Demonstration Material A walkthrough of the real use of Aurora on synthetic data.

Layer 04 · AI Business Value Engineering

See it Work · sample run · synthetic data

Prove.
Weeks of value measurement now automated and repeatable.

Every variance classification, attestation signature, and dollar figure on this page comes from an actual run of the Aurora L04 Prove automation against the locked baseline captured at L01 signoff. Layer 04 measures realized value against the projection L01 made — before the build started. Every delta is classified across seven categories. The result is a numerical readout against a signed baseline, not a slide deck made the week before.

$3.0M

Net pilot-window value

4 / 4

Signers attested

Reverse-fit narratives

7 / 7

ADW checks armed

90-second cinematic — keyboard: Space pause/play · ← → step · R restart · F fullscreen
Source: Keystone-2026Q2 L04 baseline + pilot readout · attested 2026-05-23 · baseline hash sha256:demo-l04-baseline-…7f3c1b9e

The Prove Problem

Most AI programs cannot tell you what they actually delivered.
The methodology makes the answer structural.

Four things go wrong with AI value measurement after a program ships. Aurora's L04 layer makes each of them impossible to do quietly.

The projection drifts to match the outcome.

The pre-build value claim quietly gets revised every quarter to match what actually happened. Every readout shows green. Aurora locks the baseline at L01 signoff, hash-signs it, and refuses silent revision. Re-baselining requires a documented change order with four-party re-attestation.

Variance gets narrated away.

"It seems to be working" becomes the executive narrative. "Mixed signals" becomes the variance explanation. Aurora classifies every delta against a seven-category rubric with mutual exclusion. Each variance carries its decision-tree path on the record. Free-text rationale cannot override the category.

Positive variance gets reverse-fit into "we crushed it."

A +$2.84M positive variance tempts the team to annualize the pilot into $76M/yr. Aurora's Structural Rule 2 refuses: variance is explained *against the projection*, not by re-shaping the projection. The independent auditor confirms zero reverse-fit narratives before any readout signs.

One person signs off on the value claim.

The CRO signs off on outcomes. CFO never validates the economics. CEO never reviews the variance classification. Aurora requires four-party joint attestation per readout: CFO (economics) + CRO (GTM) + CEO (strategic) + Northbeam (methodology). All four signatures, or no readout ships.

Net AI Business Value Engineering · Pilot Window

$3.0M net pilot value
at $63K/yr AI cost overhead.

Aurora discloses what the AI loops cost. The fully-loaded overhead is structurally honest because the methodology refuses to hide infrastructure cost behind value-only headlines.

$3.0M

Net pilot-window value

$3,002,923

Gross $3,007,769 · less pilot-window overhead pro-rata $4,846

Overhead category

Annual

Tokens

$6,000

Infra

$24,000

Ops

$15,000

Model maintenance

$10,000

Compliance

$8,000

Total annual AI cost overhead

$63,000

Pre-Aurora vs Post-Aurora

What changed, measured.
Not what felt different.

Pre-Aurora state cited from L01 evidence chain. Post-Aurora state observed in the 2-week pilot window (2026-06-02 → 06-13). Every delta is citable.

Dimension

Pre-Aurora (L01-cited)

Post-Aurora (pilot observed)

Delta

Customer-health definitions

4 parallel definitions (KCHS v2 / Riya Notion / Diego SF / Priya engagement)

1 canonical, hash-signed

−75%

KCHS null-input rate

3 of 5 inputs returning nulls for affected population

0 nulls post-WS-005 repair (12 accounts backfilled)

−100%

False-positive page rate

~30% (Riya operator estimate)

12% auto-suppressed by ADR-017 gate before reaching CSM

−83% reaching CSM

Canonical-dashboard viewership

6 unique / 90 days (4 of those = Riya confirming brokenness)

Replaced by canonical query layer feeding board memo

retired

At-risk ARR addressable

$3.8M (L01-cited)

$4.62M observed in Q3-Q4 pilot cohort (+21.6%)

+$820K

Customer-health decisioning loop

Operator-mediated (Devon manual suppression, Sarah uses Devon's list, Diego runs separate forecast)

End-to-end AI loop: canonical signal → driver attribution → validated dispatch → CSM action → save tracking → board memo

mechanism live

Q1 board memo defensibility

"Diego says 11.4%, Sarah says 14.1% — Lin doesn't know which" (Tom Q12 admission)

Single canonical query produces one number with hash + assumption tree + audit-review sign-off

defensible

Build-partner bus-factor

1 (Riya sole maintainer of canonical query path)

2 (Riya + Marcus Patel per ADR-004 with KT milestones)

1 → 2

Intangible Value Catalog

AI BVE isn't just dollars.
Intangibles get attestation cadence too.

Six intangible-value items captured at baseline; three pilot-validated, three pending instrumented attestation. Each carries a named authority and cadence — so the value claim doesn't quietly disappear when the spotlight moves on.

Item

Pilot status

Cadence

iv-01

Sarah↔Diego trust recovery (ADR-003 disclosure protocol)

✓ pilot-validated

quarterly

iv-02

Riya bus-factor risk reduction (ADR-004 + Marcus backstop)

✓ pilot-validated

monthly → Q

iv-03

Board defensibility moment (June 18 memo with canonical number)

✓ shipped

one-time → Q

iv-04

Devon shadow-suppression workflow retired

⏸ deploy+30d

one-time → Q

iv-05

CS team operational confidence (driver-informed intervention)

⏸ Q3 NPS survey

quarterly

iv-06

Sales↔CS coordination friction reduction

⏸ Q3 cross-functional pulse

quarterly

Forward Calendar · ADW + Scheduled Readouts

Continuous observability.
Scheduled measurement windows.

The engagement now lives under ADW with scheduled quarterly readouts. Every readout produces a fresh scorecard, fresh variance classification, and fresh four-party attestation — or surfaces as an ADW alert if material drift is detected between scheduled windows.

2026-05-23 · 03:00Z

First ADW daily scanautomated · checks all 7 ADR-compliance + 3 shadow-IT re-emergence states

2026-06-14

Wave-1 deploy + production-credential confirmation targetowner Tom + Bill · opt-04 critical-path gate · 47 deferred measurements depend on this

2026-06-18

Board memo presented to Board of Directorsowner Lin + Maya · L01-evidence-anchored canonical churn number

2026-06-21

Devon shadow-suppression retirement attestation windowowner Devon + Sarah · deploy+7d grace · iv-04 close

2026-07-05

Wave-2 L03 deploy target (WS-007 · CS Ops self-serve maturity)owner Tom + Sarah · single workstream · ~3-4 commits

2026-07-15

thirty_day L04 readoutall 4 parties · MUST close opt-01 (re-baseline change order) + opt-04 (credentials)

2026-07-26

Wave-3 L03 deploy target (WS-006 · process docs + drift-watch)owner Tom

2026-09-15

ninety_day L04 readoutall 4 parties · steady-state rate calibration window

2027-01-15

Q4 FY26 close · quarterly L04 readout · success-metric measurement windowLin + Bill joint methodology attestation · success metric measured against engagement-start baseline per AMB-026 (trailing-12-months)

How the discipline works

The outcomes above are credible because the mechanics below are structurally enforced.

Baseline lock at engagement start · variance classified faithfully · four-party joint attestation per readout · continuous observability post-deploy.

Baseline · Locked at L01 Signoff

The projection L04 measures against — captured before the build started.

Per Structural Rule 1: at L01 signoff, every promoted workstream's value_score block is pulled verbatim from the dossier into L04_BASELINE.yaml, the success metric is pulled verbatim from the L02 charter, and the file is hash-signed. The body of the baseline does not change during readouts. Revision requires a documented change order with prior-hash → new-hash transition and four-party re-attestation.

L04_BASELINE · v1 · sha256:demo-l04-baseline-…7f3c1b9e

UPSTREAML01 dossier — sha256:e6518d6e… · 7 workstreams promoted

UPSTREAML02 charter — sha256:dcb07909… · 17 ADRs · 40 acceptance criteria

METRICsuccess_metric · 2pt mid-market churn reduction · Q4 FY26 close

POOLpre_aurora_baseline.at_risk_arr_pool_usd · $3,800,000

FORCINGJune 18, 2026 board memo · Lin Zhao + Bill Tennant joint methodology attestation

ATTESTED4 of 4 signers · 2026-05-23T01:30:00Z · baseline body locked

Variance Classification · 7-Category Rubric

Three variances. Each one its own category.
Mutual exclusion enforced via decision tree.

Per Structural Rule 6: every variance enters exactly one category. The classification path (workflow → spec → build → model → adoption → assumption → macro) is recorded for each entry. No free-text rationale can contradict the category.

v-01 · assumption · positive · medium severity

+$820K addressable

At-risk ARR pool grew during the Q3-Q4 cohort window.

Projection (L01 baseline): $3.8M ARR-at-risk pool, per L01 src_slack05 T_DM_001:4 (Riya 2026-05-29 DM to Devon).
Realized (pilot observation): $4.62M observed exposure in the 2-week pilot cohort. Delta +$820K (+21.6%) above projection.

Decision-tree path: workflow? no · spec? no · build? no · model? no · adoption? no · assumption? YES — L01 baseline carried the assumption "ARR-at-risk pool is materially stable at ~$3.8M." Q3-Q4 cohort observed a larger pool. Per rubric: "an L01 confidence_interval.assumptions[] entry proved false" → classify as assumption.

Why positive direction: A larger pool is a larger addressable opportunity, not a larger problem. Treatment: Re-baseline at 30-day readout via documented change order (opt-01).

v-02 · model · positive · critical magnitude

+$2.84M pilot pro-rata

The AI driver-attribution + dispatch combination materially exceeded the L01 conservative attribution split.

Projection (pilot pro-rata): Combined WS-002 + WS-003 → ~$100K of attributable retained ARR/margin for the 2-week window.
Realized: $2.94M ARR retained — the pilot captured 77% of the full L01 projected ARR-at-risk addressable in 2 weeks.

Decision-tree path: workflow? no · spec? no (L02 acceptance criteria satisfied as written) · build? no (L03 73/73 tests passing) · model? YES — WS-002 attribution + WS-003 dispatch combined at the upper edge of L01's confidence interval, AND L01's conservative per-workstream attribution distributed value cross-workstream in a way that under-attributed the combined AI loop. Adoption considered as enabler — but per mutual-exclusion: classify at the locus where value materialized → model.

⚠ Why this is NOT "L01 sandbagged": The L01 projection deliberately distributed conservatively. The realized number is not the projection rewritten — the projection's conservatism is the explanation for why the realized is higher. Treatment: Conservative pilot extrapolation discipline through quarterly readouts (opt-02 — the IP-defensibility recommendation).

v-03 · build · positive · low severity

qualitative

ADR-016 type-level structural enforcement exceeded aspirational spec.

Projection (L02 charter): Behavioral requirement that external customer-facing actions require human confirmation.
Realized (L03 build): Type-level structural enforcement — human_confirmation: HumanConfirmation is a required parameter on every external-action function signature; PermissionError raised on confirmed=False. Forbidden-pattern grep confirms no bypass paths.

Decision-tree path: workflow? no · spec? no (the spec was correct as written) · build? YES — L03 implemented the spec at a stronger structural level than required. Treatment: noted; positive build variance with no remediation required.

Pre-attestation audit confirmed zero reverse-fit narratives detected. Auditor signature auditor-pilot-v1-clean attached to readout body. Every variance entry cites: (a) the projection field, (b) the realized observation, (c) the rubric criterion applied.

Four-Party Joint Attestation · Structural Rule 3

Four parties. One readout. One hash.

Per Structural Rule 3: every readout requires four-party attestation. CFO (commercial economics) · CRO (commercial go-to-market) · CEO (strategic) · Northbeam (methodology). All four signatures, or no readout ships. Each signature is hash-locked to the readout body.

✓

Lin Zhao

CFO · Commercial economics

"I attest to the realized economics. I accept that the $2.94M ARR-retained is the pilot-window observation and is NOT defensibly annualizable yet. I see and acknowledge the 5 deferred_pending_real_systems entries."

✓

Diego Martinez

CRO · Commercial GTM

"Sales↔CS coordination outcomes reflect post-ADR-003 mediation: zero broken-data escalations reached me or Alex in the pilot window. I accept variance v-02 classification as model (AI at upper edge), not adoption."

✓

Maya Chen

CEO · Strategic

"This pilot readout reflects faithful measurement against the L04 immutable baseline. I authorize opt-01 through opt-04 to feed Wave-2 + Wave-3 build decisions and the 30-day readout."

✓

Bill Tennant

Northbeam · Methodology

"Three variances classified deterministically. The aggressive positive v-02 magnitude is explained against the L01 conservative split rather than re-fit. Conservative extrapolation discipline (opt-02) is the methodology integrity statement."

The discipline beat — what the attestation refuses to claim.

The four signers are deliberately precise about what is and is not attested. This is the credibility statement.

NOTthe pilot save rate (63.7%) will sustain across subsequent cohorts

NOTthe $2.94M ARR retained is annualizable

NOTthe 5 deferred measurements have been measured (they're explicitly deferred)

NOTthe 2pt churn-reduction success metric has been hit (measured at Q4 FY26 close)

✓

Attesting:the mechanism is live, the variances are classified faithfully, the trajectory framing for the board is grounded in measurement.

Optimization Recommendations

Three forward actions. Each with owner, window, and blocking status.

opt-01 · assumption variance · BLOCKING

owner: Tom + Lintarget: 30-dayBLOCKING

Re-baseline at_risk_arr_pool at thirty_day readout (v-01)

At the 30-day readout, query the full live Snowflake cohort and re-baseline the at-risk ARR pool to the observed $4.62M value. Per Structural Rule 1, this requires a documented change order recorded in L04_BASELINE.yaml § baseline.baseline_revisions[] with prior hash, new hash, and four-party joint re-attestation. No silent re-baselining. The success-metric arithmetic at Q4 FY26 close depends on a stable denominator.

opt-02 · model variance · BOARD CREDIBILITY

owner: Bill + Sarahtarget: 30-day → quarterly

Hold pilot extrapolation conservatively (v-02)

The pilot's strong positive variance is exciting but unstable — subsequent cohorts will normalize because the most-at-risk accounts engage first. Annualizing the pilot rate now and then showing normalization at the 90-day readout creates a board-confidence problem you don't want to manage. Conservative extrapolation protects the credibility of every subsequent readout to the board. The disciplined narrative: "Pilot captured 77% of the L01 ARR-at-risk addressable in 2 weeks; 30-day and 90-day readouts will calibrate the steady-state rate."

opt-04 · deferred measurements · BLOCKING

owner: Tom + Billtarget: by 2026-06-14SINGLE LARGEST GATE

Production-credential confirmation

Surface Hightouch + Gainsight + Snowflake + Salesforce + Slack production credentials by Wave-1 deploy. Without resolution by 30-day readout, ~47 measurements remain deferred AND Risk R8 materializes. L04's structural integrity rests on honest-deferral discipline (Structural Rule 4). Honesty is sustained only if the credential gap is actively chased. Quarterly readouts on deferred measurements force the question: "Why hasn't instrumentation been built?"

Aurora Dependency Watch · Armed at Pilot Attestation

Continuous structural observability.

ADW extends past the L04 readout into continuous post-deploy observation. Every L02 ADR with a structural enforcement clause becomes a continuous check. Every L01 shadow-IT gap becomes a re-emergence check. Alerts are themselves citation-anchored — an alert without an evidence ref is rejected.

7 ADR-compliance checks

✓ PASSADR-001 · no parallel-definition references in production

✓ PASSADR-003 · disclosure protocol before any external publication

✓ PASSADR-006 · canonical YAML hash matches runtime on every refresh

✓ PASSADR-008 · no opaque-ML imports (torch/tensorflow/keras/transformers)

✓ PASSADR-015 · no bare-LLM calls in narrative synthesizer

✓ PASSADR-016 · human-confirmation on every external action

✓ PASSADR-017 · no validation-gate bypass paths

Shadow-IT re-emergence + alert routing

✓ PASSRiya's Notion read-only + redirect to canonical

✓ PASSDiego SF renewal model deprecated · folded into canonical

⏸ DEFERDevon suppression retirement (deploy+30d window)

Alert routing · 4-tier

info

watch

material

critical

ADR violations + shadow-IT re-emergence = always material-tier or higher, regardless of variance threshold. Per Structural Rule 7, the system cannot quietly tolerate a pattern the team explicitly rejected.

Aurora Chain · Fully Sealed

Four layers. Four hashes.
One audit-grade chain.

Every layer hand-signed an artifact to the next. Every hash verified at the downstream phase entry, or the downstream layer halted before code landed. The chain that started at L01 with seven workstreams and a synthesized current-state map closes at L04 with a four-party attested readout — and continues running under ADW until the success-metric measurement window.

L01 · Discover
Workflow Intelligence Dossier · signed
sha256:e6518d6e…
L02 · Specify
Charter · 17 ADRs · locked
sha256:dcb07909…
L03 · Build
Wave 1 · 73 tests · verified
sha256:a8e3c120…
L04 · Prove
Baseline locked · pilot attested · ADW armed
sha256:demo-l04-baseline-…7f3c1b9e

Aurora · End-to-End

Four layers. One signed chain.
See the full methodology in motion.

L01 surfaces the real workflow. L02 turns it into a binding spec. L03 ships the system with an independent verifier on every commit. L04 measures realized value against the projection. The master cinematic walks the chain end-to-end.

Discuss a Custom Engagement →

Prove.Weeks of value measurement now automated and repeatable.

Most AI programs cannot tell you what they actually delivered.The methodology makes the answer structural.

The projection drifts to match the outcome.

Variance gets narrated away.

Positive variance gets reverse-fit into "we crushed it."

One person signs off on the value claim.

$3.0M net pilot valueat $63K/yr AI cost overhead.

What changed, measured.Not what felt different.

AI BVE isn't just dollars.Intangibles get attestation cadence too.

Continuous observability.Scheduled measurement windows.

The outcomes above are credible because the mechanics below are structurally enforced.

The projection L04 measures against — captured before the build started.

L04_BASELINE · v1 · sha256:demo-l04-baseline-…7f3c1b9e

Three variances. Each one its own category.Mutual exclusion enforced via decision tree.

At-risk ARR pool grew during the Q3-Q4 cohort window.

The AI driver-attribution + dispatch combination materially exceeded the L01 conservative attribution split.

ADR-016 type-level structural enforcement exceeded aspirational spec.

Four parties. One readout. One hash.

The discipline beat — what the attestation refuses to claim.

Three forward actions. Each with owner, window, and blocking status.

Re-baseline at_risk_arr_pool at thirty_day readout (v-01)

Hold pilot extrapolation conservatively (v-02)

Production-credential confirmation

Continuous structural observability.

7 ADR-compliance checks

Shadow-IT re-emergence + alert routing

Four layers. Four hashes.One audit-grade chain.

Four layers. One signed chain.See the full methodology in motion.

Prove.
Weeks of value measurement now automated and repeatable.

Most AI programs cannot tell you what they actually delivered.
The methodology makes the answer structural.

$3.0M net pilot value
at $63K/yr AI cost overhead.

What changed, measured.
Not what felt different.

AI BVE isn't just dollars.
Intangibles get attestation cadence too.

Continuous observability.
Scheduled measurement windows.

Three variances. Each one its own category.
Mutual exclusion enforced via decision tree.

Four layers. Four hashes.
One audit-grade chain.

Four layers. One signed chain.
See the full methodology in motion.