# Test plan — §13 canon decision + checkpoint redesign (per-holder scoring, COMPARE/UNION, dirty-bit checkpoints)

TDD: tests written FIRST. Almost all of this is **design, not built** → expect mostly RED. A good RED is an
**assertion failure on observable behaviour** or a clean `typeof x === 'function'` guard, NEVER a crash.
Black-box first (converged state + `network.log()` + decoder reads); guard any internal/unbuilt API probe.

Read `research/synced-clock/DESIGN_PARTICIPATION.md` §13.3, §13.3.1, §13.4, §13.6, §13.6.1, §13.7 first.
The previous round's plan (`tests/DESYNC_SEND_MODEL_TEST_PLAN.md`) has the harness conventions — reuse them.

## Assumed (unbuilt) API names — use these EXACTLY so tests flip together when built
- Umbrella opt-in: **`knowledgeDrivenSend: true`** (constructor) gates the whole new model.
- **`canonResolution: 'compare' | 'union'`** (constructor, default `'compare'`) — the §13.6 resolution variant.
  Session-wide: all peers in a harness use the same value.
- Pure module `research/synced-clock/CanonDecision.js` EXISTS (`canonProduct`, `canonWinner`) — built last round.
- Proposed reads (guard with `typeof`): `peer.isOwnInputCorroborated(tick)`, `peer.corroboratedOwnCount()`
  (built last round); and NEW ones this round may probe — `peer.checkpointKnown(peerId, tick)` /
  dirty state, `decoder.removeChange(tick)` returning `{changed, earliestAffectedTick}`, a per-holder scoring
  helper. If a name doesn't exist, the test fails cleanly on the `typeof` guard — that's an acceptable RED.
- Wire types: `em-input`, `em-input-ack`, `em-beat`, `em-assert`. (A checkpoint/assert ack type may not exist
  yet — probe the log defensively.)

## Conventions (same as last round)
- One new file per category: `tests/desync-canon-<slug>.test.js` (DESYNC bucket; run via
  `DESYNC_SPECS=1 NO_COLOR=1 npx vitest run tests/desync-canon-<slug>.test.js`).
- `factory = (transport, opts) => new SimulationEngine(transport, opts)`; `PeerHarness`; `summingStep`,
  `scripted`, base config with `recovery:true, syncedTick:true, synced:true, inputForwarding:true,
  knowledgeDrivenSend:true`. For tests where membership must affect state, write a `step` that **uses the
  participant set** (e.g. only sums inputs from currently-participating ids) — see D6.
- Message observability via `h.network.log()` (each entry `{at, from, to, event:'deliver'|'drop', msg}`).
- Each test = NL doc comment (WHAT / WHY-it-falsifies / what-RED-means) THEN code. Try to BREAK the design.
- DO NOT touch: `vitest.config.js`, `package.json`, existing test files, `tests/graph-pacman-scenarios.test.js`,
  `New thoughts..txt`. Create only your one file. Run it; report red/green tally + confirm clean execution.

## Categories (one sub-agent each)

### D1 — Per-holder verifiable scoring  → `desync-canon-scoring.test.js`
§13.6. Pin: a checkpoint is scored from ITS HOLDER's seat — an input authored by `p ≠ holder` counts as
corroborated **verifiably from content** (no flag needed); the holder's OWN inputs count per a **per-input**
frozen flag. So **B holding A's inputs is a stronger witness than A**: B's checkpoint counts A's input that A
(ack lost) counts as 0. Observer-independence: A, B, and a bystander C all compute the *same* score for a given
checkpoint. The product/winner come from `CanonDecision.js` (built) but the per-holder VECTOR ASSEMBLY is new —
test the assembly (a helper if exposed; else end-to-end that the holder's higher count wins). Falsify the OLD
single-count / author-asserts-everywhere model (it would under-count B's holdings).

### D2 — COMPARE vs UNION resolution  → `desync-canon-resolution.test.js`
§13.6/§13.7. COMPARE (default): converge to ONE winner's corroborated set; a loser's unique corroborated input
is lost. UNION: converge to the UNION of all corroborated inputs; NO known-corroborated input lost. The
two-subnet example (A–B bridge): UNION → everyone gets both subnets' corroborated inputs; COMPARE → one
subnet's set wins, the other is overridden. Both DROP uncorroborated inputs. Uniformity guard: a harness where
peers use *different* `canonResolution` must NOT be asserted to converge (document it as a divergence; do not
silently expect convergence). Drive via the harness `canonResolution` flag.

### D3 — Predicate reconcile + tombstoning  → `desync-canon-reconcile.test.js`
§13.7 + `SparseInputDecoder`. Reconcile to canon via the NORMAL input-apply path: a moot canon change causes NO
rollback; a real one rolls back from the EARLIEST-affected tick (which may be ≫ S), not blindly to S. The skip
condition is `myInputs == canon` (full set) — a node that "won" but held uncorroborated extras still erases
them. **Removal**: reconcile can REMOVE inputs; `SparseInputDecoder.removeChange(tick)` must return
`{changed, earliestAffectedTick}` symmetric to `applyChange` (a removal reverting to the value held *entering*
that tick is moot). **Tombstone**: a removed input is RETAINED (resolvable by tick) within the history window,
not hard-deleted — so a later reference resolves without a request. Pure decoder-level tests are great here.

### D4 — Checkpoint reliability + content-hash versioning + freeze-optional  → `desync-canon-reliability.test.js`
§13.3.1 + §13.6. A checkpoint is **dirty to a peer** when (1) new or (2) its hash changed; an ack of a specific
`(checkpoint, hash)` clears the bit. **Headline falsifier:** a stale-hash ack (acking the OLD hash after the
checkpoint changed) must NOT clear the dirty bit. **Content-hash versioning:** two checkpoints with the SAME
result state but DIFFERENT corroborated-input sets must get DIFFERENT version hashes (else a stale-input ack
wrongly clears dirty) — the `left+right` vs `nothing` net-zero case. Resend is delta (only dirty), silent in
steady state, bounded by the history window. **Freeze-optional:** a corroboration update is a content change →
re-broadcast → re-converge with no deadlock; corroboration is METADATA (an update must NOT move the
result-state hash → no phantom desync / no false B8). Assert both: update re-propagates AND result hash steady.

### D5 — Delta-encoded checkpoint inputs + reference disambiguation  → `desync-canon-delta-inputs.test.js`
§13.3. For an input the peer is evidenced (`known[peer][source]`) to hold, the checkpoint carries only a
`(source, tick)` **reference**, not the value; full data only for the delta the peer lacks. The receiver
resolves a reference from its own decoder by tick; an input removed by a reconcile is resolved from its
**tombstone**; an unresolvable reference (history-window skew) is requested like a missing input. Pin the
**reference-disambiguation invariant**: exactly ONE authored value per `(player, tick)`; liveness/disconnect is
NOT a stream entry, so "A disconnected at 200" does not collide with A's real input at `(A,200)`. Falsify a
naive full-input checkpoint (it ships values the peer already has).

### D6 — Liveness in the checkpoint + grow-only-max reconcile  → `desync-canon-liveness.test.js`
§13.3. Use a **membership-affecting `step`** (only count inputs from currently-participating players, with
`disconnect → leave`). Two peers with DIFFERENT attendance-derived disconnect ticks compute different membership
→ different state → a desync that is **only repairable if the checkpoint carries the liveness signal**. Assert:
the per-player last-heard/disconnect tick rides in the checkpoint and reconciles by **grow-only-max** (the later
proof-of-life wins, always composes); a node holding a stale-low value rolls back, re-derives membership, and
all converge. A's real input also serves as proof-of-life that corrects a mistaken disconnect. (Recovery.js's
`mergeLastAttendanceTicks` is the grow-only-max core — some of this may be partly GREEN; the checkpoint-carrying
part is RED.)

### D7 — Acceptance / whole-system SPIRIT  → `desync-canon-acceptance.test.js`  (implement the spec below faithfully)
The end-to-end behaviours the §13 redesign must achieve. Lossy, multi-peer, several seeds. Mostly RED.
1. **HOLDER RESCUES A CORROBORATED-BUT-UNPROVABLE INPUT (headline).** `b` authored by B, held by B and C; B's
   ack from C is lost so B cannot prove it. End-to-end, `b` **survives** (converged CLOSER to "with"), because C
   (a holder) verifiably champions it — NOT destroyed. This is the central correctness win over author-asserts.
2. **COMPARE converges to ONE canon under loss** (3–4 peers, 10–25% drop, all `sumOf` equal across seeds);
   **UNION keeps ALL corroborated** in the two-subnet topology (a corroborated input unique to one subnet
   survives under UNION, may be lost under COMPARE).
3. **LONE STRAGGLER dropped, replicated input never destroyed.** A truly-lone input (held only by its author)
   is dropped (network beats straggler); contrast it with a replicated input that survives — no destruction.
4. **DISCONNECT-TIMING divergence repairs.** Membership-affecting step; two peers briefly disagree on a
   disconnect tick → diverge → repair to identical state once the liveness signal reconciles (grow-only-max).
5. **CHECKPOINT TRAFFIC SILENT IN STEADY STATE.** On a clean link after warm-up, checkpoint/ack traffic per
   node → ~0 (dirty-bit; no blind re-broadcast). And a corroboration update re-converges WITHOUT a phantom
   desync (result state stays equal across peers throughout).
6. **DETERMINISM under adversarial ordering** (jitter/reorder, multiple seeds → identical canon) and
   **PARTITION + HEAL** (a partitioned peer diverges past grace, heals, all re-converge).