# Participation — join, leave, disconnect, reconnect as one announced-set model

Status: SPEC (agreed 2026-06-05). Design-stage, **not yet implemented**. Consolidates the
cross-session brainstorming on who counts as a player, how they join and leave, and how the network
liveness of a player relates to their membership. Supersedes every earlier framing in which
"participant = has an input array" or in which join/leave were one-dimensional `startParticipating`/
`stopParticipating` signals.

## 1. The core mental model

> A participant is an id the engine has *announced* to the game and not since *denounced*. It is
> derived simulation state — recomputed on rollback — never a mutable side-register.

Three facts follow from that one sentence, and almost every subtlety below is a consequence of them:

1. **Membership is derived, not stored.** "Who is a participant at tick T" is a pure function of the
   finalized signal stream up to T (joins, leaves, disconnect ticks) intersected with the engine's
   announce decisions. Rollback re-runs that function and gets the same answer. There is no
   authoritative `Set` that a message mutates out of band — that would diverge the instant a rollback
   re-ran the tick.
2. **Having an input array ≠ being a participant.** A NON-participant can carry an input array — it
   just means "trying to join" (e.g. the loser of a join race who pressed the button but wasn't
   selected). Conversely a participant's avatar persists in the hashed game state even when their
   input array has been garbage-collected. The array exists for *rollback*, not for *membership*.
3. **Network-liveness (`IsDisconnected`) is orthogonal to membership.** Being connected and being a
   participant are two independent axes; all four combinations are real (see §6).

### Two-level vocabulary (DECISIONS.md #12)

Public API says **player**; engine-internal says **participant**. This doc uses *participant* for the
mechanism and *player* when talking about the game-facing surface. They denote the same entity.

## 2. The three verbs (two engine-offered, one game-initiated)

All membership change flows through exactly three calls. Two are the engine *reporting* and the game
*consuming*; one is the game *commanding*.

```
  discoverParticipants(predicate, limit)   engine offers candidates; consuming = ANNOUNCE (promote)
  getStoppedParticipating()                engine reports departures; consuming = DENOUNCE (demote)
  releaseParticipant(id)                   game commands a denounce; immediate, unconditional
```

`discover`/`release` are deliberately an acquire/release pair. The asymmetry between the two
engine-offered verbs is **principled, not accidental** — see §4.

### 2.1 `discoverParticipants(predicate, limit)` — the only announce path

Runs the dev's `predicate(frozenContext, input)` over the **discoverable** inputs present at this
tick (see §3), in id order, admitting up to `limit`. Consuming a returned candidate IS the
announce: that id becomes a participant from this tick forward.

- This preserves **conditional join**: you join *iff* your discoverable input satisfies the
  predicate (e.g. "you grabbed the ball where the ball actually is", "there is a free slot").
- Slot enforcement lives HERE, never at input-acceptance. The engine never rejects a candidate
  input for being "full"; it stores the input and lets `discoverParticipants(limit)` decide each
  tick. This dissolves the old reception-order race: a slot freed by a finalized leave is filled by
  the next discovery pass over the existing candidates — identically on every node, because it is
  re-derived from the same finalized stream.
- It is the **single** announce entry point by deliberate choice. Every distinct announce path is
  another surface that must be tick-derived and rollback-identical (the queue-pop trap, §5); a second
  path doubles that risk. An id-pinned re-admit is just
  `discoverParticipants((ctx, input) => input.playerId === id3 && eligible, limit = 1)` — sugar over a
  predicate, not a new mechanism. (A `rediscover` helper may be added LATER if real game code proves
  the pinned-predicate ergonomics painful. It is not in v1.)

### 2.2 `getStoppedParticipating()` — the departure report

The engine surfaces ids that have left (a voluntary `stopParticipating`, or a disconnect that has
been converted to a leave under Flag 1 — see §4). Under **consume-gated** demotion, consuming this
report IS the denounce, mirroring discovery's consume-is-announce. Under **auto** demotion the
denounce already happened at the signal tick and this report is purely informational — but it still
exists, because a game with its own avatar objects must LEARN of departures to clean up, score, or
update pause logic. Even under auto, the game can equivalently diff its avatar set against
`participants()` each tick — a pure function, no queue.

**Consume is DETERMINISTIC, not "local" in a way that diverges.** Consuming is a per-node call, but the
simulation is deterministic: every node runs the same game code, so if the game consumes at the same
logical point (inside the step, or at an agreed tick), **all nodes consume the same departures at the
same tick and their membership agrees**. The consume is a rollback-re-derivable transition (§5) — re-run
on rollback, idempotent under re-sim — not a source of divergence. The only way consume could desync
membership is if a game consumed on a NON-deterministic trigger (a UI event, a wall-clock timer); that is
a game-layer misuse of a deterministic primitive, the same hazard as making any other input
non-deterministic. (A test that consumes on only one node is just modelling one node's view; to compare
two nodes' membership, consume on both — they will then agree.)

### 2.3 `releaseParticipant(id)` — game-initiated denounce

The game already knows it wants the player gone (round-end kick, out-of-lives, idle-eviction), so
there is nothing to detect. Always immediate, always unconditional, governed by **neither** flag in
§4. Flag 2 applies only to stop-*signals*, not to direct commands.

## 3. The `discoverable` input flag — the one fact the engine needs

An input declares `discoverable: true` iff it is the *kind* that COULD promote a spectator to a
participant — i.e. the kind `discoverParticipants` is allowed to return. Ordinary gameplay inputs
(`left`/`right`) are `discoverable: false` and can NEVER cause promotion; `joinGame` or a ball
`grabAt` are `discoverable: true`.

This gives the engine exactly the one semantic fact it needs (which inputs can promote) without it
understanding game rules. It also powers input-volume filtering: a non-discoverable input that does
not follow a live discoverable can be dropped on sight (a pure spectator's `left`/`right` can never
matter), which alone defuses the "1000 sloppy nodes all hold `left`" flood without any cap. (The
earlier "null input is a special case" framing is subsumed — null is just one `discoverable: false`
value.)

## 4. The autoRelease split — TWO orthogonal flags

`autoRelease` was a single flag; it is **two**, because it conflated two independent questions.

- **Flag 1 — does a DISCONNECT auto-emit a `stopParticipating`?**
  - ON: a network disconnect becomes a LEAVE (denounce path entered).
  - OFF: a disconnect is a *blip* — the player stays announced, only `IsDisconnected` flips true. The
    slot is held (§6, reservation).
- **Flag 2 — does a `stopParticipating` signal AUTO-demote, or wait for the game to consume
  `getStoppedParticipating`?**
  - AUTO: the engine frees the slot at the signal tick. Lazy-dev win, especially for **Model-A**
    games that derive their entities from `participants()` each tick — the avatar simply vanishes
    with zero dev effort.
  - CONSUME: membership and the game's cleanup move in lockstep; the slot never frees behind the
    game's back. Cost: the dev must consume the report to free the slot.

All four combinations are meaningful. `releaseParticipant` is governed by neither — it is a direct
command and always demotes immediately.

**The Flag 2 tradeoff is a genuine knob, not a settled winner.** Both settings are fully
deterministic (the demotion is a derived transition at the agreed signal tick, re-derived on
rollback; the announced-set is rolled-back state; there is no side-queue either way). The difference
is ergonomics. Auto's cost: a **Model-B** game (its own avatar objects) that ignores departures yet
polls discovery gets a ghost avatar plus a refilled slot — a sloppy-game bug, caught by
loud-fail-on-departed-query (§7). Consume's cost: the dev must remember to drain the report.

(Soft, deferred option: Flag 2 could differ by *source* — auto for disconnects, which are genuinely
gone, but consume for voluntary leaves, where the player is still connected and the game may want a
lockstep reaction. Start uniform; split only if real code wants it.)

### Why join is conditional but leave is not

JOIN is **conditional**: there is contention for scarce slots, so a per-candidate predicate must
SELECT among many candidates → join MUST be consume-gated (a selection decision cannot be automatic).
LEAVE is **unconditional**: a `stopParticipating` or disconnect is a declaration of fact, not a
vetoable request; there is no contention and nothing to select among, so the consume carries cleanup
only, no decision → leave CAN be auto. Consume-join + auto-leave therefore mirrors
conditional-vs-unconditional; it is not an inconsistency.

There is **no conditional `stopParticipating`.** Vetoing a leave would mean the engine simulating a
player who has declared themselves gone — incoherent. The one thing that resembles "conditional
leave" — "you can't quit mid-round, wait for the round boundary" — is TEMPORAL, not selective: it is
the game choosing WHEN to process a leave, which is exactly Flag 2 = consume-gated (the stop-signal
sits, the player stays announced, until the game consumes at a safe point). "Conditional leave"
collapses to the consume knob; it is not a new primitive.

## 5. The queue-pop trap (the determinism invariant)

Both `discoverParticipants` and `getStoppedParticipating` run inside the deterministic loop, so
rollback RE-RUNS them for the same tick. If either were implemented as draining a mutable "pending"
queue, the second run would return empty → divergence.

**Rule:** "who joined / left at tick T" must be a pure function of the finalized signal stream at T
(a disconnect-derived `stopParticipating` is an agreed input at an agreed tick), idempotent under
re-simulation. The announce/denounce is a rollbackable derived transition, NOT a global latch that a
single pass consumes. The announced-set is rolled-back state, recomputed by re-running discovery — a
late, lower-id `joinGame` arriving into the non-finalized window changes who won, so the announced
set must change with it.

### 5.1 The tick guard — mechanically enforcing deterministic mutation

The queue-pop rule has a precondition that is easy to violate by accident: the membership-MUTATING calls
must run INSIDE the deterministic step, so they fire at agreed ticks and re-run on rollback. If a dev
calls `discoverParticipants` / `getStoppedParticipating` from a UI event handler or a wall-clock timer,
the mutation happens at a non-deterministic moment, is NOT re-run on rollback, and silently desyncs. We
make that misuse impossible to do quietly.

**Mechanism.** The engine raises an `inTick` flag for the duration of each step (set true at the top of
the tick, false at the bottom). The flag gates the two kinds of determinism-sensitive calls:

- **Membership MUTATIONS** — `discoverParticipants`, `getStoppedParticipating` (it consumes → denounces),
  `releaseParticipant`, `stopParticipating` — require `inTick === true`. Called outside the loop: **dev
  throws** (*"…must be called inside the game step; it mutates rolled-back state and must run
  deterministically"*); **prod is a no-op** (returns `null`/`[]`), sync-safe because every node skips it
  identically. This is the §5 rule turned from a convention into a guardrail.
- **READS are NOT gated** — `isParticipant`, `isDisconnected`, `players`/`participants`,
  `queryDisconnected`, `reconstructedValueAt` must work ANYWHERE, because a game reads membership every
  frame outside the loop (rendering, lobby UI, "who is here"). A pure read at the current tick can never
  desync; only mutation timing can.

**The same flag, inverted, as a determinism sandbox (dev-mode, optional).** Non-deterministic sources —
`Math.random`, `Date.now`, `performance.now` — are fine OUTSIDE the loop but poison the simulation INSIDE
it. In dev mode the engine can trap them while `inTick` is true (route to a loud-fail stub: *"use the
injected deterministic rng / the `tick` argument instead"*) and pass them through to the real globals when
`inTick` is false. (The engine already injects a deterministic `rng`, so this is a teaching/safety net,
not the determinism mechanism.) Caveat: globally swapping `Math.random`/`Date` touches ALL code that runs
during the step, including third-party libraries, so it is dev-only, opt-in, and save/restore-bracketed.

The guard is a v1-candidate but has a migration cost worth stating: any caller (including tests) that
mutates membership from outside the step must move that call inside it. That is the more realistic
shape anyway — real games change membership in their loop — but it is not free.

## 6. `IsDisconnected` — liveness, orthogonal to membership

> **MECHANISM SUPERSEDED (2026-06-09) by [`LIVENESS_CORROBORATION.md`](./LIVENESS_CORROBORATION.md).** The
> *model* in this section still holds — liveness is network state ORTHOGONAL to membership, participant-gated,
> with all four membership×liveness combinations real, reconnect as the symmetric true→false, and two layers
> (direct beat vs corroborated cross-graph). What is RETIRED is the **mechanism**: the grow-only-max
> `lastAttendanceTick` *scalar* / piggybacked **beat-vector gossip** described below cannot carry a
> disconnect→reconnect GAP and has no corroboration, so a liveness-only divergence cannot resolve
> deterministically. Liveness is now a **first-class corroborated stream** — a sparse alive-RANGE history per
> player (`assertMsg.liveness`, reconciled by `mergeRuns` UNION), with accept/grace windows, a corr/raw wire
> split, a versioned per-peer `known` gate, and §13.6 canon resolution — exactly like inputs. Direct-peer
> membership presence is the transport's job (`getPeers`); the old engine `peerHeartbeat` pulse was REMOVED. Read
> that doc for the live design; the
> prose below (grow-only-max scalar, beat-vector gossip, "never forwarded") is the original sketch, kept for
> history.

`IsDisconnected` is NETWORK state, derived from the attendance/B7 disconnect-tick machinery
(`canonicalDisconnectTick = lastAttendanceTick + timeoutTicks`, grow-only-max — PROTOCOL_SPEC.md B7).
It is **participant-gated**: only a participant's liveness is tracked, because only a participant can
affect the simulation. A spectator's connection state is irrelevant and untracked.

All four membership×liveness combinations are real and must stay distinguishable:

| | participant | non-participant |
|---|---|---|
| **connected** | active player | spectator / voluntary leaver still watching |
| **disconnected** | blip — slot held (reservation) | gone |

Collapsing `IsDisconnected` into "any non-participant" is a bug — it would mislabel spectators and
voluntary leavers as disconnected.

### Reservation = stay-announced + `IsDisconnected`, NOT denounce-then-re-admit

Under Flag 1 OFF, a blipped player is **never un-admitted** — its id never leaves the announced set,
which IS the reservation. `discoverParticipants` fills only genuinely empty slots, so it cannot give
the held slot away. Reconnect therefore needs no re-admission, because there was no un-admission. "Stop
the game if everyone has dropped" falls out for free: check `IsDisconnected` across the still-announced
participants.

### Reconnect = the symmetric `IsDisconnected` true→false on the liveness channel

- It is NOT an input arrival and NOT a `startParticipating` (that explicit-signal idea is retired —
  see below). Detecting reconnect by "an input arrived" FAILS the idle-reconnect case: a reconnected
  but silent player sends nothing on the game channel, so they would be stuck "waiting" forever.
  Liveness must ride the heartbeat channel, because a connected player legitimately sends no input.
- Under Flag 1 OFF a disconnect emits ONLY `IsDisconnected = true`, never `stopParticipating` — a
  `stopParticipating` IS the leave, and OFF's whole point is that there is no leave. Under Flag 1 ON
  you get both, and re-entry is a genuine re-join through discovery/announce (an id-pinned predicate
  for a competitive fixed roster, an open predicate for battle-royale "rejoin if room").
- `startParticipating` is retired and should not be resurrected: under OFF there is nothing to
  "start" (the id never left the announced set; re-entry = `IsDisconnected → false`); under ON
  re-entry is discovery. Neither branch needs it.

**Two agreed sub-moments** (game-layer nuance, both deterministic): `IsDisconnected → false` (network
restored, but inputs from the stale epoch are still finalized-floor-rejected while the node
re-syncs via background-join — see DESIGN_BACKGROUND_JOIN.md) versus the first accepted live input
(demonstrably re-synced). A UI renders these as "reconnecting…" versus "active".

**Determinism requirement to verify in B7 code:** `IsDisconnected → false` MUST use the same agreed
declaration mechanism as the disconnect tick — one authoritative agreed reactivation tick injected at
tick T — NOT each node's local "I started hearing X's attendance again", whose per-node observations
differ and would diverge the reactivation tick.

### 6.1 Attendance wire protocol (supersedes "attendance are never forwarded")

This replaces the old DECISIONS #30 / PROTOCOL_SPEC B7 stance ("attendance are NEVER proactively
forwarded; convergence is pull-on-suspicion"). The pull-on-suspicion probe is **deleted** — reasoning
in §6.2. Liveness now propagates as a reliably-gossiped, grow-only-max **beat vector** riding the same
id'd envelope as inputs.

**A beat is one tick number per player — last-write-wins.** Semantically: *"P vouches it was alive as
of tick N."* Only the maximum matters; `DisconnectTracker.noteAttendance` already discards anything not
strictly newer. This is the crucial contrast with inputs:

| | shape | delivery semantics |
|---|---|---|
| **inputs** | append-only log, keyed by (player, tick) | every entry must arrive or replay diverges → retransmit unacked entries (ackId watermark) |
| **beats** | overwrite map, one scalar per player | only the max matters → self-healing: a lost beat needs no retransmit, the next message's value supersedes it |

So they sit in two different shapes inside one envelope — a **list** and a **map**:

```
// Msg #1   from A   (msgId 100, ackId 87)
inputs: [ {B, t50, intent:…}, {B, t52, intent:…} ]
beats:  { A:54, B:52, C:50 }

// Msg #2   from A   (msgId 101, ackId 88)          ←— LOST in transit
inputs: [ {C, t53, intent:…} ]
beats:  { A:55, B:52, C:53 }

// Msg #3   from A   (msgId 102, ackId 88)
inputs: [ {C, t53, intent:…}, {B, t55, intent:…} ]  // C@t53 RE-SENT: recipient still at ackId 88
beats:  { A:56, B:55, C:53 }                         // current snapshot; lost A:55 irrelevant — A:56 supersedes
```

The input list self-heals by **redelivery**, the beats map by **always sending current** (no retransmit
bookkeeping at all — overwrite semantics make a dropped beat a non-event). The sketch shows a
re-included input only to make the log-vs-map contrast concrete; the actual input redelivery
mechanism is **bounded last-N-message redundancy + an RTO fallback, NOT the cumulative "everything since
ackId" this simplified sketch suggests** — see §10.1.

**Bandwidth — delta the map against the watermark.** Instead of the full beats snapshot, send only the
entries that *increased* since the recipient's `ackId`. Same watermark, same idempotence (a re-sent or
stale beat is a grow-only-max no-op). Full snapshot is the simple version; delta is the optimization,
and it makes liveness bandwidth scale with *beats that actually changed*, not participants × messages —
the same change-only principle as sparse inputs.

**Loss tolerance — value-keyed repeat, NOT message-keyed.** Because a beat is overwrite (only the newest
matters), it is made loss-tolerant differently from inputs (§10.1, which are a log and use message-keyed
redundancy). When a node learns P's beat advanced, it arms a small per-player counter (≈3) and attaches
P's *current* beat to its next 3 outbound messages; if a newer beat for P arrives before the counter
drains, it switches to the newer value and **resets the counter to 3**. A node never re-sends an OLD
beat — there is no point. This rides out up to 2 consecutive message losses with zero round-trips, and
needs **no ack/retransmit machinery at all**: beats converge by overwrite — the per-player repeat,
gossip-on-new-max across the mesh, and each node continuously re-stamping its own current tick together
guarantee the latest value wins. (Inputs, being a log, cannot tolerate a permanent hole and so DO get
the ack/RTO fallback — §10.1. The one pathological beat case — an interior link drops all repeats of a
node's *final* beat and that node then dies — is rare and caught by B8.)

**Send triggers — the old "attendance interval" is now only a FLOOR.** A node sends when *either*
(a) it has urgent payload this tick (its own input changed, or an input to forward), or (b) the
interval floor has elapsed since its last send to anyone. Beats never trigger an *early* send on their
own — they only ride whatever is already going out. Your own beat is just your current tick, restamped
on every outbound message. A dedicated beats-only message therefore happens only in genuinely quiet
stretches (no input traffic for a whole interval). Inputs are latency-sensitive and ride their own
send policy — bundling beats onto the shared envelope must NOT slow inputs down. (The firm rule here is
only that beats never dictate or delay input timing. *What* the input policy itself should be —
forward-ASAP per hop versus coalesce-once-per-tick — is an OPEN question deliberately left OUT of the
attendance protocol: per-hop coalescing adds up to one tick of delay PER HOP, compounding on relay paths
like A↔B↔C, so the tradeoff is unsettled. Decide it on the input side, not here.)

**Source gating — a node beats only while it is a participant or potential participant.** A pure
spectator emits no beat at all, so it is invisible and *cannot disconnect* (never-heard = passivity,
B3, not a disconnect). The on-ramp needs no special case: a `discoverable` input (§3) is itself the
first beat (tick-stamped, id'd, proof-of-life). A node keeps beating until it finalizes as a
*non-participant* — NOT the instant it sends a stop-signal, because the stop itself must propagate and
roll others back.

**Forward gating — gossip only on a new max, only for relevant players.** Forward another player's
beat only when it *advances your local max* for that player (gossip-on-new-info): grow-only-max makes a
re-forwarded stale beat a no-op, so propagation is loop-free and naturally terminating (a genuinely
newer beat travels one hop, then dies once neighbors hold it). And forward only for *relevant* players
(participant or live-discoverable) — the same relevance filter as input-volume filtering, so the large
cheap spectator population costs nothing. On A↔B↔C, A's beat reaches C via B's relay within the
interval, so C is never falsely suspicious of A.

**Well-targeted, not a flood — and the honest cost.** Among participants in a shared rollback sim every
participant is relevant to every other (they all affect the same finalized timeline), so forwarding
beats across the participant set aims them exactly where liveness matters; the far-larger spectator
population is excluded by the same relevance gate. This is NOT free — beats genuinely flow continuously
among active participants — but delta-against-`ackId` and the interval floor bound that to *changed*
beats riding *existing* traffic, so steady-state liveness cost scales with the participant count, not
with ticks × nodes. We pay that bounded, steady cost deliberately, to buy the robustness the one-shot
probe lacked (§6.2).

### 6.2 Why there is no suspicion probe

The earlier design had a pull-on-suspicion probe (`DisconnectProbe`) to converge peers who had heard
*different* last beats without forwarding. Once beats are reliably gossiped (above), that job is gone,
and the probe cannot earn its keep:

- The probe and the gossip ride the **same reliable transport**. A probe is request→response (two
  trips); gossip is one trip. So on any network where gossip is too slow or too lossy to refresh P's
  beat before its disconnect tick, the probe — strictly slower — does *worse*. There is no network
  condition where the probe rescues a case that reliable gossip does not.
- The old probe was also **one-shot per disconnect tick** (`DisconnectProbe` deduped on
  `(playerId, tickY)`), so a single dropped correction → false disconnect → expensive recovery. That
  fragility is the symptom; the structural cause is using a 2-trip request/response as a *primary*
  liveness signal.

**Load-bearing precondition.** This all holds only if `timeoutTicks` spans SEVERAL gossip intervals, so
a reliably-gossiped beat refreshes a player's disconnect tick well before that tick is reached. Sized
that way, ordinary gossip always arrives in time. The only regime where a probe could ever help is a
timeout set TIGHTER than the gossip cadence — and that is a misconfiguration to fix at the cadence (or
the timeout), not to paper over with a 2-trip probe. So if you are ever tempted to probe, the fix is
"gossip a little more often" or "set the timeout to more intervals." **B8 severe-desync recovery**
remains the ultimate backstop for the genuinely-went-wrong case; we are removing only the middle layer
between normal gossip and B8.

### 6.3 Two liveness layers — why this works on CONNECTED, not just COMPLETE, networks

Liveness is answered at two distinct layers; conflating them is the bug:

- **B2 `HeartbeatLiveness` — real-time, direct-neighbor presence.** Wall-clock, transport-level: *"is
  my socket to a directly-connected peer up right now?"* It can see ONLY peers this node is **directly**
  connected to, and it drives the live-connection onJoined/onLeft.
- **B7 `DisconnectTracker` — deterministic, tick-stamped, network-wide.** *"As of which tick do we all
  AGREE player P was last alive?"* Computed identically on every node from the same beat
  (`lastAttendanceTick + timeoutTicks`), so a disconnect is an agreed simulation event.

On a **complete** network (everyone directly connected) B2 alone would suffice — every node sees every
other directly. But the design must also work on a merely **connected** network, where not everyone is
directly connected — e.g. A↔B↔C, in which A and C never talk directly. There, B2 on C sees only B; it
can tell C *nothing* about A. C learns A's liveness SOLELY from A's tick-stamped B7 beat **gossiped
through B** (§6.1). So forwarding is not a mere optimization — it is what makes liveness work across
hops at all, and it is precisely why the deleted "never forward" probe stance was untenable on connected
topologies (without forwarding, C is permanently unsure about A). B2 = direct-neighbor real-time signal;
the gossiped B7 beat vector = network-wide liveness that survives multi-hop relay.

### 6.4 Liveness is rolled-back simulation state — the disconnect/reconnect event log and the agreed reactivation tick

The disconnect tick was easy because it is derived from the **last beat before silence** —
`canonicalDisconnectTick = lastBeat + timeoutTicks`, a **grow-only-max** of a shared datum: every node
that heard that last beat computes the same tick, and a late/reordered older beat can only be ignored
(max never moves backward). Reconnect does **not** get that safety net, and trying to derive "connected"
from a single most-recent-beat scalar is impossible: when a player resumes, its `lastBeat` jumps forward,
so `canonicalDisconnectTick` jumps with it and the entire disconnected window is **retroactively erased**
— a single scalar cannot represent a gap. And the "first beat after the gap" has no grow-only-max
equivalent (a min is not monotone-safe: a later-arriving *earlier* beat would move it backward).

**The model: liveness is SIMULATION STATE that rolls back — a per-player tick-stamped event log.** Each
player carries a small ordered log of liveness transitions `[{tick: D, off}, {tick: R, on}, …]`;
`connected(p, t)` is a pure replay of the events with `tick ≤ t` (latest wins). It is rolled back exactly
like the announced-set and the game state: a rollback **truncates** the log, so re-simulation re-derives
it. (Equivalently: a `connected` boolean snapshotted with state. The event log is just the form that makes
truncate-on-rollback trivial and keeps the gap representable.)

The two transitions are **agreed, tick-stamped events**, never per-node observations:

- **Disconnect (`off`) at `D = lastBeat + timeoutTicks`** — the existing B7 tick (grow-only-max, agreed).
- **Reconnect (`on`) at `R` = the stamp of the FIRST resumed beat after `D`.** A beat is tick-stamped
  ("P alive as of tick N"), so `R` is identical on every node that ever receives that beat — it is the
  beat's *stamp*, NOT the local tick at which a node happens to hear it (which is skewed by latency; that
  skew is the adversarial case). Because a node that hears the resume late sees `R` in its own past, the
  `on` event is **injected via a rollback to `R`** — the very same mechanism the disconnect tick already
  uses to inject a past event. This is why the tracker must keep enough beat **history** to know "a beat
  arrived after a disconnect was in effect" (not just the running max); that is the concrete code change.

**Reconnect rides the BEAT channel, not input.** A reconnected-but-silent player sends no game input, so
detecting reconnect by "an input arrived" strands it forever (§6). Liveness rides attendance. The first
*input* is a separate, later signal:

- **Two agreed sub-moments.** `reactivationTick(p) = R` (network restored → render *"reconnecting…"*) is
  strictly earlier than `firstLiveInputTick(p)` = the first *accepted* live input after `R` (demonstrably
  re-synced → render *"active"*). Both are agreed: `R` from the shared beat stream, the input tick from the
  shared input log. Both are rolled-back state and re-derive identically; neither is a local observation.

**Determinism corollary.** Because `connected` is rolled-back state driven by agreed ticks, two nodes with
wildly different *observation* timing (e.g. P's resume beats reach A immediately but B 150 ms late) still
agree on `reactivationTick(p)` and on `queryDisconnected(p, t)` at every `t` — the per-node skew never
leaks into membership or the simulation. (Multi-hop: the resume beat reaches a far node by **forwarding**,
§6.1, so the same `R` is learned across the connected graph, not just by direct neighbours.)

**GC — the liveness log is finalized-floor-bounded, exactly like inputs.** The event log is not kept
forever; it is truncated on the SAME grace/finalization horizon the input arrays use (§9, the
`collectAnchored` carry-forward policy). Keep the most recent entry **at-or-before the finalized floor**
(the carry-forward anchor — the liveness state a rollback to the floor must start from) plus every entry
after it; drop everything older. So the log's size is bounded by the non-finalized window, not the session
length — a player that churned connect/disconnect/reconnect for hours costs only its current liveness
state + the sub-second tail. Liveness is just another per-player rolled-back stream, GC'd like the rest.

**The connection log MUST ride the state transfer (B8 recovery + B10 bootstrap).** Because
`queryDisconnected` is rolled-back state a re-simulation depends on — when liveness alters game state (a
disconnected participant contributes nothing) — a node that adopts a full transfer and re-simulates the
non-finalized window needs the connection log, NOT just the grow-only-max `lastAttendanceTicks` scalar
(which cannot represent a disconnect→reconnect GAP). Send the per-player connection log over the retained
(post-floor) window plus the latest beat per player; the adopter seeds its tracker from it (union /
grow-only-max with any live beats it already heard). Omitting it is a real desync: a joiner that
re-simulates a gap it cannot reconstruct counts a participant that was actually disconnected and lands in
a different state (pinned by `tests/connection-log-transfer.test.js`). The latest beat alone suffices only
for a participant that was *continuously connected* across the window; any gap needs the log.

## 7. Polling contract — loud-fail on a departed query

The bug worth catching is **polling the input of an id that is no longer a (live) participant** — that
is never legitimate, so it never false-positives. (The earlier "you disconnected X but never released
it" warning is retired: a game is allowed to stop caring about a player and never poll them again, so
warning about an un-released-but-unused id nags about correct behavior.)

Three poll outcomes, distinguishable because the engine tracks BOTH membership and `IsDisconnected`:

- `poll(connected participant)` → their input (which may legitimately be null/idle).
- `poll(disconnected participant)` → ERROR (the game forgot to check `IsDisconnected`).
- `poll(non-participant)` → ERROR (the game forgot to check membership).

Dev mode THROWS on an error poll; prod returns the default (null) input. Prod-default is SYNC-safe
(every node agrees on the tick-derived denounce tick, so all return default identically — the ball
just drops everywhere, sim stays synced) but LOGIC-unsafe (it needs every input type to have a
canonical zero, else "synced garbage"), so dev-throw must actually be exercised in tests.

**Precondition this creates:** loud-fail is only ergonomic if membership is cheap to pre-check →
implies a deterministic `isParticipant(id)` (tick-derived, same announced-set on rollback). Per-tick
contract:

```js
if (isParticipant(id) && !isDisconnected(id)) poll(id); else handleDeparture();
```

## 8. Reading participants — the query API

> **Implementation status (built — §7 + §8).** `poll(id)` reads a connected participant's reconstructed
> input; for anything that is not a connected participant it DEV-throws (the contract guard was skipped) or
> PROD-returns a hardcoded `null` ("absent / gone") — NOT the configurable `defaultIntent`: polling a
> departed player is a contract-violation fallback, not passive play, so it must not read as actively idle
> nor depend on a dev config. `null` is the honest signal and is unconditionally sync-safe (the game maps it
> in `handleDeparture()`). The dev/prod distinction is the dedicated `devMode` constructor flag (decoupled
> from `debug`).
> `players()` returns per-tick `ParticipantHandle`s in dev (they stringify to their id but loud-fail if used
> past their issuing tick or hashed) and bare ids in prod; `readVia(handle)` resolves a this-tick handle and
> `hashValue(value)` guards a handle out of hashed checkpoint state. Internals read a private `_playerIds()`,
> so handles never enter engine logic. `releaseParticipant(id)` is an immediate, unconditional same-tick
> denounce; `players()` returns a fresh snapshot copy each call. (Cat 24/34 green.)

Membership CHANGE flows through the three verbs (§2). Membership READ — "give me participant X's
input this tick" — is a separate, deliberately narrow API.

**Read path = direct id-query, never a general predicate query.** The discovery predicate (§2.1)
exists ONLY to promote spectators; it is NOT how you read a known participant. The earlier "general
query" idea had two uses (promote + search-among-known); the second is cut. Reading a known
participant is a direct lookup keyed by that participant's id. This also keeps the predicate cheap —
it ranges over the handful of discoverable inputs at a tick, never the full roster.

**Ids are capability-gated, not raw.** A game may only read a participant through a handle the ENGINE
issued — it can never fabricate or guess an id. This prevents id-guessing and, more importantly,
cross-timeline contamination: a handle from one timeline must not silently resolve in another after a
rollback.

The handle is an OPAQUE wrapper around the id — NOT a raw object pointer. The rule that makes it safe
is simple: **store the id, never the handle.** The id is a plain, replay-stable value (it hashes
identically on every re-simulation), so it is the durable key a game keeps on its avatar. The handle
is only a momentary, per-tick permission: each tick you trade the id for this tick's handle
(`players()` / a direct lookup), use it, and discard it. Because the handle never lives in saved game
state, it never enters the hash — so it cannot cause a divergence, and there is no stale reference to
carry across a rollback. A handle that is not from this tick is therefore always a misuse, caught
loudly in dev.

This is why a handle exists at all rather than passing the raw id: the per-tick issue-and-revoke is
what structurally enforces §7 (you can only read a participant you re-confirmed exists THIS tick) and
what catches cross-timeline contamination (a handle stashed outside rolled-back state is stale on the
next tick → dev loud-fail).

**Dev vs production.** In dev the handle is a thin validating wrapper around the id, so the engine can
revoke it at the tick boundary and throw on a stale or foreign one. It is also deliberately
**un-hashable / un-serializable**: the dev-mode hash function recognizes a handle and throws a
localized error (*"a participant handle was found in hashed state — store the id instead"*) rather
than letting it silently corrupt the checkpoint. This enforces "store the id, never the handle"
MECHANICALLY, and catches even the store-but-never-use case that the per-tick staleness check would
miss — a stray handle surfaces as a clear error at hash time, not as a confusing desync. In
production the handle simply IS the bare id — no wrapper, no allocation, no loud-fail, freely
hashable; correctness falls back to §7's tick-derived prod-default, which is already sync-safe (every
node agrees on the denounce tick → identical default). So even a dev who wrongly persists a handle
is, in prod, just persisting the id — harmless.

(We *could* instead make a handle safe to STORE in state — a deterministic `{id, gen}` whose `gen`
comes from the join's finalized position — but that supports a pattern we don't want, persisting
handles. We make storing a handle FAIL LOUDLY instead. The durable key is the id; storing the handle
is never necessary.)

### `players()` — the engine-provided membership view

The engine exposes `players()` (a.k.a. `participants()`) rather than making every game re-derive
membership. It returns a SNAPSHOT of the current participant set (as handles) at the moment of call,
and it RESPECTS the consumption rules:

```js
var a = players();          // N participants
discoverParticipants(...);  // promotes one
var b = players();          // N+1 — b reflects the just-consumed announce
```

The same holds for departures under consume-gated demotion (Flag 2 = CONSUME): a departed id stays in
`players()` until the game consumes `getStoppedParticipating()`; under AUTO it drops at the signal
tick. This is consistent with §4 — under consume-gating the announced-set transition IS the consume,
so `players()` and the announced-set never disagree. `players()` reads the derived, rolled-back
announced-set, NEVER a mutable side register (§5), so it is identical on every node and across
re-simulation.

Mid-tick mutation of `players()` is intended, not a footgun: it is the deterministic, immediate
effect of your own consume calls. A game that caches the snapshot and then mutates membership in the
same tick simply holds an older snapshot — the normal "don't hold a stale view of a mutable
collection" caveat.

## 9. Retention / GC — the input array is welded to the finalized floor, not to membership

The array's lifetime is governed by ROLLBACK alone, not by membership:

- **Finalized participant:** keep the baseline (announced status + last finalized input value) plus
  the ENTIRE non-finalized window — every non-finalized input now matters, it moves an avatar.
- **Non-participant with a pending attempt:** keep everything from the node's OLDEST non-finalized
  discoverable input onward (a rollback can revisit any non-finalized attempt). Inputs *before* the
  oldest live discoverable — even un-finalized spectator inputs — are droppable; they can never
  promote.
- **Non-participant, no pending attempt:** drop the whole array → full GC back to spectator.

The promoting discoverable is NOT hoarded after it finalizes — once past grace the promotion cannot
be undone, so "N is a participant" rides forward in the finalized **announced-set baseline**, not in a
kept input. Removing a participant's input array does NOT drop their avatar: the avatar is hashed
game state ("ball held by player 3"), not the input array — the "removing the array drops the ball"
intuition is false.

Dropping is **SENDER-SIDE ONLY**. A receiver must NOT drop on sight: messages reorder, so a non-
discoverable input that looks dead on arrival (no live discoverable seen yet) can turn out to be
needed once its discoverable lands later — a receiver-side drop is irreversible and would desync.
The sender, by contrast, knows its own participation state authoritatively, so suppressing at the
source is safe. It is sync-safe because STATE is hashed while raw inputs are not — a segment the
sender legitimately never emits can never affect a checkpoint.

**Logical release vs physical GC are separate.** The denounce (announced-set transition) can happen at
the `stopParticipating` tick T while T is still non-finalized — rollback re-derives it. But the
physical byte-deletion of the array must wait until T is past grace, because a rollback to before T
must re-simulate the ticks where the player was active. Logical at T; byte-GC only when finalized.

## 10. Self-input dropping & input delay

**Input delay** is a separate engine knob (not participation-specific, but it interacts here). With
delay `D`, an input produced when the local sim is at tick `T` is stamped for tick `T+D` and
broadcast immediately, giving the network `D` ticks to deliver it before it is needed → fewer
rollbacks at the cost of input-to-screen latency. **Discoverable inputs share the same delay `D`** —
otherwise there would be an input gap between your un-delayed join and your delayed first gameplay
input.

**Implementation status (built — opt-in `inputDelay` ticks, default 0).** The local sample stamps every
own change at `T+D` (the encoder still sees its normal monotonic sequence; only the outbound stamp shifts),
applies it to the own decoder at `T+D` (rollback-free — it is above the current sim tick) and broadcasts
immediately. A future-stamped change is accepted at the remote (negative age → acceptance tier), so it is
buffered and consumed when the sim reaches `T+D` with no rollback. The uniform shift makes discoverable
inputs share `D` for free (no special case), preserving adjacency. `D` is a per-node knob — determinism
holds with MIXED `D` across peers, since every node reconstructs the same `{author-stamp, intent}` set.
Measured: `D ≥ link-RTT-in-ticks` drives rollbacks to zero (240-trial seed×D×latency fuzz converges at
settled ticks). Self-input *dropping* (Option 3 below) is the separate, already-implicit retention rule;
the speculative Options 1 & 2 remain future opt-ins. Cat 30's 4 `[T]` reds are green.

*Future-stamp / present-tick interaction (verified safe, pinned by `tests/input-delay-join-lag.test.js`).*
A `T+D` stamp is a future tick, so it could in principle leak into a node's present-tick estimate. It does
not, on either path: (1) a late joiner's catch-up clamps to the present, so it converges to the live state
at the same tick (no D-tick over-advance), even though the join-buffer folds inbound input ticks into
`_maxSeenTick`; (2) the LIVE `MSG_INPUT` path never feeds `_maxSeenTick` (only the join-buffer / asserts /
attendance do, and those carry present ticks), so a live node with `lagThresholdTicks < D` does not falsely
conclude it has fallen behind and reset.

The retention rules (§9) tell a node which of its STORED inputs are dead. Self-input dropping asks
the symmetric SENDER question: which of my own inputs should I even transmit? Naive rollback netcode
broadcasts every input from every node always; participation lets a node that won't be promoted skip
transmitting its useless follow-ups.

**The wrinkle input delay creates:** when you emit a discoverable input stamped `T+D`, your local sim
is only at `T` — you have NOT simulated `T+D`, so you do not yet know whether that discoverable wins
promotion. Yet you must keep emitting follow-ups (`T+1+D`, `T+2+D`, …) before your sim reaches `T+D`.
So how can a node decline to send its own follow-ups on the belief it won't be promoted, when it
cannot yet know?

Three strategies on a (compute / bandwidth / risk-of-unjust-drop) tradeoff:

- **Option 1 — simulate `D` ticks ahead to predict promotion.** Drops earliest → least bandwidth. But
  it re-creates the very rollbacks input delay exists to delete (now INVISIBLE — future ticks the
  user never sees — but still full compute), and the far-future tick is the least-settled, so its
  prediction is the least reliable → HIGHEST misprediction. *Compute-heavy, least bandwidth, highest
  risk.*
- **Option 2 — don't look ahead; keep sending until your current sim tick reaches the discoverable's
  tick, then judge.** Saves the post-loss bandwidth tail, no future compute. Risk: your local "I
  lost" can be overturned by a later rollback (e.g. a contended slot frees and you actually win), and
  you have already stopped sending → a gap. *Not compute-heavy, some bandwidth saving, some risk.*
- **Option 3 — don't guess at all; keep sending until your discoverable is FINALIZED.** Past the
  finalized floor the outcome cannot change, so if you are still not a participant you can now drop
  with ZERO risk. *Not compute-heavy, biggest bandwidth tail of the three, no risk.*

**Decision: IMPLEMENT OPTION 3 — and note it is not even a new mechanism.** The finalized floor is by
definition the boundary before which a win is still possible; §9's retention already keeps a follow-up
live only while it trails a non-finalized discoverable. Option 3 is just "transmit exactly what §9
says to keep" — send up to the floor, then stop. Because you keep sending through the entire window
where a win is still possible, a surprise late promotion always has your inputs → no gap, no risk. It
already follows from the rules; it is the default and needs no flag.

**Options 1 & 2 are FUTURE FEATURES, opt-in, not v1** — they guess AHEAD of the finalized floor, the
one move the finalized-floor discipline warns against:

- *Sparse-input softener (makes option 2 more palatable than the raw "risk" label):* inputs are
  change-only, so a gap means "unchanged", not "lost". A mispredicted-then-reversed promotion does
  NOT desync — it briefly freezes the avatar at its last value until sending resumes; only the dropped
  DELTAS are lost, and only unrecoverably if they fall past the acceptance window. Cosmetic stall, not
  a broken sim.
- *Hard constraint on 1 & 2:* speculative dropping applies ONLY to non-discoverable follow-ups, NEVER
  to a discoverable input — dropping a discoverable throws away a promotion chance you cannot predict
  away.
- *Buffer-and-resend* (store dropped inputs, resend on a surprise promotion) is possible but
  self-defeating: replayed past inputs cause bigger rollbacks and risk falling outside the acceptance
  window → desync repair.

**Flag shape (do NOT conflate — same lesson as §4's autoRelease split):** option 3 is default-on free
behavior, no flag. A separate opt-in (working name `dropOwnUnpromotedInputs`, default OFF) enables
speculative early-drop; option 1 vs 2 is a compute-vs-accuracy sub-choice under it.

### 10.1 Input wire protocol — reliable delivery without cumulative resend

The §10 question is *which* inputs to send; this is *how* to deliver them reliably and cheaply. The
naive reliable scheme — "every message re-bundles everything the recipient hasn't acked" — is
Go-Back-N, and it is quadratic exactly where it hurts most. Forwarding 5 peers' inputs immediately in
one tick:

```
msg1:[p1]  msg2:[p1,p2]  msg3:[p1,p2,p3]  msg4:[p1,p2,p3,p4]  msg5:[p1,p2,p3,p4,p5]   = 15 inputs for ONE tick
```

and it gets worse as the ack lag grows. Two changes fix this.

**Enabler 1 — a reorder-tolerant decoder (a required code change).** Inputs are already absolute
`{tick, intent}` full snapshots inserted into a tick-sorted array, so the data model is order-free in
principle — and `SimulationEngine._onInput` already *assumes* this ("tick-stamped, so reorder-safe").
But today it is NOT true: `SparseInputDecoder.applyChange`'s redundancy suppression (SparseInput.js
~L111-116) reads the neighbor present at insert time, so out-of-order arrival can silently drop a real
change — e.g. `(t10:A),(t20:B),(t30:A)` with t30 arriving before t20 leaves `valueAt(30)=B` forever.
The cumulative resend currently MASKS this (a later resend re-applies against the corrected neighbor).
Before we stop cumulative-resending, the decoder must be made genuinely order-independent: dedupe on
*exact tick only*, always insert otherwise. (Falsifying test: the reorder sequence above must converge
regardless of arrival order.)

**Enabler 2 — a monotonic message-id stream.** Each node emits messages numbered 1, 2, 3, …; a message
bundles whatever changes/acks existed at that send. Redundancy is keyed on the **message id**, NOT on
per-player change history (a passive player has no recent messages to protect → costs nothing).

With both, the protocol is **proactive bounded redundancy (primary) + RTO-triggered targeted resend
(rare fallback):**

- **Primary, zero-RTT:** each new message also re-carries the new content of the **last N−1 message
  ids** (N≈3, flattened, not nested). Any single loss inside an N-window is recovered by the next
  message with no request and no wait. Duplicates are decoder no-ops (idempotent on exact tick).
  Bounded at `O(N × per-send payload)`, independent of RTT — unlike cumulative resend.
- **Fallback, rare:** a cumulative `ackId` (+ a small **SACK** bitmap of ids received beyond the
  contiguous point) is piggybacked on return traffic; the attendance floor (§6.1) guarantees acks keep
  flowing even from an otherwise-idle peer. A message is presumed lost **only when it has been un-acked
  longer than `RTO` (≈ measured RTT + margin) — a TIME threshold, never a message count.** Then resend
  ONLY the SACK holes.

**Why the trigger is time, not count (the subtle part):** with a large RTT you always have many
un-acked messages legitimately in flight, so "N newer messages sent" or "not yet acked" does NOT mean
lost. A big RTT simply yields a big RTO; in-flight ≠ lost. Two independent knobs result: **N** = how
much *consecutive burst* loss you absorb for free (RTT-independent); **RTO** = when to give up on a
specific hole (RTT-scaled). Most losses never reach the RTO path.

Worked trace (A→B, RTT≈100ms, msg≈every 25ms, N=3, RTO≈150ms):

- *Normal:* at t=100 A has sent #10–#14 but only just got ack(10) — 4 in flight. A does nothing: #11's
  age (75ms) < RTO. In-flight, not lost.
- *Small burst (≤N):* #11 lost; B receives #12, which carries #11's content → B reconstructs #11, ack
  jumps to 12. A never learns of the loss. Zero RTT, no ack dependence.
- *Big burst (>N):* #11,#12,#13 lost; #14 carries {#14,#13,#12} → B recovers #12,#13 but #11 is outside
  the window. B's contiguous ack stays at 10, SACK={12,13,14}. When #11's age exceeds RTO, A resends
  **#11 only** → ack jumps to 14.
- *Lost ack:* harmless — acks are cumulative and re-sent on every return message; the next one
  supersedes. A acts only on RTO age, so a single dropped ack triggers nothing.

**Relevance gate:** run the ack/resend machinery only against peers you owe reliable delivery —
participants and live candidates. A pure receiver (spectator) never acks and needs none; don't retain a
send buffer for it. Same relevance filter as everywhere else.

**Inputs vs beats — two redundancy models on one envelope:**

| | redundancy | reliability | why |
|---|---|---|---|
| **inputs** | message-keyed (last N msgs) | ack + RTO + SACK resend | a LOG — no permanent hole allowed |
| **beats** | value-keyed (current value × N, §6.1) | none — converge by overwrite + gossip | OVERWRITE — only the newest matters |

**Implementation status (built — opt-in `inputRedundancy: N`, default off).** Enabler 1 (reorder-tolerant
decoder) shipped earlier. Enabler 2 is now implemented in `SimulationEngine`, with one deliberate
simplification from the sketch above:

- **Keyed on change-tick, not a message-id stream.** The sender is authoritative over its OWN change log,
  so the receiver SACKs *the change-ticks it currently holds for that sender* (`MSG_INPUT_ACK`, a snapshot
  not a delta — a dropped ack self-heals on the next one) and the sender diffs that against its own log.
  This needs no separate msgId sequence/buffer for the direct-link case. (The per-link msgId stream of
  §10.2 is still the right model once a node forwards OTHERS' inputs — there the forwarder is not the
  author and cannot diff against an own-log.)
- **Primary (zero-RTT):** each own-change message piggybacks the last N changes (`changes[]`); a burst of
  ≤ N−1 consecutive losses is healed by the next change. Decoder-idempotent on exact tick.
- **Fast path = SACK-triggered retransmit:** an own change *below the receiver's received frontier* that
  the receiver lacks is a DEFINITE hole (a later tick arrived) → resend at once, throttled to ~one per ack
  cycle. A tick *at/above* the frontier is in-flight, never resent — which is exactly why a high-RTT,
  loss-free link issues ZERO spurious resends (the "time not count" invariant, realized structurally).
- **Fallback = RTO:** un-acked non-finalized changes past an adaptive RTO (`≈ 2 × max observed ack-latency`,
  **capped below ½ the grace window** so a hole always gets several attempts before its tick finalizes — an
  uncapped grow-only RTT poisoned by one late confirmation would otherwise silently halt all retransmit).
- **Finalization is the deadline.** Resend candidates are only *non-finalized* own changes (tick above the
  grace floor); the send/ack bookkeeping GCs on that same floor. A hole that survives to finalization is by
  contract a **B8 full-state recovery** concern, not an input-resend one.

**Convergence envelope (measured, 2-engine seed×loss fuzz, grace = 40 ticks).** Light loss (10%): 100%.
Heavy loss (25%): ~93%. The residual is the classic **tail** case — a sender's *highest* change has no
later change to piggyback redundancy AND can never be "a hole below the receiver's frontier" (the frontier
cannot exceed it), so only the few RTO attempts cover it; if they and their acks all drop within the grace
window the change finalizes missing. That finalized tail-loss is precisely B8 recovery's remit (a separate
layer, deliberately not enabled in the isolated Cat 27 unit tests). Cat 27's 4 `[T]` reds are green.

### 10.2 Per-peer knowledge tracking — forward only the delta (connected-graph optimization)

§10.1 sends every neighbour the same redundant bundle. On a CONNECTED graph that over-sends: if A and B
both hear C's input, each relays it to the other AND echoes it back to C, who already has it. The
optimization: a node tracks, per neighbour, what that neighbour *already knows*, and sends only the
**delta**. This generalizes §10.1 (input redundancy/RTO) and §6.1 (beat forwarding) from
"broadcast-the-same-to-all" to "per-neighbour delta," and is what makes forwarding cheap at scale.

**The knowledge base — ONE table, keyed by (neighbour, source-player).** For each DIRECT neighbour X and
each source player P, A keeps a compact summary of what A believes X holds:
- `known[X][P]` = a tick **watermark** (X has all of P's input changes up to tick T) + a small SACK-style
  set of out-of-order extras beyond T (for the reorder/loss case);
- `knownBeat[X][P]` = the max beat for P that A believes X holds (a scalar; beats are grow-only-max).

State is O(neighbours × participants) — bounded, because on a connected graph "neighbours" are the DIRECT
links (not all nodes), and it is GC'd at the finalized floor (the floor is the baseline everyone shares,
so nothing below it is tracked). One table with per-neighbour markers (the shared player axis) beats N
separate per-peer maps — same information, cache-friendlier.

**`known[X]` advances on EVIDENCE only — never on an optimistic send (the load-bearing rule).** Two
evidence sources:
1. **Provenance** — data that arrived FROM X (X is the source, or X relayed it) is known by X. A raises
   `known[X]` immediately, no round-trip. (C's input is known by C; a change B relayed is known by B.)
2. **Advertised frontier** — every message X sends carries X's own knowledge frontier (its per-player
   watermark, "I have P up to T"). A folds it into `known[X]`. This frontier IS the ack.

**It must NOT advance on "A sent it."** Marking X as knowing a change merely because A transmitted it is
the trap: the one copy may be lost, A then believes X has it, stops sending, and X is permanently missing
it → the §10.1 permanent-hole desync. A lost change must always remain eligible to resend, so `known[X]`
moves only on the two evidence sources above.

**Sending to X = the delta.** A sends X each input change it holds that `known[X]` does not cover, and each
beat for P above `knownBeat[X][P]`. Provenance kills the echo automatically (A never sends X data that came
from X). The worked example, made precise:
- A hears C's input from C → `known[C]` covers it → A sends C nothing back.
- A checks `known[B]`; B does not (yet) cover it → A relays it to B (and decrements the per-link redundancy
  counter for it — see below — but does NOT yet mark `known[B]`, since the relay could be lost).
- B receives A's relay → raises `known[A][C]` by **provenance** (A relayed it, so A has it).
- When C's own copy later reaches B, `known[A]` already covers it → B sends A nothing. The redundant echo
  is gone; B confirms to A only via its normal advertised frontier.

**Redundancy + RTO become PER-LINK, and the ×N repeat stops early on evidence.** Since each neighbour gets
a different delta, the §10.1 message-id stream and ack/RTO are per-link (A→X has its own msgId sequence,
its own ackId). The §6.1 value-keyed ×N beat repeat and §10.1 last-N input redundancy run per neighbour:
A re-sends X a change up to N times for free burst-loss tolerance — but the instant X's advertised frontier
shows X has it, A stops early (skip the remaining copies). The RTO timer (time-based, §10.1) covers the
case where every copy was lost and no frontier ever confirmed delivery: resend on timeout. The two layers
stay distinct — **N = free consecutive-loss tolerance; RTO = the give-up timer** — now scoped per peer.

**Determinism is untouched.** This is a transport-routing optimization: it changes HOW input changes reach
each node, not WHICH changes a node ends with. Every node still converges to the same reconstructed input
set, and the deterministic sim consumes that set identically — *provided* convergence is preserved, which
is exactly why `known[X]` may advance only on evidence (a lost change is always eventually resent).

**Open efficiency questions (deliberately unsettled):**
- *Frontier-advertisement cost.* A full per-player frontier on every message is O(participants) overhead.
  Likely fix: advertise only the frontier entries that CHANGED since X last reflected A's frontier (delta
  the frontier itself), or advertise sparsely (on change / every k messages). The frontier is itself
  grow-only, so a delta encoding is safe.
- *Watermark vs hole-set sizing.* In the non-finalized window the per-player stream is near-contiguous, so
  watermark + a few holes is small; pathological reorder/loss grows the hole set (still bounded by the
  window).
- *Bootstrap of `known`.* A new link starts with `known[X]` empty (assume X knows nothing → send it the
  whole non-finalized window once, then converge to deltas). The finalized floor is the shared baseline, so
  "the whole window" is bounded.

**Implementation status (built — opt-in `inputForwarding`, default off; generalizes the §10.1 layer).**
The §10.1 self-only protocol was generalized from "resend MY changes a peer lacks" to "send each reachable
neighbour the DELTA of ANY source's changes it lacks":
- **Knowledge base `known[peer][source]`** (a tick set in the non-finalized window; GC'd at the floor) plus a
  grow-only watermark backing the `knowledgeFrontier(peer)` read API (`{input, beat}`). Advances on
  **evidence only**: PROVENANCE (a change received FROM X marks `known[X]`; a source trivially holds its own
  inputs, killing the echo back to the author) and X's **advertised frontier** (`MSG_INPUT_ACK` now carries a
  multi-source `frontier` — the ack). Never on an optimistic send, so a lost sole copy stays a resend
  candidate (the load-bearing invariant).
- **Delta send** to each reachable neighbour: any held change `known[peer][source]` does not cover, throttled
  per **(peer, source, tick)** — which also fixed §10.1 limitation (a) (a hole shared by several neighbours
  now repairs to each independently). SACK fast-retransmit + adaptive RTO are reused per-link.
- **Reachable neighbours only.** Forwarding targets peers we have actually received a message from
  (`_heardFrom`), never an unreachable peer in the room roster — otherwise the evidence-only retry would
  storm a black hole (a lesson the bandwidth test forces). Frontier advertisement still goes to all peers (to
  bootstrap contact); it is cheap and does not carry input copies.
- **Convergence envelope (measured, line/chain seed×loss fuzz).** Connected line A–B–C and 3-hop A–B–C–D
  converge 100% with no loss and ~97% at 20% loss; heavy loss degrades into the same finalized-tail residual
  as §10.1 (B8's remit). Cat 37's 5 `[T]` reds are green; determinism is untouched (routing changes, not the
  reconstructed input set). NOT yet built: the msgId-stream framing, frontier-delta compression, and beat
  forwarding folded into the same `known` table (beats still ride the separate §6.1 `beatForwarding`).

## 11. Caps and floods

- Keep `max_participants` (an admission limit, enforced inside the discovery predicate's `limit`).
- Do NOT build a simulation-affecting `max_candidates` eviction: to be deterministic it would need
  global knowledge (defeating its purpose), and the engine does not know the game's candidate-priority
  (the dev's predicate), so any engine-side eviction order could drop a real winner → desync.
- A non-discoverable input not following a live discoverable is dropped on sight (§3) — this alone
  kills the benign spectator flood. A *discoverable* flood (1000 nodes press `joinGame`) is only
  TRANSIENT: discovery announces the winners, losers finalize non-promoting and GC at grace, and
  `players.length < max_participants` stops losers re-spamming.
- A persistent flood survives only under MALICE → Byzantine, scoped OUT for now. A receiver-side flood
  cap stays a DEFERRED best-effort DoS valve; if ever built it must evict in the EXACT INVERSE of the
  discovery selection order, with margin above `max_participants`, so only provable no-hopers drop.

## 12. Why this is efficient → moved to §14

*(The efficiency summary now lives near the end as §14, after the desync design it depends on. §13's number
is load-bearing — referenced throughout — so it stays put and §12 is intentionally vacant.)*

## 13. Desync detection & repair — history window, corroboration, integer-product canon

Finalization makes state immutable, but two peers can still finalize *different* state (a lost input, a
late arrival past grace). Detecting that after the fact — and **repairing it by replay (cheap) rather than
a full state transfer (expensive)** — needs (a) more retained history than the grace window, (b) a payload
that proves *how* one finalized state became the next, and (c) a deterministic, cycle-free rule for which
finalized state becomes canon. This section is that design. Most of it is now BUILT (behind `recovery:true`);
the main piece left is **Build B — the assert payload** (§13.3). Full status in §13.9.

### 13.1 Snapshot vs checkpoint, and `snapshotInterval`

Two things the engine produces in `_stepOne`, easy to conflate:

- **snapshot** — the FULL cloned state, kept **local**, used as the **rollback anchor** (restore + re-sim).
- **checkpoint** — a **hash** of the state on a fixed grid (`checkpointInterval` = `snapshotIntervalTicks`),
  **broadcast**, used for cross-peer **detection**. A checkpoint is just `hash(snapshot at a grid tick)`.

`snapshotInterval` (dev knob, **default 1**) decouples snapshot cost from tick cost: take a full snapshot
every K ticks instead of every tick. To roll back to tick T you restore the **nearest snapshot ≤ T** and
re-tick forward to T. Rationale: cloning a large state can cost more than re-simulating simple logic, so a
game with a heavy state but light step trades a few cheap re-ticks for far fewer expensive clones. Default
1 is byte-for-byte today's behaviour (nearest snapshot = exact snapshot, zero re-ticks). Every
`_snapshots.get(tick)` site (rollback, the recovery-transfer edge, the §9 prune anchor) becomes
"nearest-≤, then re-tick" under K > 1.

**Hash every snapshot, not just every checkpoint.** Hashing is cheap; cloning is the expense. So hash each
snapshot even when it is not broadcast. The extra (un-broadcast) snapshot-hashes *between* checkpoints let a
receiver **narrow a divergence to a sub-range** and recompute / swap inputs only from the latest snapshot-
hash it still agrees on — instead of recomputing the whole checkpoint interval.

**A checkpoint anchors to the most-recent snapshot.** A checkpoint you cannot re-simulate from is useless
for repair, so the checkpoint on the grid hashes the state at the most-recent available snapshot (with
`snapshotInterval=7, checkpointInterval=20`, the "tick-20" checkpoint lands on snapshot 14). This works
cross-peer only because **every peer shares `snapshotInterval` + `checkpointInterval`** and so anchors to
the identical tick.

### 13.2 The history window (`historicCheckpoints`)

**Finalization (grace) decides when history is immutable; retention (history) decides how long the immutable
copy is kept.** They are different concerns, and `history > grace`. The motivating bug: a peer broadcasts a
finalized checkpoint hash; by the time it arrives, the receiver has GC'd that tick at the grace floor and
cannot compare → the desync goes undetected. (Today this is masked by an accidental *leak* — the
checkpoint-hash map was never pruned — which is exactly what this replaces.)

Parameterise retention as a **count**, not a time: `historicCheckpoints = N` keeps the **N most-recent
FINALIZED checkpoints** — and all the data back to the oldest of them. The retention floor is the oldest of
those N; it slides forward as each new checkpoint finalizes, dropping everything below.

**One store, not two.** The retention floor simply moves out from grace to the history floor — snapshots,
inputs, query log, beats, and the checkpoint-hash map all GC there. There is no separate "historic store."
Two safety rails keep it sound:

1. **Normal rollback is explicitly grace-clamped.** A late input never rewrites finalized history. (It is
   already bounded by the acceptance gate — a beyond-grace input is rejected, never applied — so the clamp
   makes the bound a matter of *intent* rather than relying on snapshot-absence, which no longer holds once
   we retain snapshots past grace.)
2. **Repair-rollback is a separate operation that IS allowed to cross the grace floor** — that is precisely
   the "rollback beyond grace" §13.7 needs, reusing the same snapshots and re-tick machinery.

### 13.3 What an assert sends — the transition, not the tail

> **Model note (2026-06, as built).** A node no longer tracks a per-peer *checkpoint object* and acks it by
> content hash. Agreement with each peer is collapsed to a single integer **watermark — `agreeUntil[P]`**
> (the §13.3.2 sparse protocol below), and a "checkpoint" is *derived on demand* at a grid tick: the hash
> `window` for **detection** plus a `canon` payload for **repair**. So read §13.3 as the **payload shape**
> (still the Build B target), and §13.3.2 as how reliability/silence actually works — it **supersedes** the
> §13.3.1 content-hash dirty bit.

Old payload: the most-recent finalized hash + every hash after it + all inputs from it onward. Both
wasteful (the after-part) and incomplete (enough to *detect*, not *repair*). New payload:

- the **two** most-recent finalized checkpoints `C_prev`, `C_recent`,
- the **intermediate snapshot-hashes** between them (for narrowing, §13.1),
- the **inputs that drive `C_prev → C_recent`** (the window `(C_prev, C_recent]`, plus the input values in
  effect *entering* `C_prev` so the receiver can reconstruct the stream).

That is the complete proof of *how* you moved from one finalized state to the next. The receiver's own
history window is what lets a *delayed* assert still find an overlapping checkpoint to compare.

**Two refinements (efficiency).** (1) Each of the holder's OWN inputs in the payload carries a **per-input
`corroborated` flag** (§13.6) — *not* a single count: scoring under per-holder keeps only corroborated
inputs, so it must know *which* ones survive. Non-own inputs need no flag (a non-author holding them is proof).
(2) **Delta the input payload against `known[peer][source]`:** for an input the peer is already evidenced to
hold, send only its `(source, tick)` **reference**, never the value; the receiver resolves it from its own
decoder by tick if it ever has to roll back. Only inputs the peer provably lacks carry full data. A reference goes stale **only when the receiver *removed* that input** (a reconcile dropped an uncorroborated
change, §13.7). To keep references resolvable, **removed inputs are TOMBSTONED — retained but marked inactive —
within the history window** (not hard-deleted), so the receiver still resolves them by tick and a later
checkpoint that needs one finds it. The request-fallback (ask for the value, §13.5.2) then fires only on
history-window *skew* (the reference predates the receiver's floor) — rare. **Reference-disambiguation
invariant:** the input stream holds *exactly one authored value per `(player, tick)`*; liveness/disconnect is a
**separate** grow-only-max scalar (B7), **never** a stream entry — so a node that believes "A disconnected at
200" does *not* put a competing change at `(A, 200)`, and A's real input there doubles as the proof-of-life that
corrects the mistaken disconnect tick. On a connected mesh the delta shrinks a checkpoint to roughly what each
peer is missing — the same win as §10.2 input forwarding.

> **SUPERSEDED (2026-06-09) by [`LIVENESS_CORROBORATION.md`](./LIVENESS_CORROBORATION.md).** The
> grow-only-max `lastAttendanceTick` *scalar* below cannot carry a disconnect→reconnect GAP through the
> lightweight checkpoint (a resume erases it), and it had no corroboration, so a liveness-only divergence could
> not resolve deterministically. The current design makes liveness a **first-class corroborated stream** — a
> sparse RANGE history per player with accept/grace windows, a corr-vs-raw wire split, a versioned per-peer
> `known` gate, and §13.6 canon resolution (compare/union) — exactly like inputs. Read that doc; the paragraph
> below is the original (scalar) sketch, kept for history.

**The checkpoint also carries the per-player LIVENESS signal — reconciled separately.** Wherever the engine
lets participation feed the *simulated* (hashed) state — e.g. v1's `disconnect → leave` (Flag 1 ON, §4) demotes
a player at its attendance-derived disconnect tick — that disconnect tick is a **state determinant**, not mere
presentation. A peer that replayed only the inputs would derive a *different* membership → a different state → a
**phantom desync it cannot repair** (nothing in the checkpoint told it the tick). So the checkpoint carries, per
player, the **last-heard / disconnect tick** it recorded (B7) beside the inputs. It reconciles by **grow-only-max** —
the later proof-of-life always wins — which **always composes** (a mini-UNION, never a COMPARE), so a node
holding a stale (too-low) value rolls back and re-derives membership with the maxed value. This is the
lightweight extension of what `Recovery.js` already does in a full transfer (`mergeLastAttendanceTicks`). It does
**not** break the one-value-per-`(player, tick)` stream invariant above: liveness rides as its **own checkpoint
field**, never as an input change — two reconcilers (input canon §13.6 + liveness grow-only-max) over two kinds
of data. (When a game ignores membership in `step()`, liveness can't move the hash and a checkpoint may omit it —
an optimization, not a requirement.)

#### 13.3.1 Making checkpoints reliable — the same `known` ledger, marked DIRTY on change

> **SUPERSEDED (2026-06) by §13.3.2.** The steady-state *outcome* below still holds — checkpoint traffic goes
> silent once peers agree and never blind-broadcasts. But the *mechanism* described here, a per-peer
> **content-hash dirty bit** (`ackCheckpoint`/`checkpointKnown`: track which `(checkpoint, hash)` each peer
> acked, re-send until the *current* hash is acked), is **not** how the built engine reaches silence. The dirty
> gate is now derived from the per-peer **`agreeUntil` watermark** (§13.3.2): you are dirty to P iff you hold a
> finalized input P lacks in `(agreeUntil[P], horizon]`. The content-hash ledger code
> (`_ckptAcks`/`_ckptKnownHash`, `MSG_ASSERT_ACK`/`_sendAssertAck`) still exists but is **write-only/vestigial**
> and flagged for removal. The paragraphs below are kept for design history.

A checkpoint must reach peers to be useful, and its message drops like any other. Reuse the §13.5.2 model
verbatim with checkpoints as the items: the `known` ledger gains, **per peer, which checkpoints that peer
holds AT THEIR CURRENT HASH**. A checkpoint is **dirty (unknown) to a peer** in exactly two cases — (1) it is
**new** (never sent), or (2) **its hash just changed** (a reconciliation recomputed it), which **invalidates
any prior ack** (the peer confirmed the OLD hash). A peer's ack of a specific `(checkpoint, hash)` clears the
dirty bit; a hash change re-sets it. **That `hash` is over the checkpoint's full CONTENT — its corroborated
inputs *and* the state hashes, not just the resulting state.** Two different corroborated-input sets can produce
the *identical* result yet score differently (`left+right` and nothing both net-zero, but count 2 vs 0), so the
bare result hash would let a stale-input ack clear the bit and leave a peer scoring your checkpoint wrong. (It is
still narrow, not 'hash all transient junk': a finalized checkpoint's durable content *is* its corroborated
inputs + state.) There is no explicit "rebroadcast" — you *mark dirty*, and the send loop
does the rest.

On a **large interval — but flushed immediately whenever a checkpoint becomes available or changes** — a node
sends each peer only its dirty checkpoints (delta), on the per-item backoff, until acked. The bound is the
**history window** (not grace): a checkpoint GC'd out of the window (§13.2) stops being a candidate (a peer
still lacking it reconciles off a later overlapping checkpoint, or falls to a B8 transfer). So checkpoint
traffic, like input traffic, goes **silent in steady state** and never blind-broadcasts.

**Reconciliation dirties a forward range, not one checkpoint.** A repair (§13.7) that recomputes from an
earliest-affected tick changes **every checkpoint from there forward** -> all re-marked dirty to **all** peers.
The lone exception: the anchor checkpoint you reconciled against peer P may be marked **known to P** *iff your
post-recompute hash matches P's* (you provably converged there) — but the checkpoints *after* the anchor stay
dirty even to P, since your kept post-range inputs may differ from P's.

#### 13.3.2 The agreement watermark — per-peer `agreeUntil`, derived not stored (as built)

> **Authoritative spec: [`SPARSE_CHECKPOINT_PROTOCOL.md`](./SPARSE_CHECKPOINT_PROTOCOL.md)** (co-designed
> 2026-06-08). This section is the in-context summary; that doc carries the full pseudocode (the back-walk
> comparison, the watermark update rules, tombstones, the worked examples). Read it for the details.

This is what actually ships. Instead of tracking a checkpoint object per peer and acking it by content hash
(§13.3.1), each node keeps **one integer per peer**:

> **`agreeUntil[P]` = the most-recent finalized tick at which this node has CONFIRMED its state equals P's.**

There is no stored "checkpoint" and no assert-ack roundtrip. Detection still rides the hash **`window`** every
assert already carries; agreement is *read off* that window, and both the silence gate and the asserted send
window are *derived* from the watermark.

**How it moves (`_updateAgreeUntil`).** On any assert from P, intersect my hash window with P's over their
shared grid ticks and walk back from the **newest** shared tick to the most-recent **matching** hash (the
*anchor*):
- **newest shared tick matches → RISE:** `agreeUntil[P] = max(agreeUntil[P], that tick)`. Monotonic; agreement
  never lowers the watermark, and history the two have *since* reconverged past is ignored.
- **newest shared tick mismatches → FALL to the anchor**, but **only from a FRESH (higher-seq) assert.** A
  stale or reordered compare must not drag the watermark back below a divergence the two have since repaired.

**Freshness is a monotonic `seq`, not a timestamp.** `_checkpointSeq` bumps whenever this node's latest
finalized checkpoint hash changes; every assert carries it as `seq`, and `_lastSeenSeq[P]` lets a FALL happen
only on a strictly newer assert. That is the whole "ignore since-reconverged history" guard — no clocks, and
deterministic across nodes. (This is the one still-live use of the old `checkpointVersionHash`.)

**The silence gate is derived from the watermark (`_recoveryHasDirtyCheckpoint`).** A node is **dirty to P** iff
it holds a finalized input in `(agreeUntil[P], horizon]` that P does **not** hold (`_known`). Below the
watermark the input's effect is already baked into the agreed state, so it is skipped. The reason this is
sound: only **inputs and liveness** can break agreement — deterministic state evolution from an agreed anchor
cannot — so the gate is exactly the §10.2 "does P lack an input I hold" test, restricted to the un-agreed
range. Two things fall out for free: a game that animates every tick with **no inputs** stays **silent**
(nothing past the watermark is un-held), and a **void straggler** (an own uncorroborated input no peer holds)
stays dirty until it resolves and tombstones that input away.

**A passive canonical peer must still speak — `_divergedPeers`.** The input-lack gate alone would keep a node
that holds *no input the other lacks* silent, even after it has DETECTED that the peer finalized different
state. So a finalized `desync` adds P to `_divergedPeers` — forcing dirty, so it ships its window and the
diverged side gets something to reconcile against — and the set is cleared on the next `agree` from P. (Set on
the finalized-divergence path only; a not-yet-finalized diverge tick is ignored.)

**The send window is delta-sized off the watermark (`_sendWindowAnchor`).** The asserted window's oldest tick
is the grid tick just **below the lowest un-agreed input** across peers (or `agreeUntil[P]` for a diverged
peer), clamped to the history floor — so it spans only the contested range, never the whole retained history.
After a long idle `agreeUntil` is lazily stale (say tick 20) but the floor has marched up (say 940) and
everything below it is GC'd, so the resumed assert is just `[940, finalized]` — bounded, plus the lone new
input. The peer agrees at the clamped bottom (idle ⇒ same state) and jumps `agreeUntil` to the top in one round.

**Aged-out agreement → state transfer, and the trap not to fall into (Rob).** If `agreeUntil[P]` has aged
**below the retained history floor**, the divergent range starts before anything you still hold — you cannot
replay it, so it escalates to a **B8 state transfer** (§13.7). But the test is **not** "the watermark fell below
the floor" alone: you must also confirm your **oldest still-retained hash genuinely diverges** from the same
tick's hash in P's window. The trap: `agreeUntil[P]` may have been **retreated** for an unrelated reason — you
reconciled a tick because of a THIRD peer C, or an assert from P crossed your retreat in flight — so a mismatch
*at* the retreated watermark can be a transient race, not an unrecoverable GC'd divergence. Declaring a transfer
on that race is a false (expensive) escalation. So transfer **only** when the oldest hash you *retain* actually
disagrees with P's at that tick (no agreeing anchor left in your window); otherwise let the next assert
re-narrow it. (Convergence-despite-different-history is fine — only the most-recent finalized agreement matters,
never how each side got there.)

**Why this replaces the §13.3.1 dirty bit.** One integer per peer, read from the hash window every assert
*already* sends, gives the same steady-state silence with no per-checkpoint hash bookkeeping: "I agree up to T"
is just `agreeUntil` rising, and "I disagree" makes the disagreeing node dirty so it asserts its *own* window —
the divergence is self-driving.

**The silence-confirmation ack — BUILT (`MSG_AGREE_ACK = 'em-agree-ack'`).** One tiny ack satisfies the silence
requirement: when an incoming assert **advances** the receiver's `agreeUntil[P]` (`_updateAgreeUntil` returns the
rise), the receiver replies with just `{tick, hash, seq}` (`_sendWatermarkAck`); the asserter (`_onAgreeAck`)
raises its OWN `agreeUntil` for that peer **iff** its state hash there still matches (a stale hash ⇒ ignored —
the headline invariant, now keyed on the watermark rather than a checkpoint content-hash). Sent only on an
advance, it is nearly free, and it lets an idle+agreed pair fall to **literally zero** traffic (neither is dirty,
so neither asserts; the last ack confirmed the match). Disagreement carries no ack — the disagreeing side
counter-asserts. The ack **complements**, not replaces, the §13.5.2 per-input `known` gate: the watermark ack
raises the `agreeUntil` floor of the dirty scan even when the input wire never delivered an explicit per-tick ack
(e.g. the peer got an input straight from its author), while `_known` (armed also from any received assert's
`usedInputs`/`canon` via `_armKnownFromWindow`/`_armKnownFromCheckpoint`) covers the input-holding half. The old
heavy `_sendAssertAck` + content-hash ledger (`_ckptAcks`/`_ckptKnownHash`, `ackCheckpoint`/`checkpointKnown`/
`markCheckpointDirty`, `MSG_ASSERT_ACK`) is **retired**; only `checkpointVersionHash`/`checkpointContentHash`
survive, to drive the `seq` bump.

### 13.4 Finalized state is canon — pick one, never merge

Once finalized, one state becomes canon; you do **not** merge/union divergent states. Pairwise unions do not
compose in a connected mesh — `A∪B` carries an input `C∪B` lacks and vice-versa, so the network never
reaches one state. Merging needs *global* agreement on the merge, i.e. re-opening finalized state for
everyone, which is the one thing finalization exists to prevent. **Pick-one-canon is the only rule that
converges**; the network adapts to the winner — by **repair (replay)** if it can, by **state transfer** if
it cannot.

**OPTION UNION is not a counterexample to "never merge."** What does not compose is merging arbitrary
divergent *states*, or unioning *uncorroborated* inputs (those are holder-specific — A's and C's never agree).
OPTION UNION (§13.6) unions only **corroborated input SETS** — a globally well-defined set operation, because
corroboration is a global, verifiable property — and then **deterministically re-simulates**. It is "union
inputs + re-sim," never "merge states," and it converges (set-union composes; everyone reaches the same
global union). So *both* resolution variants converge: COMPARE is the default for speed (one winner's set,
fewest rollbacks); UNION trades messages for completeness (no known-corroborated input lost).

**Trust model is fair-peer, and always was.** Any quorum/majority rule is defeatable by a peer inventing
sockpuppet peers, so the whole protocol already assumes honest peers. Corroboration (§13.5) — which leans on
acks that a peer could fake — therefore forfeits nothing that was still held.

### 13.5 Corroboration — which inputs are canon-eligible

**An input is canon-eligible only once ≥ 2 nodes hold it.** An input only one node holds is **void**:
excluded from *everyone's* count, as if it never happened. This is a durability threshold — an input you
failed to replicate to anyone is not durable, so it is not history (which is the correct semantics for "your
packet reached no one," and lines up with the acceptance window: an input has until grace to reach ≥ 1 other
node).

The simplification that makes it cheap: **only your OWN input can ever be uncorroborated.** For any
input `x = (author P, tick T, value)`, the holders are always `P` plus whoever received P's broadcast — so
if you hold `x` and you are not `P`, that is already 2 (P + you), corroborated automatically and visible
locally (the input carries its author id). The single-holder case is exactly "P's own input that reached no
one." So a node only needs corroboration data about **its own** inputs.

**Determined from acks already on the wire — but from HOLDING, not RECEIPT.** Every `inputAckEveryTicks`
(default 2) a node sends each neighbour an `MSG_INPUT_ACK` (§10.1 `{of, haveTicks}` / §10.2 `frontier`)
saying *which of that source's ticks it HOLDS IN ITS DECODER*. This is the crucial distinction: a peer can
**receive** your late input and **reject** it (raw, past acceptance → `REJECT_RAW_IN_GRACE`; or past grace →
`REJECT_BEYOND_GRACE`) — it is never applied, so it is **not** in that peer's frontier. So corroboration must
be read as *"a peer's frontier attests it HOLDS the input,"* **never** *"a message was delivered/acked."* A
mere receipt ack would falsely corroborate a rejected input. The moment any neighbour's frontier *lists* your
tick `T` for `source = you`, input@T is corroborated; record the bit before it ages past grace and carry it
into the history window. No new message type.

This is grounded and non-circular: an input is held by a peer only if that peer accepted it **directly, in
its own acceptance window** (the grace tier accepts nothing un-corroborated). So corroboration always
originates from ≥ 1 *direct* acceptance, which the frontier then advertises, which lets *others* grace-accept
it. An input no one accepts directly is never in any frontier → never corroborated → void. **This holds only
while the frontier is honest** — an un-corroborated self-resend that got grace-accepted would put a void input
into a peer's frontier and **manufacture** false corroboration. That is exactly what the `_sendOne`
corroboration gate (BUILT, behind `knowledgeDrivenSend`) prevents. Corroboration-from-frontier and
grace-is-corroboration-gated are therefore a single invariant: *a peer's frontier must contain only inputs that
were corroborated when accepted.*

This is what lets the **network beat the late straggler**: A's own input that reached no one is void → in
nobody's count → the canon is computed as if it never existed → A (who *included* it) is the only node whose
hash differs, so A is the one that rolls back and drops it. Cheap; only the straggler adapts. It is also the
*only* thing that separates "A's input was late (A loses)" from "B dropped a real input (B adapts)", which
are otherwise byte-for-byte identical to A and B — the disambiguator is whether anyone *else* holds the
input, i.e. a third node's witness (for your own inputs, learned for free via acks).

#### 13.5.1 The send model — the `corroborated` flag; an own input *graduates* into the forward fabric

Corroboration also rewrites what a node SENDS. A node's own stream splits in two:
- **own UNcorroborated input** — only you hold it. Broadcast it **raw** (`corroborated` unset), so peers
  accept it only inside their *acceptance* window — its "pending" phase. If no one ever holds it, it stays void.
- **own CORROBORATED input** — the moment any peer's frontier attests it holds your input@T, that input is no
  different from *anyone else's* input: it is durable, grace-acceptable, and **forwarded just like a §10.2
  relay** — pushed to exactly the peers your knowledge base says still lack it, never to those who already
  acked it. Its "committed" phase. The author stamps `corroborated:true` on it. (Under §13.5.2 this
graduation needs **no separate forward step** — the per-input backoff already re-sends only to peers that lack
it, now stamped corroborated. The sole special handling: the moment an input graduates, if it is at/over the
acceptance edge, reset its backoff to *due-now* so the newly grace-acceptable copy goes out on the very next
check instead of waiting out the current interval. There is no longer a manual resend path.)

So `corroborated` is the corroboration mark on the wire (one bit, no separate `relayed` field), honest because
the author sets it **iff** corroborated: for an own input, only once a peer's frontier proves a second holder.
A FORWARD carries **no flag at all** — the receiver derives corroboration from `from !== playerId` (a non-author
holding it ⇒ author + forwarder = 2 holders), so the only message that ever actually sets `corroborated:true` is
the author's own resend of an input it has proof a second node holds. This is the missing send-side half of
§13.5 — and it makes the whole thing nearly free in *this* engine, which already:
- treats **self as a source** in the forward/resend loop (`_driveForwardWire` iterates every decoder) and
  delta-forwards per the `_known[peer][source]` knowledge base (skip ticks a peer is
  known to hold, skip finalized ticks) — exactly the "A forwards corroborated-x to B but not C (C already
  acked)" behaviour; and
- already *has* the corroboration signal: input@T is corroborated **iff** some `_known[peer≠me][self]`
  contains `T`, and `_known` is populated straight from the frontier acks.

**BUILT (behind `knowledgeDrivenSend`).** The old `_sendOne` shortcut hard-coded the grace-acceptance flag
`true` for *every* resend, including an own UNcorroborated one — letting a lone input win via resend. It is now
a single gate: a self-resend sets `corroborated:true` only when a second holder is known, else stays raw.

Two honest edges. (1) Your corroboration *knowledge* can lag reality — if a holder's ack is lost you keep
treating a truly-corroborated input as pending; harmless and self-correcting (continued frontier exchange
updates `_known`; at worst a momentarily-pessimistic canon count that recovery fixes). (2) Loops are already
prevented — `_known` provenance never forwards a source's input back to that source, nor to a peer known to
hold it.

**The grace tier IS this same gate — accept a late input past acceptance only if it is CORROBORATED.** The
two windows (AcceptanceWindows.js) were always meant as: *acceptance* = accept directly; *grace* = accept
**relays only**, because a relay is proof that *someone else already accepted it* — you must never accept a
still-uncorroborated late input directly, since other nodes may have rejected it and you would desync against
them. That "relay = proof someone accepted it" is exactly corroboration. Operationally the receiver never computes
"≥ 2 holders" itself — it simply trusts the **corroborated flag on the incoming message**, which a sender sets
honestly in **exactly two cases**: (1) a **forward** (author ≠ sender — the forwarder is the second holder),
or (2) the **author's own resend** once it holds proof of a second holder (§13.5.2). A raw message with the
flag unset is acceptance-window-only. So the precise rule is: **past the
acceptance window, accept a late input iff ≥ 2 nodes already hold it.**
- A §10.2 **forward** (a node relaying *another* player's input) inherently satisfies it — the forwarder is a
  second holder, so the input is corroborated. ✔
- A §10.1 **self-resend** (the author resending its *own* input) does **not** inherently satisfy it. The old
  shortcut flagged *every* self-resend grace-acceptable, so a straggler's *lone, uncorroborated* input won via
  resend — the exact opposite of "network beats the straggler." A self-resend now earns grace acceptance
  **only once the author has an ack proving a second holder** (it is corroborated); otherwise it stays a void
  single-holder input and the author drops it. *(BUILT behind `knowledgeDrivenSend`: the `_sendOne`
  corroboration gate; flag off ⇒ legacy always-corroborated.)*

So the one sentence that covers both the count and the acceptance gate: **do not accept (or count) a
non-corroborated input past the acceptance window.**

#### 13.5.2 Knowledge-driven send — per-input backoff replaces the frontier broadcast

§10.1/§13.5.1 as first sketched still leaned on two blunt instruments: a **periodic full-frontier** ack
(every `inputAckEveryTicks`, to every neighbour, re-stating every tick you hold of every source) and a
**global RTO/SACK** gap-detector. Both over-send — the frontier keeps re-asserting what a peer already
provably holds, and one RTO timer fits all inputs. The revision: **a node sends each peer only the inputs it
can prove that peer still lacks, on a per-input exponential backoff.** In steady state (everything acked) it
sends *nothing*: no periodic frontier, no global gap scan. This revises the §13.5 ack *transport* (targeted
per-input, not a periodic snapshot) while preserving its *semantics* exactly — an ack still attests **HOLDING,
not receipt**.

**The unified ledger — `known`, with the old `sentTicks` folded in; `_changes` kept separate.** `_changes`
(the `SparseInputDecoder` contents — the actual intents) stays apart: it is the payload-heavy reconstruction
substrate. Everything about *delivery knowledge* folds into one table, keyed by `(peer P, source player X)`:

```
known[P][X] = [ { tick, known, lastSentTick, sendCount }, … ]
```
- `known[self][X]` = what *you* hold of X (the tick-list view; intents stay in `_changes`).
- `known[P][X][tick].known = true` ⇔ "P has **accepted** X@tick" (evidence — from P's ack, or from
  provenance: a copy that came *from* P, or X's own authorship).
- a `known:false` entry carries `lastSentTick` + `sendCount` — the backoff state for an input you believe P
  still lacks. A `known:true` entry is done and needs no backoff. Storing **ticknumbers only** (never the
  intent) is sufficient; the content is one `_changes` lookup away.

**The send routine (each interval, per peer P):** (1) candidates = inputs in `_changes` whose `(P, X, tick)`
entry is not `known` AND still inside its send-window (the gate below); (2) due iff `now ≥ lastSentTick + backoff(sendCount)`; (3) if due, include `(X, tick,
intent)` in P's coalesced message and bump `lastSentTick = now`, `sendCount++`. For your OWN inputs the
message also carries the `corroborated` flag (below); a forward of someone else's input carries no flag (it is
corroborated by construction — `from !== playerId`).

**Acks are per-input and mean ACCEPTED — never a message-id.** P replies with the `(source, tick)` pairs it
accepted. A message-id ack is unsafe: one message can carry x and y at different ages where P accepts x
(acceptance window) but rejects y (uncorroborated in grace), and a message-level ack would falsely mark y.
P's ack sets `known[P][source][tick].known = true`, which both **stops the backoff** and, for your own inputs,
**is the corroboration witness** (§13.5: a second holder). A peer **re-acks on every receipt** — whether it
just accepted the input or already held it — so a corroboration witness gets many independent delivery
chances; the author's view lags reality only if *every* one of those acks is lost (then it merely under-counts,
harmlessly per §13.6).

**The SEND-GATE is the cap — backoff needs no separate finalization limit (Rob).** One universal rule bounds
all transmission, and it is checked *before* the backoff timer is ever consulted (so it stops an input being
selected as a candidate at all):

> **Never send / resend / relay an UNcorroborated input past the *acceptance* window, nor a CORROBORATED input
> past the *grace* window.** (Acking what you hold is always allowed — the gate restricts *sending*, never
> *acknowledging*.)

So the backoff is an ordinary small-base geometric series with **no explicit cap**: the gate removes an input
from the candidate set the moment it ages out of its window — strictly before any backoff arithmetic runs — so
there is nothing to clamp. Two things fall out for free, *not* as separate mechanisms:
- An own *uncorroborated* input that reached no one is **void at the acceptance window**: past it the gate
  forbids resending, so it can never gain a second holder (correct — "your packet reached no one"). Its only
  shot at corroboration is the retries a small base interval naturally lands *inside* acceptance.
- A *corroborated* input still un-acked by some peer at the **grace** edge simply stops being pushed; if it
  truly never arrived, that peer finalizes missing → a B8 full-state-recovery concern, never an endless resend.

**Finalization is the GC.** A finalized input leaves the candidate set (a B8 full-state concern thereafter)
and its `known[P][X]` row may be pruned — but only once the input ages out past the history floor, where its
corroboration can no longer change (§13.6 "natural freeze"), never before.

**Forwarding is the same routine over `source ≠ self`, and it self-suppresses.** Because a forwarded copy
proves the forwarder also holds it, a forward is **corroborated by construction** (author + forwarder = 2
holders): it carries **no flag** — the receiver derives it from `from !== playerId` — and is grace-acceptable
with no flag arithmetic. And the per-input backoff
*suppresses the redundant forward automatically*: if the first forward interval ≥ one RTT, the target's
direct-delivery ack lands first, flips `known[P][X].known`, and the forward is never sent — so on a full mesh
(where every forward is redundant) forwarding mostly evaporates with **no topology knowledge**.

**Liveness is NOT on this path.** Quiet-when-all-known means input traffic ceases in steady state, so
peer-gone detection must ride the independent §6.1 beat attendance, never the (now-absent) periodic frontier.

### 13.6 The canon decision — per-holder corroboration score, two resolution variants

**EPOCH GATE (a hard precondition for everything below).** The canon decision is **intra-epoch only**. Inputs,
checkpoints, asserts, and acks all carry their `epoch`; a message from a *different* simulation is **never** fed
into `resolve()` — a foreign-epoch input is dropped (or buffered if you are background-joining that frame), and a
foreign-epoch checkpoint is ignored. Cross-epoch divergence is **not** a state disagreement: two epochs index
their ticks against different origins, so corroboration counts and hash windows are *incomparable*. It is a
timer mismatch (a younger group vs an older one) and resolves by **seniority — the older epoch wins** (the
existing authority/`resolveDesync` path), never by the §13.6 score. The engine already enforces this for inputs
(`_routeInput`: foreign epoch → drop) and asserts (`_onAssert`: completeness only when `sameEpoch`); the **new
checkpoint/canon flows MUST inherit the identical gate** — feeding a cross-epoch checkpoint into the per-holder
score would compare two different simulations and corrupt state.

**Corroboration is the HOLDER's, verifiable from the checkpoint — not the author's to assert.** An input
`x = (author P, tick T)` is corroborated iff **some non-author holds it** (§13.5) — and that is *self-evident
from any non-author's checkpoint*: if `x` sits in node N's state and `P ≠ N`, then `x` is held by at least
`{P, N}` → corroborated, and **anyone** examining N's checkpoint verifies it (the input carries its author id).
The author P needs **no ack**. P's own knowledge of x's corroboration is in fact the *weakest* source — it
rides on acks, which are exactly what gets lost — and it matters in only one place: scoring P's **own**
checkpoint, where x is P's own input and the checkpoint alone cannot reveal whether anyone else holds it.

**Scoring is PER-CHECKPOINT, from the holder's seat — a fixed, verifiable function.** To score a checkpoint
held by node H, build a per-player count over the divergent range:
- an input authored by `p ≠ H` that H holds → **corroborated** (verifiable from H's checkpoint) → counts;
- an input authored by H itself → corroborated per **H's self-asserted flag** (the one un-verifiable bit,
  carried in the checkpoint; whether it is *locked* at checkpoint creation is the **optional** freeze below).

> **`score(checkpoint) = ∏_p (1 + corroboratedCount_p)`** (BigInt); ties → **content-hash, then id**.

Every observer computes `score(H's checkpoint)` from H's checkpoint content + H's carried self-assertion, so
the score is **observer-independent**: A, B and a bystander C all compute the identical number for H's
checkpoint. That makes it a fixed total order → a global argmax → convergence. And it is why a
corroborated-but-unprovable input survives: B holding A's input `a2` scores `a2` corroborated (verifiable),
even though A — ack lost — scores its own `a2` as 0. **B is a stronger witness to A's inputs than A is**; B's
checkpoint outscores A's on player A, B's state wins, `a2` lives. The author never needed the ack.

Why the product form (unchanged):
- **Breadth-weighted, deterministically.** `∏(1+count)` orders identically to `Σ log(1+count)` (concave): a
  brand-new player (`0→1` adds `log 2`) outweighs deepening one you already know — a little about several beats
  a lot about one (soft; enough depth still overtakes). `Math.log` is **not** bit-identical across engines, so
  the *decision* is the exact integer product (BigInt); `log` is only the mental model.
- **Pareto structure does real work.** A player you know nothing about contributes a factor of 1 and drops
  out; a broad-support state is un-dominatable from below (a zero coordinate is a protected ceiling), so the
  scalar only has to adjudicate the **incomparable frontier** — exactly where corroboration acts.

**NO AUGMENTATION — but the line is PER-CHECKPOINT, not PER-OBSERVER.** The forbidden move is an **observer
re-scoring someone else's checkpoint using the observer's own private holdings** — *that* makes a checkpoint's
score depend on who is looking → Condorcet cycles → divergence (A scores its state high because A holds `z`; B
scores A's state low because B lacks `z`; each is sure it won; never repairs). The **allowed** thing — and what
the rule above *is* — is each checkpoint scored from **its own holder's seat**, which every observer reproduces
identically from that checkpoint's content. Same raw material (holder knowledge), applied to the checkpoint you
are scoring, never to your private view of it. (This corrects an earlier "advertise one self-count, others use
it" framing: a checkpoint effectively carries the corroboration of *everything it holds* — non-own inputs
verifiably, own inputs by self-assertion — not a single number.)

**CORROBORATION UPDATES — freeze is now OPTIONAL (Rob).** A node's self-asserted bit for its own inputs can
still *change* after a checkpoint is made — H learns a late ack and flips its own input to corroborated. In the
original one-shot-assert model that was fatal: A would decide with the new value, B with the old, and they would
deadlock (each sure a *different* node won, no way to re-align). **As built, that hazard is handled two ways,
neither needing a freeze of the assert path:**
- **(1) Frozen at finalization (`_ownCorrFrozen`).** Once an own input ages to the grace horizon its
  corroboration is *captured* and no longer re-read live, so for a given *finalized* checkpoint both peers score
  the identical snapshot. This is the freeze, but applied to the decision input, not the broadcast.
- **(2) Re-learned from any assert's `canon` payload, before finalization.** Every assert carries its holder's
  corroborated inputs; the receiver folds them (`_armKnownFromCheckpoint`, agree OR desync) into its own
  corroboration evidence (`_ownCorrLive`). So a real divergence or the §6.1 safety-net re-assert re-propagates
  the new value and the peers re-score.

**Note this changed from the original §13.3.1 plan**, where a corroboration flip tripped the **content-hash**
dirty bit and re-broadcast on its own. Under §13.3.2's input-lack silence gate a corroboration-only change does
**not** trip the gate (the same inputs are held — it is pure metadata), so re-propagation now rides assert
traffic plus the frozen snapshot, rather than a dedicated content-hash resend. Two supporting facts still hold:
(a) corroboration is **metadata** — it never moves the *result-state* hash, so an update never triggers a
phantom desync or a false B8 escalation; it only re-runs the canon decision. (b) corroboration settles by the
**history floor** (no more acks once an input is GC'd), which is its *natural* freeze.

Freeze therefore survives only as an **optional optimization**: locking corroboration at checkpoint creation
gives a single stable decision (no re-flip as late acks trickle in) at a small completeness cost (a late ack is
ignored); *not* locking is more complete (captures late corroboration) but can cost a few extra reconcile rounds
until corroboration settles. Pick per taste; correctness holds either way.

**Two resolution variants — a session-wide constant, default COMPARE.** Both first **drop every uncorroborated
input** (a genuinely-lone input, held by no non-author, is void per §13.5; dropping it is safe *precisely
because no one else holds it*). They differ only in how the surviving corroborated inputs combine:

- **OPTION COMPARE (default).** Pick the single highest-scoring checkpoint (score, then content-hash, then id);
  **everyone adopts THAT checkpoint's corroborated input set** and recomputes. One node's set wins; a *loser's*
  unique corroborated inputs are lost (the "never merge" cost, §13.4). Fastest — one winning state propagates,
  fewest rollbacks.
- **OPTION UNION.** Take the **union of all corroborated inputs** across the comparing checkpoints and recompute
  over it. **No known-corroborated input is ever lost** — every holder folds in the held-non-own inputs it can
  verify, and a corroborated input is never lone to its holder. More complete; more rollbacks/messages (the
  unioned set must propagate, and the result is a state no single node held).

**The variant MUST be network-uniform.** It is part of the canonical ruleset: a COMPARE node targets one
winner's set while a UNION node targets the union, so a mixed network never agrees → permanent divergence. Ship
it as a **session-wide constant** (per epoch / config), never a per-node runtime toggle.

**Verifiability — neither variant gets a free authoritative hash.** Dropping uncorroborated inputs changes the
simulation, so the converged (corroborated-only) state's hash is **not** the winner's published full-state
hash. Both variants need a *recompute-then-reverify* step: the recomputed hash is exchanged and, if all agree,
the desync is repaired; a residual mismatch ⇒ identical inputs produced different state ⇒ nondeterministic game
logic ⇒ escalate to a full **B8 state transfer**. A nondeterminism bug therefore surfaces one round later under
**both** — neither Compare nor Union pre-verifies.

**The benign residue.** A genuinely-lone input (its author the sole holder) is dropped under both variants —
correctly ("the network beats the straggler," §13.5) and **safely**, since no other node holds it so its
removal conflicts with nobody. The earlier worry that dropping uncorroborated inputs could destroy a
*replicated* input was an artifact of the wrong (author-asserts-everywhere) scoring; under per-holder scoring a
replicated input is always championed by a holder and survives — always under UNION, and under COMPARE unless
its entire side loses the comparison.

#### 13.6.1 Pseudocode (both variants behind one flag)

```
# CONFIG.canonResolution in { COMPARE (default), UNION } — a session-wide constant, identical on every node.
# A "checkpoint" carries: the input changes over the divergent range (each tagged with its author),
# the holder's id, and the holder's self-assertion for its OWN inputs (corroborated yes/no; freeze optional, §13.6).

# corroboration of one input WITHIN a checkpoint held by H — verifiable, observer-independent
isCorroborated(input, cp):
    if input.author != cp.holder: return true                 # a non-author holds it -> proof, from content
    else:                         return cp.selfAsserted[input] # H's own input -> H's self-assertion (freeze optional)

corroboratedSet(cp):  return { i for i in cp.inputs if isCorroborated(i, cp) }   # uncorroborated dropped

score(cp):                                                    # BigInt; never observer-dependent
    counts = tally author over corroboratedSet(cp)            # {player -> #corroborated}
    return product over players of (1 + counts[player])

better(a, b):                                                 # fixed total order
    if score(a) != score(b):                       return higher-score
    if hash(corroboratedSet(a)) != hash(corroboratedSet(b)):  return lower-content-hash
    return lower holder id

# resolve a divergence from an agreed anchor S, given my checkpoint + peers' checkpoints
resolve(S, mine, peers):
    all = [mine] + peers
    if CONFIG.canonResolution == UNION:
        canon = union of corroboratedSet(cp) for cp in all    # every known-corroborated input survives
    else: # COMPARE
        winner = reduce(better, all)
        canon  = corroboratedSet(winner)                      # one winner's corroborated set

    if myInputs == canon: return                              # already EXACTLY canon (no uncorroborated extras, nothing missing)

    # Reconcile to canon through the NORMAL input-apply path — do NOT blindly roll back to S.
    # Dropping my uncorroborated extras + adding any missing canon inputs are ordinary change-applications;
    # the decoder reports the EARLIEST tick whose reconstruction actually MOVES (often none, often well after S).
    earliest = applyInputDiff(myInputs -> canon)              # predicate-based, identical to receiving new inputs
    if earliest != NONE:
        rollbackTo(earliest); resimulateForward()             # may start > S, or not roll back at all if all moot

    # No explicit rebroadcast — the recompute lowers agreeUntil[P] for the affected peers, so the §13.3.2 gate
    # re-ships the contested window on the next assert (the old §13.3.1 dirty-bit re-mark is superseded).
    markDirtyToAllPeers(checkpoints changed by the reconcile)   # their hashes moved -> every peer's prior ack is stale

    # Fast desync check (BOTH modes) — ANY checkpoint whose FULL input set already equals canon has a published
    # hash that IS the canon hash. Compare; a match also lets us mark that anchor KNOWN to that peer.
    # (COMPARE: usually the winner. UNION: any peer that already held the whole union. Often nobody qualifies.)
    ref = any cp in all where cp.startHash == S and cp.inputs == canon   # same start anchor too: same inputs on a different start != same result
    if ref exists:
        if myHashAt(ref.tick) == ref.publishedHash:
            markKnown(peer=ref.holder, checkpoint=ref.tick)     # provably converged here -> not dirty to ref.holder
        else:
            request B8 full state transfer                      # same canon inputs, different state => nondeterministic logic
    # else: no reference this round -> stays dirty everywhere; a real desync resurfaces at the next compare.
```

### 13.7 Repair — replay across grace, transfer as the fallback

Two peers disagree on `C_recent` but agree on `C_prev` (or, after narrowing with the intermediate snapshot-
hashes, on some later agreed hash `S`). From `S`, run `resolve()` (§13.6.1): drop uncorroborated inputs and
compute the canon input set (the winner's corroborated set under COMPARE, the union under UNION). If your
**full** input set already equals canon, you are done — *note this erases your own uncorroborated inputs even
when you "won," since canon never contains them*. Otherwise **reconcile to canon through the ordinary
input-apply path, not a blind roll back to `S`**: dropping your uncorroborated extras and adding any missing
canon inputs are normal change-applications, so the decoder's predicate reports the EARLIEST tick whose
reconstruction actually moves — often none (the changes were moot), often well *after* `S`. Roll back only
from there (beyond grace if it falls there), resimulate, and **re-broadcast your now-changed checkpoint**.
Verification then rides the normal hash-compare loop rather than a bespoke barrier: a real nondeterminism
(identical canon inputs → different state) surfaces when the reconciled checkpoints are compared next round and
escalates to a full **state transfer**. You *can* detect it a round early under COMPARE — but only when the
winner carried **no** uncorroborated inputs, because only then is the winner's published hash the canon hash;
if the winner had uncorroborated inputs, a mismatch against its hash is expected and ignored.

One implementation note: reconcile can now **remove** inputs, not only add them (erasing your uncorroborated
extras), so the decoder's predicate must report the earliest-affected tick for a *removal* symmetrically to an
insertion — and a removal that reverts to the value held *entering* that tick is moot, exactly like an equal
insertion.

Convergence is guaranteed because `score()` is a fixed, observer-independent total order (§13.6): the canon is
well-defined and every peer climbs to the same global argmax (COMPARE) or the same union (UNION, order-
independent because set-union is commutative). Both first drop uncorroborated inputs, so both may roll back
when more than one node carried uncorroborated inputs — the price of "the network beats the straggler."

### 13.8 New dev knobs

- `snapshotInterval` (default 1) — ticks between full snapshots (§13.1).
- `historicCheckpoints` (default 4) — number of most-recent finalized checkpoints retained (§13.2).
- `canonResolution` (default `COMPARE`; alt `UNION`) — §13.6 desync resolution. **Session-wide constant** (a mixed network diverges).
- `inputResendBackoff` (§13.5.2) — base interval + growth for the per-input knowledge-driven resend. No
  explicit cap: the send-gate (never resend uncorroborated past acceptance / corroborated past grace)
  terminates it. Keep the base small enough that an own uncorroborated input gets ≥ 1 retry inside the
  acceptance window. Supersedes the periodic full-frontier ack + global RTO/SACK.
- Existing, all dev-settable: `acceptanceWindowMs`, `graceWindowMs`, `checkpointInterval` (= `snapshotIntervalTicks`).
- Invariant: **acceptance ≤ grace < history**.

### 13.9 Status — built vs. design

*(Updated 2026-06-09. The whole §13 recovery machinery is gated behind `recovery:true`. The knowledge-driven
send is now **unconditional** — its old `knowledgeDrivenSend` flag was removed, not a toggle anymore.)*

- **BUILT:** `historicCheckpoints` + the checkpoint-hash-map bound (`HashWindowBuilder.pruneBefore`,
  `_historicFloor`) — fixes the unbounded-hash leak and keeps delayed-desync *detection* working; the hash
  map is safe past grace because it is only compared, never a rollback anchor.
- **BUILT + WIRED:** the pure **`research/synced-clock/CanonDecision.js`** (`canonProduct` = BigInt `∏(1+count)`,
  `canonCompare` = shared-factor cancellation, `canonWinner` = larger product / tie → lower id,
  `corroboratedCountVector` / `corroboratedSet`, `canonResolve` COMPARE+UNION). No longer just an isolated
  module: the engine assembles the per-holder corroborated vector from the contested range and reconciles to the
  canon set in repair (`_reconcileToCanon(canonResolve([myCp, peerCp], …))`), with repair-rollback allowed to
  cross the grace floor.
- **BUILT — §13.1 snapshot/checkpoint grid:** `snapshotInterval` (fine grid, default 1 = every tick),
  hash-every-snapshot, checkpoint-anchors-to-nearest-snapshot, nearest-snapshot rollback, and the
  coarse-checkpoint / fine-snapshot decoupling. Full snapshot + input + checkpoint-hash retention to the history
  floor (with the normal-rollback grace-clamp) is in.
- **BUILT — the knowledge-driven send model (§13.5.2), now DEFAULT:** the unified `known` ledger (old
  `sentTicks` folded in, `_changes` kept separate); per-input geometric backoff (send-gate-bounded, no separate
  cap) replacing the periodic full-frontier ack + global RTO/SACK; per-input **ACCEPTED** acks (not message-id);
  intrinsic-corroboration forwarding with redundant-forward suppression; the honest `corroborated` wire flag and
  its `_sendOne` gate (an own resend is flagged corroborated only once a second holder is known, else raw).
- **BUILT — §13.6 per-holder scoring:** verifiable per-holder scoring under the no-augmentation invariant
  (non-own inputs corroborated by construction, own inputs by self-assertion, frozen at finalization via
  `_ownCorrLive` / `_ownCorrFrozen`); the COMPARE/UNION resolution variants.
- **BUILT — §13.3.2 the `agreeUntil` watermark (supersedes §13.3.1):** per-peer agreement is one integer
  `agreeUntil[P]`, derived from the hash-window compare (`_updateAgreeUntil`: RISE on a top match, FALL to the
  anchor only on a fresh higher-`seq` assert); the silence gate (`_recoveryHasDirtyCheckpoint`) and the
  delta-sized send window (`_sendWindowAnchor`) are both derived from it, plus `_divergedPeers` so a passive
  canonical peer still ships its window to a diverged side. Input traffic goes silent in steady state, so
  liveness rides the §6.1 beat attendance. The SPARSE §6 **watermark ack is BUILT** (`MSG_AGREE_ACK`,
  `_sendWatermarkAck`/`_onAgreeAck`): an advance of `agreeUntil` replies `{tick, hash, seq}`, and the asserter
  raises its own watermark iff the hash still matches — the silence proof that complements the input-`known`
  gate. **The old §13.3.1 content-hash dirty-bit ledger (`_ckptAcks`/`_ckptKnownHash`,
  `MSG_ASSERT_ACK`/`_sendAssertAck`, `checkpointKnown`/`ackCheckpoint`/`markCheckpointDirty`) is RETIRED** (its
  `desync-canon-reliability` dirty-bit tests migrated to the watermark-ack invariant); only
  `checkpointVersionHash`/`checkpointContentHash` survive, driving the `seq` bump.
- **BUILT — §13.6.1 the entering-value CLAMP + reconcile bounds (receiver-side).** The canon reconcile
  materialises each player's value ENTERING the agreed anchor `S` as a phantom change at `S` (mirrors clipping a
  liveness run to a window), so a stale per-player baseline hidden under a LOSSY anchor-state collision is
  corrected rather than rebuilt on a blank baseline. Scoring stays on the OBSERVER-INDEPENDENT in-range
  checkpoints (the anchor is pairwise — clamping the scored set would make a score observer-dependent → Condorcet
  cycles); the phantom rides the reconcile only, and only when we LOSE (adopt the winner's baseline), with a
  redundant-phantom dedup. The REMOVAL step is bounded to `min(our horizon, the sender's canonHorizon)` so a loser
  never deletes its OWN newer finalized inputs the sender's older checkpoint hadn't covered. Phantoms are
  `known`-reset (recorded as held by the SOURCE only) so a future delta-encoder never references them. Tests:
  `desync-canon-input-clamp`, `desync-canon-end-clamp`, `desync-canon-known-reset`, `desync-canon-clamp-multipeer`.
- **DESIGN (not built) — Build B, the assert payload DELTA-ENCODING (§13.3).** Today the assert *broadcasts the
  whole canon checkpoint, un-delta-encoded* (`assertMsg.canon = _canonCheckpoint()`; the code says "§13.3 trims
  it"). What is LEFT is the per-recipient bandwidth shaping: the **two-checkpoint transition** (`C_prev`,
  `C_recent` + the inputs driving the window — the intermediate snapshot-hashes it references already exist,
  §13.1); **per-recipient delta-encoding** (a `(source, tick)` reference for an input `known[peer][source]`
  already holds, full value only for what a peer provably lacks) — turning the assert from broadcast → per-peer;
  **tombstoning** removed inputs within the history window so references stay resolvable, with the
  request-the-value fallback firing only on history-window skew. The §13.6.1 `known`-reset above is the hook the
  delta-encoder must honour (a phantom is sent full, never referenced). Test family: `canon-delta-inputs` (the
  §13.4 `canon-decision` cases ride on it).
  **NO LONGER part of Build B (superseded this cycle):** the per-input `corroborated` flag — now CONTENT-driven
  (§13.6: non-own inputs corroborated by construction, own by self-assertion); and the per-player liveness carry —
  REBUILT as corroborated alive-RANGES that ALREADY ride the assert (`assertMsg.liveness`, reconciled by
  `mergeRuns` union — see `LIVENESS_CORROBORATION.md`), retiring the old grow-only-max last-heard SCALAR.

## 14. Why this is efficient — netcode optimizations vs naive rollback

Baseline = naive rollback netcode (GGPO-style): a fixed roster, every participant broadcasts an input
every tick to every peer, a rollback fires whenever a remote input lands later than predicted, and
inputs are buffered per-player across the rollback window. Each choice below cuts bandwidth, compute,
or storage — and several REDUCE rollbacks rather than just surviving them.

The impact ratings are **design-time estimates, not measurements** (CLAUDE.md §4). They swing with
the game profile — input churn, roster size, audience size, join frequency — so treat them as
order-of-magnitude intuition. They also compound rather than add independently.

1. **Sparse change-only inputs — bandwidth/storage: HIGH.** Send an input only when intent CHANGES;
   silence = unchanged (null = passive). *Example:* at 20 TPS a player holding "thrust" for 3 s
   naively sends 60 messages; sparse sends 2 (press + release) → ~97% fewer. This is the dominant
   bandwidth win for any game where inputs persist across ticks (almost all of them). It doesn't
   reduce rollbacks itself — it reduces the messages that could cause them.

2. **Input delay — rollbacks: HIGH; compute: MEDIUM (at a latency cost).** Stamp inputs `D` ticks
   ahead so they arrive BEFORE they're needed; the prediction window shrinks toward zero. *Example:*
   `D = 5` at 20 TPS (0.25 s) gives a remote input 5 ticks of travel before the sim reaches it — on a
   low-RTT link almost nothing arrives late → near-zero rollbacks → near-zero re-sim compute. Naive
   GGPO predicts-then-corrects every late input. This is the primary rollback-avoidance lever; it
   trades input-to-screen latency for it.

3. **Conditional joining — rollbacks: HIGH (eliminates a whole class); architecture: HIGH.** A join
   is one discoverable input decided by a deterministic predicate + `limit`, recomputed on rollback
   like any input — no lobby, no handshake, no out-of-band slot authority. *Example:* 20 players grab
   for 2 ball-slots on the same tick; every node independently and identically selects the 2
   lowest-id valid grabbers. No node coordinates; a late lower-id grab arriving in the non-finalized
   window just re-decides on the ordinary rollback — no special "join desync", no pause-the-world.
   Naive systems treat a mid-game join as a special event (connection negotiation + state transfer +
   roster resync), often the worst stutter source. Here it deletes the entire join-race special case;
   the saved handshake bandwidth is a bonus. (The elegance is that determinism is what makes it free.)

4. **Cheap spectators (input-volume filtering) — bandwidth/storage/compute: HIGH at scale.** A
   non-discoverable input that doesn't follow a live discoverable is dropped on sight; a pure
   watcher's arrow-keys are never stored or relayed. *Example:* a 5-player game watched by 2000
   people — the 2000 cost ≈0 input bandwidth/storage; only the 5 participants' deltas flow. Naive
   netcode has no cheap-spectator notion. Turns "thousands of churning nodes" from impossible into
   routine.

5. **GC-by-absence / finalized-floor retention — storage: MEDIUM-HIGH.** Input arrays live only as far
   back as the non-finalized window needs for rollback; past grace they're GC'd; a participant's
   identity rides in the finalized baseline, not a hoarded log; a departed player costs zero.
   *Example:* a 4-hour session with 500 players cycling through keeps state for the current roster + a
   sub-second input tail, not 4 h × 500 input logs. Memory is bounded regardless of session length or
   churn.

6. **Knowledge-driven send — steady-state SILENCE (§13.5.2) — bandwidth: HIGH.** A node sends each
   peer only the change-ticks it can *prove* that peer still lacks (the per-(peer, source) `known` ledger), on
   a per-input exponential backoff that STOPS the instant an ack confirms the hold. Once everything is acked the
   input wire goes *completely silent* — no periodic frontier, no keep-alive resend; the send-gate (never
   resend uncorroborated past acceptance / corroborated past grace) bounds it with no separate timer. *Example:*
   a 4-player mesh settles to ZERO input/ack traffic between intent changes. Supersedes our own earlier §10.1
   design (periodic full-frontier ack + global RTO/SACK), which re-stated what peers already held every cadence.
   *(BUILT, §13.5.2 — now the default send path.)*

7. **Self-input dropping (option 3, §10) — bandwidth: MEDIUM in contested joins, ≈0 otherwise.** A
   node whose discoverable finalizes as a loss stops transmitting its follow-ups (it sends right up to
   the finalized floor, never speculatively, so no risk). *Example:* in the 20-grab-for-2 race the 18
   losers each stop sending gameplay inputs once their grab finalizes non-promoting, instead of
   broadcasting forever. Complements conditional join.

8. **Checkpoint-grid hashing (not per-frame) — bandwidth/compute: LOW-MEDIUM.** State hashes exist
   only on the interval grid; finalized hashes drive desync detection. *Example:*
   `snapshotIntervalTicks = 30` hashes 1 frame in 30, not every frame. Keeps desync detection cheap.

9. **Piggybacked grow-only-max beat vector + deterministic disconnect tick (B7) — bandwidth: LOW;
   rollbacks: LOW (avoids divergence).** Beats ride the existing id'd envelope as an overwrite map
   (§6.1), delta'd against each peer's `ackId`, with a dedicated beats-only message only on the
   interval floor; the disconnect tick is a deterministic grow-only-max so every node agrees → no
   disconnect-induced desync. Source- and forward-gating keep spectators free; gossip-on-new-max is
   loop-free. *Example:* a 50-node game never sends a 50×50 dedicated attendance mesh — beats free-ride
   on input traffic, and only beats that actually advanced are sent. (Supersedes the deleted
   pull-on-suspicion probe — §6.2.)

10. **Zero-handshake completeness-scored recovery (B8) — bandwidth: LOW (rare but spiky).** On a
    catastrophic desync both peers compute the SAME winner from already-shared hash-window data; only
    the loser requests state, point-to-point. No challenge round-trip to decide WHO wins. Saves a
    negotiation RTT and a broadcast when recovery (rare) does fire.

11. **Participant-gated liveness — compute: LOW.** Only participants' connection state is tracked; a
    spectator's liveness is untracked, so liveness cost scales with participant count, not total
    nodes.

12. **Forward self-suppression on connected graphs (§13.5.2) — bandwidth: MEDIUM-HIGH at mesh scale.**
    Relaying others' inputs is the same delta-send over `source ≠ self`; because per-input backoff lets a
    peer's direct-delivery ack arrive before the first forward fires, redundant forwards *evaporate with no
    topology knowledge*. *Example:* in a full triangle where everyone hears everyone directly, the
    A→(B-relays-to-C) echo is suppressed once C's own ack proves it already holds A's input — killing the
    echo storm a naive 'relay everything to everyone' mesh produces.

13. **Corroboration as a free durability threshold (§13.5) — compute/bandwidth: LOW; rollbacks: avoids
    divergence.** An input is canon only once ≥ 2 nodes hold it — learned for free from the acks already
    flowing (HOLDING, not a new message). A lone input that reached no one is *void*, excluded from everyone's
    count, so 'the network beats the late straggler' costs only the straggler's own rollback, never a
    negotiation. *Example:* a packet that dies in flight makes solely its author roll back and drop it; every
    other node was already correct.

14. **One-scalar, BigInt-exact canon decision (§13.6) — bandwidth: LOW; determinism: HIGH.** When histories
    diverge, each node ships ONE extra number (its frozen corroborated-own-count for the checkpoint window);
    the winner is a single integer-product comparison `∏(1+count)` — BigInt-exact (no float `Math.log`
    desync), computed identically by all from already-exchanged data, so there is no tiebreak round-trip.
    Freezing the count at the checkpoint makes the decision re-derivation-free. Complements item 10's
    zero-handshake recovery: both decide WHO wins from shared data, never a challenge RTT.

**Net effect:** bandwidth scales with `changes × participants` (not `ticks × nodes`); storage with
the non-finalized window (not session length); compute with the actual late-input rate (not a fixed
rollback budget). Input traffic falls to ZERO in steady state (not a per-tick frontier). And the events that are special-cased, stutter-prone rollback sources in naive
netcode — mid-game join, disconnect, recovery — become ordinary deterministic inputs here.

## 15. v1 scope (what to build first)

- **Hardcode Flag 1 = ON** (a disconnect becomes a leave). Keep the disconnect *distinguishable* from
  a voluntary leave at the signal layer so Flag 1 = OFF (reservation) can be added later without a
  data-model change.
- **Defer the held-spot / reservation apparatus** (Flag 1 OFF, `IsDisconnected`-flip reconnect). It is
  fully specified above but not needed for the first cut.
- **Flag 2:** lean CONSUME-gated for v1, to mirror discovery's consume-is-announce pattern and keep
  membership and cleanup in lockstep. Revisit if a Model-A lazy-dev path wants auto.
- **One announce path** (`discoverParticipants`); no `rediscover` primitive.
- Ship a deterministic `isParticipant(id)` and `isDisconnected(id)` for the §7 polling contract.
- **Self-input dropping = option 3 only** (§10) — it falls out of the §9 retention rules for free; the
  speculative `dropOwnUnpromotedInputs` knob (options 1/2) is deferred.
- **Read API:** ship `players()` + per-tick capability handles (§8): store the id, re-fetch the
  handle each tick. In prod the handle is just the bare id.

## 16. What gets DELETED / not built

- `startParticipating` as an explicit signal — retired (§6).
- `RediscoverParticipantIfPossible` as a separate primitive — it is sugar over an id-pinned predicate
  (§2.1); not built.
- A persistent "has-left" registry and a never-released warning — retired (§7).
- A simulation-affecting `max_candidates` eviction — not built (§11).
- "participant = has an input array" as the definition — retired (§1).

KEEP (still load-bearing):
- The discovery query / dev predicate / `limit` (conditional join, slot enforcement).
- The `discoverable` flag and input-volume filtering.
- B7 attendance-derived disconnect tick and its grow-only-max agreement (also reused for reconnect).
- Finalized-floor GC; STATE-hashed / inputs-not-hashed safety of receiver-side dropping.

## 17. Scope / honesty (CLAUDE.md §4)

- **Not implemented.** This is a design consolidation; no engine code exists for the announced-set
  model yet. The current engine still uses the older participant handling.
- **What is reasoned, not tested.** Every determinism claim here (rollback re-derives the same
  announced-set; consume-gating is idempotent under re-sim; reconnect-tick agreement) is design-level
  reasoning. None of it is yet pinned by a falsifying test.
- **The first tests to write (each must try to BREAK the model):**
  1. **Announced-set under adversarial ordering** — feed a late, lower-id `joinGame` into the
     non-finalized window and assert the rollback recomputes a *different* winner identically on all
     nodes (catches a mutable side-register / queue-pop).
  2. **Consume-gated idempotence** — roll back across a `getStoppedParticipating` consume and assert
     the demotion re-derives (catches queue-drain implementations).
  3. **Reconnect-tick agreement** — give two nodes skewed local attendance-resume observations and
     assert `IsDisconnected → false` lands on the same agreed tick (catches per-node liveness leaking
     into membership).
  4. **Departed-query loud-fail** — assert dev-mode THROWS on `poll(non-participant)` and
     `poll(disconnected participant)`, and that prod returns an identical default on every node.
- **Browser-only unknowns.** All of the above is validated on the deterministic `VirtualClock` +
  `MemoryTransport` harness. Real WebRTC attendance timing of disconnect/reconnect detection remains
  browser-only (HONEST_LIMITS.md).
  5. **Self-input dropping = option 3 fidelity** — assert a node keeps transmitting its follow-ups
     right up to the finalized floor and stops only after its discoverable finalizes as a loss; assert
     a surprise late promotion (contended slot frees in the non-finalized window) leaves NO gap,
     because sending never stopped while a win was possible.
  6. **Handle hygiene (§8)** — assert a handle is per-tick: a handle held past its issuing tick
     dev-loud-fails, while storing the plain ID across a rollback re-resolves to the same player
     identically on every node. Assert prod resolves a stale handle via the bare id to the sync-safe
     §7 default.
  7. **`players()` respects consumption (§8)** — assert `players()` before vs after a promoting
     `discoverParticipants` differs by exactly the announced id, that the result is identical on every
     node and re-derives on rollback, and that under consume-gating a departed id persists in
     `players()` until `getStoppedParticipating()` is consumed.
  8. **Decoder reorder-tolerance (§10.1, the load-bearing one for the messaging redesign)** — feed the
     three changes `(t10:A), (t20:B), (t30:A)` into a `SparseInputDecoder` in EVERY arrival permutation
     and assert `valueAt` over the whole range is identical for all six orderings. This must FAIL on
     today's code: `applyChange`'s "equal to held-before value" suppression (SparseInput.js ~L111-116)
     drops the `t30:A` change when `t20:B` has not yet arrived, so the late-arriving `t20:B` then
     reconstructs the wrong stream. The fix under test: suppress only on an *exact-tick* equal value,
     otherwise always insert — making reconstruction a pure function of the change SET, not its arrival
     order. (This is the single concrete code change the §10.1 transport redesign actually requires;
     everything else in §10.1 is design-stage.)
  9. **Bounded redundancy absorbs a burst with zero extra RTT (§10.1)** — with last-N piggyback
     redundancy (N=3) and a drop pattern that kills ≤ N-1 *consecutive* messages, assert every change
     still reconstructs WITHOUT any retransmit request firing (the loss is masked by the next in-flight
     message, not recovered a round-trip later). Then assert that a burst of ≥ N consecutive losses
     DOES eventually reconcile via the RTO/SACK targeted resend — and, critically, that the RTO trigger
     is TIME-based: a large RTT that legitimately keeps > N messages in flight must NOT be mistaken for
     loss (catches a message-count-based "slid out of window" detector, which was the wrong first
     design — see §10.1).
 10. **Beat convergence is order- and loss-free (§6.1)** — deliver per-player beats (grow-only-max tick
     scalars) in scrambled order with arbitrary drops and assert every node's `DisconnectTracker`
     converges to the same max per player with NO ack machinery, purely from overwrite + gossip +
     re-stamp. Assert a stale (lower) beat arriving late is discarded, and that the value-keyed
     ×3 repeat counter resets when a newer beat is learned mid-burst.
 11. **Knowledge-driven send stops in steady state (§13.5.2)** — on a clean link, assert input traffic goes
     to ZERO once every change is acked (no periodic frontier re-broadcast), and that a single dropped copy
     is re-sent on the per-input backoff and STOPS the instant its `(peer, source, tick)` ack arrives. Assert
     the SEND-GATE holds: an own *uncorroborated* input is never sent past the *acceptance* window, and a
     *corroborated* input is never sent past the *grace* window (while acking either is still allowed). Assert
     a small base interval lands ≥ 1 retry *inside* acceptance for an uncorroborated input, and that an input
     un-acked at its window edge simply stops (not resent forever, and not capped by special backoff math —
     the gate, checked before the backoff, is the only bound). Assert a redundant forward is suppressed when
     the direct-delivery ack beats the first forward interval. Catches: a frontier broadcast that never
     quiets; a send that escapes its window; an own uncorroborated input "winning via resend" past acceptance.
 12. **No-augmentation canon accounting (§13.6)** — three nodes A, B, C where A's own input x is held by B
     (so genuinely corroborated) but A lost ALL of B's acks (A believes x uncorroborated). Assert A and C
     reach the **same** winner when comparing histories — A's x is judged by A's frozen self-count
     identically by every node, so x is dropped *consistently*, never split-brain. Then assert that a judge
     which *augments* A's asserted count with its own knowledge of A's inputs (the rejected "union") can be
     driven into a Condorcet cycle across the three nodes (the falsification that justifies no-augmentation).
- **Open knobs not yet decided:** Flag 2 default per source (uniform vs split); whether `rediscover`
  sugar is ever warranted; the receiver-side flood cap (deferred Byzantine valve); the speculative
  `dropOwnUnpromotedInputs` knob (options 1/2 in §10) and, under it, option 1 vs 2.
