# Wire Protocol Specification — v2

**Status:** Draft. To be filled in concretely during Phase B as each message type is implemented.

## Overview

Wire-level format for messages exchanged between peers. Lives above the Transport layer (see `TRANSPORT_SPEC.md`). All messages are JS objects (structured-clone safe). Field names use camelCase.

Tunable parameters (windows, intervals, cadences) are documented per message type and configurable per-game (`DECISIONS.md` #19). The library will ship reasonable defaults — choosing them is `KNOWN_ISSUES.md` open question #4.

---

## Delta-Sync Envelope

The network is CONNECTED, not complete (`DECISIONS.md` #39): messages may be relayed, delayed, duplicated, or lost. To synchronize knowledge with minimal traffic, layer-2 messages carry an identity/ack envelope on top of their type-specific payload:

```js
{
  // ... type-specific fields below ...
  id:    <int>,   // monotonic per-sender message id
  ackId: <int>    // highest id this sender has received FROM the recipient
}
```

A sender transmits only the DELTA the recipient has not yet acknowledged (`id > recipient.ackId`). Each node keeps ONE global watermark store `{ id -> knowledge-diff }` (NOT a per-peer log); a `knowledge-diff` records what changed at that id (new/changed inputs, an advanced finalized frame, new frame-hash updates). Bringing a peer up to date = replay every diff with `id` above their `ackId`. Memory is bounded to the unacknowledged tail; per-peer bookkeeping is a single integer. Conceptual narrative: `easy_multiplayer_redesign_concretized_architecture.md` § *Connected-Network Delta Sync*.

---

## Message Types

### `intent` — Sparse Input Change (Goal B1)

```js
{
  type: 'intent',
  fromPeer: <opaque>,
  participantId: <opaque>,
  tick: <int>,
  intent: <game-defined object> | null   // null = transition to passive
}
```

Semantics: silence between intent messages means "unchanged". Receiver reconstructs the continuous input stream. May be retransmitted within the acceptance window.

### ~~`attendance`~~ — NOT a Layer-2 message

Heartbeats are **transport-internal**, not a wire message at this layer. Per `TRANSPORT_SPEC.md` resolved question #5 and `DECISIONS.md` #22, liveness lives entirely inside Layer 1: `TrysteroTransport` uses WebRTC connection state, `MemoryTransport` is driven by test controls, and a hypothetical raw-datagram transport would run its own internal heartbeat. Layer 2 sees only the *results* — `onPeerJoined` / `onPeerLeft` / `getPeers()`.

Goal B2 ("transport-level heartbeat") therefore refers to that internal mechanism realized inside transport implementations that need it — it does **not** add a message type here. The shape below is retained only as a reference for an implementation that chooses an app-level heartbeat internally; it never reaches `onMessage`.

```js
// transport-internal only — never surfaced to Layer 2
{ type: 'attendance', fromPeer: <opaque>, transportTime: <int> }
```

Absence of attendance (not absence of intents) drives a transport's own `onPeerLeft`.

**Implementation (Goal B2).** A reusable `transports/HeartbeatLiveness.js` component encapsulates this for any transport that needs it: it emits beats on a fixed cadence via an injected `sendHeartbeat`, tracks each peer's last-seen time from `noteHeartbeat(peerId)`, and on a periodic sweep fires `onLeft` for peers whose last beat is older than the timeout (and `onJoined` on a peer's first beat or post-timeout rejoin). It is pure and clock-driven (VirtualClock `now()/schedule()` contract), so liveness is deterministic and testable. `MemoryTransport` composes it under an opt-in `{ heartbeat }` option; production transports (`TrysteroTransport`) may instead use native connection state. Detection bound: a gone peer is reported within `timeoutMs + sweepMs` (timeout plus up to one sweep period). With `sweepMs` defaulting to the attendance interval and a steady-beat sender, this satisfies the Goal B2 "≤ 2× attendance interval" bound only when `timeoutMs ≤ intervalMs`; in general the real bound is `timeoutMs + intervalMs`, and `timeoutMs` should be set to a small multiple of the interval to tolerate the loss of a few beats.

### `hashWindow` — Rolling Hash Broadcast (Goal B5)

```js
{
  type: 'hashWindow',
  fromPeer: <opaque>,
  oldestTick: <int>,
  interval: <int>,
  stateHashes: [<hash>, ...],     // hashes at oldestTick, oldestTick+interval, oldestTick+2*interval, ...
  usedInputs: [...]               // relevant queried inputs since oldestTick
}
```

Cadence and `interval` are tunable per `DECISIONS.md` #19.

**Finalized vs non-finalized hashes (`DECISIONS.md` #39).** Hashes exist only at checkpoints (the `interval` grid), never per-frame. The recovery-relevant *finalized frame* is therefore the latest CHECKPOINT at-or-before the grace horizon (`currentTick - graceWindowTicks`) — a frame past grace that is not on the grid is never hashed nor reported as finalized. **Finalized** checkpoint hashes (checkpoint ≤ horizon) are ALWAYS broadcast; they feed catastrophic-desync detection and the B8 completeness score. **Non-finalized** checkpoint hashes (checkpoints still inside the grace window) are broadcast ONLY when the opt-in flag `sendNonFinalizedHashes` is set — they can reveal only a rollback-able / non-deterministic divergence that a transfer cannot repair, so their value is unproven and they are off by default.

### `relay` — Already-Accepted Input Relay (Goal B6)

*(format TBD; carries an already-accepted intent for grace-window convergence; receiver accepts only because some other peer has already accepted it)*

### `bootstrapRequest` / `bootstrapResponse` — Joining (Goal B10)

*(format TBD; bootstrapResponse carries grace-window state + sparse intent log + current-per-participant intent; serving peer chosen randomly per `DECISIONS.md` #18)*

### `stateChallenge` / `stateTransfer` — Severe Desync Recovery (Goal B8)

Severe (CATASTROPHIC) desync = a FINALIZED checkpoint hash differs (history below the grace window can no longer be reconciled by accepting inputs). Resolution is COMPLETENESS-SCORED with an authority tiebreaker, and ZERO-HANDSHAKE: both peers independently compute the same verdict from data they already share via the hash-window / delta-sync stream, so no `stateChallenge` round-trip is needed to decide *who* wins. Only the LOSER acts.

**Completeness score** (`DECISIONS.md` #39), evaluated at `min(myFinalizedFrame, theirFinalizedFrame)`, summed over players:

```text
+1   a player I have finalized that they do not
-1   a player they have finalized that I do not
+1   a shared player whose most-recent finalizing input I hold is NEWER
-1   a shared player whose most-recent finalizing input THEY hold is newer
 0   a player both finalize identically (or known to have no input)
```

(Inputs are accepted only in order — no gaps — so the most-recent input before the finalized frame is a sufficient completeness summary for a shared player.)

```text
score > 0  -> I WIN:  do nothing (serve on request)
score < 0  -> I LOSE: request the winner's full state
score = 0  -> authority tiebreaker: older simulationAge, then lower peerId
```

**`stateTransfer`** — the full finalized snapshot the loser requests from the exact peer it lost to (point-to-point, not broadcast). Mirrors the B10 bootstrap payload (`DECISIONS.md` #18) plus disconnect-tick reconciliation:

```js
{
  type: 'stateTransfer',
  fromPeer: <opaque>,
  tick: <int>,                    // finalized tick of the snapshot
  snapshot: <game-defined state>, // opaque, by reference — never cloned
  inputLog: [...],                // since-edge sparse intent log
  baselineInputs: {...},          // per-participant baseline-at-edge
  lastAttendanceTicks: {...}       // folded grow-only-max (#30 slow-path fallback)
}
```

**`stateChallenge`** — the loser's point-to-point REQUEST for the above. The single in-flight guard must not wedge if the request or its reply is lost: the loser RE-REQUESTS on a rate-limited timeout until the transfer lands (pinned RED by `tests/recovery-lost-request.test.js`; cadence is `KNOWN_ISSUES.md` open question — "something that makes sense").

```js
{
  type: 'stateChallenge',
  fromPeer: <opaque>,
  finalizedFrame: <int>           // the contested finalized frame the loser is requesting state for
}
```

**Non-finalized** desync (a non-finalized checkpoint differing under EQUAL inputs) is NOT healed here — it indicates non-deterministic game code that a transfer cannot fix (re-running the same inputs re-diverges). See `KNOWN_ISSUES.md` (whether to scrap non-finalized healing entirely).

### Disconnect-Tick Convergence (Goal B7 local; cross-network via beat forwarding — SUPERSEDES the B7.1 probe)

Local disconnect tick = `lastAttendanceTick + timeoutTicks` (B7, deterministic grow-only-max). Cross-network convergence is **beat forwarding (grow-only-max gossip)**, not a pull-on-suspicion probe.

> **SUPERSEDED (2026-06-05).** The original `disconnectSuspicion` / `attendanceCorrection` pull-on-suspicion probe — DECISIONS #30, implemented in B7.1 as `DisconnectProbe.js` with the rule "attendance are NEVER proactively forwarded" — has been **DELETED** (code and tests removed). The probe rode the same reliable transport as a attendance but paid a 2-trip request/response, so it could never rescue a convergence case that 1-trip reliable beat gossip cannot; and it was one-shot per `(playerId, tickY)`, so a single dropped correction caused a false disconnect. See `DESIGN_PARTICIPATION.md` §6.2 for the full reductio and §6.1 for the replacement.

A **beat** is one grow-only-max tick number per player ("P was alive as of tick N"). A node **forwards** a beat whenever it advances its local max for a relevant player (gossip-on-new-info → loop-free; grow-only-max absorbs duplicates and reorder). A lost beat needs no retransmit — the next message's value supersedes it (self-healing map, not a log). The fallback when gossip has not yet converged is a B5 desync → B8 severe-desync recovery carrying last-attendance-tick in the #18 bootstrap payload (the ultimate backstop; only the *middle* probe layer between gossip and B8 was removed).

The hardening precondition (`DESIGN_PARTICIPATION.md` §6.2): `timeoutTicks` must span several gossip intervals so a beat propagates network-wide before any node would time the player out. Forwarding itself (the `relevantPlayers` forward-gate) is design-stage — see `DESIGN_PARTICIPATION.md` §6.1.

---

## Open

- Versioning scheme (per-message `version` field vs handshake)
- Compression / encoding (JSON? msgpack? structured-clone-only?)
- Maximum message size assumptions

---

## Defaults (placeholder)

Sensible starting values to be chosen and recorded here under `KNOWN_ISSUES.md` open question #4:

- `acceptanceWindowMs` — TBD
- `graceWindowMs` — TBD (must be > acceptanceWindowMs)
- `snapshotIntervalTicks` — TBD
- `hashWindowBroadcastIntervalMs` — TBD
- `sendNonFinalizedHashes` — **false** (opt-in; broadcast non-finalized checkpoint hashes — see B8 / `DECISIONS.md` #39)
- `attendanceIntervalMs` — **500** (HeartbeatLiveness default; tunable per transport)
- `livenessTimeoutMs` — **2000** (HeartbeatLiveness default; tolerates ~3 lost beats at the 500 ms cadence)
- `livenessSweepMs` — defaults to `attendanceIntervalMs` (the periodic timeout-check cadence)
