Lean nodes that fall behind cannot recover. The only block-fetch primitive is BlocksByRoot, so closing an 872-slot gap takes ~872 sequential round trips while gossip orphans pile up faster than they resolve. This issue adds a chunked BlocksByRange protocol, mirroring the established beacon design.
Problem
src/lean_spec/subspecs/networking/reqresp/message.py defines only BlocksByRootRequest. When a gossip block arrives with a missing parent, the sync layer queues a single-root fetch; the parent's parent is also missing; another fetch; etc.
src/lean_spec/subspecs/sync/backfill_sync.py already batches per call (MAX_BLOCKS_PER_REQUEST = 10), but the caller pattern feeds one orphan in at a time, so batching never engages.
With ranges, an 872-slot gap closes in ceil(872 / 1024) = 1 round trip.
The caller-pattern fan-in (gossip handler enqueues one orphan at a time) is partly to blame and worth fixing in passing, but the structural fix is ranges.
Proposed protocol
Mirror beacon BeaconBlocksByRange under the lean namespace. New protocol added alongside BlocksByRoot, not replacing it.
# src/lean_spec/subspecs/networking/reqresp/message.py
BLOCKS_BY_RANGE_PROTOCOL_V1: Final = ProtocolId(
"/leanconsensus/req/blocks_by_range/1/ssz_snappy"
)
class BlocksByRangeRequest(Container):
start_slot: Slot
count: Uint64
- No
step field. Deprecated in beacon v2; no validator client used it. If a use case appears, bump to /2/.
- Response framing: chunked
ssz_snappy, one SignedBlock per chunk, identical to BlocksByRoot.
- Reuse existing limits:
MAX_REQUEST_BLOCKS = 1024, MAX_PAYLOAD_SIZE = 10 MiB, TTFB_TIMEOUT = 5.0s, RESP_TIMEOUT = 10.0s. No new constants.
- Status codes:
0 = SUCCESS, 1 = INVALID_REQUEST, 2 = SERVER_ERROR, 3 = RESOURCE_UNAVAILABLE.
Responder rules
- MUST serve blocks canonical on the responder's current fork choice.
- MUST return blocks at consecutive slots; missing slots omitted, order preserved.
- MUST ensure each block's
parent_root matches the previous returned block's root, or links to a known ancestor.
- MUST return
INVALID_REQUEST if count == 0 or count > MAX_REQUEST_BLOCKS.
- MUST return
RESOURCE_UNAVAILABLE if start_slot predates the retained history window.
- For slots
<= finalized.slot from the most recent Status, returned blocks MUST lead to that finalized root.
- MAY stop early on fork-choice change, load shedding, or reaching head.
Requester rules
- Verify slot strictly increasing and
block.parent_root == prev_block.hash_tree_root() before importing.
- On parent_root or slot-monotonicity violation: drop the response, downscore the peer.
- During initial sync, request overlapping ranges from at least two peers; cross-check
hash_tree_root agreement at overlap slots.
- Treat
RESOURCE_UNAVAILABLE as non-punitive.
Implementation checklist
Stage 1 — Protocol surface
Stage 2 — Sync integration
Stage 3 — Peer scoring hooks
Test plan
tests/lean_spec/subspecs/networking/reqresp/test_message.py
tests/lean_spec/subspecs/networking/reqresp/test_handler.py
tests/lean_spec/subspecs/networking/client/test_reqresp_client.py
tests/lean_spec/subspecs/sync/test_backfill_sync.py
tests/consensus/ (spec fixtures)
Out of scope
- No V2 with skip-slot semantics or
step field.
- No blob-range framing (lean has no blobs).
- No peer-scoring overhaul beyond the localized hooks above.
- No changes to
BlocksByRoot; it remains for targeted recovery.
- No backward-compatibility shims (per repo policy).
- Validator-side mitigations (sync-lag duty gate, fork-choice attestation filter) are tracked in a separate issue.
Open questions
- Should the request or each response chunk carry an explicit
head_root to disambiguate forks the responder is on, or is Status sufficient?
- For non-finalized slots, MUST the responder serve only its canonical fork, or MAY it serve any-fork on explicit requester opt-in?
- Rate-limit policy (per-peer requests/sec, max in-flight)? Beacon leaves this client-defined; should lean spec a floor?
- During catch-up, should an empty range count against peer score, or is it expected near the responder's head?
References
- Beacon
BeaconBlocksByRange: /eth2/beacon_chain/req/beacon_blocks_by_range/1/ in consensus-specs/specs/phase0/p2p-interface.md.
- Existing
BlocksByRoot: src/lean_spec/subspecs/networking/reqresp/message.py.
- Networking constants:
src/lean_spec/subspecs/networking/config.py.
- Sync layer:
src/lean_spec/subspecs/sync/backfill_sync.py, src/lean_spec/subspecs/sync/service.py.
Lean nodes that fall behind cannot recover. The only block-fetch primitive is
BlocksByRoot, so closing an 872-slot gap takes ~872 sequential round trips while gossip orphans pile up faster than they resolve. This issue adds a chunkedBlocksByRangeprotocol, mirroring the established beacon design.Problem
src/lean_spec/subspecs/networking/reqresp/message.pydefines onlyBlocksByRootRequest. When a gossip block arrives with a missing parent, the sync layer queues a single-root fetch; the parent's parent is also missing; another fetch; etc.src/lean_spec/subspecs/sync/backfill_sync.pyalready batches per call (MAX_BLOCKS_PER_REQUEST = 10), but the caller pattern feeds one orphan in at a time, so batching never engages.With ranges, an 872-slot gap closes in
ceil(872 / 1024) = 1round trip.The caller-pattern fan-in (gossip handler enqueues one orphan at a time) is partly to blame and worth fixing in passing, but the structural fix is ranges.
Proposed protocol
Mirror beacon
BeaconBlocksByRangeunder the lean namespace. New protocol added alongsideBlocksByRoot, not replacing it.stepfield. Deprecated in beacon v2; no validator client used it. If a use case appears, bump to/2/.ssz_snappy, oneSignedBlockper chunk, identical toBlocksByRoot.MAX_REQUEST_BLOCKS = 1024,MAX_PAYLOAD_SIZE = 10 MiB,TTFB_TIMEOUT = 5.0s,RESP_TIMEOUT = 10.0s. No new constants.0 = SUCCESS,1 = INVALID_REQUEST,2 = SERVER_ERROR,3 = RESOURCE_UNAVAILABLE.Responder rules
parent_rootmatches the previous returned block's root, or links to a known ancestor.INVALID_REQUESTifcount == 0orcount > MAX_REQUEST_BLOCKS.RESOURCE_UNAVAILABLEifstart_slotpredates the retained history window.<= finalized.slotfrom the most recentStatus, returned blocks MUST lead to that finalized root.Requester rules
block.parent_root == prev_block.hash_tree_root()before importing.hash_tree_rootagreement at overlap slots.RESOURCE_UNAVAILABLEas non-punitive.Implementation checklist
Stage 1 — Protocol surface
BLOCKS_BY_RANGE_PROTOCOL_V1andBlocksByRangeRequesttonetworking/reqresp/message.py.networking/reqresp/handler.py, mirroring the existing single-root handler.networking/client/reqresp_client.py, mirroringrequest_blocks_by_root.Stage 2 — Sync integration
BackfillSyncuses ranges first; falls back toBlocksByRootonly for residual missing roots.Stage 3 — Peer scoring hooks
RESOURCE_UNAVAILABLEor empty post-head ranges.Test plan
tests/lean_spec/subspecs/networking/reqresp/test_message.pyBlocksByRangeRequestSSZ round-trip: encode then decode yields identical containerhash_tree_rootstable across re-encodings of equal requestscount == 0withINVALID_REQUESTcount > MAX_REQUEST_BLOCKS(boundary: MAX, MAX+1)start_slotatSlot(0)and atUint64.MAXdecode cleanlytests/lean_spec/subspecs/networking/reqresp/test_handler.pycountconsecutive blocks fromstart_slotwhen all retainedcountwhen range overruns head, no errorRESOURCE_UNAVAILABLEwhenstart_slotpredates retained historyINVALID_REQUESToncount == 0,count > MAX_REQUEST_BLOCKS, malformed SSZtests/lean_spec/subspecs/networking/client/test_reqresp_client.pyparent_rootagainst previous chunk; downscores[start_slot, start_slot + count)tests/lean_spec/subspecs/sync/test_backfill_sync.pyceil(872 / MAX_REQUEST_BLOCKS)RPCstests/consensus/(spec fixtures)state_transition_test: range-based catch-up applies a contiguous block batch and yields expected post-stateOut of scope
stepfield.BlocksByRoot; it remains for targeted recovery.Open questions
head_rootto disambiguate forks the responder is on, or isStatussufficient?References
BeaconBlocksByRange:/eth2/beacon_chain/req/beacon_blocks_by_range/1/inconsensus-specs/specs/phase0/p2p-interface.md.BlocksByRoot:src/lean_spec/subspecs/networking/reqresp/message.py.src/lean_spec/subspecs/networking/config.py.src/lean_spec/subspecs/sync/backfill_sync.py,src/lean_spec/subspecs/sync/service.py.