feat(l1): parallel body download from multiple peers by azteca1998 · Pull Request #6490 · lambdaclass/ethrex

azteca1998 · 2026-04-16T10:42:05Z

Summary

Adds request_block_bodies_parallel() to PeerHandler that splits headers into 128-block chunks and dispatches each to a different peer via tokio::spawn
Adds download_bodies_from_peer() helper for single-peer body fetch within spawned tasks
Replaces the sequential body download loop in sync_cycle_full with the parallel version
Failed chunks are automatically retried with other peers; bodies are validated against headers before acceptance
Est. +30-50% body download speed

Test plan

Run full sync on hoodi with FULL_SYNC_BLOCK_LIMIT=50000 and verify bodies download correctly
Compare body download time vs sequential approach
Verify correct ordering: blocks in same order as input headers
Test with limited peers to verify retry/fallback behavior

Split block body requests into chunks of MAX_BLOCK_BODIES_TO_REQUEST and dispatch each chunk to a different peer in parallel using tokio::spawn, following the same channel-based pattern used by the existing parallel header download. Failed chunks are automatically re-queued and retried with other peers, and partial responses are handled by re-queuing the remainder. This replaces the sequential loop that sent 128-block requests to one peer at a time, significantly improving body download throughput when multiple peers are available.

github-actions · 2026-04-16T10:44:38Z

🤖 Codex Code Review

High: partial body responses are reassembled in the wrong global order. When a peer returns only a prefix of a chunk, the remainder is pushed as a brand-new result slot in peer_handler.rs:685, and final output is flattened by iterating results in slot order in peer_handler.rs:737. sync_cycle_full then zips that flattened body list with headers in original order in full.rs:137. If chunk 0 returns 64/128 bodies, bodies 65..128 get moved after later chunks, so headers and bodies become mismatched. This is a correctness bug for block execution. Keep original offsets for each subtask, or write into a pre-sized per-header result buffer instead of appending remainders as new chunk indices.
High: the new parallel downloader can hang forever on persistent failures. Failed/invalid chunks are always re-queued in peer_handler.rs:646 and peer_handler.rs:671, but the loop in peer_handler.rs:640 has no retry budget or terminal error path. The old path bounded this with REQUEST_RETRY_ATTEMPTS in peer_handler.rs:555. With the new call site in full.rs:132, a malicious or just unhealthy peer set can stall full sync indefinitely instead of surfacing BodiesNotFound. Add per-chunk attempt counters or an overall deadline and fail once exhausted.

I did not spot any EVM/gas/consensus-rule changes in this PR beyond the sync transport path. I did not run tests locally.

Automated review by OpenAI Codex · gpt-5.4 · custom prompt

github-actions · 2026-04-16T10:44:38Z

🤖 Claude Code Review

Now I have a complete picture. Here is my review:

PR Review: feat(p2p): parallel body download from multiple peers

Overview

This PR replaces the sequential body download loop in sync_cycle_full with a parallel multi-peer approach. The design mirrors the existing parallel header download (request_block_headers_parallel) and follows the same channel + tokio::spawn pattern. The structural idea is sound, but there are correctness and robustness issues that must be addressed before merging.

Critical Bug: Partial-result ordering is broken

File: peer_handler.rs, lines 676–691

When a peer returns fewer bodies than requested, the remainder is pushed to a new slot appended to the end of results:

let new_idx = results.len();   // always at the tail
results.push(None);
tasks_queue.push_back((new_idx, remaining_headers));

The final reassembly iterates results in index order:

for slot in results {
    all_bodies.extend(bodies);
}

So the remainder of chunk i ends up after all original chunks, not right after its partial sibling. Example with 256 headers (two chunks of 128):

Chunk 0 → peer returns only B0..B63 (partial). New task gets idx=2.
Chunk 1 → peer returns B128..B255 fully. results[1] set.
New task (idx=2) → peer returns B64..B127. results[2] set.
Reassembly: [B0..B63, B128..B255, B64..B127]

The zip in sync_cycle_full then pairs B128..B255 with headers H64..H127, producing silently wrong blocks. This is a data-corruption bug on any partially-responding peer.

Fix: track a (chunk_idx, sub_offset) or maintain a flat sorted list of (global_start, bodies) entries that are stitched together in order at the end, similar to how the header download tracks startblock.

No retry limit — potential infinite loop

File: peer_handler.rs, loop body

The sequential request_block_bodies enforced REQUEST_RETRY_ATTEMPTS. The parallel version has no bound. A chunk for which every peer persistently fails (network partition, all peers serve invalid data for those blocks) will be re-queued forever, hanging the sync indefinitely. At minimum, add a per-chunk attempt counter and fail-fast with an error after N attempts.

`_peer_id` is dead parameter

File: peer_handler.rs line 587

async fn download_bodies_from_peer(
    _peer_id: H256,   // <-- unused
    connection: &mut PeerConnection,
    block_hashes: Vec<H256>,
) -> Result<Vec<BlockBody>, PeerHandlerError> {

The leading underscore signals intent to suppress the warning, but the analogous download_chunk_from_peer uses peer_id for a debug! log. Drop the parameter or add a log line — either way it shouldn't be a silent no-op in the signature.

`dec_requests` leak on channel-send failure

File: peer_handler.rs, tokio::spawn block (lines 720–732)

inc_requests is called before spawning (line 718). dec_requests is called only inside the rx.try_recv() handler. If the channel send in the spawned task fails (e.g., because the outer loop returned early on a get_best_peer error), dec_requests is never called and the peer's request counter stays elevated permanently. This matches the pattern already present in request_block_headers_parallel, so it's a pre-existing wart — but worth fixing in both places with a guard or by restructuring so the spawn always delivers to the channel.

Unnecessary `get_best_peer` call when queue is empty

File: peer_handler.rs, lines 699–712

When the task queue is empty but in-flight tasks haven't reported back yet, the loop still calls get_best_peer() on every tick before discovering tasks_queue.pop_front() returns None. This wastes an async call (and possibly a lock acquisition) every 10 ms purely to discover no work is queued. Guard with an explicit check:

if tasks_queue.is_empty() {
    tokio::time::sleep(Duration::from_millis(10)).await;
    continue;
}

before the get_best_peer call.

Channel capacity may be under-provisioned after growth

File: peer_handler.rs, line 638

let (tx, mut rx) = tokio::sync::mpsc::channel::<(...)>(total_chunks + 1);

The channel is sized to total_chunks + 1. But every partial response pushes a new entry into results (and dispatches another task), so the total number of in-flight tasks can exceed total_chunks. If the channel fills up, spawned tasks will block on send, potentially stalling. Size the channel to accommodate the maximum possible concurrent tasks or use an unbounded channel given results are drained on every iteration.

Minor style issues

Missing blank line between request_block_bodies_parallel's closing brace and the /// Returns the PeerData… doc comment (peer_handler.rs, line 750–751).
The comment on line 694 says "Check if all original data has been collected" but results now contains dynamically added slots — the comment is misleading.
bodies variable in the for loop on line 658 shadows the outer bodies from try_recv; consider renaming to body since it's a single item from zip.

What's good

The overall architecture (channel + tokio::spawn, inc_requests before spawn, dec_requests on receipt, get_best_peer for load-aware peer selection) correctly mirrors the proven header-download pattern.
Validation against headers (validate_block_body) is performed before accepting bodies, consistent with the sequential path.
Partial-peer responses trigger re-queueing rather than failing the whole sync — correct defensive behavior (once the ordering bug is fixed).
The simplification in sync_cycle_full is clean and easier to follow than the old while !headers.is_empty() loop.

Automated review by Claude (Anthropic) · sonnet · custom prompt

github-actions · 2026-04-16T10:44:52Z

Lines of code report

Total lines added: 126
Total lines removed: 5
Total lines changed: 131

Detailed view

+----------------------------------------------+-------+------+
| File                                         | Lines | Diff |
+----------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/peer_handler.rs | 722   | +126 |
+----------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/full.rs    | 287   | -5   |
+----------------------------------------------+-------+------+

…wnload # Conflicts: # crates/networking/p2p/peer_handler.rs

github-actions Bot assigned azteca1998 Apr 16, 2026

azteca1998 changed the title ~~feat(p2p): parallel body download from multiple peers~~ feat(l1): parallel body download from multiple peers Apr 16, 2026

github-actions Bot added the L1 Ethereum client label Apr 16, 2026

github-project-automation Bot added this to ethrex_l1 Apr 16, 2026

azteca1998 added 3 commits April 16, 2026 13:06

fix(l1): resolve clippy warnings in parallel body download

cdd1eb6

Merge remote-tracking branch 'origin/main' into feat/parallel-body-do…

dfa9a13

…wnload # Conflicts: # crates/networking/p2p/peer_handler.rs

style(l1): fix rustfmt in parallel body download

4d15a83

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(l1): parallel body download from multiple peers#6490

feat(l1): parallel body download from multiple peers#6490
azteca1998 wants to merge 4 commits into
mainfrom
feat/parallel-body-download

azteca1998 commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

azteca1998 commented Apr 16, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Apr 16, 2026

🤖 Codex Code Review

Uh oh!

github-actions Bot commented Apr 16, 2026

🤖 Claude Code Review

PR Review: feat(p2p): parallel body download from multiple peers

Overview

Critical Bug: Partial-result ordering is broken

No retry limit — potential infinite loop

_peer_id is dead parameter

dec_requests leak on channel-send failure

Unnecessary get_best_peer call when queue is empty

Channel capacity may be under-provisioned after growth

Minor style issues

What's good

Uh oh!

github-actions Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Lines of code report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`_peer_id` is dead parameter

`dec_requests` leak on channel-send failure

Unnecessary `get_best_peer` call when queue is empty

github-actions Bot commented Apr 16, 2026 •

edited

Loading