Conversation
🤖 Codex Code Review
I did not run Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
Lines of code reportTotal lines added: Detailed view |
Greptile SummaryThis PR addresses a TOCTOU race in Confidence Score: 3/5Not safe to merge: the delete_range fix reintroduces canonical-chain corruption for reorgs spanning a byte-boundary in LE-encoded block numbers. The mutex approach correctly serializes forkchoice_update callers and the TOCTOU race is fixed at the concurrency level. However, the new delete_range on LE-encoded keys is semantically incorrect with RocksDB's default lexicographic comparator, leaving orphan canonical entries for block numbers >= 256 in certain reorg scenarios. The regression test's small block numbers (<=15) cannot catch this. This is a P1 data-integrity issue that needs to be resolved before merging. crates/storage/backend/rocksdb.rs and crates/storage/backend/in_memory.rs both need the delete_range key-ordering fixed; test/tests/storage/fcu_race_tests.rs should be extended to cover block numbers above 256.
|
| Filename | Overview |
|---|---|
| crates/storage/store.rs | Adds fcu_lock mutex to serialize concurrent forkchoice_update calls and removes the TOCTOU-prone load_latest_block_number read. The mutex approach is correct, but the delete_range bounds rely on LE key ordering which is incorrect for RocksDB's lexicographic comparator. |
| crates/storage/backend/rocksdb.rs | Implements delete_range via WriteBatch::delete_range_cf, but CANONICAL_BLOCK_HASHES uses little-endian u64 keys while RocksDB's default comparator is lexicographic — the range will silently miss entries whose LE first byte falls below start_key's first byte. |
| crates/storage/backend/in_memory.rs | Implements delete_range using Rust slice lexicographic comparison via retain; suffers the same LE byte-ordering bug as the RocksDB path for block numbers ≥ 256. |
| crates/storage/api/mod.rs | Adds delete_range to the StorageWriteBatch trait with a clear half-open [start, end) contract; clean API surface. |
| test/tests/storage/fcu_race_tests.rs | Good regression test for the TOCTOU race; however all block numbers are ≤ 15 (BASE+EXT), so the LE byte-ordering bug in delete_range is never exercised. |
| crates/storage/Cargo.toml | Adds sync feature to the tokio dependency to enable tokio::sync::Mutex; correct and minimal change. |
| test/tests/storage/mod.rs | Registers the new fcu_race_tests module; trivial plumbing change. |
Sequence Diagram
sequenceDiagram
participant CallerA
participant CallerB
participant fcu_lock as fcu_lock (tokio::Mutex)
participant Store
participant DB as RocksDB / InMemory
CallerA->>fcu_lock: lock().await → _guard
Note over fcu_lock: CallerB blocks here
CallerA->>Store: latest_block_header.update(new_head_A)
CallerA->>Store: forkchoice_update_inner(...)
Store->>DB: txn.put(new_canonical_blocks)
Store->>DB: txn.delete_range([head+1, u64::MAX)) ⚠️ LE key ordering
Store->>DB: txn.put(LatestBlockNumber = head_A)
Store->>DB: txn.commit()
CallerA->>fcu_lock: drop(_guard)
CallerB->>fcu_lock: lock().await → _guard
CallerB->>Store: latest_block_header.update(new_head_B)
CallerB->>Store: forkchoice_update_inner(...)
Store->>DB: txn.put / delete_range / commit
CallerB->>fcu_lock: drop(_guard)
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/storage/backend/rocksdb.rs
Line: 376-382
Comment:
**Little-endian keys are not monotone under lexicographic order — `delete_range` will silently skip entries**
RocksDB's default comparator is lexicographic (byte-by-byte, left to right). Little-endian u64 encoding does not produce a lexicographically sorted sequence: for example, 256 encodes as `[0x00, 0x01, 0, …]`, which sorts *before* 11's `[0x0B, 0, …]` in lex order. As a result, `delete_range_cf` with `start = (head+1).to_le_bytes()` will miss any canonical entry whose LE first byte is smaller than the first byte of `head+1`, leaving orphan canonical entries—exactly the corruption this PR is trying to eliminate.
Concrete failure scenario: chain extends to block 65536 (`[0x00, 0x00, 0x01, 0, …]`), then reorgs to `head = 50000` (delete_start `[0x51, 0xC3, 0, …]`). Lex: `[0x00, 0x00, 0x01, …] < [0x51, 0xC3, …]`, so block 65536 survives the delete_range and the canonical table remains inconsistent.
The pre-existing loop-based deletion was correct because it generated each exact key. The fix is to either use big-endian encoding (which *is* lexicographically monotone) or restore the enumeration approach.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/storage/backend/in_memory.rs
Line: 197
Comment:
**Same LE byte-ordering bug as the RocksDB path**
Rust's `PartialOrd` on `&[u8]` is also lexicographic. The `retain` predicate `key.as_slice() < start_key || key.as_slice() >= end_key` applies the same byte-by-byte ordering and will preserve entries whose LE first byte is smaller than `start_key`'s first byte, even if their numeric value is above the new head. The regression test uses block numbers ≤ 15, which all fit in the first byte, so the test never exercises numbers ≥ 256 where the bug manifests.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1034-1036
Comment:
**`u64::MAX.to_le_bytes()` is not a safe upper-bound sentinel for LE-encoded RocksDB keys**
Even if the byte-order issue were resolved, using `u64::MAX.to_le_bytes()` (`[0xFF; 8]`) as the exclusive end of a `delete_range` assumes that `[0xFF; 8]` is the lexicographic maximum 8-byte key. With big-endian encoding that assumption holds; with little-endian it does not, so this sentinel is conceptually tied to byte ordering. Additionally, `delete_range` semantics are `[start, end)`, meaning a canonical entry at block `u64::MAX` (key `[0xFF; 8]`) would not be removed — but that's an academic concern for this context.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(l1,l2): serialize forkchoice_update" | Re-trigger Greptile
| .cf_handle(table) | ||
| .ok_or_else(|| StoreError::Custom(format!("Table {} not found", table)))?; | ||
|
|
||
| self.batch.delete_range_cf(&cf, start_key, end_key); | ||
| Ok(()) | ||
| } | ||
|
|
There was a problem hiding this comment.
Little-endian keys are not monotone under lexicographic order —
delete_range will silently skip entries
RocksDB's default comparator is lexicographic (byte-by-byte, left to right). Little-endian u64 encoding does not produce a lexicographically sorted sequence: for example, 256 encodes as [0x00, 0x01, 0, …], which sorts before 11's [0x0B, 0, …] in lex order. As a result, delete_range_cf with start = (head+1).to_le_bytes() will miss any canonical entry whose LE first byte is smaller than the first byte of head+1, leaving orphan canonical entries—exactly the corruption this PR is trying to eliminate.
Concrete failure scenario: chain extends to block 65536 ([0x00, 0x00, 0x01, 0, …]), then reorgs to head = 50000 (delete_start [0x51, 0xC3, 0, …]). Lex: [0x00, 0x00, 0x01, …] < [0x51, 0xC3, …], so block 65536 survives the delete_range and the canonical table remains inconsistent.
The pre-existing loop-based deletion was correct because it generated each exact key. The fix is to either use big-endian encoding (which is lexicographically monotone) or restore the enumeration approach.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/backend/rocksdb.rs
Line: 376-382
Comment:
**Little-endian keys are not monotone under lexicographic order — `delete_range` will silently skip entries**
RocksDB's default comparator is lexicographic (byte-by-byte, left to right). Little-endian u64 encoding does not produce a lexicographically sorted sequence: for example, 256 encodes as `[0x00, 0x01, 0, …]`, which sorts *before* 11's `[0x0B, 0, …]` in lex order. As a result, `delete_range_cf` with `start = (head+1).to_le_bytes()` will miss any canonical entry whose LE first byte is smaller than the first byte of `head+1`, leaving orphan canonical entries—exactly the corruption this PR is trying to eliminate.
Concrete failure scenario: chain extends to block 65536 (`[0x00, 0x00, 0x01, 0, …]`), then reorgs to `head = 50000` (delete_start `[0x51, 0xC3, 0, …]`). Lex: `[0x00, 0x00, 0x01, …] < [0x51, 0xC3, …]`, so block 65536 survives the delete_range and the canonical table remains inconsistent.
The pre-existing loop-based deletion was correct because it generated each exact key. The fix is to either use big-endian encoding (which *is* lexicographically monotone) or restore the enumeration approach.
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| let db_mut = Arc::make_mut(&mut *db); | ||
| if let Some(table_ref) = db_mut.get_mut(table) { | ||
| table_ref.retain(|key, _| key.as_slice() < start_key || key.as_slice() >= end_key); |
There was a problem hiding this comment.
Same LE byte-ordering bug as the RocksDB path
Rust's PartialOrd on &[u8] is also lexicographic. The retain predicate key.as_slice() < start_key || key.as_slice() >= end_key applies the same byte-by-byte ordering and will preserve entries whose LE first byte is smaller than start_key's first byte, even if their numeric value is above the new head. The regression test uses block numbers ≤ 15, which all fit in the first byte, so the test never exercises numbers ≥ 256 where the bug manifests.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/backend/in_memory.rs
Line: 197
Comment:
**Same LE byte-ordering bug as the RocksDB path**
Rust's `PartialOrd` on `&[u8]` is also lexicographic. The `retain` predicate `key.as_slice() < start_key || key.as_slice() >= end_key` applies the same byte-by-byte ordering and will preserve entries whose LE first byte is smaller than `start_key`'s first byte, even if their numeric value is above the new head. The regression test uses block numbers ≤ 15, which all fit in the first byte, so the test never exercises numbers ≥ 256 where the bug manifests.
How can I resolve this? If you propose a fix, please make it concise.| let delete_start = (head_number + 1).to_le_bytes(); | ||
| let delete_end = u64::MAX.to_le_bytes(); | ||
| txn.delete_range(CANONICAL_BLOCK_HASHES, &delete_start, &delete_end)?; |
There was a problem hiding this comment.
u64::MAX.to_le_bytes() is not a safe upper-bound sentinel for LE-encoded RocksDB keys
Even if the byte-order issue were resolved, using u64::MAX.to_le_bytes() ([0xFF; 8]) as the exclusive end of a delete_range assumes that [0xFF; 8] is the lexicographic maximum 8-byte key. With big-endian encoding that assumption holds; with little-endian it does not, so this sentinel is conceptually tied to byte ordering. Additionally, delete_range semantics are [start, end), meaning a canonical entry at block u64::MAX (key [0xFF; 8]) would not be removed — but that's an academic concern for this context.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1034-1036
Comment:
**`u64::MAX.to_le_bytes()` is not a safe upper-bound sentinel for LE-encoded RocksDB keys**
Even if the byte-order issue were resolved, using `u64::MAX.to_le_bytes()` (`[0xFF; 8]`) as the exclusive end of a `delete_range` assumes that `[0xFF; 8]` is the lexicographic maximum 8-byte key. With big-endian encoding that assumption holds; with little-endian it does not, so this sentinel is conceptually tied to byte ordering. Additionally, `delete_range` semantics are `[start, end)`, meaning a canonical entry at block `u64::MAX` (key `[0xFF; 8]`) would not be removed — but that's an academic concern for this context.
How can I resolve this? If you propose a fix, please make it concise.
🤖 Claude Code ReviewNow I have a thorough picture of the changes. Let me compile the review. PR Review:
|
🤖 Kimi Code ReviewThis is a high-quality fix for a subtle TOCTOU race condition in the forkchoice update path. The implementation correctly addresses the safety issue while maintaining good documentation. Critical Correctness: Race Condition Fix
Storage Layer Safety: Delete Range Logic
Concurrency Design
Test Coverage
Minor Suggestions
Summary: The fix is correct, well-documented, and properly tested. The endianness comment is particularly valuable for preventing future regressions. Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt |
🤖 Claude Code ReviewNow I have a complete picture. Let me write the review. PR #6508 Review:
|
| Finding | Severity | Pre-existing? |
|---|---|---|
forkchoice_update_inner is pub, bypasses lock |
Medium | No |
| Cache updated before DB commit; error leaves state dirty | Low–Medium | Yes, but PR touches this code |
| Test doesn't cover cache consistency | Low | No |
unwrap_or(0) edge case undocumented |
Very low | Yes |
The core fix (tokio Mutex serialization) is correct and the regression test is solid. The main actionable item is restricting forkchoice_update_inner visibility so the lock invariant is enforced rather than documented.
Automated review by Claude (Anthropic) · sonnet · custom prompt
🤖 Codex Code ReviewFindings
Notes The per-key delete in I wasn’t able to run the tests in this environment because Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
ElFantasma
left a comment
There was a problem hiding this comment.
Solid TOCTOU fix. The mutex serializes the cache update + DB write so they can't interleave. The cache rollback on DB-write failure is the right complement — without it, a failed FCU would leave the cache pointing at a non-canonical head.
The regression test is excellent: the BASE=250, EXT=10 choice deliberately spans block 256 to catch any future refactor that tries to replace the per-block delete loop with a delete_range, since LE-encoded keys aren't lexicographically monotone. The threshold calibration (4999/5000 hits pre-fix) gives confidence the test would actually catch a regression.
The inline comment about why delete_range isn't safe is gold — exactly the kind of trap that bites people six months later.
| } | ||
|
|
||
| // Delete canonical entries above the new head by enumerating each key. | ||
| // `delete_range` is not safe here: keys are `u64::to_le_bytes()`, and |
There was a problem hiding this comment.
This comment is gold — exactly the kind of trap that bites people six months later.
Motivation
Currently if forkchoice_update is called by two sites simultaneously, we could end up with a corrupted canonical chain.
Description
There are two underlying problems:
This PR fixes them by serializing writes with a mutex.
It also adds a regression test.