feat(l1): replace trie layer cache with LRU cache for full sync#6492
feat(l1): replace trie layer cache with LRU cache for full sync#6492azteca1998 wants to merge 9 commits intomainfrom
Conversation
… mode (#6480) During full sync (add_blocks_in_batch), the TrieLayerCache's diff-layer chain, bloom filter, and RCU cloning are pure overhead since full sync never reorgs. This commit adds a FlatTrieCache backed by an LRU that bypasses all of that machinery: - New FlatTrieCache in layering.rs: simple LRU keyed by trie node path, no layers, no bloom, no parent pointers. - New TrieCacheRef enum: allows TrieWrapper to use either the layered cache (normal CL processing) or the flat LRU cache (batch mode) transparently. - In batch mode, apply_trie_updates writes trie nodes directly to disk and populates the LRU, instead of accumulating diff layers and doing periodic commits with bloom rebuilds. - add_blocks_in_batch enables the flat cache before execution and disables it after, so normal block processing is completely unchanged.
Lines of code reportTotal lines added: Detailed view |
🤖 Kimi Code ReviewOverall Assessment: The PR introduces a well-structured optimization for full sync batch processing. The abstraction using Issues and Suggestions: 1. Magic numbers for trie key lengths (
2. Early success signal before disk persistence (
3. Unused method (
4. Poisoned lock handling (
5. Unnecessary fallback in
Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt |
🤖 Codex Code Review
No other EVM opcode or gas-accounting issues stood out in this diff. I couldn’t run Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
🤖 Claude Code ReviewHere is the full review of PR #6492: PR Review:
|
The batch path in `apply_trie_updates` was sending `Ok(())` on the result channel before the RocksDB write finished. The caller would then proceed to `disable_batch_trie_cache` (dropping the LRU) and `enable_batch_trie_cache` (creating a fresh, empty LRU). If the disk write hadn't completed, the next batch's `has_state_root` check would find the root node in neither the new LRU nor on disk, causing "state root missing" errors around block 443k. Move the result_sender signal to after write_tx.commit() so the caller blocks until all trie nodes are persisted.
When a batch fails with StateRootMismatch (cross-block EVM cache pollution), the fullsync falls back to single-block pipeline execution. The pipeline uses the normal trie path, which stores nodes in the layered diff-chain cache. With commit_threshold=128, the top ~127 layers remain uncommitted (not yet flushed to disk). When the next batch starts in batch mode, trie_cache_ref() previously returned only the flat LRU cache, completely bypassing the layered cache. has_state_root() would miss the flat cache (empty) and fall through to disk, which only had the root from ~128 blocks ago. The hash check failed, producing "state root missing". Fix: introduce TrieCacheRef::FlatWithFallback that checks the flat LRU first, then the layered cache, before falling through to disk.
…mode When a batch fails with StateRootMismatch, the pipeline fallback writes trie nodes to the layered cache (TrieLayerCache). The next batch writes updates only to disk + flat LRU, but after batch mode ends, the stale layered cache still shadows the fresher disk values. Subsequent reads return stale data, causing a permanent StateRootMismatch loop. Fix: drain all uncommitted layers to disk in enable_batch_trie_cache() before creating the flat LRU. This guarantees batch-mode reads (flat LRU → disk) never hit stale in-memory data. Removes the FlatWithFallback variant since the layered cache is now always empty when batch mode is active.
…trics Add per-batch and cumulative cache statistics (hits, misses, hit rate, inserts, evictions, fill level) logged at INFO level when each batch completes. This helps evaluate whether the LRU capacity is sufficient or if evictions are causing excess disk I/O. Also doubles the default capacity from 2M to 4M entries to test whether reduced eviction pressure improves fullsync throughput.
The flat LRU cache had 0% hit rate because all writes happened in phase 2 (apply_trie_updates) while all reads happened in phase 1 (block execution). Promote disk-fetched trie nodes into the flat cache on read so subsequent traversals within the same batch hit memory instead of disk.
…overhead Benchmarked read-through cache: 1.07M hits vs 69.7M misses (1.5% hit rate). The mutex lock + allocation on every disk read made overall sync 10 min slower (3h23m vs 3h13m write-only). Different blocks touch different trie paths so node reuse within a batch is minimal.
Summary
FlatTrieCache: an LRU-based trie node cache (2M entries) for full sync batch mode, bypassing the diff-layer chain, bloom filter rebuild, and RCU overheadTrieCacheRefenum soTrieWrappercan dispatch to either cache transparentlyapply_trie_updatestakes a fast path: writes nodes directly to disk and populates the LRU, skipping layer accumulation entirelyadd_blocks_in_batchenables the flat cache before execution and disables it aftertrie_cache_ref()returnsLayeredwhen no batch cache is activeCloses #6480
Test plan
FULL_SYNC_BLOCK_LIMIT=50000and verify completion[METRICS]logs)