Skip to content

release: testnet → prod (trie pruning + gas price oracle fix)#136

Merged
lockchainco merged 8 commits intoprodfrom
testnet
Apr 21, 2026
Merged

release: testnet → prod (trie pruning + gas price oracle fix)#136
lockchainco merged 8 commits intoprodfrom
testnet

Conversation

@lockchainco
Copy link
Copy Markdown
Contributor

Summary

Promotes all testnet-validated changes to mainnet production.

Gas Price Oracle Fix (#135)

  • Fix eth_gasPrice ratcheting from 1 Gwei to 300+ Gwei on sparse-transaction chains
  • Empty blocks no longer inject synthetic lastPrice samples
  • Fix infinite-loop in extended scan (currentBlock never advanced)
  • Fall back to default 1 Gwei when no real transactions in scan window
  • Validated: unit tests (18 pass), e2e devnet, deployed to testnet (all 5 nodes, running since 2026-04-21)

Trie Pruning Feature (#133)

  • prune-trie CLI command for offline state trie pruning
  • --prune server flag for pre-startup pruning
  • Hardened swap logic with crash recovery
  • Streaming writes to prevent OOM on large tries
  • Flush error propagation
  • golangci-lint fixes

Commits (8)

  • ba6265be fix(gasprice): prevent gas price oracle from ratcheting upward (fix(gasprice): prevent gas price oracle from ratcheting upward #135)
  • f87415b8 fix(prune): propagate flush errors in flushingBatch
  • 2fa8ee26 fix(prune): use streaming writes to prevent OOM on large tries
  • b30a1245 Merge PR feat: Offline state trie pruning (--prune flag) #133 (feature/trie-pruning)
  • 6b52d580 style: fix golangci-lint issues
  • 1d17c58f fix(server): harden --prune swap logic with crash recovery
  • ba5699ad feat(server): add --prune flag for pre-startup trie pruning
  • a1e92074 feat(cli): add prune-trie command for offline state trie pruning

Deployment Plan

  1. Local devnet — gas price fix validated
  2. Testnet — all changes deployed 2026-04-21, running stable
  3. Mainnet RPC nodes (RPC-1/2/3) — rolling deploy
  4. Mainnet validators (Validator-1/2) — rolling deploy

Test Plan

  • Unit tests pass (18/18 gasprice, full suite green)
  • E2e devnet: gas price ratchet confirmed on buggy code, recovery confirmed on fixed code
  • Testnet: 5 validators + RPC running stable since deploy
  • Mainnet RPC: rolling deploy, verify gas price stability
  • Mainnet validators: rolling deploy after RPC verification

🤖 Generated with Claude Code

Hydra Guardian and others added 8 commits April 16, 2026 08:05
Implements a safe offline CLI tool that copies only reachable trie nodes
from the latest state root to a new LevelDB, reducing disk usage by
removing orphaned historical state. Source trie is opened read-only.

- `hydra prune-trie run --data-dir <path> --target-path <path>`
- Auto-resolves state root from blockchain DB (or --block N)
- Reuses proven CopyTrie() + HashChecker() from regenesis
- Reports source/dest key counts, size reduction, and validates hash
- 29 tests: property-based, integration, validation, safety

Tested on devnet: 81% reduction (6272→2154 keys), chain continues
producing blocks after trie swap with correct balances.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a --prune flag to `hydra server` that prunes historical state trie
data before the node starts. This integrates the prune-trie functionality
directly into the server startup flow:

  hydra server --data-dir ./node-secrets --chain mainnet --prune [...]

The flag triggers a pre-startup phase that:
1. Resolves the latest state root from blockchain DB
2. Copies reachable trie nodes to trie_new/ (source opened read-only)
3. Validates state root integrity via HashChecker
4. Swaps trie/ → trie_old/, trie_new/ → trie/
5. Preserves trie_old/ for rollback safety
6. Continues normal server startup with the pruned trie

If pruning fails at any step, the original trie is untouched and the
server does not start (fail-safe). Operators add --prune once when
needed, remove it for subsequent restarts.

Tested on devnet: prune + server startup + consensus participation
all working in a single command invocation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…handling

Addresses code sentinel review findings:

- Add crash-state detector: if trie/ is missing but trie_old/ exists
  (SIGKILL between renames), auto-recovers by renaming trie_old back
- Check rollback rename error: on failure, log CRITICAL with exact
  manual recovery command (mv trie_old trie)
- Check os.RemoveAll(trie_old) error before swap
- Check chainStorage.Close() error on success path
- Check os.Chmod error (log warning)
- Guard against negative reduction percentage

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- gofmt -s formatting on prune.go and params.go
- Add blank lines before return statements (nlreturn)
- Use var block for grouped declarations (wsl)
- Break long flag line under 120 chars (lll)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: Offline state trie pruning (--prune flag)
CopyTrie accumulates the entire reachable trie in a single LevelDB
batch in memory before flushing. On testnet (24GB trie), this consumed
3.8GB+ RAM and would have OOM'd the 16GB validator box.

Replace with CopyTrieStreaming which uses a flushingBatch that
auto-flushes to disk every 50,000 entries (~50MB memory cap). The
existing trie traversal code is unchanged — only the batch wrapper
is different.

Discovered during first testnet prune attempt. Killed at 3.8GB RSS
before OOM. Node-2 restarted with original trie untouched.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address sentinel review: store the first flush error on the struct
and return it from Write(), instead of silently swallowing with
nolint:errcheck. Ensures disk-full errors during prune produce a
clear "flush to disk failed" message.

Adds tests for exact-multiple flush boundary and error propagation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(gasprice): prevent gas price oracle from ratcheting upward on sparse chains

The gas price oracle had three compounding bugs that caused eth_gasPrice to
drift from 1 Gwei to 300+ Gwei on chains with sparse transactions (like Hydra
mainnet with ~700 txs/day across 216K blocks/day):

1. Empty blocks injected lastPrice as a synthetic sample. Since 99.6% of blocks
   are empty, the sample set was dominated by copies of the previous estimate.
   Any real tx with a slightly higher tip shifted the percentile up, which became
   the new lastPrice, creating a feedback loop that could only increase.

2. The extended scan loop never advanced currentBlock, processing the same block
   repeatedly until minNumOfTx samples were collected (via the lastPrice injection
   from bug #1).

3. When no real transactions existed in the scan window, the oracle returned the
   cached lastPrice from a previous call, preserving the inflated value.

Fix: skip empty blocks entirely (no synthetic injection), advance currentBlock in
the extended scan with a bounded limit, and fall back to the 1 Gwei default when
no real transactions are found.

Validated with unit tests (17 pass) and e2e on local devnet:
- Buggy code: gas price stayed at 50 Gwei after 34 empty blocks (ratchet confirmed)
- Fixed code: gas price recovered from 50 Gwei to 1 Gwei after 149 empty blocks

Closes #134

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gasprice): address PR review findings

- Restore per-block sample cap (sampleNumber=3) to ensure diversity across blocks.
  A single busy block can no longer dominate the entire gas price estimate.
- Store defaultPrice in GasHelper instance instead of reading the mutable global
  DefaultGasHelperConfig.LastPrice at runtime. This fixes a data race where
  concurrent tests mutating the global corrupted the fallback value.
- Use isolated Config in regression test to avoid pre-existing global mutation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hydra Guardian <guardian@hydrachain.org>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lockchainco lockchainco merged commit b00054c into prod Apr 21, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant