perf(download): cache asset verification by stat signature by AprilNEA · Pull Request #25 · arcboxlabs/boot-assets

AprilNEA · 2026-06-12T12:56:30Z

prepare_binaries and ensure_file SHA-256 every asset on every call. For the ArcBox daemon that meant re-hashing ~230MB of runtime binaries (~410ms measured) on every single boot, plus kernel/rootfs on the prepare() path.

Change

After a successful verification or download, record (size, mtime, sha256) per file in a .verified.json next to the assets. Subsequent calls trust the recorded digest while the stat signature and the expected digest both match — one stat per asset instead of a full hash. Any mismatch (file touched, manifest updated, cache missing/corrupt) falls back to the full re-hash; the cache is written atomically and is purely an optimization.

Also streams sha256_file in 1MiB chunks instead of slurping whole files into memory (assets run to hundreds of MB).

Measured in arcbox (isolated daemon, warm boot): daemon-start → VMM-start dropped 434ms → 4ms once the cache is populated.

Release

Bumped to 0.5.2. After merge this needs a crates.io publish, then arcbox bumps arcbox-boot = "0.5.2" (arcboxlabs/arcbox#301 references this).

prepare_binaries and ensure_file SHA-256 every asset on every call; for the ArcBox daemon that meant re-hashing ~230MB of runtime binaries (~400ms) on each boot. Record (size, mtime, sha256) in a .verified.json next to the assets after a successful verification or download, and trust it while the stat signature and expected digest are unchanged. Missing/corrupt cache degrades to a full re-hash. Also stream sha256_file in 1MiB chunks instead of reading whole files into memory. Bump to 0.5.2.

pullfrog

Important

The cache trades away boot-time tamper detection for the security-critical kernel and rootfs. Please confirm this matches the intended threat model before merging — the change is otherwise clean and well-tested.

Reviewed changes — a stat-signature verification cache so steady-state boots skip re-hashing already-verified assets, plus a streaming hash and a patch release bump.

Add VerifyCache (src/verify_cache.rs) — records (sha256, size, mtime) per asset in .verified.json; is_verified trusts the recorded digest only while both the expected sha256 and the stat signature match, otherwise the caller falls back to a full re-hash. Load tolerates a missing/corrupt file, save is atomic and dirty-gated.
Thread the cache through ensure_file (src/asset_manager.rs) — prepare() loads once, passes &mut cache for kernel and rootfs, and save()s at the end; the prior exists-and-rehash path now also records.
Thread the cache through prepare_binaries (src/download.rs) — same pattern with its own cache in dest_dir.
Stream sha256_file in 1 MiB chunks — replaces fs::read of the whole file to cap peak memory on hundred-MB assets.
Release plumbing — filetime/tempfile dev-deps, filetime 0.2.27→0.2.29 in Cargo.lock (drops libredox/redox_syscall/plain transitives), crate 0.5.1→0.5.2.

⚠️ Cache removes boot-time tamper detection for kernel and rootfs

Without the cache, every boot re-hashes each asset and compares it to the manifest digest, so on-disk corruption or tampering of the kernel/rootfs is caught before the asset is used. With the cache, a file modified in place while keeping its size and mtime is trusted on the strength of a stale .verified.json entry, with no re-hash. Preserving mtime is trivial (touch -d/filetime), and the attacker need not touch the cache file at all — the old entry already matches the manifest digest.

For a pure download-integrity goal (catch corrupt/partial fetches at fetch time) this is fine, and the perf win is real. The open question is whether re-verifying security-critical boot assets on every boot was a deliberate defense-in-depth property you intend to keep.

Technical details

# Cache removes boot-time tamper detection for kernel and rootfs

## Affected sites
- `src/verify_cache.rs:65` — `is_verified` returns true on (expected sha256 + size + mtime) match without reading file contents.
- `src/asset_manager.rs` `ensure_file` — kernel/rootfs now short-circuit on `is_verified`.
- `src/download.rs` `prepare_binaries` — runtime binaries short-circuit on `is_verified`.

## Required outcome
- An explicit decision on whether boot-time re-verification of kernel/rootfs is in-scope for the threat model, documented in the PR or module docs.

## Suggested approach (optional)
- If boot-time tamper detection matters: keep the cache for the bulk runtime binaries but force a full re-hash for `kernel`/`rootfs` (or gate caching of those two behind a flag), since they are the highest-value targets and only two files.
- If it does not matter (assets live in a trusted, integrity-protected store): no code change needed — just record the accepted tradeoff so it isn't silently reintroduced as a "bug" later.

## Open questions for the human
- What is the trust boundary on the cache directory? If anything that can write the asset can also write `.verified.json`, re-hashing still mattered (the attacker cannot forge a manifest-matching digest), so the regression is real regardless of cache-file integrity.

ℹ️ Concurrent `prepare` runs can drop cache entries

Two processes preparing the same directory each load, mutate, and atomically rename their own copy of .verified.json, so the last writer wins and the other's newly recorded entries are lost. The atomic rename prevents a corrupt file, and a lost entry only costs a re-hash on the next run, so this is mergeable as-is — noting it so the last-writer-wins behavior is a known property rather than a surprise.

^{｜ Fix it ➔ ｜ View workflow run ｜ Using Claude Opus ｜ 𝕏}

pullfrog Bot reviewed Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(download): cache asset verification by stat signature#25

perf(download): cache asset verification by stat signature#25
AprilNEA wants to merge 1 commit into
masterfrom
perf/verification-cache

AprilNEA commented Jun 12, 2026

Uh oh!

pullfrog Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AprilNEA commented Jun 12, 2026

Change

Release

Uh oh!

pullfrog Bot left a comment

Choose a reason for hiding this comment

⚠️ Cache removes boot-time tamper detection for kernel and rootfs

ℹ️ Concurrent prepare runs can drop cache entries

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ℹ️ Concurrent `prepare` runs can drop cache entries