Feature/sparse mem checkpoints #1176

protoben · 2025-12-18T19:49:21Z

This add two new "lazy trace iterators" (impls of Iterator<Item = Cycle>): CheckpointingTraceIter executes just like LazyTraceIter, but records the initial values of all memory accesses in a hashmap. Checkpoint uses a hashmap-based memory representation that stores the initial values for all memory accesses in a specific subsequence of the trace, allowing it to be much smaller than a full representation of memory; it can be replayed for the given number of trace steps.

In addition, we modify trace_checkpoints to use a CheckpointingTraceIter for its initial execution and produce a sequence of Checkpoint, rather than cloning the full vector representation of memory.

This adds a `MemoryData` trait for use inside the MMU's `MemoryBackend`. This allows us to use different data structures to represent RAM within the emulator. Specifically, we want a data structure that records the initial state of each accessed entry in memory as a "checkpoint", as well as one that allows replaying from such a checkpoint.

This adds the `CheckpointingMemory` and `ReplayableMemory` structs, which can be used as the underlying memory data structure for the emulator. The former allows saving the initial values of memory accessed during a given chunk of execution as a hashmap, and the latter uses a hashmap to replay those memory accesses.

For now, we rename Emulator to GeneralizedEmulator<D> and Cpu to GeneralizedCpu<D> and make Emulator and Cpu type aliases, where D is instantiated as Vec<u64>, as before. This avoids making many upstream changes to jolt-core, for the time being. This commit touches many files, due to needing to add generics to all of the many places Cpu is used, especially the instruction types.

The checkpoint can be called on an underlying memory capable of saving replayable checkpoints. It clones the corresponding data structure, with the exception of the memory, which it extracts with a memory-efficient hashmap backend.

Now that we need to record accesses to memory in the emulator, the read interface needs to take the memory mutably. This makes it difficult to use in jolt-core, where the memory needs to be read within a ParIter. In addition, allowing external modification of the emulator memory is probably not desirable. Thus, we make the read/write interface private to the crate and add a collection of simple getters.

This replaces the full clones of `LazyTraceIter` in `trace_checkpoints` with hashmap-based checkpoints, produced using `CheckpointingTraceIter`. This also entails a few changes to the `MemoryData` interface and associated interfaces: - We modify `CheckpointingMemory` to not track memory accesses initially. The `start_recording_checkpoints` function starts the tracking of memory accesses. This is in order to avoid tracking the initial access that set up the bytecode values in memory, since tracking these causes zeros (the pre-setup value) to be recorded for the bytecode. - We remove the `Default` constraint for `MemoryData` and impl for `MemoryBackend` to prevent `std::mem::take` being called on the `MemoryBackend`. Doing so causes values such as the memory capacity to be zeroed out, which interferes with checkpoint collection. Instead, we add a method `take_as_vec_memory` to take the `Vec<u64>` while leaving the rest of the data intact. - We modify the `Iterator` impls of `CheckpointingMemory` and `Checkpoint` to track trace steps, rather than the notion of a "cycle", as tracked by `cycle_count`. This allows us to ensure that a checkpoint will produce exactly the desired number of entries in its trace.

This commit re-implements `LazyTraceIterator` as a newtype wrapper around the `LazyTracer` trait. This allows us to have one implementation of `Iterator<Item = Cycle>`, rather than 3.

…ory`

protoben added 13 commits December 16, 2025 13:23

Implement checkpoint saving for mmu, cpu, and emulator

d3125f5

The checkpoint can be called on an underlying memory capable of saving replayable checkpoints. It clones the corresponding data structure, with the exception of the memory, which it extracts with a memory-efficient hashmap backend.

Implement checkpoint-generating and replaying trace iterators

d453c07

Run cargo fmt

29bb0b5

Error on uninitialized access in replayable memory

fc1b984

Re-enable tests for trace iterator

56e41b8

Add trace length test for CheckpointingMemory

7d8072c

Checkpointing WIP

6c6428b

Merge branch 'main' into feature/sparse-mem-checkpoints

b25c35e

protoben changed the title ~~Draft: Feature/sparse mem checkpoints~~ Feature/sparse mem checkpoints Jan 3, 2026

protoben added 3 commits January 5, 2026 13:11

Eliminate code duplication in multiple iterators of Iterator<Cycle>

91b0c83

This commit re-implements `LazyTraceIterator` as a newtype wrapper around the `LazyTracer` trait. This allows us to have one implementation of `Iterator<Item = Cycle>`, rather than 3.

Fix some typoes

709766d

Eliminate the Vec<u64> memory backend in favor of `CheckpointingMem…

598e508

…ory`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/sparse mem checkpoints #1176

Feature/sparse mem checkpoints #1176

protoben commented Dec 18, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feature/sparse mem checkpoints #1176

Are you sure you want to change the base?

Feature/sparse mem checkpoints #1176

Conversation

protoben commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

protoben commented Dec 18, 2025 •

edited

Loading