Skip to content

Conversation

@protoben
Copy link
Contributor

@protoben protoben commented Dec 18, 2025

Closes #1137

This add two new "lazy trace iterators" (impls of Iterator<Item = Cycle>): CheckpointingTraceIter executes just like LazyTraceIter, but records the initial values of all memory accesses in a hashmap. Checkpoint uses a hashmap-based memory representation that stores the initial values for all memory accesses in a specific subsequence of the trace, allowing it to be much smaller than a full representation of memory; it can be replayed for the given number of trace steps.

In addition, we modify trace_checkpoints to use a CheckpointingTraceIter for its initial execution and produce a sequence of Checkpoint, rather than cloning the full vector representation of memory.

This adds a `MemoryData` trait for use inside the MMU's `MemoryBackend`.
This allows us to use different data structures to represent RAM within
the emulator. Specifically, we want a data structure that records the
initial state of each accessed entry in memory as a "checkpoint", as well
as one that allows replaying from such a checkpoint.
This adds the `CheckpointingMemory` and `ReplayableMemory` structs, which
can be used as the underlying memory data structure for the emulator. The
former allows saving the initial values of memory accessed during a given
chunk of execution as a hashmap, and the latter uses a hashmap to replay
those memory accesses.
For now, we rename Emulator to GeneralizedEmulator<D> and Cpu to
GeneralizedCpu<D> and make Emulator and Cpu type aliases, where D is
instantiated as Vec<u64>, as before. This avoids making many upstream
changes to jolt-core, for the time being.

This commit touches many files, due to needing to add generics to all of
the many places Cpu is used, especially the instruction types.
The checkpoint can be called on an underlying memory capable of saving
replayable checkpoints. It clones the corresponding data structure, with
the exception of the memory, which it extracts with a memory-efficient
hashmap backend.
Now that we need to record accesses to memory in the emulator, the read
interface needs to take the memory mutably. This makes it difficult to use
in jolt-core, where the memory needs to be read within a ParIter. In
addition, allowing external modification of the emulator memory is
probably not desirable. Thus, we make the read/write interface private to
the crate and add a collection of simple getters.
This replaces the full clones of `LazyTraceIter` in `trace_checkpoints`
with hashmap-based checkpoints, produced using `CheckpointingTraceIter`.
This also entails a few changes to the `MemoryData` interface and
associated interfaces:
- We modify `CheckpointingMemory` to not track memory accesses initially.
  The `start_recording_checkpoints` function starts the tracking of memory
accesses. This is in order to avoid tracking the initial access that set
up the bytecode values in memory, since tracking these causes zeros (the
pre-setup value) to be recorded for the bytecode.
- We remove the `Default` constraint for `MemoryData` and impl for
  `MemoryBackend` to prevent `std::mem::take` being called on the
`MemoryBackend`. Doing so causes values such as the memory capacity to be
zeroed out, which interferes with checkpoint collection. Instead, we add a
method `take_as_vec_memory` to take the `Vec<u64>` while leaving the rest
of the data intact.
- We modify the `Iterator` impls of `CheckpointingMemory` and `Checkpoint`
  to track trace steps, rather than the notion of a "cycle", as tracked by
`cycle_count`. This allows us to ensure that a checkpoint will produce
exactly the desired number of entries in its trace.
@protoben protoben changed the title Draft: Feature/sparse mem checkpoints Feature/sparse mem checkpoints Jan 3, 2026
This commit re-implements `LazyTraceIterator` as a newtype wrapper around
the `LazyTracer` trait. This allows us to have one implementation of
`Iterator<Item = Cycle>`, rather than 3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compact memory representation for streaming checkpoints

1 participant