-
Notifications
You must be signed in to change notification settings - Fork 279
Feature/sparse mem checkpoints #1176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
protoben
wants to merge
16
commits into
a16z:main
Choose a base branch
from
GaloisInc:feature/sparse-mem-checkpoints
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+1,998
−1,135
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This adds a `MemoryData` trait for use inside the MMU's `MemoryBackend`. This allows us to use different data structures to represent RAM within the emulator. Specifically, we want a data structure that records the initial state of each accessed entry in memory as a "checkpoint", as well as one that allows replaying from such a checkpoint.
This adds the `CheckpointingMemory` and `ReplayableMemory` structs, which can be used as the underlying memory data structure for the emulator. The former allows saving the initial values of memory accessed during a given chunk of execution as a hashmap, and the latter uses a hashmap to replay those memory accesses.
For now, we rename Emulator to GeneralizedEmulator<D> and Cpu to GeneralizedCpu<D> and make Emulator and Cpu type aliases, where D is instantiated as Vec<u64>, as before. This avoids making many upstream changes to jolt-core, for the time being. This commit touches many files, due to needing to add generics to all of the many places Cpu is used, especially the instruction types.
The checkpoint can be called on an underlying memory capable of saving replayable checkpoints. It clones the corresponding data structure, with the exception of the memory, which it extracts with a memory-efficient hashmap backend.
Now that we need to record accesses to memory in the emulator, the read interface needs to take the memory mutably. This makes it difficult to use in jolt-core, where the memory needs to be read within a ParIter. In addition, allowing external modification of the emulator memory is probably not desirable. Thus, we make the read/write interface private to the crate and add a collection of simple getters.
This replaces the full clones of `LazyTraceIter` in `trace_checkpoints` with hashmap-based checkpoints, produced using `CheckpointingTraceIter`. This also entails a few changes to the `MemoryData` interface and associated interfaces: - We modify `CheckpointingMemory` to not track memory accesses initially. The `start_recording_checkpoints` function starts the tracking of memory accesses. This is in order to avoid tracking the initial access that set up the bytecode values in memory, since tracking these causes zeros (the pre-setup value) to be recorded for the bytecode. - We remove the `Default` constraint for `MemoryData` and impl for `MemoryBackend` to prevent `std::mem::take` being called on the `MemoryBackend`. Doing so causes values such as the memory capacity to be zeroed out, which interferes with checkpoint collection. Instead, we add a method `take_as_vec_memory` to take the `Vec<u64>` while leaving the rest of the data intact. - We modify the `Iterator` impls of `CheckpointingMemory` and `Checkpoint` to track trace steps, rather than the notion of a "cycle", as tracked by `cycle_count`. This allows us to ensure that a checkpoint will produce exactly the desired number of entries in its trace.
This commit re-implements `LazyTraceIterator` as a newtype wrapper around the `LazyTracer` trait. This allows us to have one implementation of `Iterator<Item = Cycle>`, rather than 3.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #1137
This add two new "lazy trace iterators" (impls of
Iterator<Item = Cycle>):CheckpointingTraceIterexecutes just likeLazyTraceIter, but records the initial values of all memory accesses in a hashmap.Checkpointuses a hashmap-based memory representation that stores the initial values for all memory accesses in a specific subsequence of the trace, allowing it to be much smaller than a full representation of memory; it can be replayed for the given number of trace steps.In addition, we modify
trace_checkpointsto use aCheckpointingTraceIterfor its initial execution and produce a sequence ofCheckpoint, rather than cloning the full vector representation of memory.