Skip to content

Refactor bulk parallelism#221

Merged
noajshu merged 14 commits intoquantumlib:mainfrom
noajshu:main
Mar 24, 2026
Merged

Refactor bulk parallelism#221
noajshu merged 14 commits intoquantumlib:mainfrom
noajshu:main

Conversation

@noajshu
Copy link
Copy Markdown
Contributor

@noajshu noajshu commented Mar 24, 2026

The use of atomic counter and termination signaling is slightly non-obvious, and we use the same pattern in an ad-hoc way across both simplex and tesseract mains. The goal is to re-use this also in the python API, as an alternative to sinter.

The API looks like this:

size_t parallel_for_shots_in_order(size_t num_shots, size_t num_threads, ProcessShot&& process_shot,
                                   ConsumeShot&& consume_shot)

here process_shot is called within a worker thread for each shot and consume_shot is called on the main thread for each shot (this should return a bool that signals termination when set to false).

Example usage:

std::vector<std::unique_ptr<Decoder>> decoders(args.num_threads);
std::vector<std::vector<size_t>> error_use_per_thread(
    args.num_threads, std::vector<size_t>(num_error_terms));
std::vector<Result> results(shots.size());

size_t num_consumed = parallel_for_shots_in_order(
    shots.size(),
    args.num_threads,

    // Process shot runs in parallel, potentially out of order.
    [&](size_t thread_index, size_t shot_index) {
        if (!decoders[thread_index]) {
            decoders[thread_index] = std::make_unique<Decoder>(config);
        }
        auto& decoder = *decoders[thread_index];
        auto& error_use = error_use_per_thread[thread_index];

        results[shot_index] = decoder.decode(shots[shot_index]);

        if (results[shot_index].count_for_stats) {
            for (size_t ei : decoder.predicted_errors_buffer) {
                ++error_use[ei];
            }
        }
    },

    // Consume shot runs on the caller thread, strictly in shot order: 0, 1, 2, ...
    [&](size_t shot_index) {
        emit_result(results[shot_index]);
        return !should_stop_early(results[shot_index]);
    });

// Optional: merge per-thread scratch after all workers have joined.
std::vector<size_t> error_use_totals(num_error_terms);
for (const auto& error_use : error_use_per_thread) {
    for (size_t ei = 0; ei < num_error_terms; ++ei) {
        error_use_totals[ei] += error_use[ei];
    }
}

@noajshu noajshu requested a review from a team as a code owner March 24, 2026 02:30
@noajshu noajshu requested review from LalehB and removed request for a team March 24, 2026 02:30
@noajshu noajshu merged commit fff3c75 into quantumlib:main Mar 24, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant