Skip to content

fix: replace blocking-in-async ops in validator, remote prover, and ntx-builder#2041

Open
Ollie202 wants to merge 8 commits into0xMiden:mainfrom
Ollie202:fix/blocking-async-validator
Open

fix: replace blocking-in-async ops in validator, remote prover, and ntx-builder#2041
Ollie202 wants to merge 8 commits into0xMiden:mainfrom
Ollie202:fix/blocking-async-validator

Conversation

@Ollie202
Copy link
Copy Markdown

@Ollie202 Ollie202 commented May 4, 2026

Partially addresses #1976 (store blocking ops remain in #2008)

Problem

Proof verification, VM execution, local signing, local batch/block proving, local transaction
proving, and ntx-builder VM execution were running CPU-intensive work directly on Tokio worker
threads. As noted in the issue (with dial9 profiling evidence), this starves the async runtime
and degrades throughput.

Changes

crates/validator/src/tx_validation/mod.rs

  • TransactionVerifier::verify (proof verification) moved to spawn_blocking
  • executor.execute_transaction (VM execution) moved to spawn_blocking with a dedicated
    current_thread runtime since the validator data store makes no real async I/O calls

crates/validator/src/signers/mod.rs

  • Local SecretKey::sign (sync ECDSA wrapped in async fn) moved to spawn_blocking
  • KMS signing left untouched -- it performs real async I/O

bin/remote-prover/src/server/prover.rs

  • LocalBatchProver::prove was already fixed; LocalBlockProver::prove moved to spawn_blocking
  • LocalTransactionProver::prove left as-is -- delegates to the VM async API

crates/ntx-builder/src/actor/execute.rs

  • LocalTransactionProver::prove (ZK proof generation) moved to spawn_blocking
  • filter_notes + execute (VM execution with data-store callbacks) moved to spawn_blocking
    with Handle::current().block_on() so gRPC data-store calls use the existing I/O driver

Not covered

Store blocking ops (RocksDB) are being addressed separately in #2008.

…wn_blocking/block_in_place

Proof verification, VM execution, local signing, and local proving are
CPU-bound operations that were running directly on Tokio worker threads,
stalling the async runtime. Wraps them in block_in_place or spawn_blocking
to isolate blocking work from the async executor.

- validator: wrap proof verification and VM execution in block_in_place
- validator: wrap local SecretKey signing in block_in_place
- remote-prover: wrap LocalBatchProver::prove in spawn_blocking
- remote-prover: wrap LocalBlockProver::prove in block_in_place

Closes 0xMiden#1976
Copy link
Copy Markdown
Collaborator

@Mirko-von-Leipzig Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use block_in_place

Copy link
Copy Markdown
Contributor

@kkovaacs kkovaacs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely doesn't fix all issues outlined in #1976.

Replaces all uses of tokio::task::block_in_place with spawn_blocking
across the validator and remote-prover crates, as requested in review.
spawn_blocking is preferred because it works with both multi-threaded
and current_thread runtimes (including #[tokio::test]).

For the TX executor which uses an async API that may not yield,
a dedicated current_thread runtime is created inside the spawn_blocking
closure to drive the future to completion.
@Ollie202
Copy link
Copy Markdown
Author

Ollie202 commented May 5, 2026

This definitely doesn't fix all issues outlined in #1976.

On it👍🏻

@Ollie202 Ollie202 changed the title fix: wrap blocking ops in validator and remote prover with spawn_blocking/block_in_place fix: replace blocking-in-async ops in validator, remote prover, and ntx-builder May 5, 2026
Ollie202 added 2 commits May 5, 2026 11:46
LocalTransactionProver::prove generates a ZK proof which is CPU-intensive
and does not yield. Wrap it in spawn_blocking with a current_thread runtime
when no remote prover is configured.
The filter_notes and execute steps call executor.execute_transaction /
NoteConsumptionChecker.check_notes_consumability, which are CPU-intensive
and may not yield between await points.

Move data-store creation + filter + execute onto a dedicated blocking
thread via spawn_blocking. Use Handle::current().block_on() (not a new
runtime) so that the gRPC data-store callbacks made by the VM are
driven by the existing I/O driver on the parent runtime threads.
@Ollie202
Copy link
Copy Markdown
Author

Ollie202 commented May 5, 2026

Updated across three commits -- ready for re-review:

Mirko (@Mirko-von-Leipzig): All block_in_place calls have been replaced with spawn_blocking. For the validator TX executor and ntx-builder VM execution, Handle::current().block_on() is used inside the blocking thread to drive async data-store callbacks through the existing runtime I/O driver.

Krisztian (@kkovaacs): The PR now covers all the blocking-in-async areas from #1976 except the store (which is in #2008):

  • Validator: proof verification, VM re-execution, local signing
  • Remote prover: local block and batch proving
  • ntx-builder: local transaction proving (ZK), VM execution (filter_notes + execute)

The only remaining item is store RocksDB blocking ops, already tracked in #2008.

Comment thread crates/ntx-builder/src/actor/execute.rs Outdated
Move the tokio runtime handle capture to before the spawn_blocking
closure so it is explicit that the same runtime is used for execution.
@Ollie202
Copy link
Copy Markdown
Author

Ollie202 commented May 5, 2026

Moved Handle::current() outside of spawn_blocking so it's explicit we're using the existing runtime handle.

Copy link
Copy Markdown
Contributor

@kkovaacs kkovaacs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ntx-builder execution part with the async-in-blocking context is a bit ugly but I don't have a better suggestion for now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants