fix: serialize DPB access across pipelined encode slots#1
Open
urwrstkn8mare wants to merge 2 commits into
Open
fix: serialize DPB access across pipelined encode slots#1urwrstkn8mare wants to merge 2 commits into
urwrstkn8mare wants to merge 2 commits into
Conversation
Author
|
there might be a better way to go about this so i'll try implementing that - then you can have a look at it because you probably know alot more about it. also lmk if ur benchmark suite is ready so i can test with that. |
Author
|
just tested no more frame freezing after a while 🎊 |
urwrstkn8mare
added a commit
to urwrstkn8mare/pixelforge
that referenced
this pull request
Jun 11, 2026
Author
|
@porkloin just added some benchmarks to the PR desc. - don't see any performance regressions from the fixes. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up against the branch for hgaiser#12 (
pipelining). This branch is based onporkloin:pipelining, not my forkdev.Problem
The depth-2 pipelining change gives each encode slot its own input image, command buffer, bitstream buffer, fence, and query pool. However, the DPB images and reference-tracking state still live on the encoder (
dpb_images,current_dpb_slot, reference lists, etc.).With two slots in flight, the next frame can be submitted before the previous frame has finished using or updating that shared DPB state. That can race reference reads/writes across submits. In a live stream this can show up as accumulating artifacts/stutter, and if the encode work wedges then the later drain waits forever on
wait_for_fences(..., u64::MAX)— this results in a frozen stream that only updates when resumed.Fix
Chain pipelined encode submits with a per-encoder timeline semaphore. Each encode submit signals the next timeline value, and each following encode submit waits on the previous value before its command buffer can execute.
That keeps DPB/reference access ordered on the GPU without blocking the CPU between submissions. Bitstream readback is still delayed and drained through the slot pipeline, so the useful part of the depth-2 pipeline remains: submit/readback overlap without racing the shared DPB.
This does not create true encode/encode overlap for normal dependent P-frames;
N+1still cannot safely encode beforeNhas produced its reconstructed reference. It just expresses that dependency to Vulkan instead of relying on host-side waits.Benchmarks
Benchmarked with moonshine's new
moonshine-benchtool (hgaiser/moonshine#107): h264, vkcube as the test app, 40 s runs with 10 s warmup, RTX 5070 (driver 610.43.02). To build against current moonshine, each pixelforge variant had upstreammain(v0.5.0) merged in locally — all three merge cleanly.main(v0.5.0)pipelining(hgaiser#12)pipelining+ this PRMulti-run averages; run-to-run spread is ~±50 µs. All variants sustained the target frame rate (60 / 120 fps).
Takeaways:
main.