perf: coarse-grained parallel encoding with fused SIMD (closes #31) by MavenRain · Pull Request #36 · itzmeanjan/rlnc

MavenRain · 2026-04-02T12:37:34Z

Replace fine-grained rayon par_iter (one task per piece, per-piece allocation, two-pass multiply then add) with coarse-grained chunking across threads. Each thread accumulates its piece range using the fused multiply-and-add SIMD operation in a single memory pass with zero per-piece allocation. Reduces allocation from O(piece_count) to O(num_threads) and halves memory bandwidth per piece.

…anjan#31) Replace fine-grained rayon par_iter (one task per piece, per-piece allocation, two-pass multiply then add) with coarse-grained chunking across threads. Each thread accumulates its piece range using the fused multiply-and-add SIMD operation in a single memory pass with zero per-piece allocation. Reduces allocation from O(piece_count) to O(num_threads) and halves memory bandwidth per piece.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: coarse-grained parallel encoding with fused SIMD (closes #31)#36

perf: coarse-grained parallel encoding with fused SIMD (closes #31)#36
MavenRain wants to merge 1 commit intoitzmeanjan:mainfrom
MavenRain:perf/coarse-grained-parallel-encode

MavenRain commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MavenRain commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant