Skip to content

Add interleave() matching std::simd API, faster than zip_low/zip_high on AVX2#206

Open
Shnatsel wants to merge 11 commits intolinebender:mainfrom
Shnatsel:interleave
Open

Add interleave() matching std::simd API, faster than zip_low/zip_high on AVX2#206
Shnatsel wants to merge 11 commits intolinebender:mainfrom
Shnatsel:interleave

Conversation

@Shnatsel
Copy link
Copy Markdown
Contributor

@Shnatsel Shnatsel commented Apr 12, 2026

Adds a new method with API matching std::simd's interleave method.

The primary motivation is performance: on AVX2, zip_low followed by zip_high requires 6 instructions, while a combined interleave function only needs 4 instructions. (With AVX-512 we'd be able to to do it in 2 instructions on 256-bit vectors, but that's not supported by fearless_simd yet and #201 seems stalled).

This also improves API compatibility with std::simd as a nice bonus.

AI use disclosure: this work was assisted by Claude Opus 4.5 for the initial commit and 4.6 for the rest. I have manually reviewed the code and take full responsibility for it.

@LaurenzV LaurenzV self-requested a review April 13, 2026 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant