Add interleave() matching std::simd API, faster than zip_low/zip_high on AVX2#206
Open
Shnatsel wants to merge 11 commits intolinebender:mainfrom
Open
Add interleave() matching std::simd API, faster than zip_low/zip_high on AVX2#206Shnatsel wants to merge 11 commits intolinebender:mainfrom
interleave() matching std::simd API, faster than zip_low/zip_high on AVX2#206Shnatsel wants to merge 11 commits intolinebender:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a new method with API matching std::simd's
interleavemethod.The primary motivation is performance: on AVX2,
zip_lowfollowed byzip_highrequires 6 instructions, while a combinedinterleavefunction only needs 4 instructions. (With AVX-512 we'd be able to to do it in 2 instructions on 256-bit vectors, but that's not supported by fearless_simd yet and #201 seems stalled).This also improves API compatibility with
std::simdas a nice bonus.AI use disclosure: this work was assisted by Claude Opus 4.5 for the initial commit and 4.6 for the rest. I have manually reviewed the code and take full responsibility for it.