Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Nov 21, 2025

Which issue does this PR close?

TODOs

  • Make a PR to extract the RowSelection backed code into its own module

Rationale for this change

Make the parquet predicate evaluation faster by not converting back/forth between BooleanArray and RowSelection as much

What changes are included in this PR?

  1. Change RowSelection to have two possible backings
  2. Add a BooleanArray backed implementation, based on @XiangpengHao 's code from [Parquet] Add BooleanArray based row selection #6624

Are these changes tested?

TBD

Are there any user-facing changes?

Internal notes for myself

  • Get it compiling, not actually routing through BooleanSelection
  • Start routing through BooleanSelection

@github-actions github-actions bot added the parquet Changes to the parquet crate label Nov 21, 2025
/// Optimized version of `boolean_buffer_and_then` using BMI2 PDEP instructions.
/// This function performs the same operation but uses bit manipulation instructions
/// for better performance on supported x86_64 CPUs.
pub fn boolean_buffer_and_then(left: &BooleanBuffer, right: &BooleanBuffer) -> BooleanBuffer {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Parquet] Avoid Mask --> RowSelection --> Mask conversion to improve predicate pushdown performance

1 participant