-
Notifications
You must be signed in to change notification settings - Fork 121
feat[array]: pushdown struct validity on write #5923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
5a5ac8f to
551425d
Compare
Merging this PR will degrade performance by 33.7%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | WallTime | u16_FoR[10M] |
6.5 µs | 9.7 µs | -33.7% |
| ⚡ | Simulation | canonical_into_non_nullable[(10000, 100, 0.01)] |
3 ms | 2.1 ms | +37.96% |
| ⚡ | Simulation | canonical_into_non_nullable[(10000, 100, 0.0)] |
2.7 ms | 1.9 ms | +42.61% |
| ⚡ | Simulation | canonical_into_non_nullable[(10000, 100, 0.1)] |
4.5 ms | 3.7 ms | +22.33% |
| ❌ | Simulation | canonical_into_nullable[(10000, 10, 0.0)] |
445.4 µs | 528.7 µs | -15.75% |
| ❌ | Simulation | canonical_into_nullable[(10000, 100, 0.0)] |
4.1 ms | 4.9 ms | -16.39% |
| ⚡ | Simulation | into_canonical_non_nullable[(10000, 100, 0.01)] |
3 ms | 2.2 ms | +36.73% |
| ⚡ | Simulation | into_canonical_non_nullable[(10000, 100, 0.1)] |
4.6 ms | 3.7 ms | +21.46% |
| ⚡ | Simulation | into_canonical_non_nullable[(10000, 100, 0.0)] |
2.7 ms | 1.9 ms | +41.84% |
| ⚡ | Simulation | into_canonical_nullable[(10000, 100, 0.0)] |
5.2 ms | 4.4 ms | +18.5% |
Comparing ji/struct-val-push-downn (a43054c) with develop (a095ca7)
Footnotes
-
1290 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
5b15206 to
003256a
Compare
Benchmarks: Random AccessSummary
|
Benchmarks: TPC-H SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb NVMeSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on S3Summary
Detailed Results Table
|
Benchmarks: TPC-DS SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb S3Summary
Detailed Results Table
|
Benchmarks: Statistical and Population GeneticsSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on S3Summary
Detailed Results Table
|
Benchmarks: Clickbench on NVMESummary
Detailed Results Table
|
Benchmarks: CompressionSummary
Detailed Results Table
|
| #[derive(Clone, prost::Message)] | ||
| pub struct StructMetadata { | ||
| /// If true, child validity is a superset of struct validity (validity was pushed down). | ||
| /// For nullable children, their validity already includes struct nulls. For non-nullable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only push this into null children why not all ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unneeded (free to add on read) and changes the dtype of the child.
Add a new validity_pushed_down field to struct
Intersect struct validity with nullable children allowing us to skip validity intersection in read.
This is tracked with metadata on struct array/layout.