Skip to content

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Nov 10, 2025

Rationale for this change

Some parts of the Parquet C++ codebase are not fuzzed yet, since they are not exercised in the fuzz target.

What changes are included in this PR?

Add the following tasks when fuzzing an input:

  • decode and read column statistics
  • decode and read bloom filters
  • decode and read per-page indexes (column index, offset index)

Are these changes tested?

By OSS-Fuzz.

Are there any user-facing changes?

No.

@pitrou
Copy link
Member Author

pitrou commented Nov 10, 2025

@github-actions crossbow submit fuzz

@github-actions
Copy link

Revision: b663073

Submitted crossbow builds: ursacomputing/crossbow @ actions-ed4742dbd2

Task Status
test-build-cpp-fuzz GitHub Actions

@pitrou pitrou force-pushed the gh48089-pq-fuzz-read-metadata branch from b663073 to a1d2895 Compare November 10, 2025 15:57
@pitrou
Copy link
Member Author

pitrou commented Nov 10, 2025

@github-actions crossbow submit fuzz

@github-actions
Copy link

Revision: a1d2895

Submitted crossbow builds: ursacomputing/crossbow @ actions-e898f807d8

Task Status
test-build-cpp-fuzz GitHub Actions

@pitrou pitrou force-pushed the gh48089-pq-fuzz-read-metadata branch from a1d2895 to 2ec30dd Compare November 12, 2025 14:29
@pitrou
Copy link
Member Author

pitrou commented Nov 12, 2025

@github-actions crossbow submit fuzz

@github-actions
Copy link

Revision: 2ec30dd

Submitted crossbow builds: ursacomputing/crossbow @ actions-ce7ba541c2

Task Status
test-build-cpp-fuzz GitHub Actions

@pitrou pitrou marked this pull request as ready for review November 12, 2025 15:16
@pitrou pitrou requested a review from wgtmac as a code owner November 12, 2025 15:16
@pitrou
Copy link
Member Author

pitrou commented Nov 12, 2025

@wgtmac @mapleFU @adamreeve @EnricoMi Are you available to take a look at this?

throw ParquetException("bloom_filter_length less than 0");
}
if (*bloom_filter_length + *bloom_filter_offset > file_size) {
if (*bloom_filter_length > file_size - *bloom_filter_offset) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why calling this ? Would add overflows

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it could. The alternative is to call AddWithOverflow.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Nov 12, 2025
@pitrou pitrou merged commit 934554d into apache:main Nov 12, 2025
42 of 45 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Nov 12, 2025
@pitrou pitrou deleted the gh48089-pq-fuzz-read-metadata branch November 12, 2025 16:51
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 934554d.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants