Skip to content

Refactor benchmark jobs and remove hardcoded paths (Fixes #418)#718

Open
makarandhinge wants to merge 5 commits intoapache:mainfrom
makarandhinge:fix/issue-418-refactor-benchmark-jobs
Open

Refactor benchmark jobs and remove hardcoded paths (Fixes #418)#718
makarandhinge wants to merge 5 commits intoapache:mainfrom
makarandhinge:fix/issue-418-refactor-benchmark-jobs

Conversation

@makarandhinge
Copy link
Contributor

@makarandhinge makarandhinge commented Mar 11, 2026

Description

This PR revisits and cleans up benchmark jobs in the wayang-benchmark module as requested in issue #418.

Based on maintainer feedback, the original JavaNativeAPI examples were preserved and equivalent implementations using JavaPlanBuilder were added instead of replacing them. This keeps both APIs available for benchmarking and comparison.

Changes Made

Benchmark Improvements

  • Fixed hardcoded file paths in Grep.java.

Added JavaPlanBuilder Implementations

  • Added TPCHQ1WithPlanBuilder.java as a JavaPlanBuilder version of TPCHQ1WithJavaNative.java.
  • Added WordCountWithPlanBuilder.java as a JavaPlanBuilder version of WordCountWithJavaNativeAPI.java.

Code Cleanup

  • Cleaned up Main.java by removing unused imports and adding Configuration.
  • Cleaned up WordCountParquet.java by removing unused imports and adding Configuration.
  • Cleaned up WordCount.java by removing commented code.

Current Benchmark Structure

wayang-benchmark/src/main/java/org/apache/wayang/apps/
├── grep/
│   └── Grep.java
├── tpch/
│   ├── TPCHQ1WithJavaNative.java
│   └── TPCHQ1WithPlanBuilder.java
└── wordcount/
    ├── Main.java
    ├── WordCount.java
    ├── WordCountParquet.java
    ├── WordCountWithJavaNativeAPI.java
    └── WordCountWithPlanBuilder.java

Testing

  • Verified that all benchmark jobs compile successfully.
  • Ran benchmark examples locally to ensure functionality remains unchanged.

Related Issue

Fixes #418

- Replace hardcoded input path with configurable argument
- Replace hardcoded output path 'lala.out' with configurable argument
- Add default output path as <input-file>.out if not specified
- Add usage message and argument validation
- Addresses issue apache#418
@zkaoudi
Copy link
Contributor

zkaoudi commented Mar 11, 2026

Thanks @makarandhinge. It would be better if we keep the JavaNativeAPI examples and add the respective ones with the planbuilder instead of replacing them.

@makarandhinge
Copy link
Contributor Author

Thank you for the suggestion @zkaoudi.

I will restore the original JavaNativeAPI examples and add separate implementations using JavaPlanBuilder instead of replacing them. This way both approaches will remain available in the benchmark module.

I will update the PR accordingly.

@makarandhinge makarandhinge force-pushed the fix/issue-418-refactor-benchmark-jobs branch from fdc7ca9 to bf7a708 Compare March 12, 2026 12:48
@makarandhinge
Copy link
Contributor Author

Hi @zkaoudi,

I have updated the PR accordingly. The original JavaNativeAPI examples are now preserved, and separate implementations using JavaPlanBuilder have been added.

Please let me know if any further adjustments are needed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Revisit benchmark jobs

2 participants