Extract environment boilerplate code from within the device interfaces to a separate header #6622

gonidelis · 2025-11-13T19:18:52Z

Boilerplate code for extracting types information (stream, mr, tuning_t etc.) is too big and repetitive across the new device environment based interfaces we introduced. This PR extracts the code into a separate function and re-uses it in the existing environment based device APIs that we have (DeviceScan and DeviceReduce).

Some consideration about the design for the reviewers:

Each device primitive has its own quirks regarding which deterministm_t is supported. For example DeviceReduce::Reduce can support both gpu_to_gpu and run_to_run determinism, while DeviceReduce::ArgMax/Min or DeviceScan only support run_to_run at the moment. That means the determinism heuristics cannot be incorporated into the boilerplate code. Future environment-based APIs must individually evaluate each algorithm to determine and support the appropriate deterministic types.
The existing boilerplate code uses a lambda callable to pass the specific deterministic algorithm implementation by packing the arguments.

      auto reduce_callable = [&](auto tuning, void* storage, size_t& bytes, auto... args) {
        using tuning_t = decltype(tuning);
        return reduce_impl<tuning_t>(storage, bytes, args...);
      };

      // Dispatch with environment - handles all boilerplate
      return detail::dispatch_with_env(
        env, determinism_t{}, reduce_callable, d_in, d_out, num_items, reduction_op, ::cuda::std::identity{}, init);
    }

I need some feedback on whether this interface on the dispatch_with_env() looks sane.

gonidelis · 2025-11-13T19:20:13Z

cub/cub/device/device_reduce.cuh


    // Initial value
-    OutputExtremumT initial_value{::cuda::std::numeric_limits<InputValueT>::max()};
+    OutputExtremumT initial_value{::cuda::std::numeric_limits<InputValueT>::lowest()};


I need to figure out why this bug was not caught from the tests

github-actions · 2025-11-13T23:41:08Z

😬 CI Workflow Results

🟥 Finished in 3h 00m: Pass: 28%/81 | Total: 2d 04h | Max: 2h 59m | Hits: 81%/22346

See results here.

NaderAlAwar · 2025-11-14T15:35:28Z

cub/cub/device/device_reduce.cuh

-
-      return deallocate_error;
+      // Lambda that calls reduce_impl with the right overload based on determinism
+      auto reduce_callable = [&](auto tuning, void* storage, size_t& bytes, auto... args) {


Suggestion: it seems that we are not using anything from the outside scope, so we don't need to capture by reference [&]. Comment applies throughout PR

NaderAlAwar · 2025-11-14T15:40:39Z

cub/cub/device/device_reduce.cuh

-
-      return deallocate_error;
+      // Lambda that calls reduce_impl with the right overload based on determinism
+      auto reduce_callable = [&](auto tuning, void* storage, size_t& bytes, auto... args) {


Question: do we need a lambda to make this work? Can we just pass reduce_impl<tuning_t> directly?

gonidelis requested a review from a team as a code owner November 13, 2025 19:18

gonidelis requested a review from NaderAlAwar November 13, 2025 19:18

github-project-automation bot added this to CCCL Nov 13, 2025

github-project-automation bot moved this to Todo in CCCL Nov 13, 2025

cccl-authenticator-app bot moved this from Todo to In Review in CCCL Nov 13, 2025

gonidelis requested review from bernhardmgruber and srinivasyadav18 November 13, 2025 19:19

gonidelis commented Nov 13, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

Extract env code boilerplate from device interfaces to a separate header

03c1ee9

gonidelis force-pushed the refactor_env_boilerplate branch from c72dd09 to 03c1ee9 Compare November 13, 2025 20:38

NaderAlAwar reviewed Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extract environment boilerplate code from within the device interfaces to a separate header #6622

Extract environment boilerplate code from within the device interfaces to a separate header #6622

Uh oh!

gonidelis commented Nov 13, 2025

Uh oh!

gonidelis Nov 13, 2025 •

edited

Loading

Uh oh!

This comment has been minimized.

github-actions bot commented Nov 13, 2025

Uh oh!

NaderAlAwar Nov 14, 2025

Uh oh!

NaderAlAwar Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Extract environment boilerplate code from within the device interfaces to a separate header #6622

Are you sure you want to change the base?

Extract environment boilerplate code from within the device interfaces to a separate header #6622

Uh oh!

Conversation

gonidelis commented Nov 13, 2025

Uh oh!

gonidelis Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

github-actions bot commented Nov 13, 2025

😬 CI Workflow Results

🟥 Finished in 3h 00m: Pass: 28%/81 | Total: 2d 04h | Max: 2h 59m | Hits: 81%/22346

Uh oh!

NaderAlAwar Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

NaderAlAwar Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gonidelis Nov 13, 2025 •

edited

Loading