The new cub env based APIs reuse a lot of template code that can be extracted to a common header so that we avoid code duplication as discussed here. Put the common lines of code under a single header and use it as common functionality under the APIs implementation.
Also try consolidating the two underlying env cub::DeviceReduce implementations by template away the Dispatch type.