Skip to content

Conversation

@tenpercent
Copy link
Contributor

@tenpercent tenpercent commented Jan 16, 2026

Standalone PR

This PR is independent and can be merged separately from the main optimization stack.

Related stack: #3585#3588#3589#3590#3596


Summary

  • Replace recursive template metaprogramming with simple C-array based struct
  • Add constexpr conversion constructors for seamless Tuple interoperability
  • Add arithmetic operators using C++20 concepts
  • Add container helper overloads for StaticallyIndexedArray

Build Time Improvement

Metric Before After Improvement
Wall-Clock 19.0s 18.4s 3.2% reduction
Cumulative 26.2s 24.5s 6.5% reduction

Stacked on: #3585

Test plan

  • Build example_grouped_conv_fwd_xdl_fp16
  • Run verification with example_grouped_conv_fwd_xdl_fp16 1 1 1

// This avoids deep template instantiation while maintaining the same interface
template <typename T, index_t N>
struct StaticallyIndexedArrayImpl
struct StaticallyIndexedArray
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we are doing here is essentially a vector of a vector, no? Maybe we can refactor this into the vector_type class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current major problem with this class it has to be interface-compatible with a Tuple. Need to be careful with the call sites

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can retire the StaticallyIndexedArray and replace with StaticallyIndexedArray_v2

Replace the recursive template metaprogramming implementation of
StaticallyIndexedArray with a simple C-array based struct. This avoids
deep template instantiation while maintaining the same interface.

Key changes:
- StaticallyIndexedArray now stores `T data_[N]` instead of inheriting from Tuple
- Added constexpr conversion constructor to convert from any indexed container (Tuple, etc.)
- Added arithmetic operators (+, -, *, +=, -=) using C++20 concepts
- Added overloads for container_reorder_given_new2old/old2new
- Added overloads for get_container_subset and set_container_subset
- Specialization for empty array (N=0)

Co-Authored-By: Claude <[email protected]>
@tenpercent tenpercent force-pushed the tenpercent/statically-indexed-array-rewrite branch from 1b33b98 to aef254c Compare January 16, 2026 20:16
@tenpercent tenpercent changed the base branch from tenpercent/old-ck-pack-rewrites to develop January 16, 2026 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants