Rewrite O(N) recursive templates with O(1) pack expansion #3596

tenpercent · 2026-01-16T17:28:25Z

Summary

Rewrite sequence_map_inverse using O(1) depth pack expansion
Replace O(N) recursive calculate_element_space_size with fold expression
Replace sequence_merge O(log N) recursion with O(1) fold expression

Results

Pattern	Before	After	Reduction
`sequence_map_inverse`	45 inst, 187ms	10 inst, 10ms	95%
`calculate_element_space_size`	24 inst, 35ms	10 inst, 9ms	73%
`sequence_merge`	O(log N) depth	O(1) depth	-

Test Plan

Waiting for full CI

PR Stack

#	PR	Description
1	#3585	sequence_gen with `__make_integer_seq`
2	#3588	generate_identity_sequences helper
3	#3589	Named functors in transform_tensor_descriptor
4	#3590	container_concat optimization
5	#3596	O(1) pack expansion rewrites
6	#3600	TensorDescriptor/TensorAdaptor lambda elimination

Replace O(N) recursive template sequence_map_inverse_impl with constexpr function and pack expansion for O(1) template depth. Results: - sequence_map_inverse: 45 instances, 187ms → 7 instances, 10ms (95% reduction)

Use pack expansion with fold expression to compute element space size instead of recursive template or recursive lambda. Results: - calculate_element_space_size: 24 instances, 35ms → 10 instances, 9ms - Max template depth: 24 → 23

Use operator| with fold expression (Seqs{} | ...) to merge sequences in O(1) template depth instead of O(log N) binary tree recursion. - Reduces sequence_merge instantiations from 449 to 167 (63% reduction) - Total template instantiations: 47,186 → 46,974 (-212) - ADL finds operator| since Sequence is in ck namespace

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 59f0c32 to 5190578 Compare January 16, 2026 17:34

tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch from 6d792da to f5ada17 Compare January 16, 2026 20:16

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 5190578 to 887bdf2 Compare January 16, 2026 20:16

tenpercent mentioned this pull request Jan 16, 2026

Replace nested static_for lambdas with compile-time search helper #3600

Open

1 task

tenpercent marked this pull request as ready for review January 17, 2026 03:41

tenpercent requested review from Snektron, ThomasNing, afagaj, andriy-ca, aosewski, asleepzzz, bartekxk, carlushuang, cgmillette, coderfeli, geyyer, illsilin, poyenc, qianfengz, shumway, vidyasagar-amd and vpietila-amd as code owners January 17, 2026 03:41

tenpercent added 3 commits January 16, 2026 21:46

Rewrite sequence_map_inverse using O(1) depth pack expansion

a8c9be9

Replace O(N) recursive template sequence_map_inverse_impl with constexpr function and pack expansion for O(1) template depth. Results: - sequence_map_inverse: 45 instances, 187ms → 7 instances, 10ms (95% reduction)

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 887bdf2 to 02e42dc Compare January 17, 2026 03:51

tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch from f5ada17 to 9942fd6 Compare January 17, 2026 03:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rewrite O(N) recursive templates with O(1) pack expansion #3596

Rewrite O(N) recursive templates with O(1) pack expansion #3596

Uh oh!

tenpercent commented Jan 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Rewrite O(N) recursive templates with O(1) pack expansion #3596

Are you sure you want to change the base?

Rewrite O(N) recursive templates with O(1) pack expansion #3596

Uh oh!

Conversation

tenpercent commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Results

Test Plan

PR Stack

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tenpercent commented Jan 16, 2026 •

edited

Loading