Simplify IdModel options #5808

naoyam · 2026-01-13T02:18:26Z

I used to use selective options to enable part of the new indexer such as NVFUSER_ENABLE=id_model(producer_index), but I rarely use the selective options anymore.

Before globally enabling the new indexer, this PR just simplifies the option configuration. Previously, it has to be NVFUSER_ENABLE=id_model(all). With this PR, it's just NVFUSER_ENABLE=id_model.

naoyam · 2026-01-13T02:18:53Z

!test --diff

greptile-apps · 2026-01-13T02:21:52Z

Greptile Overview

Greptile Summary

This PR significantly simplifies the IdModel configuration system by consolidating multiple granular options into a single unified approach. The primary goal is to make the IdModel easier to enable globally before it becomes the default indexing path.

Key Simplifications

Option Configuration

Environment variable syntax simplified from NVFUSER_ENABLE=id_model(all) to just NVFUSER_ENABLE=id_model
Removed 7 boolean flags (build_id_model_, build_tensor_indexer_, consumer_index_, producer_index_, inline_predicate_, unswitch_predicate_, loop_)
Replaced with single tensor_indexer_enabled_ flag in IdModelOptions
Removed complex ensureConsistency() logic that managed flag dependencies
Removed DisableOption::IdModel - IdModel can no longer be disabled

Code Architecture

Eliminated IdModelEnableOption enum and all related parsing functions from id_model/utils.h
All usage sites now check single isTensorIndexerEnabled() method instead of individual flags
Updated 60+ test files to use simplified option syntax

Behavioral Changes

Always-Built Components (important for understanding resource usage):

IdModel is now always built during lowering (previously conditional)
TensorIndexer is now always built (previously conditional)
NonDivisiblePredicateInfo is now always built (previously conditional)

These are constructed regardless of whether TensorIndexer will be used, which increases baseline memory/compute overhead but simplifies the code paths.

Automatic TensorIndexer Enabling (IMPORTANT - functional change):
The logic for when TensorIndexer is automatically enabled was modified:

Removed: Automatic enabling for MmaOp operations
Removed: Automatic enabling for CpAsyncBulkTensorTile (only CpAsyncBulk remains)
Removed: Automatic enabling for tensors with non-linear transformations (the hasRootToLoopLinearTransformations check)

This means MMA operations and certain complex tensor transformations will use the legacy indexer by default unless IdModel is explicitly enabled or the fusion contains other triggering operations (ArgsortOp, PadOp, ScanOp, ScatterOp, SliceOp, TopKOp, CpAsyncBulk, or reshape with expanded broadcast).

Test Updates

All tests updated to remove argument passing to EnableOption::IdModel (changed from .set(EnableOption::IdModel, {"all"}) to .set(EnableOption::IdModel)). Some tests have updated variable name expectations due to IdModel changes (e.g., i98→i114 in test_indexing.cpp).

Confidence Score: 4/5

This PR is generally safe but contains functional changes beyond the stated scope that warrant discussion
The code simplification itself is well-executed and consistent across all 63 files. However, the score is 4 rather than 5 because: (1) The PR removes automatic TensorIndexer enabling for MmaOp, CpAsyncBulkTensorTile, and non-linear transformations without documenting this in the PR description, which could be a significant behavioral change. (2) IdModel and TensorIndexer are now always built, increasing baseline overhead. (3) These changes go beyond "simplifying the option configuration" as stated in the PR description. The test suite passes with these changes, suggesting intentional behavior, but explicit confirmation would be valuable.
csrc/device_lower/lower2device.cpp - Contains the removed logic for automatic TensorIndexer enabling (MmaOp, CpAsyncBulkTensorTile, non-linear transformations). Verify this behavioral change is intentional and doesn't break MMA or complex tensor operations.

Important Files Changed

File Analysis

Filename	Score	Overview
csrc/device_lower/id_model_options.h	5/5	Simplified from 7 boolean flags to single tensor_indexer_enabled_ flag, removed complex ensureConsistency logic
csrc/options.h	5/5	Removed DisableOption::IdModel enum value as IdModel can no longer be disabled
csrc/options.cpp	5/5	Removed "id_model" from disable options map, no longer supporting NVFUSER_DISABLE=id_model
csrc/id_model/utils.h	5/5	Removed IdModelEnableOption enum and all related parsing functions (getIdModelEnabledOptions, isIdModelOptionEnabled)
csrc/device_lower/lower2device.cpp	3/5	Major changes: IdModel/TensorIndexer always built now; removed automatic TensorIndexer enabling for MmaOp, CpAsyncBulkTensorTile, and non-linear transformations
csrc/index_compute.cpp	5/5	Replaced specific flag checks (consumerIndex, producerIndex) with unified isTensorIndexerEnabled() check
csrc/predicate_compute.cpp	5/5	Replaced specific flag checks (inlinePredicate, unswitchPredicate) with unified isTensorIndexerEnabled() check

Sequence Diagram

sequenceDiagram
    participant User
    participant ENV as Environment Variable
    participant Options as EnableOptions
    participant IdModelOpts as IdModelOptions
    participant Lower as GpuLower::analysis
    participant IdModel
    participant TensorIndexer
    participant Indexer as Index/Predicate Compute

    User->>ENV: Set NVFUSER_ENABLE=id_model
    Note over ENV: OLD: id_model(all)<br/>NEW: id_model
    
    ENV->>Options: Parse environment variable
    Options->>IdModelOpts: Create with isOptionEnabled(IdModel)
    Note over IdModelOpts: OLD: 7 boolean flags<br/>NEW: 1 tensor_indexer_enabled_
    
    IdModelOpts->>Lower: getIdModelOptions(fusion)
    Note over Lower: Check for specific ops:<br/>- CpAsyncBulk<br/>- ArgsortOp, PadOp, etc.<br/>- ReshapeOp with expanded broadcast
    
    alt TensorIndexer needed for ops
        Lower->>IdModelOpts: setTensorIndexer(true)
    end
    
    alt TensorIndexer not supported
        Lower->>IdModelOpts: setTensorIndexer(false)
    end
    
    Lower->>IdModel: Always build IdModel
    Note over IdModel: OLD: Conditional on buildIdModel()<br/>NEW: Always built
    
    Lower->>TensorIndexer: Always build TensorIndexer
    Note over TensorIndexer: OLD: Conditional on buildTensorIndexer()<br/>NEW: Always built
    
    alt isTensorIndexerEnabled()
        Lower->>IdModel: allocateLoopIndexVariables()
    end
    
    Indexer->>IdModelOpts: isTensorIndexerEnabled()?
    alt Enabled
        Indexer->>TensorIndexer: Use TensorIndexer for indexing/predicates
    else Disabled
        Indexer->>Indexer: Use legacy indexer
    end

greptile-apps

_{No files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

naoyam · 2026-01-13T02:55:00Z

!test

github-actions · 2026-01-13T02:56:27Z

Review updated until commit be3515d

Description

Simplify IdModel options from selective flags to single enable flag
Replace complex option parsing with straightforward tensor indexer toggle
Update all test files to use simplified option format
Remove unused option parsing logic and disable option

Changes walkthrough

Relevant files

Enhancement

5 files

id_model_options.h `Simplify IdModelOptions class structure`	+11/-120
lower2device.cpp `Streamline IdModel option logic`	+17/-46
utils.h `Remove complex option parsing functions`	+0/-68
options.cpp `Remove id_model disable option`	+0/-1
options.h `Remove IdModel disable option enum`	+0/-1

Tests

1 files

utils.cpp `Update test setup to use simplified option`	+1/-1

Additional files

57 files

utils.cpp	+1/-1
id_model.cpp	+1/-1
index_compute.cpp	+5/-5
predicate_compute.cpp	+2/-2
test_abstract_tensor.cpp	+1/-1
test_allocation_domain.cpp	+1/-1
test_allocation_order_inference.cpp	+1/-1
test_argsort.cpp	+1/-1
test_bfs.cpp	+1/-1
test_circular_buffering.cpp	+8/-8
test_circular_buffering_ping_pong.cpp	+1/-1
test_cluster.cpp	+1/-1
test_combined_inner_outer_reduction.cpp	+1/-1
test_compute_with.cpp	+1/-1
test_contiguity_id_model.cpp	+1/-1
test_gather.cpp	+1/-1
test_gpu1.cpp	+4/-4
test_gpu2.cpp	+1/-1
test_gpu3.cpp	+2/-2
test_greedy.cpp	+1/-1
test_index_put.cpp	+1/-1
test_index_select.cpp	+1/-1
test_indexing.cpp	+34/-33
test_indexing_advanced.cpp	+2/-2
test_inlining.cpp	+1/-1
test_layout_op.cpp	+1/-1
test_loop_domain_scheduling.cpp	+1/-1
test_matmul.cpp	+2/-2
test_matmul_scheduler.cpp	+4/-4
test_memory.cpp	+3/-3
test_mma.cpp	+4/-4
test_moe.cpp	+1/-1
test_move_pad.cpp	+1/-1
test_move_repeat_forward.cpp	+1/-1
test_move_split_cat.cpp	+1/-1
test_outer_reduction.cpp	+1/-1
test_persistent_buffer.cpp	+2/-2
test_pointwise.cpp	+2/-2
test_predicate_elimination.cpp	+1/-1
test_reduction.cpp	+1/-1
test_reduction_pointwise.cpp	+1/-1
test_remove_bcast_squeeze.cpp	+1/-1
test_replay.cpp	+1/-1
test_reshape.cpp	+2/-2
test_resize.cpp	+40/-40
test_rng.cpp	+1/-1
test_rope.cpp	+1/-1
test_scatter.cpp	+2/-2
test_select.cpp	+1/-1
test_serial_gridreduce.cpp	+1/-1
test_stream.cpp	+1/-1
test_topk.cpp	+1/-1
test_transpose.cpp	+2/-2
test_unary.cpp	+1/-1
test_vectorization.cpp	+3/-3
test_welford.cpp	+1/-1
utils.h	+2/-2

PR Reviewer Guide

Here are some key observations to aid the review process:

🧪 PR contains tests
⚡ Recommended focus areas for review

greptile-apps

_{No files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

csrc/device_lower/lower2device.cpp

greptile-apps · 2026-01-13T07:08:07Z

csrc/device_lower/lower2device.cpp

+  tensor_indexer_ = std::make_unique<TensorIndexer>(info().idModel());
+
+  non_divisible_predicate_info_ =
+      std::make_unique<NonDivisiblePredicateInfo>(fusion_);


Behavioral Change: TensorIndexer and NonDivisiblePredicateInfo are now always constructed, regardless of whether TensorIndexer is enabled for use.

Previously, these were only constructed when buildTensorIndexer() was true. This means:

Additional memory allocation occurs in all cases now

Construction overhead happens even when not used

Any code checking GpuLower::current()->isTensorIndexerEnabled() (which checks if the pointer is non-null) will now always return true

The decision to use TensorIndexer is now solely controlled by idModelOptions().isTensorIndexerEnabled() (the boolean flag), but the object itself always exists. This separation is important to understand for maintenance.

jjsjann123

Looks straightforward to me.

naoyam · 2026-01-14T00:24:07Z

!build

Simplify IdModel options

0578f53

greptile-apps bot reviewed Jan 13, 2026

View reviewed changes

naoyam added 2 commits January 12, 2026 18:54

update

1d7b328

fix

ae0dfac

greptile-apps bot reviewed Jan 13, 2026

View reviewed changes

naoyam requested a review from jjsjann123 January 13, 2026 05:18

Merge branch 'main' into id_model_option_cleanup

be3515d

greptile-apps bot reviewed Jan 13, 2026

View reviewed changes

jjsjann123 approved these changes Jan 14, 2026

View reviewed changes

naoyam merged commit 9881db4 into main Jan 14, 2026
15 of 16 checks passed

naoyam deleted the id_model_option_cleanup branch January 14, 2026 00:32

Simplify IdModel options #5808

Simplify IdModel options #5808

Uh oh!

Conversation

naoyam commented Jan 13, 2026

Uh oh!

naoyam commented Jan 13, 2026

Uh oh!

greptile-apps bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Overview

Greptile Summary

Key Simplifications

Behavioral Changes

Test Updates

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

naoyam commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes walkthrough

PR Reviewer Guide

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

jjsjann123 left a comment

Choose a reason for hiding this comment

Uh oh!

naoyam commented Jan 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps bot commented Jan 13, 2026 •

edited

Loading

github-actions bot commented Jan 13, 2026 •

edited

Loading