[BUG][Deepcompile] reduce_grad returns undefined tensor -> Inductor compilation fails (expected a proper tensor but got None)

**Describe the bug**
During AOTAutograd backward compilation, DeepSpeed’s reduce_grad op returns an undefined tensor, but the graph rewrite pass rewires all downstream gradient usages to this output.
As a result, Inductor/FakeTensor sees None as input to ops like aten.sum or reshape, causing compilation failure.

**Error**
```
torch._inductor.exc.InductorError: RuntimeError:
Expected a proper Tensor but got None (or an undefined Tensor in C++) for argument #0 'self'
```

**Trigger path**
1. Backward graph: each parameter-grad node is rewritten to torch.ops.dc.reduce_grad.default(grad)
2. All uses of the original grad are replaced by the output of this op 
3. Fx trace shows downstream ops (e.g., aten.sum(...,[0,1]), reshape) consuming the output of reduce_grad.
4. c++ implementation returns at::Tensor() (undefined) in both:
- reduce_grad()
- reduce_grad_meta()
This breaks FakeTensor propagation and Inductor lowering. 

**Root Cause**
reduce_grad is treated as a functional node in the graph, but its c++ kernel and meta kernel return a undefined tensor, which cannot be consumed by downstream ops. 

Since the compiler rewrites all gradient uses to this output, the output must be a valid Tensor. 

**Question for maintainers**
In DeepSpeed/csrc/compile/deepcompile.cpp, both `reduce_grad(...)` and `reduce_grad_meta(...)` currently return an undefined tensor (`at::Tensor()`). 
Given that the graph rewrite redirects all downstream gradient uses to the output of this op, should these two functions instead return the input `grad_tensor`?

This would allow downstream ops (e.g., aten.sum, reshape) to receive a valid tensor and avoid FakeTensor/Inductor errors during compilation. Is returning `grad_tensor ` the correct fix here, or is the intended semantics different?




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG][Deepcompile] reduce_grad returns undefined tensor -> Inductor compilation fails (expected a proper tensor but got None) #7682

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG][Deepcompile] reduce_grad returns undefined tensor -> Inductor compilation fails (expected a proper tensor but got None) #7682

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions