Skip to content

Fix gradient propagation for in-place array assignments#1306

Open
shivansh023023 wants to merge 4 commits intoNVIDIA:mainfrom
shivansh023023:feat/autograd-inplace-assign
Open

Fix gradient propagation for in-place array assignments#1306
shivansh023023 wants to merge 4 commits intoNVIDIA:mainfrom
shivansh023023:feat/autograd-inplace-assign

Conversation

@shivansh023023
Copy link

@shivansh023023 shivansh023023 commented Mar 19, 2026

Description

This PR fixes a bug where gradient propagation for in-place array assignments on vector and matrix types (e.g., arr[i] = wp.vec3(1.0, 2.0, 3.0) and arr[i] += wp.vec3(1.0, 2.0, 3.0)) would fail to build correct wp.Tape nodes and backpropagate gradients.

The bug lied in the fact that intermediate vector/matrix values inside arrays default to Reference types during code generation. By explicitly stripping the Reference type inside vector_assign_dispatch_func, vector_assign_copy_value_func, matrix_assign_dispatch_func, and matrix_assign_copy_value_func, we cleanly register the backwards assignment node for autodifferentiation. Additionally, emit_AugAssign in codegen.py now correctly redirects augmented array assignments to the standard base assignment protocol.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes. (N/A)

Test plan

  • Uncommented the existing TODO gradient propagation assertions for array assignments in warp/tests/test_vec.py and warp/tests/matrix/test_mat.py.
  • Ran the entire test suite via python warp/tests/test_vec.py and python warp/tests/matrix/test_mat.py successfully.
  • Created and validated custom reproduction scripts mapping x[0] = y[0] and x[0][0] += 5.0 with gradients, confirming that tape.backward(loss=x) accurately sets y.grad[0] to 1.0.

Bug fix

import warp as wp
import numpy as np
wp.init()

@wp.kernel
def assign_kernel(x: wp.array(dtype=float), y: wp.array(dtype=float)):
    x[0] = y[0] # In-place array assignment

@wp.kernel
def aug_assign_kernel(x: wp.array(dtype=wp.vec3)):
    x[0][0] += 5.0 # In-place vector assignment

y = wp.array([10.0], dtype=float, requires_grad=True)
x = wp.array([0.0], dtype=float, requires_grad=True)
vec = wp.zeros(1, dtype=wp.vec3, requires_grad=True)

tape = wp.Tape()
with tape:
    wp.launch(assign_kernel, dim=1, inputs=[x, y])
    wp.launch(aug_assign_kernel, dim=1, inputs=[vec])

tape.backward(loss=x)
# Previously, y.grad[0] would be 0.0 or trigger a Codegen KeyError
# With this PR, it properly propagates the gradient: 1.0
print(f"y.grad: {y.grad.numpy()}")


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **Bug Fixes**
  * More robust vector/matrix assignment and indexing: container types are normalized, partial vs. full indexing for multi-dimensional assignments is handled correctly, and augmented-assignment/subscript behavior improved; kernel error messages shortened.

* **Tests**
  * Re-enabled and updated array-assignment tests, including restored gradient assertion and an exposed kernel for matrix array assignment.

* **Documentation**
  * Added Apache 2.0 license headers and formatting cleanups.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 19, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link

coderabbitai bot commented Mar 19, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Normalize reference/container types in vector/matrix assignment builtins, add/restore Apache‑2.0 headers and re-enable tests, and add an allow_partial flag plus subscript-splitting and adjoint/eval adjustments in codegen.

Changes

Cohort / File(s) Summary
Builtins
warp/_src/builtins.py
Apply strip_reference(...) to normalize container types in vector_assign_* and matrix_assign_* dispatch and copy helpers; add Apache‑2.0 header; minor formatting edits.
Codegen
warp/_src/codegen.py
Introduce allow_partial to recurse_subscript()/eval_subscript(); when LHS indexing is over-dimensioned, split indices into array vs. vector/matrix components, use assign_copy for component update and array_store for writeback; ensure mark_write(...) and early return for specialized path; adjust augmented-assignment lowering for reference-like targets; remove unconditional pre-eval of rhs for subscript cases; Adjoint.eval() returns immediately for Var; shorten list-literal error text.
Tests — Matrix
warp/tests/matrix/test_mat.py
Add Apache‑2.0 header and re-enable previously string-commented @wp.kernel (mat_array_assign_element) by converting the block to a single-line TODO.
Tests — Vector
warp/tests/test_vec.py
Add Apache‑2.0 header and re-enable gradient assertion in test_vec_array_assign (now asserts x.grad.numpy() == [[6.0]]).

Sequence Diagram(s)

sequenceDiagram
    participant Source as Source AST / Kernel
    participant Codegen as Codegen
    participant Subscript as Subscript Resolver
    participant Assign as Assign Lowering
    participant Runtime as Runtime / Array Store

    Source->>Codegen: emit assignment lhs[...] = rhs
    Codegen->>Subscript: eval_subscript(lhs, allow_partial=false)
    Subscript-->>Codegen: resolved -> {array_part, component_part}
    alt over-indexed (array + vector/matrix component)
        Codegen->>Assign: assign_copy(component_view, rhs)
        Assign-->>Codegen: component_value
        Codegen->>Runtime: array_store(array_ref, array_part_index, component_value)
    else standard element/region
        Codegen->>Runtime: direct_store(resolved_target, rhs)
    end
    Note over Codegen,Assign: augmented-inplace on reference-like targets diverted to non-inplace assign path
    Note over Codegen,Runtime: Adjoint.eval() returns Var early for Var nodes
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately captures the main objective: fixing gradient propagation for in-place array assignments, which is the core issue being addressed across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR fixes gradient propagation for in-place array assignments of vector and matrix types (e.g. arr[i][j] = val and arr[i][j] += val). The fix has two complementary parts: (1) builtins.py strips Reference wrappers when looking up vec/mat types in the four assign dispatch/copy-value helpers, so the correct builtin overload is resolved when the array element is a reference; (2) codegen.py changes eval_subscript/recurse_subscript to support an allow_partial=False mode, then uses that in emit_Assign to collect all indices together and emit a assign_copy + array_store read-modify-write sequence instead of the previous broken single-store path. emit_AugAssign receives a matching deep-index branch that manually reads, applies the operator, and writes back using the same pattern.

Key changes:

  • strip_reference() applied to args["a"].type in all four vector/matrix assign helpers in builtins.py
  • New allow_partial parameter on recurse_subscript/eval_subscript controls whether intermediate array partial indexing happens during subscript recursion
  • emit_Assign now splits over-indexed subscripts into array_indices + vec_indices and uses assign_copy + array_store for correctness and differentiability
  • emit_AugAssign redirects reference subscripts to make_new_assign_statement() for simple cases, and to an explicit read-modify-write sequence for deep array-component indexing
  • A defensive isinstance(node, Var): return node guard is added to eval()
  • Previously disabled gradient assertions in test_vec.py and test_mat.py are restored

Several issues flagged in prior review rounds (missing adj.reads attribute, double-evaluation of index sub-expressions, emit_indexing with over-counted indices) remain open and are not addressed in this iteration.

Confidence Score: 2/5

  • Not yet safe to merge — multiple pre-existing but still-open critical bugs in codegen.py (missing adj.reads attribute causing AttributeError, double-evaluated index sub-expressions, and over-counted indices passed to emit_indexing) remain unaddressed.
  • The builtins.py strip_reference fixes are correct and targeted. However, codegen.py carries forward unresolved issues from prior review rounds: adj.reads is never initialized (AttributeError on first use), subscript index expressions are evaluated twice when make_new_assign_statement() is called after eval_subscript has already run, and the AugAssign emit_indexing call may receive over-counted indices. These were flagged in previous review threads and are not yet fixed. Additionally, the new emit_AugAssign deep-index path has no dedicated test, and a stale # TODO comment remains in the test file.
  • warp/_src/codegen.py requires the most attention — particularly the adj.reads attribute initialization, the double-evaluation side effect in make_new_assign_statement, and the index-split handling in emit_AugAssign.

Important Files Changed

Filename Overview
warp/_src/builtins.py Adds strip_reference() calls in vector_assign_dispatch_func, vector_assign_copy_value_func, matrix_assign_dispatch_func, and matrix_assign_copy_value_func so that container type lookups work correctly when the incoming a argument is a reference type (as happens for array element access). The fix is minimal and targeted; also adds the Apache 2.0 license body and one blank-line formatting tweak.
warp/_src/codegen.py Core code-generation changes: eval_subscript/recurse_subscript gain an allow_partial flag (defaulting to True) that, when set to False, prevents intermediate array partial indexing so that the full index list (array + component indices) is available together. emit_Assign uses allow_partial=False and implements a new read-modify-write path (assign_copy + array_store) for array-level vector/matrix component assignments. emit_AugAssign adds a parallel deep-index path, redirecting simple-reference cases back to make_new_assign_statement(). A guard if isinstance(node, Var): return node is added to eval(). Several issues flagged in previous review threads remain unresolved (missing adj.reads attribute, double-evaluation of subscript index expressions, index over-count in emit_indexing).
warp/tests/matrix/test_mat.py Restores the mat_array_assign_element, mat_array_assign_row, and test_mat_array_assign test block that was previously disabled with a triple-quoted comment. The gradient assertions are now active. A stale # TODO comment remains at line 1668 and should be removed.
warp/tests/test_vec.py Re-enables the previously commented-out gradient assertion in test_vec_array_assign. No test for the augmented-assignment (arr[i][j] +=) path that the PR's AugAssign changes target.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["emit_Assign / emit_AugAssign\nlhs is ast.Subscript"] --> B["eval_subscript(lhs,\nallow_partial=False)"]
    B --> C{"is_array(target_type)\nAND len(indices) > ndim?"}

    C -- Yes --> D["Split indices:\narray_indices = indices[:ndim]\nvec_indices = indices[ndim:]"]
    D --> E["emit_indexing(target, array_indices)\n→ vec_target (reference)"]
    E --> F["assign_copy(vec_target, vec_indices, rhs)\n→ new_vec"]
    F --> G["array_store(target, array_indices, new_vec)"]

    C -- No --> H{"is_array(target_type)"}
    H -- Yes --> I["eval_indices → array_store\n(whole-element write)"]
    H -- No --> J{"is_reference(target.type)?"}
    J -- Yes / AugAssign --> K["make_new_assign_statement()\n→ recurse into emit_Assign"]
    J -- No --> L["assign / store / assign_copy\nfor tiles, vecs, mats"]

    subgraph builtins_fix["builtins.py fix"]
        M["vector/matrix_assign_dispatch_func\nvector/matrix_assign_copy_value_func"]
        M --> N["strip_reference(args['a'].type)\nbefore type lookup"]
    end

    G -.->|"enables differentiable\narray[i][j] = val"| builtins_fix
Loading

Last reviewed commit: "Address P1: Fix emit..."

Comment on lines +235 to +241
doc="Compute the cosh of ``x``.",
group="Scalar Math",
)
add_builtin(
"copysign",
input_types={"x": Float, "y": Float},
value_func=sametypes_create_value_func(Float),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Missing native C++ implementation for copysign

The new copysign builtin uses the default namespace "wp::", which means Warp's codegen will emit wp::copysign(x, y). However, no such function exists in the native headers (builtin.h, etc.). Searching the entire warp/native/ directory finds no definition of wp::copysign.

Without the corresponding C++ implementation, any kernel that calls wp.copysign(...) will fail to compile with a linker/compiler error. Either:

  • A wp::copysign C++ implementation must be added to builtin.h (e.g. wrapping std::copysign / CUDA copysignf), or
  • The builtin should specify namespace="" to fall through to the standard library's copysign.

Comment on lines 1018 to 1038
@@ -1009,7 +1033,8 @@ def get_diag_value_func(arg_types: Mapping[str, type], arg_values: Mapping[str,
hidden=True,
group="Scalar Math",
export=False,
namespace="wp::" if t is not bool else "",
namespace="wp::" if t is not bool and not safe_native else "",
native_func=safe_native if safe_native else t.__name__,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 safe_native function names don't exist in native headers

The PR generates native_func names like float32_to_uint8, float16_to_uint32, float64_to_uint64, etc. using:

safe_native = f"{float_src_types[u]}_to_{t.__name__}"

These function names (e.g. float32_to_uint8, float32_to_uint16, float32_to_uint32, float32_to_uint64, and their float16/float64 variants) do not exist in warp/native/builtin.h or any other native header. A thorough search of warp/native/ confirms this.

Any kernel that performs a float-to-unsigned-int cast (e.g. wp.uint8(x) where x is a float) will fail to compile with an "undefined function" error. The safe cast helper functions need to first be implemented in the native C++ layer before being referenced here.

Comment on lines +462 to +475
@@ -462,10 +472,7 @@ def set_primitive_value(inst, value):
else:
setattr(inst._ctype, field, value)

# Re-wrap in the Warp scalar type so the Python attribute preserves
# the declared type (e.g. wp.uint8) instead of decaying to plain
# int/float, but only when the caller passed a Warp scalar.
cls.__setattr__(inst, field, var_type(value) if is_warp_scalar else value)
cls.__setattr__(inst, field, value)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Struct primitive field set to None on zero-initialization

After the refactor, cls.__setattr__(inst, field, value) is called unconditionally at the end of set_primitive_value, including when value is None. This means the Python-side attribute is set to None instead of to the zero-initialized type instance (var_type()).

Before this PR, the value is None branch explicitly called cls.__setattr__(inst, field, var_type()) so that s.my_field returned wp.uint8(0) (or similar). After this change, s.my_field returns None for a default-constructed struct, which will break downstream code that reads primitive struct fields.

A minimal fix would be to restore the zero-init path:

        if value is None:
            setattr(inst._ctype, field, var_type._type_())
            cls.__setattr__(inst, field, var_type())   # restore
        else:
            ...
        # Remove the unconditional cls.__setattr__ below

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
warp/_src/builtins.py (1)

9808-9823: ⚠️ Potential issue | 🟡 Minor

Use the stripped vector type in the length-mismatch message.

The new strip_reference() on Line 9808 is right, but Line 9822 still formats args["value"].type._length_ directly. When value is a referenced vector from an array element, that branch can raise AttributeError instead of the intended ValueError.

Proposed fix
-                        f"The length of the provided vector ({args['value'].type._length_}) isn't compatible with the given slice (expected {length})."
+                        f"The length of the provided vector ({value_type._length_}) isn't compatible with the given slice (expected {length})."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/builtins.py` around lines 9808 - 9823, The length-mismatch error
message uses args["value"].type._length_ which can be a referenced type and
cause AttributeError after you introduced strip_reference; update the message to
use the already-stripped value_type (i.e., value_type._length_) and keep the
same surrounding text/formatting so the ValueError always formats safely; locate
the check in the slice_t branch where vec, idx, value_type are used and replace
the args["value"].type._length_ reference with value_type._length_.
warp/_src/codegen.py (2)

3530-3535: ⚠️ Potential issue | 🟠 Major

Avoid evaluating the RHS before deciding to rewrite.

Any branch that falls back to make_new_assign_statement() has already emitted adj.eval(node.value) on Line 3534, and the synthetic ast.Assign reevaluates node.value again. That makes lhs += expr run expr twice on the rewritten paths, which is observable for expensive or side-effecting RHS expressions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/codegen.py` around lines 3530 - 3535, The code currently calls
adj.eval(node.value) before deciding to rewrite, then constructs a synthetic
ast.Assign that re-evaluates node.value, causing the RHS to run twice; fix by
evaluating the RHS exactly once and reusing that result: call rhs_eval =
adj.eval(node.value) a single time (or defer that call until you know you need
it) and use rhs_eval when building the ast.BinOp inside
make_new_assign_statement() (replace adj.eval(node.value) and any direct uses of
node.value in the synthetic ast.Assign with rhs_eval), referencing
make_new_assign_statement(), adj.eval, lhs, node.value, rhs (or rhs_eval),
new_node, ast.Assign and ast.BinOp so the RHS side-effects/expense occur only
once.

466-475: ⚠️ Potential issue | 🟠 Major

Keep struct primitive attributes synchronized with the coerced storage.

After writing to inst._ctype, this caches the pre-conversion Python value on the instance. For narrowed fields like float16 or small integer types, the Python attribute can now diverge from the actual stored value, so __repr__(), assign(), and to() may expose/copy something different from what kernels see.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/codegen.py` around lines 466 - 475, The code currently writes a
coerced value into inst._ctype (handling e.g. float16 via float_to_half_bits)
but then caches the original Python value by calling cls.__setattr__(inst,
field, value), causing divergence; change this to read back the actual stored
value from inst._ctype and set that on the instance instead (e.g. use
stored_value = getattr(inst._ctype, field) and then cls.__setattr__(inst, field,
stored_value)) so that primitive attributes (field) always reflect the
coerced/storage representation for var_type checks like warp.float16 and
operations that use inst attributes.
warp/tests/matrix/test_mat.py (1)

3485-3485: ⚠️ Potential issue | 🟠 Major

Test registration is still commented out, so test_mat_array_assign won't run.

The PR objective states that gradient propagation assertions for array assignments were re-enabled, and the kernels at lines 1668-1685 are now active. However, this test registration remains commented out, meaning the test will never execute.

To validate the gradient propagation fix claimed by this PR, uncomment this line:

-# add_function_test(TestMat, "test_mat_array_assign", test_mat_array_assign, devices=devices)
+add_function_test(TestMat, "test_mat_array_assign", test_mat_array_assign, devices=devices)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/tests/matrix/test_mat.py` at line 3485, Uncomment the test registration
so the array-assignment gradient test runs: remove the leading "#" from the line
that calls add_function_test(TestMat, "test_mat_array_assign",
test_mat_array_assign, devices=devices) in warp/tests/matrix/test_mat.py so the
test_mat_array_assign test (registered under TestMat) is enabled and executed;
ensure no trailing comment characters remain and run the test suite to verify it
executes.
🧹 Nitpick comments (1)
warp/tests/test_vec.py (1)

982-996: Please add a regression for array-component += as well.

This assertion now locks down the direct assignment path, but the new branch in warp/_src/codegen.py is specifically for reference-backed augmented assignments. A companion case like y[i, j][0] += x[i, j] / y[i, j].x += ... would protect the actual emit_AugAssign() path from regressing. The TODO on Line 995 also looks stale now.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/tests/test_vec.py` around lines 982 - 996, Add a companion regression
that covers reference-backed augmented assignment for array components: create a
new test (or extend test_vec_array_assign) which uses an augmented assignment
like y[0,0][0] += x[0,0] or y[0,0].x += x[0,0] inside the launched kernel, then
run the same backward pass and assert both the updated y value and the
propagated gradient on x match expected results; update identifiers in the test
(y, x, tape, wp.launch) to mirror the existing pattern and remove the stale TODO
about in-place array assignment once the augmented-assignment case is added.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@warp/_src/builtins.py`:
- Around line 238-244: The new copysign builtin was added via add_builtin in
builtins.py but the public stub wasn't regenerated; run the documentation/stub
generation script (execute build_docs.py) to regenerate warp/__init__.pyi so
that copysign is included in the public typing stub and autocomplete, then
verify the generated stub exposes the copysign signature consistent with the
add_builtin call (input_types={"x": Float, "y": Float}, return type matching
sametypes_create_value_func(Float)).

In `@warp/_src/codegen.py`:
- Around line 3293-3307: The synthetic load created by calling
adj.emit_indexing(target, array_indices) marks the LHS as read and triggers
spurious read/write diagnostics when verify_autograd_array_access is enabled; to
fix, snapshot the target's read flags before calling adj.emit_indexing (using
the same read-flag save/restore pattern used in emit_BinOp for LHS handling),
perform vec_target = adj.emit_indexing(...), then restore the saved read flags
before proceeding to vec_indices, new_vec, adj.add_builtin_call("array_store",
...) and the subsequent target.mark_write(...) call so the temporary
preserve-read does not generate a false read diagnostic for target. Ensure you
reference adj.eval_indices, adj.emit_indexing, vec_target/vec_target_type, and
target.mark_write when locating and applying the change.

---

Outside diff comments:
In `@warp/_src/builtins.py`:
- Around line 9808-9823: The length-mismatch error message uses
args["value"].type._length_ which can be a referenced type and cause
AttributeError after you introduced strip_reference; update the message to use
the already-stripped value_type (i.e., value_type._length_) and keep the same
surrounding text/formatting so the ValueError always formats safely; locate the
check in the slice_t branch where vec, idx, value_type are used and replace the
args["value"].type._length_ reference with value_type._length_.

In `@warp/_src/codegen.py`:
- Around line 3530-3535: The code currently calls adj.eval(node.value) before
deciding to rewrite, then constructs a synthetic ast.Assign that re-evaluates
node.value, causing the RHS to run twice; fix by evaluating the RHS exactly once
and reusing that result: call rhs_eval = adj.eval(node.value) a single time (or
defer that call until you know you need it) and use rhs_eval when building the
ast.BinOp inside make_new_assign_statement() (replace adj.eval(node.value) and
any direct uses of node.value in the synthetic ast.Assign with rhs_eval),
referencing make_new_assign_statement(), adj.eval, lhs, node.value, rhs (or
rhs_eval), new_node, ast.Assign and ast.BinOp so the RHS side-effects/expense
occur only once.
- Around line 466-475: The code currently writes a coerced value into
inst._ctype (handling e.g. float16 via float_to_half_bits) but then caches the
original Python value by calling cls.__setattr__(inst, field, value), causing
divergence; change this to read back the actual stored value from inst._ctype
and set that on the instance instead (e.g. use stored_value =
getattr(inst._ctype, field) and then cls.__setattr__(inst, field, stored_value))
so that primitive attributes (field) always reflect the coerced/storage
representation for var_type checks like warp.float16 and operations that use
inst attributes.

In `@warp/tests/matrix/test_mat.py`:
- Line 3485: Uncomment the test registration so the array-assignment gradient
test runs: remove the leading "#" from the line that calls
add_function_test(TestMat, "test_mat_array_assign", test_mat_array_assign,
devices=devices) in warp/tests/matrix/test_mat.py so the test_mat_array_assign
test (registered under TestMat) is enabled and executed; ensure no trailing
comment characters remain and run the test suite to verify it executes.

---

Nitpick comments:
In `@warp/tests/test_vec.py`:
- Around line 982-996: Add a companion regression that covers reference-backed
augmented assignment for array components: create a new test (or extend
test_vec_array_assign) which uses an augmented assignment like y[0,0][0] +=
x[0,0] or y[0,0].x += x[0,0] inside the launched kernel, then run the same
backward pass and assert both the updated y value and the propagated gradient on
x match expected results; update identifiers in the test (y, x, tape, wp.launch)
to mirror the existing pattern and remove the stale TODO about in-place array
assignment once the augmented-assignment case is added.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: c42f3ba3-41b4-43bb-aa19-09a44fb1319c

📥 Commits

Reviewing files that changed from the base of the PR and between fe1742c and 496a899.

📒 Files selected for processing (4)
  • warp/_src/builtins.py
  • warp/_src/codegen.py
  • warp/tests/matrix/test_mat.py
  • warp/tests/test_vec.py

Comment on lines +238 to +244
add_builtin(
"copysign",
input_types={"x": Float, "y": Float},
value_func=sametypes_create_value_func(Float),
doc="Create a value with the magnitude of ``x`` and the sign of ``y``.",
group="Scalar Math",
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

if [ ! -f warp/__init__.pyi ]; then
  echo "warp/__init__.pyi not found"
  exit 1
fi

rg -n '\bcopysign\b' warp/__init__.pyi

Repository: NVIDIA/warp

Length of output: 37


Run build_docs.py to regenerate warp/__init__.pyi with the new copysign builtin.

The copysign function is missing from warp/__init__.pyi. Per the coding guidelines, modifying warp/_src/builtins.py requires regenerating the public stub so that typing and autocomplete stay in sync with the runtime API.

🧰 Tools
🪛 Ruff (0.15.6)

[error] 240-240: Float may be undefined, or defined from star imports

(F405)


[error] 240-240: Float may be undefined, or defined from star imports

(F405)


[error] 241-241: Float may be undefined, or defined from star imports

(F405)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/builtins.py` around lines 238 - 244, The new copysign builtin was
added via add_builtin in builtins.py but the public stub wasn't regenerated; run
the documentation/stub generation script (execute build_docs.py) to regenerate
warp/__init__.pyi so that copysign is included in the public typing stub and
autocomplete, then verify the generated stub exposes the copysign signature
consistent with the add_builtin call (input_types={"x": Float, "y": Float},
return type matching sametypes_create_value_func(Float)).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
warp/_src/builtins.py (1)

9801-9831: ⚠️ Potential issue | 🟡 Minor

Use normalized value_type in the length-mismatch error path.

Line 9815 still uses args["value"].type._length_. With reference-wrapped values this can raise an AttributeError instead of the intended ValueError.

Suggested fix
-                        f"The length of the provided vector ({args['value'].type._length_}) isn't compatible with the given slice (expected {length})."
+                        f"The length of the provided vector ({value_type._length_}) isn't compatible with the given slice (expected {length})."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/builtins.py` around lines 9801 - 9831, The length-mismatch error
message uses the raw args["value"].type._length_ which can be a
reference-wrapped type and raise AttributeError; update the error to use the
already-normalized value_type (e.g., value_type._length_) in the branch inside
the slice_t path where you check value_type._length_ != length so the ValueError
message reports the provided vector length safely (reference vec and length and
keep template_args handling unchanged).
warp/_src/codegen.py (1)

3154-3180: ⚠️ Potential issue | 🟠 Major

allow_partial=False also blocks valid array-view chaining.

For writes like a[1:][0] = v, the second subscript still targets an array view, not a vector/matrix lane. This change keeps both indices attached to the root array, so the assignment path later treats the extra index as a composite component and lowers it through assign_copy(). Only stop partial flattening once the intermediate target is no longer an array.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/codegen.py` around lines 3154 - 3180, The current guard "if
is_array(target_type) and allow_partial:" prevents flattening indices for
chained array-views when allow_partial is False; change the condition in
recurse_subscript so flattening only stops when the intermediate target is no
longer an array. Concretely, in recurse_subscript (and where flat_indices is
computed) remove the allow_partial requirement and use "if
is_array(target_type):" so both indices remain attached to the root array
(keeping the later assign_copy lowering path intact); keep the rest of the logic
(adj.eval, adj.recurse_subscript, strip_reference) unchanged.
♻️ Duplicate comments (2)
warp/_src/codegen.py (2)

3548-3551: ⚠️ Potential issue | 🟠 Major

This eval_subscript() probe reevaluates the LHS.

Line 3548 is not a pure type check. For x[idx()][0] += 1, it can materialize x[idx()] before make_new_assign_statement() reevaluates the same subscript in the synthesized BinOp, and again in the store path. That duplicates index side effects and also leaves behind a synthetic read for verify_autograd_array_access. Please avoid probing with eval_subscript() here; use a type-only walk or carry the resolved base/index vars into the lowered assignment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/codegen.py` around lines 3548 - 3551, The probe using
eval_subscript(lhs) causes re-evaluation and side effects; instead avoid calling
eval_subscript() here and either perform a type-only walk (inspect AST/Type info
without executing/evaluating lhs) or propagate the already-resolved base/index
variables into the lowering so make_new_assign_statement() receives the concrete
target and indices; update the logic around the eval_subscript call in
codegen.py so you do not re-materialize x[idx()] (reference eval_subscript,
make_new_assign_statement, and verify_autograd_array_access) — either replace
the probe with a pure type-check helper or thread the resolved target/indices
through to the synthesized BinOp/store path.

3300-3302: ⚠️ Potential issue | 🔴 Critical

adj.reads will crash here.

Line 3300 reads adj.reads, but Adjoint only tracks access state on each Var (is_read / is_write). The first x[i][j] = ... codegen path will raise AttributeError before this fix can run. Reuse the same arg read-flag snapshot/restore pattern used in emit_BinOp().

🐛 Proposed fix
-                    reads_saved = dict(adj.reads)
-                    vec_target = adj.emit_indexing(target, array_indices)
-                    adj.reads = reads_saved
+                    if warp.config.verify_autograd_array_access:
+                        read_states = [arg.is_read for arg in adj.args]
+                    vec_target = adj.emit_indexing(target, array_indices)
+                    if warp.config.verify_autograd_array_access:
+                        for arg, is_read in zip(adj.args, read_states):
+                            arg.is_read = is_read
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/codegen.py` around lines 3300 - 3302, The code attempts to snapshot
adj.reads which doesn't exist on Adjoint and will raise AttributeError; change
to snapshot/restore the read-flag state on the Var arguments like emit_BinOp
does: before calling adj.emit_indexing(target, array_indices) capture the
current is_read flags for each Var used in the target/index args (e.g., iterate
the Vars to build reads_saved), call adj.emit_indexing(...), then restore those
Var.is_read flags from reads_saved afterward; update the logic around
adj.emit_indexing and the reads_saved variable name to operate on
Var.is_read/is_write state rather than adj.reads.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@warp/tests/test_vec.py`:
- Around line 995-996: The comment "TODO: gradient propagation for in-place
array assignment" is stale because the following assertion asserts the fixed
behavior (assert_np_equal(x.grad.numpy(), np.array([[6.0]], dtype=float)));
remove or update that TODO near the test in warp/tests/test_vec.py so it either
is deleted or changed to a positive note indicating gradient propagation for
in-place array assignment is now implemented; reference the test/assertion
involving x.grad.numpy() to locate the exact place to edit.

---

Outside diff comments:
In `@warp/_src/builtins.py`:
- Around line 9801-9831: The length-mismatch error message uses the raw
args["value"].type._length_ which can be a reference-wrapped type and raise
AttributeError; update the error to use the already-normalized value_type (e.g.,
value_type._length_) in the branch inside the slice_t path where you check
value_type._length_ != length so the ValueError message reports the provided
vector length safely (reference vec and length and keep template_args handling
unchanged).

In `@warp/_src/codegen.py`:
- Around line 3154-3180: The current guard "if is_array(target_type) and
allow_partial:" prevents flattening indices for chained array-views when
allow_partial is False; change the condition in recurse_subscript so flattening
only stops when the intermediate target is no longer an array. Concretely, in
recurse_subscript (and where flat_indices is computed) remove the allow_partial
requirement and use "if is_array(target_type):" so both indices remain attached
to the root array (keeping the later assign_copy lowering path intact); keep the
rest of the logic (adj.eval, adj.recurse_subscript, strip_reference) unchanged.

---

Duplicate comments:
In `@warp/_src/codegen.py`:
- Around line 3548-3551: The probe using eval_subscript(lhs) causes
re-evaluation and side effects; instead avoid calling eval_subscript() here and
either perform a type-only walk (inspect AST/Type info without
executing/evaluating lhs) or propagate the already-resolved base/index variables
into the lowering so make_new_assign_statement() receives the concrete target
and indices; update the logic around the eval_subscript call in codegen.py so
you do not re-materialize x[idx()] (reference eval_subscript,
make_new_assign_statement, and verify_autograd_array_access) — either replace
the probe with a pure type-check helper or thread the resolved target/indices
through to the synthesized BinOp/store path.
- Around line 3300-3302: The code attempts to snapshot adj.reads which doesn't
exist on Adjoint and will raise AttributeError; change to snapshot/restore the
read-flag state on the Var arguments like emit_BinOp does: before calling
adj.emit_indexing(target, array_indices) capture the current is_read flags for
each Var used in the target/index args (e.g., iterate the Vars to build
reads_saved), call adj.emit_indexing(...), then restore those Var.is_read flags
from reads_saved afterward; update the logic around adj.emit_indexing and the
reads_saved variable name to operate on Var.is_read/is_write state rather than
adj.reads.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: c669072e-a798-4709-a882-a52bb97c13b3

📥 Commits

Reviewing files that changed from the base of the PR and between 496a899 and e35721b.

📒 Files selected for processing (3)
  • warp/_src/builtins.py
  • warp/_src/codegen.py
  • warp/tests/test_vec.py

Comment on lines +3300 to +3302
reads_saved = dict(adj.reads)
vec_target = adj.emit_indexing(target, array_indices)
adj.reads = reads_saved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 adj.reads is never initialized — AttributeError at runtime

adj.reads is accessed here, but it is never assigned in Adjoint.__init__ or anywhere else in the codebase. A grep across all of warp/_src/ confirms these two lines are the only occurrences of adj.reads / self.reads.

The first time a kernel that contains arr[i][j] = val (or arr[i,j][0] = val) is compiled, reads_saved = dict(adj.reads) will raise AttributeError: 'Adjoint' object has no attribute 'reads', preventing the kernel from building.

It looks like the intent was to save/restore whichever set or dict tracks which arrays were read (to avoid the temporary emit_indexing call for vec_target from being treated as a real read for verify_autograd_array_access). However, Warp tracks reads via Var.is_read / Var.mark_read() — not through a dict on the Adjoint instance.

The reads_saved / adj.reads = reads_saved save-restore block appears to be dead code (even if the attribute were initialised, emit_indexing doesn't write to adj.reads). Simply removing these two lines and calling emit_indexing directly should be correct:

Suggested change
reads_saved = dict(adj.reads)
vec_target = adj.emit_indexing(target, array_indices)
adj.reads = reads_saved
vec_target = adj.emit_indexing(target, array_indices)

If the goal truly is to suppress the "array read before write" warning for target, the correct approach would be to save/restore target.is_read rather than a non-existent adj.reads dict.

Comment on lines +3548 to +3551
target, indices = adj.eval_subscript(lhs)
if is_reference(target.type):
make_new_assign_statement()
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Double-evaluation of subscript index expressions

adj.eval_subscript(lhs) is called on line 3548 to inspect the target type (and thus check is_reference). This call evaluates all index sub-expressions as a side effect, emitting SSA variables into the forward body. When make_new_assign_statement() is then called on line 3550, it builds a fresh ast.Assign node that contains the original lhs AST node, which is then re-evaluated in full by emit_Assign — causing every index expression (e.g., wp.tid(), array loads used as indices) to be emitted twice in the generated kernel code.

For simple index expressions like integer literals or wp.tid(), this produces redundant but semantically equivalent code. For index expressions that involve function calls with side effects or array reads, it can produce incorrectly duplicated operations.

This is a pre-existing pattern in some other make_new_assign_statement() call sites (e.g. the non_atomic_types path), but this PR introduces a new instance of it. Consider inspecting the target type without fully evaluating the subscript, or structuring the dispatch so that the full evaluation only happens in the path that actually uses it.

Comment on lines +3549 to +3568
if is_reference(target.type):
if is_array(target_type) and len(indices) > target_type.ndim:
rhs = adj.eval(node.value)

old_val = adj.emit_indexing(target, indices)
op_name = builtin_operators[type(node.op)]
new_val = adj.add_builtin_call(op_name, [old_val, rhs])

array_indices = indices[:target_type.ndim]
vec_indices = indices[target_type.ndim:]

array_indices = adj.eval_indices(target_type, array_indices)
vec_target = adj.emit_indexing(target, array_indices)
vec_target_type = strip_reference(vec_target.type)

vec_indices = adj.eval_indices(vec_target_type, vec_indices)

new_vec = adj.add_builtin_call("assign_copy", [vec_target, *vec_indices, new_val])
adj.add_builtin_call("array_store", [target, *array_indices, new_vec])
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 emit_indexing with over-counted indices in AugAssign

At line 3553, adj.emit_indexing(target, indices) is called where target is an array (reference) and indices contains both array-level and vector/component-level indices — more than target_type.ndim. Inside emit_indexing, this hits the else branch (since len(indices) != target_type.ndim) and emits a view builtin call instead of a scalar address load. That is incorrect semantics for the intended "read the current scalar component, compute new value, write back" operation.

The correct approach is to split the indices first (as is done later for the write-back path at lines 3557–3567) and index step by step:

if is_array(target_type) and len(indices) > target_type.ndim:
    rhs = adj.eval(node.value)

    array_indices = indices[:target_type.ndim]
    vec_indices = indices[target_type.ndim:]

    array_indices = adj.eval_indices(target_type, array_indices)
    vec_target = adj.emit_indexing(target, array_indices)
    vec_target_type = strip_reference(vec_target.type)

    vec_indices = adj.eval_indices(vec_target_type, vec_indices)
    old_val = adj.emit_indexing(vec_target, vec_indices)

    op_name = builtin_operators[type(node.op)]
    new_val = adj.add_builtin_call(op_name, [old_val, rhs])

    new_vec = adj.add_builtin_call("assign_copy", [vec_target, *vec_indices, new_val])
    adj.add_builtin_call("array_store", [target, *array_indices, new_vec])
    return

Note: while in practice eval_subscript with allow_partial=True will have already collapsed array indices into a vector reference (making this branch unreachable for typical kernels), the explicit call to emit_indexing with combined indices would be wrong if the branch were ever reached, and the intent clearly should be to read a single component.

Signed-off-by: shivansh023023 <singhshivansh023@gmail.com>
Signed-off-by: shivansh023023 <singhshivansh023@gmail.com>
…ve stale TODO, remove undefined adj.reads

Signed-off-by: shivansh023023 <singhshivansh023@gmail.com>
Signed-off-by: shivansh023023 <singhshivansh023@gmail.com>
@shivansh023023 shivansh023023 force-pushed the feat/autograd-inplace-assign branch from 63a8b6a to d195b7b Compare March 19, 2026 17:00
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
warp/tests/matrix/test_mat.py (1)

1668-1715: ⚠️ Potential issue | 🟠 Major

test_mat_array_assign remains disabled from the test registry.

This section is now active, but the test is still not executed because registration is commented at Line 3485. That leaves the in-place matrix array assignment gradient regression path uncovered in CI.

✅ Suggested fix
-# add_function_test(TestMat, "test_mat_array_assign", test_mat_array_assign, devices=devices)
+add_function_test(TestMat, "test_mat_array_assign", test_mat_array_assign, devices=devices)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/tests/matrix/test_mat.py` around lines 1668 - 1715, The test function
test_mat_array_assign is defined but not registered with the test registry, so
it's not run in CI; fix this by locating the test registration block where tests
are listed and uncomment (or add) the registration entry for
test_mat_array_assign (the same symbol name) so the test is included in the
suite, then run the test runner to confirm the in-place matrix array assignment
gradient path is covered.
♻️ Duplicate comments (1)
warp/_src/codegen.py (1)

3298-3312: ⚠️ Potential issue | 🟡 Minor

Synthetic element load can still trigger a false read→write warning on Line 3300.

Line 3300 (emit_indexing(target, array_indices)) marks target as read, then Line 3312 marks it written. With verify_autograd_array_access, this can produce a spurious warning for compiler-generated preserve/read logic. This was previously flagged in this area and appears to be reintroduced in the new split-index path.

💡 Suggested patch
-                    vec_target = adj.emit_indexing(target, array_indices)
+                    if warp.config.verify_autograd_array_access:
+                        is_read_states = [arg.is_read for arg in adj.args]
+
+                    vec_target = adj.emit_indexing(target, array_indices)
                     vec_target_type = strip_reference(vec_target.type)
+
+                    if warp.config.verify_autograd_array_access:
+                        for i, arg in enumerate(adj.args):
+                            arg.is_read = is_read_states[i]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@warp/_src/codegen.py` around lines 3298 - 3312, The synthetic indexing done
in adj.emit_indexing(...) causes a spurious read mark before the later
target.mark_write(...) under verify_autograd_array_access; change the flow so
synthetic/compiler-generated element loads do not mark the original array as a
user read. Concretely, update emit_indexing (or its callers) to accept a
synthetic/preserve flag (e.g., synthetic=True) when computing vec_target from
target and array_indices, and have emit_indexing skip any call that marks the
source as read (or annotate the returned vec_target with a synthetic flag that
suppresses read→write warnings), then ensure the place that calls mark_write
(the block using adj.fun_name/adj.filename/adj.lineno and target.mark_write)
remains unchanged so true writes are still recorded.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@warp/tests/matrix/test_mat.py`:
- Around line 1668-1715: The test function test_mat_array_assign is defined but
not registered with the test registry, so it's not run in CI; fix this by
locating the test registration block where tests are listed and uncomment (or
add) the registration entry for test_mat_array_assign (the same symbol name) so
the test is included in the suite, then run the test runner to confirm the
in-place matrix array assignment gradient path is covered.

---

Duplicate comments:
In `@warp/_src/codegen.py`:
- Around line 3298-3312: The synthetic indexing done in adj.emit_indexing(...)
causes a spurious read mark before the later target.mark_write(...) under
verify_autograd_array_access; change the flow so synthetic/compiler-generated
element loads do not mark the original array as a user read. Concretely, update
emit_indexing (or its callers) to accept a synthetic/preserve flag (e.g.,
synthetic=True) when computing vec_target from target and array_indices, and
have emit_indexing skip any call that marks the source as read (or annotate the
returned vec_target with a synthetic flag that suppresses read→write warnings),
then ensure the place that calls mark_write (the block using
adj.fun_name/adj.filename/adj.lineno and target.mark_write) remains unchanged so
true writes are still recorded.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: b9f7fb6e-c355-4b04-8ead-d3e12485060c

📥 Commits

Reviewing files that changed from the base of the PR and between 63a8b6a and d195b7b.

📒 Files selected for processing (4)
  • warp/_src/builtins.py
  • warp/_src/codegen.py
  • warp/tests/matrix/test_mat.py
  • warp/tests/test_vec.py
✅ Files skipped from review due to trivial changes (1)
  • warp/_src/builtins.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant