ci: add graphify structural impact analysis to PR review and structure audit by andreatgretel · Pull Request #567 · NVIDIA-NeMo/DataDesigner

andreatgretel · 2026-04-21T20:42:23Z

📋 Summary

Add a graphify-based AST analysis tool that builds a
directed graph of the codebase (~6s, zero LLM cost) to surface architectural risk before the
review agent starts. Integrates into both the PR review workflow (pre-computed context for the
agent) and the Wednesday structure audit (week-over-week graph diff).

Hat tip to @nabinchha for suggesting graphify as the right tool for this.

🔗 Related Issue

Closes #472 (Phase 4 — structural analysis)

🔄 Changes

✨ Added

structural_impact.py —
standalone analysis tool with two modes:
- --changed-files (PR review): extracts changed AST entities against the full codebase graph,
  reports risk level (LOW/MEDIUM/HIGH), god nodes affected, import direction violations, and
  cross-package dependencies
- --full (structure audit): god node rankings, cross-package edge summary table, import
  violation detection, and graph diff against the previous week's cached graph
Step 3.5 in the
review-code skill —
agent reads the pre-computed structural impact to calibrate review depth (HIGH risk = extra
scrutiny on blast radius and backward compatibility)
graphify-out/ added to .gitignore (9MB graph cache is CI-only, persisted via actions/cache)

🔧 Changed

agentic-ci-pr-review.yml —
new step installs graphifyy in a lightweight temp venv and runs structural analysis before the
agent starts. Tool is read from the base-branch checkout (base-agents/) so fork PRs cannot
inject code that runs with GH_TOKEN in scope. Base-branch checkout expanded from
.agents/recipes to also include .agents/tools
agentic-ci-daily.yml —
conditional graphifyy install for the structure suite; graphify-out/ added to the cache path
for week-over-week graph diffs
pr-review recipe —
agent reads /tmp/structural-impact-{{pr_number}}.md and appends findings to the review
structure recipe —
new Section 4 with graphify instructions, baseline tracking, and week-over-week diff guidance

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

agentic-ci-pr-review.yml —
the security model: structural_impact.py runs from base-agents/ (base-branch sparse
checkout), not from the PR's files, to prevent fork PRs from injecting code that executes
with secrets in scope. The --repo-root flag ensures the tool still analyzes the PR's
source code correctly despite living in a different checkout
structural_impact.py —
new 361-line tool; pure AST analysis, no DD imports at runtime. Depends on
graphifyy (tree-sitter based) and networkx (transitive)

📊 Retroactive validation on 3 merged PRs

Ran graphify retroactively against PRs #502, #545, and #557 to validate the signal quality:

PR	Graphify risk	Key signal	Actual critical findings
#502 (skip.when, 22 files)	HIGH — 5 god nodes, 9 violations, 12/73 clusters	`DatasetBuilder` (88 deps) + `ExecutionGraph` (89 deps) flagged as highest-risk	allow_resize propagation bug and positional merge divergence — both in the flagged entities
#557 (jinja engine, 20 files)	HIGH — config→engine violation, 52-dep env class	`config.__getattr__()` calls into engine	jsonpath filter bug at exactly that cross-package seam
#545 (async bridge, 6 files)	MEDIUM — 9 clusters for 6 files	Error hierarchy connectivity (52 deps)	Fragile error string matching — in the flagged error class

In all 3 cases, the most critical review findings correlated with the highest-connectivity
entities in the graphify report. The risk level (LOW/MEDIUM/HIGH) matched actual review
effort needed.

🧪 Testing

structural_impact.py smoke-tested in both modes (--changed-files, --full)
against the live DD codebase
--repo-root flag tested from external checkout directory (simulates CI base-agents/ path)
graphify API signatures verified against graphifyy==0.4.23
Pre-commit hooks pass (ruff, yaml, merge conflict check)
Unit tests added/updated — N/A: CI tool, no testable library code
E2E tests added/updated — validated retroactively against 3 merged PRs (see above)

✅ Checklist

Follows commit message conventions
Commits are signed off (DCO)
Architecture docs updated — N/A: .agents/ tooling, not package architecture

…e audit Add a graphify-based AST analysis tool that builds a directed graph of the codebase (~2s, no LLM calls) to detect architectural impact. Integrates into both the PR review workflow (pre-computed before claude runs) and the Wednesday structure audit (with week-over-week diff). PR review: extracts changed files against the full codebase graph, reports risk level (LOW/MEDIUM/HIGH), god nodes affected, import direction violations, and cross-package dependencies. Output saved to /tmp and read by the review agent. Structure audit: produces god node rankings, cross-package edge summary table, import violation detection, and graph diff against previous week's cached graph. Baselines saved for runner memory trend tracking.

- Fix KeyError: god_nodes() returns 'degree' not 'edges' (3 call sites) - Fix deduped vs raw violation count inconsistency in baselines.json - Security: run structural_impact.py from base-branch checkout so fork PRs cannot inject code that executes with GH_TOKEN in scope - Add --repo-root flag so the tool resolves package paths correctly when invoked from a different checkout directory - Replace make install-dev + .venv with lightweight /tmp/graphify-venv (only graphifyy needed, saves ~2min CI per PR review) - Add graphify-out/ to .gitignore (9MB graph cache is CI-only)

github-actions · 2026-04-21T20:44:11Z

Summary

PR #567 adds a graphify-based AST structural-impact tool and wires it into two CI workflows:

agentic-ci-pr-review.yml: a pre-review step installs graphifyy into a temp venv, runs the new .agents/tools/structural_impact.py against changed Python files, and writes /tmp/structural-impact-{PR}.md for the review agent to consume.
agentic-ci-daily.yml (structure suite): conditionally installs graphify and caches graphify-out/ for week-over-week graph diffs.
New tool (.agents/tools/structural_impact.py, 361 LOC): builds a directed AST graph of the three DD packages, flags god nodes, import-direction violations (interface → engine → config), cross-package edges, and emits risk levels (LOW/MEDIUM/HIGH).
Recipe/skill updates: pr-review recipe reads the pre-computed report and appends it to the review; structure recipe adds a Section 4 with baseline tracking; review-code skill adds Step 3.5 describing how to calibrate review depth based on risk.

Retroactive validation against 3 merged PRs is cited in the description; signal correlation with actual critical findings looks plausible.

Findings

Security — correctly handled, one residual concern

The critical security story (tool must run from trusted base-branch code, not PR code) is executed well:

Sparse checkout expanded from .agents/recipes to also include .agents/tools; the script is invoked from base-agents/.agents/tools/structural_impact.py, not from the PR checkout.
--repo-root "${{ github.workspace }}" cleanly separates trusted tool location from untrusted analysis target.
continue-on-error: true on the structural step means a tool regression or graphify pin drift will not block PR review.

One residual concern in agentic-ci-pr-review.yml:

CHANGED_PY=$(gh pr diff "$PR_NUMBER" --name-only | grep '\.py$' || true)
...
--changed-files $CHANGED_PY \

$CHANGED_PY is deliberately unquoted to word-split into multiple args, but that also subjects the values to IFS word-splitting and glob expansion. A PR introducing a .py file whose name contains whitespace or glob characters would produce incorrect arg boundaries. Git allows such filenames. Low risk (the tool treats each path as a file and filters by Path.exists()/.suffix), but consider reading into an array:

mapfile -t CHANGED_PY < <(gh pr diff "$PR_NUMBER" --name-only | grep '\.py$' || true)
...
--changed-files "${CHANGED_PY[@]}"

This also fixes a latent bug: if CHANGED_PY is only newlines (no matches), the unquoted expansion produces zero args and nargs="+" will error — but the early if [ -z "$CHANGED_PY" ] guard covers that case today.

Correctness

_dedup key truncation (structural_impact.py:60): dedup key uses label[:30] on each side. Two distinct entities that share the first 30 characters of their label would be merged. Probably rare in practice but not obviously intended — prefer the full label, or document the truncation as intentional.
pip install ... | tail -3 (both workflows): piping suppresses the pip exit code under the default set +o pipefail. A failed install would proceed to run structural_impact.py, which would then ImportError on graphify. continue-on-error catches this, but the log would be confusing. Either set pipefail for these steps or drop | tail -3.
Global _REPO_ROOT mutation (structural_impact.py:21, 308): the global _REPO_ROOT pattern works but is fragile — _PACKAGE_SUBDIRS is resolved at import time against the default, while _collect_source_files reads _REPO_ROOT at call time. If anyone later changes _PACKAGE_SUBDIRS to absolute paths, the override breaks silently. Pass repo_root explicitly through the call chain, or evaluate subdirs inside the function.
_get_package interface detection (structural_impact.py:53-57): matches on the literal directory segment data-designer followed by src. This correctly skips data-designer-engine / data-designer-config because the earlier substring checks win, but the fallthrough logic is subtle. A brief comment explaining the ordering dependency would save a future reader.
NetworkX API compatibility (structural_impact.py:273-277): the try: node_link_graph(..., edges="links") / except TypeError: node_link_graph(...) fallback covers a real API change across NetworkX versions. Good defensive coding; worth a one-line comment referencing the NetworkX version boundary so it isn't mistaken for dead code later.

Style / project conventions

Missing type annotations: project requires full annotations (AGENTS.md). _build_graph, _cross_package_edges, _changed_files_mode, _full_mode, main are missing return types on some paths; _build_graph returns dict which hides its structure — consider a TypedDict or a small @dataclass. The returned G is a NetworkX DiGraph; typing it as dict defeats downstream type checking.
from __future__ import annotations: present at top of file. Good.
Absolute imports only: satisfied.
SPDX header: present.
Lazy heavy imports (AGENTS.md "Fast imports" invariant): graphify.* and networkx are imported at module top-level. For a standalone CI tool invoked directly this is fine — the invariant is about keeping data_designer import time low, and this file is in .agents/tools/, not in the package. Worth confirming the tool isn't auto-imported from anywhere in data_designer.

Test coverage

The checklist marks unit tests as "N/A: CI tool, no testable library code." This is partially true but several pure helpers are clearly testable and carry real logic: _get_package, _dedup, the risk-level decision tree in _changed_files_mode, and the direction-violation classifier in _cross_package_edges. A small test module (even a single pytest file in .agents/tools/tests/) would protect against silent regressions — especially since graphify is pinned loosely (graphifyy, no version) and API drift is the most likely failure mode.
Retroactive validation on 3 merged PRs is good signal-quality evidence, but it isn't a regression guard.

Performance

~6s per PR and ~2s for full audit (per description) against a small codebase is acceptable.
The full graph is built even for PR review (only changed files would not be enough to compute cross-package context). That is the correct trade-off.
graphify-out/ at ~9MB adds to the cache footprint but only for the structure suite. Fine.

Minor / nits

agentic-ci-pr-review.yml:166: pip install graphifyy --quiet with no version pin. Recommend pinning (e.g., graphifyy==0.4.23, the version the PR says was tested against) to match the existing graphify API signatures verified against graphifyy==0.4.23 claim in the description. Same for the daily workflow.
structural_impact.py:9-12: docstring usage examples are helpful; consider adding --full --previous-graph ... example too.
_fmt_gods mixes bullet-list output (affected) and table output (full) in one function via a flag — two small formatters would read more cleanly, but this is purely cosmetic.
Recipe change in pr-review/recipe.md asks the agent to append structural impact before the Verdict section. The skill's Step 3.5 also tells the agent to use it to calibrate review depth. These two instructions are compatible but slightly duplicative — confirm the agent applies both (use during review, then append the raw report). Not a blocker.

Verdict

Approve with minor follow-ups. The core design — trusted-path execution of a pure-AST tool with --repo-root pointing at the untrusted checkout — is sound, the fallback story (continue-on-error) is correct, and the integration into both review and audit surfaces is clean.

Recommended before merge:

Quote/array-expand $CHANGED_PY in the workflow (latent filename-handling bug).
Pin graphifyy version in both pip install steps.
Either set pipefail or drop | tail -3 around pip install so install failures surface cleanly.

Recommended after merge:
4. Add a minimal pytest for _get_package, _dedup, and the risk-level branches in _changed_files_mode — graphify API drift is the most likely future break.
5. Replace the dict return type of _build_graph with a TypedDict/dataclass for proper typing.

No blocking issues; security model is correct; retroactive validation supports the signal claim.

greptile-apps · 2026-04-21T20:45:11Z

Greptile Summary

This PR adds a graphify-based structural impact analysis tool that builds a directed AST graph of the codebase and surfaces architectural risk (LOW/MEDIUM/HIGH) before the review agent runs, integrating into both the PR-review workflow and the weekly structure audit. The security model is sound — structural_impact.py always executes from the base-branch checkout so fork PRs cannot inject code that runs with secrets in scope.

P1 — silent empty analysis: _changed_files_mode calls extract(changed_files) independently from the full graph build to obtain changed_node_ids. If graphify's node IDs are not stable across separate invocations, every downstream lookup (affected_gods, affected_communities, cross-package edges) silently returns empty, and all PRs report LOW risk with zero findings. Deriving the ID set from G.nodes() filtered by source_file removes this implicit dependency.

Confidence Score: 4/5

Safe to merge after confirming the changed_node_ids derivation is robust across graphify invocations.

All three previously raised issues (version pinning, _dedup truncation, MEDIUM risk reason) were addressed. One new P1 remains: the correctness of the analysis depends on graphify producing stable node IDs across independent extract() calls. If that assumption holds, the tool is functionally correct as validated on three retroactive PRs. The fix is a small, self-contained change and should be confirmed before merge.

structural_impact.py line 196 — changed_node_ids derivation via separate extract call.

Important Files Changed

Filename	Overview
.agents/tools/structural_impact.py	New 361-line AST analysis tool using graphify; implements PR (--changed-files) and full audit (--full) modes with risk scoring, god-node detection, and import-violation reporting. MEDIUM risk reason still reports cluster count even when the high-connectivity trigger fires (open prior comment).
.github/workflows/agentic-ci-pr-review.yml	Adds a pre-agent structural analysis step that installs graphifyy==0.4.23 in a throw-away venv and runs the tool from the base-branch checkout; security model for fork-PR isolation is correct, continue-on-error: true ensures graceful degradation.
.github/workflows/agentic-ci-daily.yml	Adds conditional graphifyy==0.4.23 install for the structure suite and extends the cache path to include graphify-out/ for week-over-week graph diffs; changes are minimal and correct.
.agents/recipes/pr-review/recipe.md	Step 1 now reads the pre-computed structural impact file if present and appends it under a ### Structural Impact heading; wording is clear and the conditional guards are correct.
.agents/recipes/structure/recipe.md	Adds Section 4 (graphify structural analysis) with bash snippet, god-node tracking guidance, and baseline persistence instructions; output template extended with graph summary metrics.
.agents/skills/review-code/SKILL.md	Adds Step 3.5 that reads the structural impact file and calibrates review depth; instructions are well-structured with clear HIGH/LOW risk guidance.
.gitignore	Adds graphify-out/ to gitignore to exclude the 9MB CI-only graph cache; correct and minimal.

Sequence Diagram

sequenceDiagram
    participant GH as GitHub Actions
    participant Base as base-agents/ (base branch)
    participant Venv as /tmp/graphify-venv
    participant Tool as structural_impact.py
    participant WS as github.workspace (PR code)
    participant TMP as /tmp/structural-impact-{PR}.md
    participant Agent as Claude Agent

    GH->>Base: sparse-checkout .agents/recipes + .agents/tools
    GH->>WS: checkout PR HEAD (full)
    GH->>GH: gh pr diff --name-only | grep .py
    GH->>Venv: python -m venv + pip install graphifyy==0.4.23
    Venv->>Tool: run base-agents/.agents/tools/structural_impact.py
    Note over Tool: --repo-root=WS --changed-files=[...]
    Tool->>WS: _collect_source_files() → all .py in packages/
    Tool->>Tool: extract(all_files) → build_from_json → G
    Tool->>Tool: extract(changed_files) → changed_node_ids
    Tool->>Tool: god_nodes, cluster, cross_package_edges
    Tool->>TMP: write risk report (LOW/MEDIUM/HIGH)
    GH->>Agent: start with recipe prompt
    Agent->>TMP: read structural impact
    Agent->>Agent: calibrate review depth by risk level
    Agent->>GH: post review with Structural Impact section

Prompt To Fix All With AI

This is a comment left during a code review.
Path: .agents/tools/structural_impact.py
Line: 196

Comment:
**`extract(changed_files)` called independently — nodes may not exist in `G`**

`changed_node_ids` is populated by calling `extract` on only the changed files, but `G` was built from `extract(all_files)` inside `_build_graph`. Every downstream lookup — `affected_gods`, `affected_communities`, `high_impact`, `_cross_package_edges(G, changed_node_ids)` — silently produces empty results if graphify assigns IDs that are not stable across independent invocations (e.g. sequential integers scoped to a single extraction run). The analysis would always report `LOW` risk with zero findings and no error.

A more robust approach is to derive `changed_node_ids` from the graph that was already built, matching on `source_file`:

```python
changed_paths = {str(p) for p in changed_files}
changed_node_ids = {
    nid for nid in G.nodes()
    if G.nodes[nid].get("source_file") in changed_paths
}
```

This removes the implicit assumption on graphify's ID stability and eliminates the redundant second extraction pass.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (3): Last reviewed commit: "fix(ci): use array expansion for changed..." | Re-trigger Greptile}

Pin graphifyy==0.4.23 in both CI workflows to prevent breaking changes from unpinned installs. Fix _dedup() label truncation at 30 chars that could merge distinct entities sharing a common prefix.

…filenames Replace unquoted $CHANGED_PY word-split with mapfile + array expansion to prevent glob expansion and correctly handle filenames containing spaces or special characters.

Derive changed_node_ids from the already-built graph by matching source_file paths instead of running a separate extraction pass. Removes implicit dependency on graphify ID stability across independent extractions. Fix MEDIUM risk reason to reflect the actual trigger (cluster spread vs high-connectivity entity) instead of always reporting cluster count.

greptile-apps · 2026-04-21T21:16:52Z

Greptile encountered an error while reviewing this PR. Please reach out to support@greptile.com for assistance.

…stale artifacts Split the workflow step to isolate GH_TOKEN from graphifyy execution, preventing a compromised package release from exfiltrating write-scoped tokens. Scan both edge directions in _cross_package_edges so inbound dependents and violations where the changed node is the target are visible. Detect deleted files and report them as a risk signal. Include relation type in dedup key so distinct edge types between the same labels are not collapsed. Clean stale /tmp artifacts before running analysis to prevent reruns from reading old reports.

andreatgretel added 2 commits April 21, 2026 20:14

andreatgretel requested a review from a team as a code owner April 21, 2026 20:42

andreatgretel temporarily deployed to agentic-ci April 21, 2026 20:42 — with GitHub Actions Inactive

greptile-apps Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread .github/workflows/agentic-ci-pr-review.yml Outdated

Comment thread .github/workflows/agentic-ci-daily.yml

Comment thread .agents/tools/structural_impact.py Outdated

fix: pin graphifyy version and fix dedup truncation

bfc5db6

Pin graphifyy==0.4.23 in both CI workflows to prevent breaking changes from unpinned installs. Fix _dedup() label truncation at 30 chars that could merge distinct entities sharing a common prefix.

greptile-apps Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread .agents/tools/structural_impact.py Outdated

fix(ci): use array expansion for changed-files arg to handle special …

913cb28

…filenames Replace unquoted $CHANGED_PY word-split with mapfile + array expansion to prevent glob expansion and correctly handle filenames containing spaces or special characters.

greptile-apps Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread .agents/tools/structural_impact.py Outdated

github-actions Bot mentioned this pull request Apr 27, 2026

Agentic CI: Issue & PR Triage Tracker #562

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add graphify structural impact analysis to PR review and structure audit#567

ci: add graphify structural impact analysis to PR review and structure audit#567
andreatgretel wants to merge 6 commits intomainfrom
andreatgretel/feat/graphify-structural-rebase

andreatgretel commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

greptile-apps Bot commented Apr 21, 2026 •

edited

Loading

Confidence Score: 4/5

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

andreatgretel commented Apr 21, 2026

📋 Summary

🔗 Related Issue

🔄 Changes

✨ Added

🔧 Changed

🔍 Attention Areas

📊 Retroactive validation on 3 merged PRs

🧪 Testing

✅ Checklist

Uh oh!

github-actions Bot commented Apr 21, 2026

Summary

Findings

Security — correctly handled, one residual concern

Correctness

Style / project conventions

Test coverage

Performance

Minor / nits

Verdict

Uh oh!

greptile-apps Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Apr 21, 2026 •

edited

Loading