ci: add graphify structural impact analysis to PR review and structure audit#567
ci: add graphify structural impact analysis to PR review and structure audit#567andreatgretel wants to merge 6 commits intomainfrom
Conversation
…e audit Add a graphify-based AST analysis tool that builds a directed graph of the codebase (~2s, no LLM calls) to detect architectural impact. Integrates into both the PR review workflow (pre-computed before claude runs) and the Wednesday structure audit (with week-over-week diff). PR review: extracts changed files against the full codebase graph, reports risk level (LOW/MEDIUM/HIGH), god nodes affected, import direction violations, and cross-package dependencies. Output saved to /tmp and read by the review agent. Structure audit: produces god node rankings, cross-package edge summary table, import violation detection, and graph diff against previous week's cached graph. Baselines saved for runner memory trend tracking.
- Fix KeyError: god_nodes() returns 'degree' not 'edges' (3 call sites) - Fix deduped vs raw violation count inconsistency in baselines.json - Security: run structural_impact.py from base-branch checkout so fork PRs cannot inject code that executes with GH_TOKEN in scope - Add --repo-root flag so the tool resolves package paths correctly when invoked from a different checkout directory - Replace make install-dev + .venv with lightweight /tmp/graphify-venv (only graphifyy needed, saves ~2min CI per PR review) - Add graphify-out/ to .gitignore (9MB graph cache is CI-only)
SummaryPR #567 adds a graphify-based AST structural-impact tool and wires it into two CI workflows:
Retroactive validation against 3 merged PRs is cited in the description; signal correlation with actual critical findings looks plausible. FindingsSecurity — correctly handled, one residual concernThe critical security story (tool must run from trusted base-branch code, not PR code) is executed well:
One residual concern in CHANGED_PY=$(gh pr diff "$PR_NUMBER" --name-only | grep '\.py$' || true)
...
--changed-files $CHANGED_PY \
mapfile -t CHANGED_PY < <(gh pr diff "$PR_NUMBER" --name-only | grep '\.py$' || true)
...
--changed-files "${CHANGED_PY[@]}"This also fixes a latent bug: if Correctness
Style / project conventions
Test coverage
Performance
Minor / nits
VerdictApprove with minor follow-ups. The core design — trusted-path execution of a pure-AST tool with Recommended before merge:
Recommended after merge: No blocking issues; security model is correct; retroactive validation supports the signal claim. |
Greptile SummaryThis PR adds a
|
| Filename | Overview |
|---|---|
| .agents/tools/structural_impact.py | New 361-line AST analysis tool using graphify; implements PR (--changed-files) and full audit (--full) modes with risk scoring, god-node detection, and import-violation reporting. MEDIUM risk reason still reports cluster count even when the high-connectivity trigger fires (open prior comment). |
| .github/workflows/agentic-ci-pr-review.yml | Adds a pre-agent structural analysis step that installs graphifyy==0.4.23 in a throw-away venv and runs the tool from the base-branch checkout; security model for fork-PR isolation is correct, continue-on-error: true ensures graceful degradation. |
| .github/workflows/agentic-ci-daily.yml | Adds conditional graphifyy==0.4.23 install for the structure suite and extends the cache path to include graphify-out/ for week-over-week graph diffs; changes are minimal and correct. |
| .agents/recipes/pr-review/recipe.md | Step 1 now reads the pre-computed structural impact file if present and appends it under a ### Structural Impact heading; wording is clear and the conditional guards are correct. |
| .agents/recipes/structure/recipe.md | Adds Section 4 (graphify structural analysis) with bash snippet, god-node tracking guidance, and baseline persistence instructions; output template extended with graph summary metrics. |
| .agents/skills/review-code/SKILL.md | Adds Step 3.5 that reads the structural impact file and calibrates review depth; instructions are well-structured with clear HIGH/LOW risk guidance. |
| .gitignore | Adds graphify-out/ to gitignore to exclude the 9MB CI-only graph cache; correct and minimal. |
Sequence Diagram
sequenceDiagram
participant GH as GitHub Actions
participant Base as base-agents/ (base branch)
participant Venv as /tmp/graphify-venv
participant Tool as structural_impact.py
participant WS as github.workspace (PR code)
participant TMP as /tmp/structural-impact-{PR}.md
participant Agent as Claude Agent
GH->>Base: sparse-checkout .agents/recipes + .agents/tools
GH->>WS: checkout PR HEAD (full)
GH->>GH: gh pr diff --name-only | grep .py
GH->>Venv: python -m venv + pip install graphifyy==0.4.23
Venv->>Tool: run base-agents/.agents/tools/structural_impact.py
Note over Tool: --repo-root=WS --changed-files=[...]
Tool->>WS: _collect_source_files() → all .py in packages/
Tool->>Tool: extract(all_files) → build_from_json → G
Tool->>Tool: extract(changed_files) → changed_node_ids
Tool->>Tool: god_nodes, cluster, cross_package_edges
Tool->>TMP: write risk report (LOW/MEDIUM/HIGH)
GH->>Agent: start with recipe prompt
Agent->>TMP: read structural impact
Agent->>Agent: calibrate review depth by risk level
Agent->>GH: post review with Structural Impact section
Prompt To Fix All With AI
This is a comment left during a code review.
Path: .agents/tools/structural_impact.py
Line: 196
Comment:
**`extract(changed_files)` called independently — nodes may not exist in `G`**
`changed_node_ids` is populated by calling `extract` on only the changed files, but `G` was built from `extract(all_files)` inside `_build_graph`. Every downstream lookup — `affected_gods`, `affected_communities`, `high_impact`, `_cross_package_edges(G, changed_node_ids)` — silently produces empty results if graphify assigns IDs that are not stable across independent invocations (e.g. sequential integers scoped to a single extraction run). The analysis would always report `LOW` risk with zero findings and no error.
A more robust approach is to derive `changed_node_ids` from the graph that was already built, matching on `source_file`:
```python
changed_paths = {str(p) for p in changed_files}
changed_node_ids = {
nid for nid in G.nodes()
if G.nodes[nid].get("source_file") in changed_paths
}
```
This removes the implicit assumption on graphify's ID stability and eliminates the redundant second extraction pass.
How can I resolve this? If you propose a fix, please make it concise.Reviews (3): Last reviewed commit: "fix(ci): use array expansion for changed..." | Re-trigger Greptile
Pin graphifyy==0.4.23 in both CI workflows to prevent breaking changes from unpinned installs. Fix _dedup() label truncation at 30 chars that could merge distinct entities sharing a common prefix.
…filenames Replace unquoted $CHANGED_PY word-split with mapfile + array expansion to prevent glob expansion and correctly handle filenames containing spaces or special characters.
Derive changed_node_ids from the already-built graph by matching source_file paths instead of running a separate extraction pass. Removes implicit dependency on graphify ID stability across independent extractions. Fix MEDIUM risk reason to reflect the actual trigger (cluster spread vs high-connectivity entity) instead of always reporting cluster count.
|
Greptile encountered an error while reviewing this PR. Please reach out to support@greptile.com for assistance. |
…stale artifacts Split the workflow step to isolate GH_TOKEN from graphifyy execution, preventing a compromised package release from exfiltrating write-scoped tokens. Scan both edge directions in _cross_package_edges so inbound dependents and violations where the changed node is the target are visible. Detect deleted files and report them as a risk signal. Include relation type in dedup key so distinct edge types between the same labels are not collapsed. Clean stale /tmp artifacts before running analysis to prevent reruns from reading old reports.
📋 Summary
Add a graphify-based AST analysis tool that builds a
directed graph of the codebase (~6s, zero LLM cost) to surface architectural risk before the
review agent starts. Integrates into both the PR review workflow (pre-computed context for the
agent) and the Wednesday structure audit (week-over-week graph diff).
Hat tip to @nabinchha for suggesting graphify as the right tool for this.
🔗 Related Issue
Closes #472 (Phase 4 — structural analysis)
🔄 Changes
✨ Added
structural_impact.py—standalone analysis tool with two modes:
--changed-files(PR review): extracts changed AST entities against the full codebase graph,reports risk level (LOW/MEDIUM/HIGH), god nodes affected, import direction violations, and
cross-package dependencies
--full(structure audit): god node rankings, cross-package edge summary table, importviolation detection, and graph diff against the previous week's cached graph
review-codeskill —agent reads the pre-computed structural impact to calibrate review depth (HIGH risk = extra
scrutiny on blast radius and backward compatibility)
graphify-out/added to.gitignore(9MB graph cache is CI-only, persisted via actions/cache)🔧 Changed
agentic-ci-pr-review.yml—new step installs graphifyy in a lightweight temp venv and runs structural analysis before the
agent starts. Tool is read from the base-branch checkout (
base-agents/) so fork PRs cannotinject code that runs with
GH_TOKENin scope. Base-branch checkout expanded from.agents/recipesto also include.agents/toolsagentic-ci-daily.yml—conditional graphifyy install for the structure suite;
graphify-out/added to the cache pathfor week-over-week graph diffs
pr-reviewrecipe —agent reads
/tmp/structural-impact-{{pr_number}}.mdand appends findings to the reviewstructurerecipe —new Section 4 with graphify instructions, baseline tracking, and week-over-week diff guidance
🔍 Attention Areas
agentic-ci-pr-review.yml—the security model:
structural_impact.pyruns frombase-agents/(base-branch sparsecheckout), not from the PR's files, to prevent fork PRs from injecting code that executes
with secrets in scope. The
--repo-rootflag ensures the tool still analyzes the PR'ssource code correctly despite living in a different checkout
structural_impact.py—new 361-line tool; pure AST analysis, no DD imports at runtime. Depends on
graphifyy(tree-sitter based) andnetworkx(transitive)📊 Retroactive validation on 3 merged PRs
Ran graphify retroactively against PRs #502, #545, and #557 to validate the signal quality:
DatasetBuilder(88 deps) +ExecutionGraph(89 deps) flagged as highest-riskconfig.__getattr__()calls into engineIn all 3 cases, the most critical review findings correlated with the highest-connectivity
entities in the graphify report. The risk level (LOW/MEDIUM/HIGH) matched actual review
effort needed.
🧪 Testing
structural_impact.pysmoke-tested in both modes (--changed-files,--full)against the live DD codebase
--repo-rootflag tested from external checkout directory (simulates CIbase-agents/path)graphifyy==0.4.23✅ Checklist
.agents/tooling, not package architecture