Add graphify skill, Graph Researcher agent, and graphify instructions.#1518
Add graphify skill, Graph Researcher agent, and graphify instructions.#1518TechPreacher wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an experimental “Graph Research” workflow to the experimental collection, including a graphify skill (wrapping the upstream graphifyy package), a Graph Researcher agent that queries the graph via MCP tools, and supporting instructions/docs/collection + plugin metadata updates.
Changes:
- Introduces the
graphifyskill with a Python subprocess wrapper, uv/pyproject config, and pytest + fuzz harness tests. - Adds the
Graph Researcheragent andgraphify-out/**instructions for evidence-tagged reporting conventions. - Updates
experimentalandhve-core-allcollection manifests/docs and regenerates plugin README metadata; adds a new docs section underdocs/agents/graphify/.
Reviewed changes
Copilot reviewed 33 out of 36 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| plugins/hve-core-all/README.md | Generated plugin index updated to include graph-researcher agent, graphify skill, and graphify instructions |
| plugins/experimental/README.md | Generated plugin index updated for the experimental collection additions |
| docs/agents/graphify/category.json | Adds a new Docusaurus sidebar category for Graph Research docs |
| docs/agents/graphify/README.md | New documentation page describing the Graph Research workflow and setup |
| docs/agents/README.md | Adds “Graph Research” section to the agents catalog |
| collections/hve-core-all.collection.yml | Adds new experimental agent/instruction/skill to the all-up collection manifest |
| collections/hve-core-all.collection.md | Documents the new agent/instruction/skill in the all-up collection description |
| collections/experimental.collection.yml | Adds new agent/instruction/skill entries to the experimental manifest |
| collections/experimental.collection.md | Documents the new agent/instruction/skill in the experimental collection description |
| CLAUDE.md | Adds repo guidance for Claude Code sessions (project structure, commands, conventions) |
| .vscode/settings.json | Updates Copilot Chat skill discovery locations (currently has issues) |
| .markdownlint-cli2.jsonc | Ignores .venv/ folders for markdownlint scanning |
| .github/skills/experimental/graphify/uv.lock | Adds locked Python dependency resolution for the new skill |
| .github/skills/experimental/graphify/tests/test_graphify_wrapper.py | Adds pytest coverage for the CLI wrapper behavior |
| .github/skills/experimental/graphify/tests/fuzz_harness.py | Adds a fuzz harness for wrapper validation (currently has issues) |
| .github/skills/experimental/graphify/scripts/graphify_wrapper.py | Implements a thin subprocess wrapper around the graphify CLI |
| .github/skills/experimental/graphify/pyproject.toml | Defines Python project config, dependency groups, ruff/pytest config |
| .github/skills/experimental/graphify/SKILL.md | Skill documentation for building graphs + MCP server registration + operational notes |
| .github/instructions/experimental/graphify.instructions.md | Adds instructions governing graphify-out/** output handling and evidence reporting |
| .github/agents/experimental/graph-researcher.agent.md | Adds agent behavior for MCP tool selection and audit-tagged reporting |
| .cspell.json | Adds allowlist entries for graphify-related terminology |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1518 +/- ##
==========================================
- Coverage 85.46% 83.61% -1.85%
==========================================
Files 82 60 -22
Lines 11802 7503 -4299
==========================================
- Hits 10087 6274 -3813
+ Misses 1715 1229 -486
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
07ee4b4 to
15ffabb
Compare
bindsi
left a comment
There was a problem hiding this comment.
Thanks for adding graphify — the skill, agent, and instructions are well-structured and the upstream attribution / pinning discipline is solid. A few items to address before merge:
High priority
CLAUDE.mdat the repo root and the unrelatedchat.agentSkillsLocationsadditions in.vscode/settings.jsonare out of scope for a PR titled "Add graphify skill, Graph Researcher agent, and graphify instructions." Recommend splitting them into their own PR(s) so each can be evaluated independently —CLAUDE.mdat the root ofmicrosoft/hve-corein particular feels like a separate governance decision.- The fuzz harness's docstring promises it never spawns a subprocess, but the implementation will spawn
graphify .for valid modes if the CLI is onPATH. In deep mode that uploads cwd contents to Claude. Easy fix; details inline. - The SKILL lists 7
mcp_graphify_*tools; upstreamgraphifyydocuments 4. Please verify the extras exist in pinned0.5.4or trim the table — otherwise the agent's tool-mapping is broken by construction.
Lower priority
pyproject.tomlrequires Python 3.11 whileSKILL.mdand upstream both say 3.10+.graphify_wrapper.pyhas no in-tree consumer — either point at it from SKILL.md or drop it.Graph ResearcherPhase 1 prescribes an MCP-tool-enumeration check Copilot Chat can't actually perform; reword as reactive.
PR is also currently behind main and will need a rebase. Nothing here blocks the direction of the change — it's a useful addition under experimental — just want the scope and the fuzz/docs details tightened up.
| "servers": { | ||
| "graphify": { | ||
| "command": "python3", | ||
| "args": ["-m", "graphify.serve", "graphify-out/graph.json"], | ||
| "type": "stdio" | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| Reload the VS Code window. Copilot Chat surfaces these tools (names follow GHCC's `mcp_<server>_<tool>` convention): | ||
|
|
||
| | Tool | Purpose | |
There was a problem hiding this comment.
graphifyy==0.5.4
This table lists 7 tools (query_graph, get_node, get_neighbors, get_community, god_nodes, graph_stats, shortest_path). The upstream graphifyy README on PyPI documents only 4: query_graph, get_node, get_neighbors, shortest_path.
If get_community, god_nodes, and graph_stats are not actually exposed by the pinned 0.5.4 MCP server, the Graph Researcher agent's Phase 2 tool-mapping table will keep selecting tools that produce "tool not found", which directly violates the agent's stated quality contract ("name the MCP tool(s) used and the node IDs touched").
Suggested resolution: Verify against the 0.5.4 source/changelog, not the latest upstream. Either link evidence that these four extras existed in 0.5.4, or trim the SKILL table and the corresponding rows in graph-researcher.agent.md to fall back to query_graph for those question shapes.
| @@ -0,0 +1,128 @@ | |||
| --- | |||
There was a problem hiding this comment.
CLAUDE.md at the Microsoft repo root
This file (128 lines, largely overlapping .github/copilot-instructions.md) is added at the repo root but is not mentioned in the PR description or "Type of Change" checklist. The PR is scoped to "graphify skill, Graph Researcher agent, and graphify instructions"; introducing a tool-specific guidance file at the root of microsoft/hve-core is a separate governance decision and should land in its own commit/PR so it can be evaluated and reverted independently of the graphify feature.
Suggested resolution: Split into a separate PR (docs: add CLAUDE.md guidance for Claude Code sessions) and either link the prior governance discussion that authorized it, or drop it from this PR.
| ".agents/skills": true, | ||
| ".claude/skills": true, | ||
| ".copilot/skills": true, | ||
| ".github/skills/shared": true, | ||
| ".github/skills/coding-standards": true, | ||
| ".github/skills/experimental": true, | ||
| ".github/skills/github": true, | ||
| ".github/skills/gitlab": true, | ||
| ".github/skills/jira": true, | ||
| ".github/skills/security": true, |
There was a problem hiding this comment.
chat.agentSkillsLocations additions
The .agents/skills, .claude/skills, and .copilot/skills entries are not used by anything else in this PR and are unrelated to graphify. Riding unrelated workspace-config changes inside a feature PR makes rollback/bisection harder.
Suggested resolution: Move into a separate chore(settings): broaden agent skills locations PR, or call them out explicitly in this PR's description so reviewers can opt in deliberately.
Also note the trailing comma on line 79 ("security": true,) which is now dangling after the trailing entry was removed.
|
|
||
| ## Required Phases | ||
|
|
||
| ### Phase 1: Verify the graph is available | ||
|
|
||
| Before answering any question: | ||
|
|
||
| 1. Confirm `graphify-out/graph.json` exists at the workspace root. | ||
| 2. Confirm at least one `mcp_graphify_*` tool is available in the current Copilot Chat session. | ||
| 3. If either check fails, stop and report exactly one of: | ||
| * "No `graphify-out/graph.json` found. Build the graph first: `graphify . --mode standard --update`." | ||
| * "Graphify MCP server not registered. Add the snippet from the [graphify skill Quick Start](../../skills/experimental/graphify/SKILL.md#quick-start) to `.vscode/mcp.json` and reload the window." | ||
|
|
There was a problem hiding this comment.
💡 Phase 1 step 2 prescribes a check the agent cannot actually perform
Confirm at least one
mcp_graphify_*tool is available in the current Copilot Chat session.
There is no portable way for an agent to enumerate registered MCP tools from inside Copilot Chat. The agent can call a tool and observe a "tool not found" failure, but it cannot pre-flight check tool registration. As written, this step encourages the agent to fabricate a confirmation.
Suggested resolution: Reword as a reactive check, e.g.:
- If a subsequent
mcp_graphify_*call fails with "tool not found", stop and surface the MCP-server-not-registered remediation (the second bullet in step 3).
That matches what the agent can actually observe.
| fdp = atheris.FuzzedDataProvider(data) | ||
| mode = fdp.ConsumeUnicodeNoSurrogates(16) | ||
| else: | ||
| mode = data.decode("utf-8", errors="ignore")[:16] | ||
| with suppress(ValueError): | ||
| build_graph(Path("."), mode=mode, update=False) | ||
|
|
||
|
|
||
| def test_fuzz_build_graph_mode() -> None: | ||
| """Pytest entry point — feeds a small fixed corpus through the harness.""" |
There was a problem hiding this comment.
🔒 Fuzz harness can invoke subprocess.run against the host
The docstring promises:
The wrapper must reject unknown modes with
ValueErrorwithout ever invoking subprocess.
But only ValueError is suppressed. When the fuzzer (or the pytest seed corpus that includes b"fast", b"deep") hits a valid mode, build_graph clears mode validation, calls graphify_executable(), and reaches subprocess.run([..., "."]):
- If
graphifyis not onPATH,GraphifyNotInstalledErroris raised — not suppressed — sotest_fuzz_build_graph_modefails with an unexpected exception. - If
graphifyis installed (developer machines, future CI images), the harness will actually executegraphify .against the current working directory. With--mode standard/deepandANTHROPIC_API_KEYset, this would upload cwd contents to the Claude API — a real safety risk for a fuzz harness.
Suggested fix (short-circuit to the validation surface only):
def fuzz_build_graph_mode(data: bytes) -> None:
"""Fuzz build_graph mode validation. The wrapper must reject unknown modes
with ValueError without ever invoking subprocess."""
if FUZZING:
fdp = atheris.FuzzedDataProvider(data)
mode = fdp.ConsumeUnicodeNoSurrogates(16)
else:
mode = data.decode("utf-8", errors="ignore")[:16]
if mode in {"fast", "standard", "deep"}:
return # don't spawn the CLI from a fuzz harness
with suppress(ValueError):
build_graph(Path("."), mode=mode, update=False)Or, if exercising the happy path under fuzzing is intended, monkey-patch graphify_wrapper.shutil.which and graphify_wrapper.subprocess.run the way test_graphify_wrapper.py does.
| [project] | ||
| name = "graphify-skill" | ||
| version = "0.0.0" | ||
| requires-python = ">=3.11" |
There was a problem hiding this comment.
💡 Python pin disagrees with skill compatibility statement
requires-python = ">=3.11" and target-version = "py311" here, but SKILL.md frontmatter says "Requires Python 3.10+" and upstream graphifyy itself only requires Python 3.10+. A 3.10 user will install graphifyy successfully and then fail to uv sync this skill. The wrapper code uses no 3.11-only syntax that I can see.
Suggested resolution: Lower the floor to >=3.10 and target-version = "py310", or update SKILL.md compatibility to "Requires Python 3.11+" with a one-line rationale.
| # Copyright (c) Microsoft Corporation. | ||
| # SPDX-License-Identifier: MIT | ||
| """Thin wrapper around the upstream `graphify` CLI. | ||
|
|
||
| This skill orchestrates the upstream graphifyy package; it does not reimplement | ||
| graph construction. The wrapper exists so the skill exposes a stable subprocess | ||
| entry point and so tests can mock the CLI boundary. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import shutil | ||
| import subprocess | ||
| from pathlib import Path | ||
|
|
||
|
|
||
| class GraphifyNotInstalledError(RuntimeError): | ||
| """Raised when the `graphify` CLI binary cannot be located on PATH.""" | ||
|
|
||
|
|
||
| def graphify_executable() -> str: | ||
| """Return the absolute path to the `graphify` CLI or raise.""" | ||
| path = shutil.which("graphify") | ||
| if path is None: | ||
| raise GraphifyNotInstalledError( | ||
| "graphify CLI not found on PATH. Install with `pip install graphifyy`." | ||
| ) | ||
| return path | ||
|
|
||
|
|
||
| def build_graph(target: Path, mode: str = "standard", update: bool = True) -> int: | ||
| """Invoke `graphify <target> --mode <mode> [--update]` and return its exit code.""" | ||
| if mode not in {"fast", "standard", "deep"}: | ||
| raise ValueError(f"unsupported mode: {mode!r}") | ||
| cmd = [graphify_executable(), str(target), "--mode", mode] | ||
| if update: | ||
| cmd.append("--update") | ||
| return subprocess.run(cmd, check=False).returncode |
There was a problem hiding this comment.
💡 Wrapper has no in-tree consumer
Nothing in this PR calls build_graph() or graphify_executable() outside the wrapper's own test file. SKILL.md instructs users to invoke graphify directly on the command line, and the agent uses mcp_graphify_* tools.
This is acceptable as scaffolding to satisfy validate:skills, but the rationale comment ("so the skill exposes a stable subprocess entry point and so tests can mock the CLI boundary") leaves a future reader wondering who the consumer is.
Suggested resolution: Either add a one-line "Programmatic Usage" subsection to SKILL.md pointing at this module and its function signatures, or drop the wrapper and replace its test with a --dry-run smoke gated on graphify being installed. Either choice is fine; the current state is the awkward middle.
|
Follow-up on Claude Code does read Better patterns, in order of preference:
Worth noting that the new Happy to follow up with a separate issue/PR proposing |
37d0425 to
3452a69
Compare
|
Re-reviewed at Resolved
Still open
Minor follow-up nit (non-blocking) In Once RI-4 is addressed I'm happy to mark this approved. |
This introduces a new agent and supporting skill for structural codebase analysis using knowledge graphs. The `Graph Researcher` agent answers complex structural questions by querying a `graphify`-built knowledge graph, while the `Graphify` skill provides the capability to build and manage these graphs. Key changes include: - Added `Graph Researcher` agent definition and `Graphify` instruction file. - Introduced `graphify` Python skill, including its wrapper, dependencies (`graphifyy`), and fuzz tests. - Updated `experimental` and `hve-core-all` collections to include the new agent and skill. - Added `graphify` and `graphifyy` to the spell check dictionary. feat(graphify): Update uv.lock for Python 3.11 compliance This commit regenerates `uv.lock` to fully align the `graphify` skill's dependencies with the Python 3.11 requirement. It removes Python 3.10-specific package versions, conditional dependencies, and packages that are now part of the Python 3.11 standard library (e.g., `exceptiongroup`, `tomli`, and associated `typing-extensions`). fix(graphify): stub subprocess in fuzz harness so valid modes never spawn graphify CLI The harness docstring promised it never invokes subprocess, but build_graph only raises ValueError for unknown modes; valid modes (fast/standard/deep) fell through to subprocess.run([graphify, '.', '--mode', ...]). With the seed corpus including 0_fast/0_standard/0_deep, atheris would rapidly spawn the upstream CLI against cwd — and 'deep' mode uploads cwd contents to a remote service. Scope the stub to the call duration (not module import) so wrapper unit tests still observe the real subprocess boundary.
…P availability check - SKILL.md: state Python 3.11+ (matching pyproject.toml/uv.lock) and note that upstream graphifyy itself supports 3.10+. Resolves a doc/code mismatch. - SKILL.md: add a Programmatic API section pointing at scripts/graphify_wrapper.py so the wrapper has at least one in-tree consumer reference. The wrapper is the boundary the skill's tests mock. - graph-researcher.agent.md: rewrite Phase 1 step 2 as reactive. Copilot Chat has no API to enumerate available MCP tools at session start, so the previous proactive check was unimplementable. Now the agent attempts the first MCP call and reports unreachability if it fails.
Reviewer flagged that the MCP-tool table lists 7 tools while the upstream README documents only 4. The remaining 3 (get_community, god_nodes, graph_stats) are registered by serve.py in the pinned version but absent from the README's brief tool list. Add an inline citation pointing at upstream serve.py at tag v0.5.4 so reviewers and future maintainers can verify the surface area without taking my word for it.
…mespace Replace manual setattr/try/finally with unittest.mock.patch.object so the stub is bound to the call's context manager. Rebind graphify_wrapper's 'subprocess' attribute to a SimpleNamespace fake instead of mutating the real stdlib subprocess.run, so the patch never touches global module state and is safe under pytest-xdist or other in-process parallel runners that share an interpreter.
a015dcf to
ae98325
Compare
Pull Request
Description
Adds an experimental
graphifyskill, aGraph Researcheragent, and agraphifyinstructions file under theexperimentalcollection. The skill wraps the upstream MIT-licensedgraphifyyPyPI package (pinned0.5.4) to build knowledge graphs over a codebase and expose them via MCP. The agent answers structural questions ("what depends on X", "what connects A and B") by querying the resulting graph throughmcp_graphify_*tools, with audit-tag (EXTRACTED/INFERRED/AMBIGUOUS) reporting and confidence-score handling. Includespyproject.toml, Atheris fuzz harness,pytesttests, Docusaurus docs page, cspell allowlist entries, collection manifest updates, and regenerated plugin outputs forexperimentalandhve-core-all.Related Issue(s)
N/A.
Type of Change
Code & Documentation:
AI Artifacts:
.github/instructions/*.instructions.md).github/agents/*.agent.md).github/skills/*/SKILL.md)Other:
.ps1,.sh,.py) - Python wrapper + tests for the skillSample Prompts (for AI Artifact Contributions)
User Request:
"Use the Graph Researcher to tell me which modules are implicitly affected if I change
auth_middleware.py."Execution Flow:
graphify-out/graph.jsonexists andmcp_graphify_*tools are registered. If missing, surfaces the exact remediation from the skill.mcp_graphify_get_neighborsrather than the broaderquery_graph.EXTRACTEDedges are stated as fact;INFERREDare hedged with confidence;AMBIGUOUSare surfaced as open questions.Output Artifacts:
The skill generates (under target repo's
graphify-out/):The agent itself returns chat output, not files.
Success Indicators:
npm run validate:skillspasses for.github/skills/experimental/graphify/.Testing
npm run validate:skills- skill structure,pyproject.toml, fuzz harness layout.npm run test:py- runstest_graphify_wrapper.py.npm run lint:frontmatter- schema-validates new agent, instructions, skill frontmatter.npm run lint:md+npm run spell-check- markdown + cspell over new docs.npm run plugin:generate+npm run plugin:validate- verify regeneratedplugins/experimental/andplugins/hve-core-all/outputs match collection manifests.npm run lint:md-links- link integrity across new docs.Checklist
Required Checks
docs/agents/graphify/README.md,docs/agents/README.md)*.agent.md,*.instructions.md,SKILL.md,<collection>/<skill>/)experimentalcollection entries)tests/test_graphify_wrapper.py,tests/fuzz_harness.py)AI Artifact Contributions
/prompt-analyzeto review contributionprompt-builderreviewRequired Automated Checks
npm run lint:mdnpm run spell-checknpm run lint:frontmatternpm run validate:skillsnpm run lint:md-linksnpm run lint:psnpm run plugin:generatenpm run docs:testSecurity Considerations
graphifyy==0.5.4reviewed; upstream is MIT-licensed (Safi Shamsi). Skill orchestrates upstream CLI/MCP server only - no vendored source.--mode fast(AST-only, no LLM) for sensitive trees.ANTHROPIC_API_KEYconsumed via env only; never written to artifacts.graphify-out/(incl.cache/of hashed file snapshots) documented as gitignored build output.Additional Notes
experimentalcollection - opt-in for users.plugins/experimental/andplugins/hve-core-all/are generated; do not edit by hand. Regenerate vianpm run plugin:generateafter any artifact or collection manifest change.graphifyyPyPI package name uses doubley; CLI binary is single-ygraphify. Pin matters - upstream iterates fast.