feat(map-review): cross-AI peer review (--cross-ai <runtime>) — slice 1 of #288#295
Merged
Conversation
Dispatch /map-review to an INDEPENDENT external AI CLI (codex/gemini/claude/
opencode) for a true second opinion — a different model/vendor with fresh
context. All subprocess interaction, envelope parsing, finding normalization,
and the untrusted boundary live in the Python step runner (run_cross_ai_review /
dispatch_cross_ai_review, producer-owns-parse); the skill only handles consent
and presentation.
Egress is double-consent: the per-run --cross-ai flag AND
review.cross_ai.enabled: true (off by default) — the diff/code leaves the
machine. Guardrails: a high-confidence outbound secret scan blocks dispatch
before the subprocess (pattern name only, never value); shell=False literal-argv
with a configurable timeout; returned findings always enter context behind an
EXTERNAL UNTRUSTED REFERENCE fence (link/injection scan, applied in Python so the
model cannot skip it) and are advisory-only (source: cross_ai); same-vendor
runtimes are honestly labeled independent_vendor: false. Any failure (disabled /
CLI missing / unauthenticated / timeout / non-JSON / secret-blocked) degrades
non-blockingly and falls back to the in-session review.
Config keys review.cross_ai.{enabled,runtime,timeout_seconds}. map-review's
per-skill SKILL.md line budget raised (three review modes: normal + adversarial
+ cross-ai); detail lives in review-reference.md. Single-runtime slice;
--cross-ai all consensus deferred to a follow-up. Design was llm-council-reviewed.
Part of #288
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
/map-review --cross-ai <runtime>dispatches the review to an independent external AI CLI (codex/gemini/claude/opencode) for a true second opinion — a different model/vendor with fresh context and no shared session. Same-model review is "inbred"; an independent reviewer catches model-specific blind spots. Slice 1 of #288 (single-runtime dispatch).Design (llm-council-reviewed — conv
92a7f159)run_cross_ai_review/dispatch_cross_ai_review). The skill only handles consent + presentation. Mirrors theskills_eval/dispatcher.pyand--adversarialprecedents.{binary, argv, envelope, independent_vendor}per runtime.--cross-aiflag ANDreview.cross_ai.enabled: trueare both required, because the diff/code leaves the machine (mirrors the SOFA opt-in posture).Security guardrails (all enforced in Python, not prompt text)
status: secret_blocked.shell=Falseliteral-argv invocation ({prompt}token replaced wholesale — injection-proof) with a configurable timeout.EXTERNAL UNTRUSTED REFERENCEfence (link allowlist + injection scan, SOFA semantics) — applied deterministically in Python so the model cannot "forget" to fence. Findings are advisory-only (source: cross_ai).claudereviewing a Claude session) are labeledindependent_vendor: false.Failure degradation (non-blocking)
disabled/unavailable/timeout/error/unparsed/secret_blockedall degrade gracefully and fall back to the in-session review — cross-AI is a supplement, never a hard gate.Config
Testing
make checkgreen locally: ruff ✅, mypy ✅, pyright0 errors/0 warnings/0 informations✅, 2880 passed,check-render✅ (generated trees ==templates_src).TestCrossAi*intests/test_map_step_runner.py(wrap/detect/dispatch across success / claude-json envelope / timeout / non-zero / unparsed / secret-blocked / shell=False+literal-argv) +tests/test_cross_ai_config.py(dataclass defaults, dotted-key aliasing, validation fallbacks, default-config doc). The secret-blocked test assertssubprocess.runis not called.review-reference.md; map-review's per-skill SKILL.md line budget raised deliberately (it now hosts three review modes: normal + adversarial + cross-ai).Scope
Part of #288— single-runtime dispatch.--cross-ai allmulti-runtime consensus/disagreement aggregation is a deferred follow-up slice (theFinding/sourceshape is designed to support it without a schema migration). Issue stays open.