add PLAN_ROLLOUT proposal — PR-stack-aware planning#1192
Open
mastermanas805 wants to merge 2 commits intogarrytan:mainfrom
Open
add PLAN_ROLLOUT proposal — PR-stack-aware planning#1192mastermanas805 wants to merge 2 commits intogarrytan:mainfrom
mastermanas805 wants to merge 2 commits intogarrytan:mainfrom
Conversation
Proposes two new skills + a declarative schema to address the gap between plan approval and shipping: - /plan-rollout: decomposes an approved plan into a reviewable PR stack and a rollout plan. Outputs decomposition.md + rollout.md consumed by /ship, /review, /spill-check, /land-and-deploy. - /spill-check: detects scope creep mid-implementation by comparing the current diff against the declared PR unit. - SYSTEM.md: repo-root declarative semantic contract graph — components, roles, role-level contracts with rollout-edge semantics. Reconciled against the LLM-discovered import graph at runtime. Includes a CEO plan (full spec), SKILL.md drafts, schema documentation, usage guide, integration notes for /ship and /review, and a TypeScript parser stub. The design was stress-tested end-to-end by simulating the workflow against honojs/hono issue #4633. 8 concrete design gaps surfaced by the dogfood are folded into v1 scope; documented in the CEO plan. Filing as a proposal doc in docs/designs/ to get directional feedback before opening the 4-PR implementation stack — see the attached issue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
|
@garrytan Requesting your suggestion |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's the problem
gstack has great plan-time skills (
/plan-eng-review,/plan-ceo-review) and great ship-time skills (/ship,/review). There's a gap between them. Nothing asks "is this one PR or three?" or "in what order should these units ship?"The pain is LLM-specific. A single Claude Code session produces a 2,000-line diff across 15 files. Reviewers drown. Scope creep hides. Bugs ship under LGTM pressure. I feel this every time I use LLMs for non-trivial work. I suspect others do too.
What I'm proposing
/plan-rolloutruns after plan approval. Reads the plan plusSYSTEM.md(a new repo-root semantic contract graph) plus the discovered import graph. Producesdecomposition.md(PR stack with reader guides, dep ordering, time-budget estimates) androllout.md(rollout strategy with inverse rollback auto-generated per step)./spill-checkruns during implementation. Compares the current diff against the declared PR unit. Flags undeclared files. Adaptive: strict for code, soft for infra/meta files like CLAUDE.md, package.json, bun.lock.SYSTEM.mdis the interesting primitive. Human-declared role contracts (auth mints session tokens middleware enforces; breaks if format changes without middleware redeploy; rollout-edge hard). Separate from the package/import graph, which the LLM discovers at runtime via AST and grep. Reconciled jointly: declared contracts give the why, discovered imports give the what, disagreements surface for human resolution.Does it actually work
Dogfooded the design end to end against honojs/hono#4633 (405 Method Not Allowed). Authored SYSTEM.md for Hono's 8 components. Decomposed the issue into a 3-PR stack with graceful dep relaxation (PR-3 can merge without PR-2 via feature detection on an optional interface method).
Implemented what would be PR-1 locally. 171 LOC, 3 files, 86/86 tests pass, zero regressions across the 4 router implementations not touched.
8 design gaps surfaced during the dogfood. All folded into v1 scope. Highlights:
kindfield (component | leaf-util | types-only) so shared utility dirs don't force awkward fits.package-typefield because library rollouts (npm publish + revert) differ materially from service rollouts (coordinated deploy + state restore).The 4 PRs if this lands
lib/plan-rollout/system-map-*.ts+ tests +docs/SYSTEM-MD.md. Standalone, no skills modified./plan-rolloutskill + the helpers the SKILL.md calls. Depends on docs: add README and CLAUDE.md #1./spill-checkskill + spill classifier. Independent of refactor: reorganize codebase into modular structure #2./ship,/review,/plan-ceo-review,/plan-eng-review. Zero-regression gated ondecomposition.mdexistence.~75 min cumulative review time. PR-1 is low-risk standalone and should land first.
What I want from you
Does this shape fit gstack? In-tree or separate plugin? Any expansions I've got wrong? Convention checks: artifacts in
~/.gstack/projects/vs.gstack/in-repo? SYSTEM.md format?If this is a nope, tell me, saves us both time. If it's yes-but-shape-it, I'll rework. If yes, I'll open the 4-PR stack.
No rush. I know the queue is deep.