diff --git a/SKILL.md b/SKILL.md index d4130d1db9..cfbe00ce41 100644 --- a/SKILL.md +++ b/SKILL.md @@ -3,10 +3,7 @@ name: gstack preamble-tier: 1 version: 1.1.0 description: | - Fast headless browser for QA testing and site dogfooding. Navigate pages, interact with - elements, verify state, diff before/after, take annotated screenshots, test responsive - layouts, forms, uploads, dialogs, and capture bug evidence. Use when asked to open or - test a site, verify a deployment, dogfood a user flow, or file a bug with screenshots. (gstack) + gstack browser workflow for opening, testing, and dogfooding web pages with screenshots and evidence. allowed-tools: - Bash - Read diff --git a/SKILL.md.tmpl b/SKILL.md.tmpl index a248cbfa32..3dc20fe358 100644 --- a/SKILL.md.tmpl +++ b/SKILL.md.tmpl @@ -3,10 +3,7 @@ name: gstack preamble-tier: 1 version: 1.1.0 description: | - Fast headless browser for QA testing and site dogfooding. Navigate pages, interact with - elements, verify state, diff before/after, take annotated screenshots, test responsive - layouts, forms, uploads, dialogs, and capture bug evidence. Use when asked to open or - test a site, verify a deployment, dogfood a user flow, or file a bug with screenshots. (gstack) + gstack browser workflow for opening, testing, and dogfooding web pages with screenshots and evidence. allowed-tools: - Bash - Read diff --git a/autoplan/SKILL.md b/autoplan/SKILL.md index 6a8ad3b278..a9c95c65e1 100644 --- a/autoplan/SKILL.md +++ b/autoplan/SKILL.md @@ -3,14 +3,7 @@ name: autoplan preamble-tier: 3 version: 1.0.0 description: | - Auto-review pipeline — reads the full CEO, design, eng, and DX review skills from disk - and runs them sequentially with auto-decisions using 6 decision principles. Surfaces - taste decisions (close approaches, borderline scope, codex disagreements) at a final - approval gate. One command, fully reviewed plan out. - Use when asked to "auto review", "autoplan", "run all reviews", "review this plan - automatically", or "make the decisions for me". - Proactively suggest when the user has a plan file and wants to run the full review - gauntlet without answering 15-30 intermediate questions. (gstack) + gstack auto-review pipeline that runs CEO, eng, design, and DX reviews with final taste decisions. Voice triggers (speech-to-text aliases): "auto plan", "automatic review". benefits-from: [office-hours] triggers: diff --git a/autoplan/SKILL.md.tmpl b/autoplan/SKILL.md.tmpl index 6577a6725c..8a7eb955f9 100644 --- a/autoplan/SKILL.md.tmpl +++ b/autoplan/SKILL.md.tmpl @@ -3,14 +3,7 @@ name: autoplan preamble-tier: 3 version: 1.0.0 description: | - Auto-review pipeline — reads the full CEO, design, eng, and DX review skills from disk - and runs them sequentially with auto-decisions using 6 decision principles. Surfaces - taste decisions (close approaches, borderline scope, codex disagreements) at a final - approval gate. One command, fully reviewed plan out. - Use when asked to "auto review", "autoplan", "run all reviews", "review this plan - automatically", or "make the decisions for me". - Proactively suggest when the user has a plan file and wants to run the full review - gauntlet without answering 15-30 intermediate questions. (gstack) + gstack auto-review pipeline that runs CEO, eng, design, and DX reviews with final taste decisions. voice-triggers: - "auto plan" - "automatic review" diff --git a/benchmark-models/SKILL.md b/benchmark-models/SKILL.md index b152301baa..a153855008 100644 --- a/benchmark-models/SKILL.md +++ b/benchmark-models/SKILL.md @@ -3,12 +3,7 @@ name: benchmark-models preamble-tier: 1 version: 1.0.0 description: | - Cross-model benchmark for gstack skills. Runs the same prompt through Claude, - GPT (via Codex CLI), and Gemini side-by-side — compares latency, tokens, cost, - and optionally quality via LLM judge. Answers "which model is actually best - for this skill?" with data instead of vibes. Separate from /benchmark, which - measures web page performance. Use when: "benchmark models", "compare models", - "which model is best for X", "cross-model comparison", "model shootout". (gstack) + gstack model benchmark for comparing Claude, GPT, and Gemini on a skill or prompt. Voice triggers (speech-to-text aliases): "compare models", "model shootout", "which model is best". triggers: - cross model benchmark diff --git a/benchmark-models/SKILL.md.tmpl b/benchmark-models/SKILL.md.tmpl index 034cda1824..5e36fd2299 100644 --- a/benchmark-models/SKILL.md.tmpl +++ b/benchmark-models/SKILL.md.tmpl @@ -3,12 +3,7 @@ name: benchmark-models preamble-tier: 1 version: 1.0.0 description: | - Cross-model benchmark for gstack skills. Runs the same prompt through Claude, - GPT (via Codex CLI), and Gemini side-by-side — compares latency, tokens, cost, - and optionally quality via LLM judge. Answers "which model is actually best - for this skill?" with data instead of vibes. Separate from /benchmark, which - measures web page performance. Use when: "benchmark models", "compare models", - "which model is best for X", "cross-model comparison", "model shootout". (gstack) + gstack model benchmark for comparing Claude, GPT, and Gemini on a skill or prompt. voice-triggers: - "compare models" - "model shootout" diff --git a/benchmark/SKILL.md b/benchmark/SKILL.md index 0a01897b03..6c784deee6 100644 --- a/benchmark/SKILL.md +++ b/benchmark/SKILL.md @@ -3,11 +3,7 @@ name: benchmark preamble-tier: 1 version: 1.0.0 description: | - Performance regression detection using the browse daemon. Establishes - baselines for page load times, Core Web Vitals, and resource sizes. - Compares before/after on every PR. Tracks performance trends over time. - Use when: "performance", "benchmark", "page speed", "lighthouse", "web vitals", - "bundle size", "load time". (gstack) + gstack performance benchmark for page load, Core Web Vitals, bundle size, and regression baselines. Voice triggers (speech-to-text aliases): "speed test", "check performance". triggers: - performance benchmark diff --git a/benchmark/SKILL.md.tmpl b/benchmark/SKILL.md.tmpl index 038f16f5fb..e187c8637e 100644 --- a/benchmark/SKILL.md.tmpl +++ b/benchmark/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: benchmark preamble-tier: 1 version: 1.0.0 description: | - Performance regression detection using the browse daemon. Establishes - baselines for page load times, Core Web Vitals, and resource sizes. - Compares before/after on every PR. Tracks performance trends over time. - Use when: "performance", "benchmark", "page speed", "lighthouse", "web vitals", - "bundle size", "load time". (gstack) + gstack performance benchmark for page load, Core Web Vitals, bundle size, and regression baselines. voice-triggers: - "speed test" - "check performance" diff --git a/browse/SKILL.md b/browse/SKILL.md index 7b89fa5c99..556616f886 100644 --- a/browse/SKILL.md +++ b/browse/SKILL.md @@ -3,12 +3,7 @@ name: browse preamble-tier: 1 version: 1.1.0 description: | - Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with - elements, verify page state, diff before/after actions, take annotated screenshots, check - responsive layouts, test forms and uploads, handle dialogs, and assert element states. - ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a - user flow, or file a bug with evidence. Use when asked to "open in browser", "test the - site", "take a screenshot", or "dogfood this". (gstack) + gstack headless browser for navigating pages, clicking UI, checking state, and capturing screenshots. triggers: - browse a page - headless browser diff --git a/browse/SKILL.md.tmpl b/browse/SKILL.md.tmpl index ec4fcad706..52ee238e59 100644 --- a/browse/SKILL.md.tmpl +++ b/browse/SKILL.md.tmpl @@ -3,12 +3,7 @@ name: browse preamble-tier: 1 version: 1.1.0 description: | - Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with - elements, verify page state, diff before/after actions, take annotated screenshots, check - responsive layouts, test forms and uploads, handle dialogs, and assert element states. - ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a - user flow, or file a bug with evidence. Use when asked to "open in browser", "test the - site", "take a screenshot", or "dogfood this". (gstack) + gstack headless browser for navigating pages, clicking UI, checking state, and capturing screenshots. triggers: - browse a page - headless browser diff --git a/canary/SKILL.md b/canary/SKILL.md index 4f79a02104..070d35b778 100644 --- a/canary/SKILL.md +++ b/canary/SKILL.md @@ -3,11 +3,7 @@ name: canary preamble-tier: 2 version: 1.0.0 description: | - Post-deploy canary monitoring. Watches the live app for console errors, - performance regressions, and page failures using the browse daemon. Takes - periodic screenshots, compares against pre-deploy baselines, and alerts - on anomalies. Use when: "monitor deploy", "canary", "post-deploy check", - "watch production", "verify deploy". (gstack) + gstack post-deploy canary monitor for production page failures, console errors, and performance regressions. allowed-tools: - Bash - Read diff --git a/canary/SKILL.md.tmpl b/canary/SKILL.md.tmpl index d1eb2950ab..62253f39ef 100644 --- a/canary/SKILL.md.tmpl +++ b/canary/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: canary preamble-tier: 2 version: 1.0.0 description: | - Post-deploy canary monitoring. Watches the live app for console errors, - performance regressions, and page failures using the browse daemon. Takes - periodic screenshots, compares against pre-deploy baselines, and alerts - on anomalies. Use when: "monitor deploy", "canary", "post-deploy check", - "watch production", "verify deploy". (gstack) + gstack post-deploy canary monitor for production page failures, console errors, and performance regressions. allowed-tools: - Bash - Read diff --git a/careful/SKILL.md b/careful/SKILL.md index 91a5776e30..6b12d89f83 100644 --- a/careful/SKILL.md +++ b/careful/SKILL.md @@ -2,11 +2,7 @@ name: careful version: 0.1.0 description: | - Safety guardrails for destructive commands. Warns before rm -rf, DROP TABLE, - force-push, git reset --hard, kubectl delete, and similar destructive operations. - User can override each warning. Use when touching prod, debugging live systems, - or working in a shared environment. Use when asked to "be careful", "safety mode", - "prod mode", or "careful mode". (gstack) + gstack safety guardrails that warn before destructive shell commands in risky environments. triggers: - be careful - warn before destructive diff --git a/careful/SKILL.md.tmpl b/careful/SKILL.md.tmpl index 9d83411f83..f10a42849c 100644 --- a/careful/SKILL.md.tmpl +++ b/careful/SKILL.md.tmpl @@ -2,11 +2,7 @@ name: careful version: 0.1.0 description: | - Safety guardrails for destructive commands. Warns before rm -rf, DROP TABLE, - force-push, git reset --hard, kubectl delete, and similar destructive operations. - User can override each warning. Use when touching prod, debugging live systems, - or working in a shared environment. Use when asked to "be careful", "safety mode", - "prod mode", or "careful mode". (gstack) + gstack safety guardrails that warn before destructive shell commands in risky environments. triggers: - be careful - warn before destructive diff --git a/claude/SKILL.md.tmpl b/claude/SKILL.md.tmpl index 94552cbe4e..1348efe12e 100644 --- a/claude/SKILL.md.tmpl +++ b/claude/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: claude preamble-tier: 3 version: 1.0.0 description: | - Claude Code CLI wrapper for non-Claude hosts - three modes. Review: independent - diff review via claude -p. Challenge: adversarial failure-mode review. Consult: - ask Claude about the repo with read-only file tools. Use when asked for "claude - review", "claude challenge", "ask claude", "second opinion from claude", or - "outside voice". (gstack) + gstack Claude CLI wrapper for review, challenge, and read-only consultation from non-Claude hosts. triggers: - claude review - claude challenge diff --git a/codex/SKILL.md b/codex/SKILL.md index e90ec7e89e..5b4d7164d3 100644 --- a/codex/SKILL.md +++ b/codex/SKILL.md @@ -3,11 +3,7 @@ name: codex preamble-tier: 3 version: 1.0.0 description: | - OpenAI Codex CLI wrapper — three modes. Code review: independent diff review via - codex review with pass/fail gate. Challenge: adversarial mode that tries to break - your code. Consult: ask codex anything with session continuity for follow-ups. - The "200 IQ autistic developer" second opinion. Use when asked to "codex review", - "codex challenge", "ask codex", "second opinion", or "consult codex". (gstack) + gstack Codex CLI wrapper for review, adversarial challenge, and session-continuity consultation. Voice triggers (speech-to-text aliases): "code x", "code ex", "get another opinion". triggers: - codex review diff --git a/codex/SKILL.md.tmpl b/codex/SKILL.md.tmpl index c311fc80b7..b1efc332d9 100644 --- a/codex/SKILL.md.tmpl +++ b/codex/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: codex preamble-tier: 3 version: 1.0.0 description: | - OpenAI Codex CLI wrapper — three modes. Code review: independent diff review via - codex review with pass/fail gate. Challenge: adversarial mode that tries to break - your code. Consult: ask codex anything with session continuity for follow-ups. - The "200 IQ autistic developer" second opinion. Use when asked to "codex review", - "codex challenge", "ask codex", "second opinion", or "consult codex". (gstack) + gstack Codex CLI wrapper for review, adversarial challenge, and session-continuity consultation. voice-triggers: - "code x" - "code ex" diff --git a/context-restore/SKILL.md b/context-restore/SKILL.md index 6cb5236593..ac9807e797 100644 --- a/context-restore/SKILL.md +++ b/context-restore/SKILL.md @@ -3,13 +3,7 @@ name: context-restore preamble-tier: 2 version: 1.0.0 description: | - Restore working context saved earlier by /context-save. Loads the most recent - saved state (across all branches by default) so you can pick up where you - left off — even across Conductor workspace handoffs. - Use when asked to "resume", "restore context", "where was I", or - "pick up where I left off". Pair with /context-save. - Formerly /checkpoint resume — renamed because Claude Code treats /checkpoint - as a native rewind alias in current environments. (gstack) + gstack context restore loads saved session state so work can resume across branches or workspaces. allowed-tools: - Bash - Read diff --git a/context-restore/SKILL.md.tmpl b/context-restore/SKILL.md.tmpl index 1fe9f938a2..9c60500941 100644 --- a/context-restore/SKILL.md.tmpl +++ b/context-restore/SKILL.md.tmpl @@ -3,13 +3,7 @@ name: context-restore preamble-tier: 2 version: 1.0.0 description: | - Restore working context saved earlier by /context-save. Loads the most recent - saved state (across all branches by default) so you can pick up where you - left off — even across Conductor workspace handoffs. - Use when asked to "resume", "restore context", "where was I", or - "pick up where I left off". Pair with /context-save. - Formerly /checkpoint resume — renamed because Claude Code treats /checkpoint - as a native rewind alias in current environments. (gstack) + gstack context restore loads saved session state so work can resume across branches or workspaces. allowed-tools: - Bash - Read diff --git a/context-save/SKILL.md b/context-save/SKILL.md index 972f5b561e..e67888fdcb 100644 --- a/context-save/SKILL.md +++ b/context-save/SKILL.md @@ -3,13 +3,7 @@ name: context-save preamble-tier: 2 version: 1.0.0 description: | - Save working context. Captures git state, decisions made, and remaining work - so any future session can pick up without losing a beat. - Use when asked to "save progress", "save state", "context save", or - "save my work". Pair with /context-restore to resume later. - Formerly /checkpoint — renamed because Claude Code treats /checkpoint as a - native rewind alias in current environments, which was shadowing this skill. - (gstack) + gstack context save records git state, decisions, and remaining work for later resume. allowed-tools: - Bash - Read diff --git a/context-save/SKILL.md.tmpl b/context-save/SKILL.md.tmpl index 8343873f09..d41c015e55 100644 --- a/context-save/SKILL.md.tmpl +++ b/context-save/SKILL.md.tmpl @@ -3,13 +3,7 @@ name: context-save preamble-tier: 2 version: 1.0.0 description: | - Save working context. Captures git state, decisions made, and remaining work - so any future session can pick up without losing a beat. - Use when asked to "save progress", "save state", "context save", or - "save my work". Pair with /context-restore to resume later. - Formerly /checkpoint — renamed because Claude Code treats /checkpoint as a - native rewind alias in current environments, which was shadowing this skill. - (gstack) + gstack context save records git state, decisions, and remaining work for later resume. allowed-tools: - Bash - Read diff --git a/cso/SKILL.md b/cso/SKILL.md index f4ce42d542..c057b8a973 100644 --- a/cso/SKILL.md +++ b/cso/SKILL.md @@ -3,12 +3,7 @@ name: cso preamble-tier: 2 version: 2.0.0 description: | - Chief Security Officer mode. Infrastructure-first security audit: secrets archaeology, - dependency supply chain, CI/CD pipeline security, LLM/AI security, skill supply chain - scanning, plus OWASP Top 10, STRIDE threat modeling, and active verification. - Two modes: daily (zero-noise, 8/10 confidence gate) and comprehensive (monthly deep - scan, 2/10 bar). Trend tracking across audit runs. - Use when: "security audit", "threat model", "pentest review", "OWASP", "CSO review". (gstack) + gstack security audit for secrets, dependencies, CI/CD, LLM risks, OWASP, and threat modeling. Voice triggers (speech-to-text aliases): "see-so", "see so", "security review", "security check", "vulnerability scan", "run security". allowed-tools: - Bash diff --git a/cso/SKILL.md.tmpl b/cso/SKILL.md.tmpl index 2f849ee006..bd24bbcef1 100644 --- a/cso/SKILL.md.tmpl +++ b/cso/SKILL.md.tmpl @@ -3,12 +3,7 @@ name: cso preamble-tier: 2 version: 2.0.0 description: | - Chief Security Officer mode. Infrastructure-first security audit: secrets archaeology, - dependency supply chain, CI/CD pipeline security, LLM/AI security, skill supply chain - scanning, plus OWASP Top 10, STRIDE threat modeling, and active verification. - Two modes: daily (zero-noise, 8/10 confidence gate) and comprehensive (monthly deep - scan, 2/10 bar). Trend tracking across audit runs. - Use when: "security audit", "threat model", "pentest review", "OWASP", "CSO review". (gstack) + gstack security audit for secrets, dependencies, CI/CD, LLM risks, OWASP, and threat modeling. voice-triggers: - "see-so" - "see so" diff --git a/design-consultation/SKILL.md b/design-consultation/SKILL.md index 3ccd0140f3..80cc9b57d9 100644 --- a/design-consultation/SKILL.md +++ b/design-consultation/SKILL.md @@ -3,13 +3,7 @@ name: design-consultation preamble-tier: 3 version: 1.0.0 description: | - Design consultation: understands your product, researches the landscape, proposes a - complete design system (aesthetic, typography, color, layout, spacing, motion), and - generates font+color preview pages. Creates DESIGN.md as your project's design source - of truth. For existing sites, use /plan-design-review to infer the system instead. - Use when asked to "design system", "brand guidelines", or "create DESIGN.md". - Proactively suggest when starting a new project's UI with no existing - design system or DESIGN.md. (gstack) + gstack design consultation for creating a product design system and DESIGN.md source of truth. allowed-tools: - Bash - Read diff --git a/design-consultation/SKILL.md.tmpl b/design-consultation/SKILL.md.tmpl index a4eba48fc5..23952fbaa2 100644 --- a/design-consultation/SKILL.md.tmpl +++ b/design-consultation/SKILL.md.tmpl @@ -3,13 +3,7 @@ name: design-consultation preamble-tier: 3 version: 1.0.0 description: | - Design consultation: understands your product, researches the landscape, proposes a - complete design system (aesthetic, typography, color, layout, spacing, motion), and - generates font+color preview pages. Creates DESIGN.md as your project's design source - of truth. For existing sites, use /plan-design-review to infer the system instead. - Use when asked to "design system", "brand guidelines", or "create DESIGN.md". - Proactively suggest when starting a new project's UI with no existing - design system or DESIGN.md. (gstack) + gstack design consultation for creating a product design system and DESIGN.md source of truth. allowed-tools: - Bash - Read diff --git a/design-html/SKILL.md b/design-html/SKILL.md index 844b9d9c96..5f8a2b6c90 100644 --- a/design-html/SKILL.md +++ b/design-html/SKILL.md @@ -3,14 +3,7 @@ name: design-html preamble-tier: 2 version: 1.0.0 description: | - Design finalization: generates production-quality Pretext-native HTML/CSS. - Works with approved mockups from /design-shotgun, CEO plans from /plan-ceo-review, - design review context from /plan-design-review, or from scratch with a user - description. Text actually reflows, heights are computed, layouts are dynamic. - 30KB overhead, zero deps. Smart API routing: picks the right Pretext patterns - for each design type. Use when: "finalize this design", "turn this into HTML", - "build me a page", "implement this design", or after any planning skill. - Proactively suggest when user has approved a design or has a plan ready. (gstack) + gstack design-html turns approved designs or plans into production-quality Pretext-native HTML/CSS. Voice triggers (speech-to-text aliases): "build the design", "code the mockup", "make it real". triggers: - build the design diff --git a/design-html/SKILL.md.tmpl b/design-html/SKILL.md.tmpl index 3cdec9a14d..5e3a56ec53 100644 --- a/design-html/SKILL.md.tmpl +++ b/design-html/SKILL.md.tmpl @@ -3,14 +3,7 @@ name: design-html preamble-tier: 2 version: 1.0.0 description: | - Design finalization: generates production-quality Pretext-native HTML/CSS. - Works with approved mockups from /design-shotgun, CEO plans from /plan-ceo-review, - design review context from /plan-design-review, or from scratch with a user - description. Text actually reflows, heights are computed, layouts are dynamic. - 30KB overhead, zero deps. Smart API routing: picks the right Pretext patterns - for each design type. Use when: "finalize this design", "turn this into HTML", - "build me a page", "implement this design", or after any planning skill. - Proactively suggest when user has approved a design or has a plan ready. (gstack) + gstack design-html turns approved designs or plans into production-quality Pretext-native HTML/CSS. voice-triggers: - "build the design" - "code the mockup" diff --git a/design-review/SKILL.md b/design-review/SKILL.md index 43aec13e0c..b81359b181 100644 --- a/design-review/SKILL.md +++ b/design-review/SKILL.md @@ -3,13 +3,7 @@ name: design-review preamble-tier: 4 version: 2.0.0 description: | - Designer's eye QA: finds visual inconsistency, spacing issues, hierarchy problems, - AI slop patterns, and slow interactions — then fixes them. Iteratively fixes issues - in source code, committing each fix atomically and re-verifying with before/after - screenshots. For plan-mode design review (before implementation), use /plan-design-review. - Use when asked to "audit the design", "visual QA", "check if it looks good", or "design polish". - Proactively suggest when the user mentions visual inconsistencies or - wants to polish the look of a live site. (gstack) + gstack live design audit that finds visual issues, fixes code, and verifies with screenshots. allowed-tools: - Bash - Read diff --git a/design-review/SKILL.md.tmpl b/design-review/SKILL.md.tmpl index bdcda48e29..28ff214cfd 100644 --- a/design-review/SKILL.md.tmpl +++ b/design-review/SKILL.md.tmpl @@ -3,13 +3,7 @@ name: design-review preamble-tier: 4 version: 2.0.0 description: | - Designer's eye QA: finds visual inconsistency, spacing issues, hierarchy problems, - AI slop patterns, and slow interactions — then fixes them. Iteratively fixes issues - in source code, committing each fix atomically and re-verifying with before/after - screenshots. For plan-mode design review (before implementation), use /plan-design-review. - Use when asked to "audit the design", "visual QA", "check if it looks good", or "design polish". - Proactively suggest when the user mentions visual inconsistencies or - wants to polish the look of a live site. (gstack) + gstack live design audit that finds visual issues, fixes code, and verifies with screenshots. allowed-tools: - Bash - Read diff --git a/design-shotgun/SKILL.md b/design-shotgun/SKILL.md index a9f1625b23..a6b77dd7e3 100644 --- a/design-shotgun/SKILL.md +++ b/design-shotgun/SKILL.md @@ -3,12 +3,7 @@ name: design-shotgun preamble-tier: 2 version: 1.0.0 description: | - Design shotgun: generate multiple AI design variants, open a comparison board, - collect structured feedback, and iterate. Standalone design exploration you can - run anytime. Use when: "explore designs", "show me options", "design variants", - "visual brainstorm", or "I don't like how this looks". - Proactively suggest when the user describes a UI feature but hasn't seen - what it could look like. (gstack) + gstack design shotgun generates multiple design variants, compares them, and iterates from feedback. triggers: - explore design variants - show me design options diff --git a/design-shotgun/SKILL.md.tmpl b/design-shotgun/SKILL.md.tmpl index f78070edd1..81ce94e7be 100644 --- a/design-shotgun/SKILL.md.tmpl +++ b/design-shotgun/SKILL.md.tmpl @@ -3,12 +3,7 @@ name: design-shotgun preamble-tier: 2 version: 1.0.0 description: | - Design shotgun: generate multiple AI design variants, open a comparison board, - collect structured feedback, and iterate. Standalone design exploration you can - run anytime. Use when: "explore designs", "show me options", "design variants", - "visual brainstorm", or "I don't like how this looks". - Proactively suggest when the user describes a UI feature but hasn't seen - what it could look like. (gstack) + gstack design shotgun generates multiple design variants, compares them, and iterates from feedback. triggers: - explore design variants - show me design options diff --git a/devex-review/SKILL.md b/devex-review/SKILL.md index 57bcba04a5..9400000602 100644 --- a/devex-review/SKILL.md +++ b/devex-review/SKILL.md @@ -3,13 +3,7 @@ name: devex-review preamble-tier: 3 version: 1.0.0 description: | - Live developer experience audit. Uses the browse tool to actually TEST the - developer experience: navigates docs, tries the getting started flow, times - TTHW, screenshots error messages, evaluates CLI help text. Produces a DX - scorecard with evidence. Compares against /plan-devex-review scores if they - exist (the boomerang: plan said 3 minutes, reality says 8). Use when asked to - "test the DX", "DX audit", "developer experience test", or "try the - onboarding". Proactively suggest after shipping a developer-facing feature. (gstack) + gstack live developer-experience audit for docs, onboarding, CLI help, and integration friction. Voice triggers (speech-to-text aliases): "dx audit", "test the developer experience", "try the onboarding", "developer experience test". triggers: - live dx audit diff --git a/devex-review/SKILL.md.tmpl b/devex-review/SKILL.md.tmpl index 081d4f35bb..fe24aada54 100644 --- a/devex-review/SKILL.md.tmpl +++ b/devex-review/SKILL.md.tmpl @@ -3,13 +3,7 @@ name: devex-review preamble-tier: 3 version: 1.0.0 description: | - Live developer experience audit. Uses the browse tool to actually TEST the - developer experience: navigates docs, tries the getting started flow, times - TTHW, screenshots error messages, evaluates CLI help text. Produces a DX - scorecard with evidence. Compares against /plan-devex-review scores if they - exist (the boomerang: plan said 3 minutes, reality says 8). Use when asked to - "test the DX", "DX audit", "developer experience test", or "try the - onboarding". Proactively suggest after shipping a developer-facing feature. (gstack) + gstack live developer-experience audit for docs, onboarding, CLI help, and integration friction. voice-triggers: - "dx audit" - "test the developer experience" diff --git a/docs/designs/SKILL_CONTEXT_BUDGET.md b/docs/designs/SKILL_CONTEXT_BUDGET.md new file mode 100644 index 0000000000..8d9358285c --- /dev/null +++ b/docs/designs/SKILL_CONTEXT_BUDGET.md @@ -0,0 +1,371 @@ +# Skill Context Budget Plan + +Status: proposed +Date: 2026-04-28 +Branch: `chore/skill-context-budget-plan` + +## Problem + +gstack's skill surface is doing two different jobs with the same files: + +1. **Routing/discovery**: the host needs enough metadata to decide whether a + skill applies. +2. **Execution**: once selected, the model needs the actual workflow. + +Today, discovery points at full `SKILL.md` files whose frontmatter descriptions +are long, and execution files inline large shared preambles plus large workflow +sections. That burns context before the user task has started and makes some +skills expensive to read even after they are correctly selected. + +Measured on this clone: + +| Metric | Current | +|---|---:| +| Visible generated `SKILL.md` files | 47 | +| Total visible `SKILL.md` bytes | 2,297,236 | +| Approx visible body tokens | 574,309 | +| Skills over 50 KB | 18 | +| Frontmatter description chars | 20,951 | +| Approx frontmatter description tokens | 5,238 | + +Largest bodies: + +| Skill | Bytes | +|---|---:| +| `ship/SKILL.md` | 145,370 | +| `plan-ceo-review/SKILL.md` | 119,001 | +| `office-hours/SKILL.md` | 103,944 | +| `plan-design-review/SKILL.md` | 94,388 | +| `plan-devex-review/SKILL.md` | 94,240 | +| `design-review/SKILL.md` | 88,647 | +| `plan-eng-review/SKILL.md` | 85,742 | +| `land-and-deploy/SKILL.md` | 82,818 | +| `autoplan/SKILL.md` | 79,479 | +| `review/SKILL.md` | 78,992 | + +There is already a concrete symptom in `test/skill-e2e-workflow.test.ts`: the +Codex E2E test extracts only the review-relevant section because the full +`codex/SKILL.md` is large enough to exhaust turns. + +## Goals + +- Reduce eager skill discovery context without making skills harder to invoke. +- Keep skill behavior intact by loading detailed workflow text only after routing. +- Add budget tests so future growth is visible and eventually blocked. +- Make the change host-aware: Claude, Codex, OpenCode, OpenClaw, Factory, and + other generated hosts should all benefit without one-off patches. + +## Non-Goals + +- Do not build a runtime tool-output compactor. `docs/designs/GCOMPACTION.md` + covers that separate problem and is blocked on host API support. +- Do not rewrite all skill workflows in one PR. +- Do not hard-fail existing large skills before the repo has a migration path. +- Do not remove behavior solely to hit byte targets; move it behind lazy loading + or shared references first. + +## Design Principles + +1. **Discovery is not execution.** Discovery metadata should be a compact routing + index; full workflow instructions should be read only for selected skills. +2. **Budgets must ratchet.** Start with measured warn-only thresholds, then lower + and harden once the first slimming pass lands. +3. **Reference files are acceptable.** A skill can instruct the agent to read + `references/...` only when that branch of the workflow is reached. +4. **Behavioral invariants get tests.** Any slimming pass needs static checks and + at least targeted E2E coverage for the affected skill family. + +## Proposed Architecture + +### 1. Add a Skill Context Budget Reporter + +Create `scripts/skill-context-budget.ts` with two modes: + +- `--report`: print a table and JSON summary. +- `--check`: enforce configured thresholds. + +Metrics: + +- generated skill body bytes, lines, and approximate tokens +- frontmatter description chars and approximate tokens +- eager catalog estimate: one line per skill with name, short description, and path +- largest skills +- largest descriptions +- per-host generated totals when host output directories exist +- hidden/generated duplicate totals under host subdirectories + +Initial thresholds should be warn-only except where clearly safe: + +| Budget | Initial | Enforcement | +|---|---:|---| +| Per-description target | 180 chars | warn | +| Per-description hard limit | 360 chars | fail for new or edited templates | +| Eager catalog target | 12,000 chars | warn | +| Individual skill target | 50 KB | warn | +| Individual skill hard ceiling | 160 KB | fail, matching current generator ceiling | +| Preamble target for tier >= 2 | 22 KB | warn | + +Wire it into: + +- `bun run skill:budget` +- `bun run skill:budget:check` +- `bun test` via a new free unit test +- `bun run skill:check` summary output + +This should reuse `scripts/discover-skills.ts` and the existing frontmatter +parser logic from `scripts/gen-skill-docs.ts` or move that parser into a shared +helper. + +### 2. Split Routing Metadata From Long Descriptions + +The frontmatter `description` field should become short enough to be safe for +eager catalogs. Long explanations should move to the body or to references. + +Template convention: + +```yaml +--- +name: ship +description: Ship workflow: test, review, version, changelog, commit, push, and open a PR. +triggers: + - ship it + - create a pr + - push this branch +--- +``` + +Rules: + +- `description` is one sentence, preferably <= 180 chars. +- `triggers` carries invocation phrases, not prose. +- Host-specific `openai.yaml` keeps using `short_description`. +- Existing long "Use when..." text moves into a body section named + `## Routing Notes` if the workflow still needs it. + +Expected result: + +- 47-skill description total falls from 20,951 chars to <= 8,500 chars. +- The active catalog with paths should stay under about 11,000 chars. + +### 3. Make the Shared Preamble Load-Bearing But Smaller + +Current `scripts/resolvers/preamble.ts` composes useful sections, but many of +them are inlined into every large skill. Keep only session-critical instructions +inline: + +- update/session/config echo block +- routing prefix rules +- user-decision and AskUserQuestion contract +- completion/status bookkeeping + +Move expanded guidance into references: + +- `references/preamble/voice.md` +- `references/preamble/writing-style.md` +- `references/preamble/context-recovery.md` +- `references/preamble/search-before-building.md` +- `references/preamble/completeness.md` + +Generated skills should say when to read those references. Example: + +```md +For substantial implementation or review work, read +`$GSTACK_ROOT/references/preamble/context-recovery.md` before starting. +``` + +This preserves behavior for complex tasks while keeping every skill's default +read smaller. + +### 4. Split Mega Workflows Into Router + References + +For the top 10 skills, keep `SKILL.md.tmpl` as the routing and phase skeleton. +Move rarely used branches into explicit reference files. + +Priority order: + +1. `codex`: extract review, consult, challenge, and session-continuity modes. +2. `ship`: extract coverage audit, plan validation, review-army, Greptile, and + document-release handoff sections. +3. `review`: extract specialist checklists and report templates. +4. `plan-ceo-review`, `plan-eng-review`, `plan-design-review`, + `plan-devex-review`: extract scoring rubrics and outside-voice protocols. +5. `qa` and `design-review`: extract bug report templates, browser command + recipes, and fix-loop rubrics. + +Target body budgets after migration: + +| Skill family | Target | +|---|---:| +| `codex` | <= 30 KB | +| `review` | <= 45 KB | +| `ship` | <= 70 KB | +| plan-review skills | <= 60 KB each | +| QA/design-review skills | <= 55 KB each | + +The generated workflow can still load multiple references during execution, but +only after the model has selected the relevant mode. + +### 5. Host Output Hygiene + +Generated host variants should not accidentally become discoverable by unrelated +hosts. Add a budget reporter check that flags `SKILL.md` files under hidden host +subdirectories, and document expected install layout for each host. + +Candidate rules: + +- Claude global install should contain Claude-facing skills only. +- Codex global install should contain Codex-facing skills only. +- Repo-local generated host directories should stay outside other hosts' + discovery paths or include a host-specific ignore/sentinel when supported. + +This is mostly a packaging/install concern, not a prompt-writing concern. + +## Implementation Sequence + +### Phase 0: Metrics and Guardrails + +Files: + +- `scripts/skill-context-budget.ts` +- `test/skill-context-budget.test.ts` +- `package.json` +- `scripts/skill-check.ts` + +Deliverables: + +- report current body and discovery budgets +- fail only for parser errors and the existing 160 KB hard ceiling +- warn on large descriptions, large preambles, and >50 KB skills + +Validation: + +```bash +bun run skill:budget +bun run skill:budget:check +bun test test/skill-context-budget.test.ts +``` + +### Phase 1: Description Slimming + +Files: + +- every `*/SKILL.md.tmpl` frontmatter description +- `test/gen-skill-docs.test.ts` + +Deliverables: + +- one-sentence descriptions +- invocation phrases moved to `triggers` +- test asserts all generated descriptions are <= 360 chars +- warning target <= 180 chars + +Validation: + +```bash +bun run gen:skill-docs +bun test test/gen-skill-docs.test.ts test/skill-validation.test.ts +``` + +### Phase 2: Shared Preamble Slim + +Files: + +- `scripts/resolvers/preamble.ts` +- `scripts/resolvers/preamble/*` +- `references/preamble/*.md` +- `test/gen-skill-docs.test.ts` +- golden fixture updates + +Deliverables: + +- tier >= 2 preamble target reduced from the current 33 KB guard toward 22 KB +- voice/writing/context sections moved to lazy references where safe +- existing tests assert the core voice and AskUserQuestion contracts remain + +Validation: + +```bash +bun run gen:skill-docs +bun test test/gen-skill-docs.test.ts test/preamble-compose.test.ts +``` + +### Phase 3: Split One Mega Skill First + +Start with `codex` because the E2E test already works around body size. + +Files: + +- `codex/SKILL.md.tmpl` +- `codex/references/*.md` +- `test/skill-e2e-workflow.test.ts` + +Deliverables: + +- `codex/SKILL.md` <= 30 KB +- E2E test reads the full generated skill, not a sliced section +- no loss of review/challenge/consult mode coverage + +Validation: + +```bash +bun run gen:skill-docs +bun test test/gen-skill-docs.test.ts test/skill-e2e-workflow.test.ts +``` + +### Phase 4: Ratchet and Repeat + +After `codex` proves the pattern: + +- split `review` +- split `ship` +- split the plan-review family +- lower body warnings from 50 KB to 40 KB +- convert the per-description 360 char limit from "new/edited templates" to all + templates + +## Acceptance Criteria + +- `bun run skill:budget:check` passes. +- Total frontmatter description chars <= 8,500. +- Eager catalog estimate <= 11,000 chars including paths. +- No generated visible `SKILL.md` exceeds 160 KB. +- `codex/SKILL.md` no longer needs section slicing in E2E. +- Tier >= 2 preamble guard is ratcheted below 25 KB. +- Behavior-preserving tests pass for generator, validation, and the first split + mega-skill. + +## Risks + +- **Behavior drift from slimming.** Mitigate with golden fixture diffs and E2E + tests for each split skill. +- **Host-specific routing regressions.** Mitigate by testing generated output for + Claude and Codex first, then all hosts. +- **Too many lazy reads.** A split skill can become slower if it loads many + references unconditionally. Each reference must be mode- or phase-gated. +- **Budget gaming.** Byte limits alone can remove useful instruction. Pair every + budget ratchet with behavior invariants. + +## Open Questions + +- Does Claude Code's eager skill catalog consume full `description` only, or also + other frontmatter fields like `triggers`? We should verify empirically before + relying on trigger-heavy metadata. +- Which hosts support ignore files or non-discoverable reference directories? +- Should shared references live under root `references/` or under each skill + directory? Root references reduce duplication; per-skill references make + ownership clearer. +- Should `skill:budget:check` compare against an explicit JSON baseline to catch + percentage growth, or only enforce absolute limits? + +## Recommended First PR + +Ship Phase 0 and Phase 1 together: + +1. Add `scripts/skill-context-budget.ts`. +2. Add free budget tests. +3. Add `skill:budget` scripts. +4. Slim descriptions in templates only. +5. Regenerate `SKILL.md` files. + +This creates immediate context savings with low behavior risk, and it gives later +preamble/workflow refactors a measurable gate. diff --git a/document-release/SKILL.md b/document-release/SKILL.md index 7d049b195b..1558bdd21b 100644 --- a/document-release/SKILL.md +++ b/document-release/SKILL.md @@ -3,11 +3,7 @@ name: document-release preamble-tier: 2 version: 1.0.0 description: | - Post-ship documentation update. Reads all project docs, cross-references the - diff, updates README/ARCHITECTURE/CONTRIBUTING/CLAUDE.md to match what shipped, - polishes CHANGELOG voice, cleans up TODOS, and optionally bumps VERSION. Use when - asked to "update the docs", "sync documentation", or "post-ship docs". - Proactively suggest after a PR is merged or code is shipped. (gstack) + gstack document-release updates project docs, changelog, TODOs, and version notes after shipping. allowed-tools: - Bash - Read diff --git a/document-release/SKILL.md.tmpl b/document-release/SKILL.md.tmpl index 0fd08eac73..e7f535ffb3 100644 --- a/document-release/SKILL.md.tmpl +++ b/document-release/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: document-release preamble-tier: 2 version: 1.0.0 description: | - Post-ship documentation update. Reads all project docs, cross-references the - diff, updates README/ARCHITECTURE/CONTRIBUTING/CLAUDE.md to match what shipped, - polishes CHANGELOG voice, cleans up TODOS, and optionally bumps VERSION. Use when - asked to "update the docs", "sync documentation", or "post-ship docs". - Proactively suggest after a PR is merged or code is shipped. (gstack) + gstack document-release updates project docs, changelog, TODOs, and version notes after shipping. allowed-tools: - Bash - Read diff --git a/freeze/SKILL.md b/freeze/SKILL.md index 2f034500c9..2183438978 100644 --- a/freeze/SKILL.md +++ b/freeze/SKILL.md @@ -2,11 +2,7 @@ name: freeze version: 0.1.0 description: | - Restrict file edits to a specific directory for the session. Blocks Edit and - Write outside the allowed path. Use when debugging to prevent accidentally - "fixing" unrelated code, or when you want to scope changes to one module. - Use when asked to "freeze", "restrict edits", "only edit this folder", - or "lock down edits". (gstack) + gstack freeze restricts file edits to one directory for scoped debugging or guarded work. triggers: - freeze edits to directory - lock editing scope diff --git a/freeze/SKILL.md.tmpl b/freeze/SKILL.md.tmpl index 85e646ed88..ca03d1c6bd 100644 --- a/freeze/SKILL.md.tmpl +++ b/freeze/SKILL.md.tmpl @@ -2,11 +2,7 @@ name: freeze version: 0.1.0 description: | - Restrict file edits to a specific directory for the session. Blocks Edit and - Write outside the allowed path. Use when debugging to prevent accidentally - "fixing" unrelated code, or when you want to scope changes to one module. - Use when asked to "freeze", "restrict edits", "only edit this folder", - or "lock down edits". (gstack) + gstack freeze restricts file edits to one directory for scoped debugging or guarded work. triggers: - freeze edits to directory - lock editing scope diff --git a/gstack-upgrade/SKILL.md b/gstack-upgrade/SKILL.md index 81bb1228c8..e9a45eda7f 100644 --- a/gstack-upgrade/SKILL.md +++ b/gstack-upgrade/SKILL.md @@ -2,9 +2,7 @@ name: gstack-upgrade version: 1.1.0 description: | - Upgrade gstack to the latest version. Detects global vs vendored install, - runs the upgrade, and shows what's new. Use when asked to "upgrade gstack", - "update gstack", or "get latest version". + gstack upgrade updates the installed workflow, applies migrations, and reports what changed. Voice triggers (speech-to-text aliases): "upgrade the tools", "update the tools", "gee stack upgrade", "g stack upgrade". triggers: - upgrade gstack diff --git a/gstack-upgrade/SKILL.md.tmpl b/gstack-upgrade/SKILL.md.tmpl index 5402a1da3c..36d863dc25 100644 --- a/gstack-upgrade/SKILL.md.tmpl +++ b/gstack-upgrade/SKILL.md.tmpl @@ -2,9 +2,7 @@ name: gstack-upgrade version: 1.1.0 description: | - Upgrade gstack to the latest version. Detects global vs vendored install, - runs the upgrade, and shows what's new. Use when asked to "upgrade gstack", - "update gstack", or "get latest version". + gstack upgrade updates the installed workflow, applies migrations, and reports what changed. voice-triggers: - "upgrade the tools" - "update the tools" diff --git a/guard/SKILL.md b/guard/SKILL.md index 9da5e21cb9..2de1c03f04 100644 --- a/guard/SKILL.md +++ b/guard/SKILL.md @@ -2,11 +2,7 @@ name: guard version: 0.1.0 description: | - Full safety mode: destructive command warnings + directory-scoped edits. - Combines /careful (warns before rm -rf, DROP TABLE, force-push, etc.) with - /freeze (blocks edits outside a specified directory). Use for maximum safety - when touching prod or debugging live systems. Use when asked to "guard mode", - "full safety", "lock it down", or "maximum safety". (gstack) + gstack guard combines destructive-command warnings with directory-scoped edit restrictions. triggers: - full safety mode - guard against mistakes diff --git a/guard/SKILL.md.tmpl b/guard/SKILL.md.tmpl index 1f3c6575a5..8c0a3c55a6 100644 --- a/guard/SKILL.md.tmpl +++ b/guard/SKILL.md.tmpl @@ -2,11 +2,7 @@ name: guard version: 0.1.0 description: | - Full safety mode: destructive command warnings + directory-scoped edits. - Combines /careful (warns before rm -rf, DROP TABLE, force-push, etc.) with - /freeze (blocks edits outside a specified directory). Use for maximum safety - when touching prod or debugging live systems. Use when asked to "guard mode", - "full safety", "lock it down", or "maximum safety". (gstack) + gstack guard combines destructive-command warnings with directory-scoped edit restrictions. triggers: - full safety mode - guard against mistakes diff --git a/health/SKILL.md b/health/SKILL.md index f9ab5c2259..0d1fdbebfe 100644 --- a/health/SKILL.md +++ b/health/SKILL.md @@ -3,11 +3,7 @@ name: health preamble-tier: 2 version: 1.0.0 description: | - Code quality dashboard. Wraps existing project tools (type checker, linter, - test runner, dead code detector, shell linter), computes a weighted composite - 0-10 score, and tracks trends over time. Use when: "health check", - "code quality", "how healthy is the codebase", "run all checks", - "quality score". (gstack) + gstack health dashboard for running project quality checks and tracking a weighted score over time. triggers: - code health check - quality dashboard diff --git a/health/SKILL.md.tmpl b/health/SKILL.md.tmpl index ca70c665de..0b45cf1376 100644 --- a/health/SKILL.md.tmpl +++ b/health/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: health preamble-tier: 2 version: 1.0.0 description: | - Code quality dashboard. Wraps existing project tools (type checker, linter, - test runner, dead code detector, shell linter), computes a weighted composite - 0-10 score, and tracks trends over time. Use when: "health check", - "code quality", "how healthy is the codebase", "run all checks", - "quality score". (gstack) + gstack health dashboard for running project quality checks and tracking a weighted score over time. triggers: - code health check - quality dashboard diff --git a/investigate/SKILL.md b/investigate/SKILL.md index b9a8fa0a7b..f1df57ac9b 100644 --- a/investigate/SKILL.md +++ b/investigate/SKILL.md @@ -3,13 +3,7 @@ name: investigate preamble-tier: 2 version: 1.0.0 description: | - Systematic debugging with root cause investigation. Four phases: investigate, - analyze, hypothesize, implement. Iron Law: no fixes without root cause. - Use when asked to "debug this", "fix this bug", "why is this broken", - "investigate this error", or "root cause analysis". - Proactively invoke this skill (do NOT debug directly) when the user reports - errors, 500 errors, stack traces, unexpected behavior, "it was working - yesterday", or is troubleshooting why something stopped working. (gstack) + gstack root-cause debugging workflow for errors, regressions, and unexpected behavior. allowed-tools: - Bash - Read diff --git a/investigate/SKILL.md.tmpl b/investigate/SKILL.md.tmpl index fc8e931260..651ff4d05f 100644 --- a/investigate/SKILL.md.tmpl +++ b/investigate/SKILL.md.tmpl @@ -3,13 +3,7 @@ name: investigate preamble-tier: 2 version: 1.0.0 description: | - Systematic debugging with root cause investigation. Four phases: investigate, - analyze, hypothesize, implement. Iron Law: no fixes without root cause. - Use when asked to "debug this", "fix this bug", "why is this broken", - "investigate this error", or "root cause analysis". - Proactively invoke this skill (do NOT debug directly) when the user reports - errors, 500 errors, stack traces, unexpected behavior, "it was working - yesterday", or is troubleshooting why something stopped working. (gstack) + gstack root-cause debugging workflow for errors, regressions, and unexpected behavior. allowed-tools: - Bash - Read diff --git a/land-and-deploy/SKILL.md b/land-and-deploy/SKILL.md index 55a86d2d40..2cd29d3cc0 100644 --- a/land-and-deploy/SKILL.md +++ b/land-and-deploy/SKILL.md @@ -3,10 +3,7 @@ name: land-and-deploy preamble-tier: 4 version: 1.0.0 description: | - Land and deploy workflow. Merges the PR, waits for CI and deploy, - verifies production health via canary checks. Takes over after /ship - creates the PR. Use when: "merge", "land", "deploy", "merge and verify", - "land it", "ship it to production". (gstack) + gstack land-and-deploy merges a PR, waits for CI and deployment, then verifies production. allowed-tools: - Bash - Read diff --git a/land-and-deploy/SKILL.md.tmpl b/land-and-deploy/SKILL.md.tmpl index a08debea7c..a7db69f613 100644 --- a/land-and-deploy/SKILL.md.tmpl +++ b/land-and-deploy/SKILL.md.tmpl @@ -3,10 +3,7 @@ name: land-and-deploy preamble-tier: 4 version: 1.0.0 description: | - Land and deploy workflow. Merges the PR, waits for CI and deploy, - verifies production health via canary checks. Takes over after /ship - creates the PR. Use when: "merge", "land", "deploy", "merge and verify", - "land it", "ship it to production". (gstack) + gstack land-and-deploy merges a PR, waits for CI and deployment, then verifies production. allowed-tools: - Bash - Read diff --git a/landing-report/SKILL.md b/landing-report/SKILL.md index 4a04d77f76..61f4c82218 100644 --- a/landing-report/SKILL.md +++ b/landing-report/SKILL.md @@ -2,11 +2,7 @@ name: landing-report version: 0.1.0 description: | - Read-only queue dashboard for workspace-aware ship. Shows which VERSION slots - are currently claimed by open PRs, which sibling Conductor workspaces have - WIP work likely to ship soon, and what slot /ship would pick next. No - mutations — just a snapshot. Use when asked to "landing report", "what's in - the queue", "show me open PRs", or "which version do I claim next". (gstack) + gstack landing report shows claimed VERSION slots and likely sibling workspace conflicts before ship. triggers: - landing report - version queue diff --git a/landing-report/SKILL.md.tmpl b/landing-report/SKILL.md.tmpl index 32a8cc1ab0..b067d2609a 100644 --- a/landing-report/SKILL.md.tmpl +++ b/landing-report/SKILL.md.tmpl @@ -2,11 +2,7 @@ name: landing-report version: 0.1.0 description: | - Read-only queue dashboard for workspace-aware ship. Shows which VERSION slots - are currently claimed by open PRs, which sibling Conductor workspaces have - WIP work likely to ship soon, and what slot /ship would pick next. No - mutations — just a snapshot. Use when asked to "landing report", "what's in - the queue", "show me open PRs", or "which version do I claim next". (gstack) + gstack landing report shows claimed VERSION slots and likely sibling workspace conflicts before ship. triggers: - landing report - version queue diff --git a/learn/SKILL.md b/learn/SKILL.md index d6cacddb97..d43c402522 100644 --- a/learn/SKILL.md +++ b/learn/SKILL.md @@ -3,11 +3,7 @@ name: learn preamble-tier: 2 version: 1.0.0 description: | - Manage project learnings. Review, search, prune, and export what gstack - has learned across sessions. Use when asked to "what have we learned", - "show learnings", "prune stale learnings", or "export learnings". - Proactively suggest when the user asks about past patterns or wonders - "didn't we fix this before?" + gstack learn manages saved project learnings for review, search, pruning, and export. triggers: - show learnings - what have we learned diff --git a/learn/SKILL.md.tmpl b/learn/SKILL.md.tmpl index 8a0a7572c5..a3f392d58a 100644 --- a/learn/SKILL.md.tmpl +++ b/learn/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: learn preamble-tier: 2 version: 1.0.0 description: | - Manage project learnings. Review, search, prune, and export what gstack - has learned across sessions. Use when asked to "what have we learned", - "show learnings", "prune stale learnings", or "export learnings". - Proactively suggest when the user asks about past patterns or wonders - "didn't we fix this before?" + gstack learn manages saved project learnings for review, search, pruning, and export. triggers: - show learnings - what have we learned diff --git a/make-pdf/SKILL.md b/make-pdf/SKILL.md index 538797ff78..e94c3504a0 100644 --- a/make-pdf/SKILL.md +++ b/make-pdf/SKILL.md @@ -3,11 +3,7 @@ name: make-pdf preamble-tier: 1 version: 1.0.0 description: | - Turn any markdown file into a publication-quality PDF. Proper 1in margins, - intelligent page breaks, page numbers, cover pages, running headers, curly - quotes and em dashes, clickable TOC, diagonal DRAFT watermark. Not a draft - artifact — a finished artifact. Use when asked to "make a PDF", "export to - PDF", "turn this markdown into a PDF", or "generate a document". (gstack) + gstack make-pdf turns markdown into polished PDFs with TOC, page numbers, links, and print styling. Voice triggers (speech-to-text aliases): "make this a pdf", "make it a pdf", "export to pdf", "turn this into a pdf", "turn this markdown into a pdf", "generate a pdf", "make a pdf from", "pdf this markdown". triggers: - markdown to pdf diff --git a/make-pdf/SKILL.md.tmpl b/make-pdf/SKILL.md.tmpl index 0827492a85..4f887a3157 100644 --- a/make-pdf/SKILL.md.tmpl +++ b/make-pdf/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: make-pdf preamble-tier: 1 version: 1.0.0 description: | - Turn any markdown file into a publication-quality PDF. Proper 1in margins, - intelligent page breaks, page numbers, cover pages, running headers, curly - quotes and em dashes, clickable TOC, diagonal DRAFT watermark. Not a draft - artifact — a finished artifact. Use when asked to "make a PDF", "export to - PDF", "turn this markdown into a PDF", or "generate a document". (gstack) + gstack make-pdf turns markdown into polished PDFs with TOC, page numbers, links, and print styling. voice-triggers: - "make this a pdf" - "make it a pdf" diff --git a/office-hours/SKILL.md b/office-hours/SKILL.md index 952eafff12..be3d90bdfb 100644 --- a/office-hours/SKILL.md +++ b/office-hours/SKILL.md @@ -3,17 +3,8 @@ name: office-hours preamble-tier: 3 version: 2.0.0 description: | - YC Office Hours — two modes. Startup mode: six forcing questions that expose - demand reality, status quo, desperate specificity, narrowest wedge, observation, - and future-fit. Builder mode: design thinking brainstorming for side projects, - hackathons, learning, and open source. Saves a design doc. - Use when asked to "brainstorm this", "I have an idea", "help me think through - this", "office hours", or "is this worth building". - Proactively invoke this skill (do NOT answer directly) when the user describes - a new product idea, asks whether something is worth building, wants to think - through design decisions for something that doesn't exist yet, or is exploring - a concept before any code is written. - Use before /plan-ceo-review or /plan-eng-review. (gstack) + gstack office-hours reframes startup or builder ideas through forcing questions + before code. allowed-tools: - Bash - Read diff --git a/office-hours/SKILL.md.tmpl b/office-hours/SKILL.md.tmpl index 5b9f762e7a..03844af44c 100644 --- a/office-hours/SKILL.md.tmpl +++ b/office-hours/SKILL.md.tmpl @@ -3,17 +3,8 @@ name: office-hours preamble-tier: 3 version: 2.0.0 description: | - YC Office Hours — two modes. Startup mode: six forcing questions that expose - demand reality, status quo, desperate specificity, narrowest wedge, observation, - and future-fit. Builder mode: design thinking brainstorming for side projects, - hackathons, learning, and open source. Saves a design doc. - Use when asked to "brainstorm this", "I have an idea", "help me think through - this", "office hours", or "is this worth building". - Proactively invoke this skill (do NOT answer directly) when the user describes - a new product idea, asks whether something is worth building, wants to think - through design decisions for something that doesn't exist yet, or is exploring - a concept before any code is written. - Use before /plan-ceo-review or /plan-eng-review. (gstack) + gstack office-hours reframes startup or builder ideas through forcing questions + before code. allowed-tools: - Bash - Read diff --git a/open-gstack-browser/SKILL.md b/open-gstack-browser/SKILL.md index 5c91e63d26..c765ef8bf3 100644 --- a/open-gstack-browser/SKILL.md +++ b/open-gstack-browser/SKILL.md @@ -2,11 +2,7 @@ name: open-gstack-browser version: 0.2.0 description: | - Launch GStack Browser — AI-controlled Chromium with the sidebar extension baked in. - Opens a visible browser window where you can watch every action in real time. - The sidebar shows a live activity feed and chat. Anti-bot stealth built in. - Use when asked to "open gstack browser", "launch browser", "connect chrome", - "open chrome", "real browser", "launch chrome", "side panel", or "control my browser". + gstack visible browser launcher opens Chromium with the sidebar extension and live activity feed. Voice triggers (speech-to-text aliases): "show me the browser". triggers: - open gstack browser diff --git a/open-gstack-browser/SKILL.md.tmpl b/open-gstack-browser/SKILL.md.tmpl index ef91a52789..8e156e5f48 100644 --- a/open-gstack-browser/SKILL.md.tmpl +++ b/open-gstack-browser/SKILL.md.tmpl @@ -2,11 +2,7 @@ name: open-gstack-browser version: 0.2.0 description: | - Launch GStack Browser — AI-controlled Chromium with the sidebar extension baked in. - Opens a visible browser window where you can watch every action in real time. - The sidebar shows a live activity feed and chat. Anti-bot stealth built in. - Use when asked to "open gstack browser", "launch browser", "connect chrome", - "open chrome", "real browser", "launch chrome", "side panel", or "control my browser". + gstack visible browser launcher opens Chromium with the sidebar extension and live activity feed. voice-triggers: - "show me the browser" triggers: diff --git a/package.json b/package.json index 5326f31190..ce9b735c44 100644 --- a/package.json +++ b/package.json @@ -28,6 +28,8 @@ "test:gemini": "EVALS=1 bun test test/gemini-e2e.test.ts", "test:gemini:all": "EVALS=1 EVALS_ALL=1 bun test test/gemini-e2e.test.ts", "skill:check": "bun run scripts/skill-check.ts", + "skill:budget": "bun run scripts/skill-context-budget.ts --report", + "skill:budget:check": "bun run scripts/skill-context-budget.ts --check", "dev:skill": "bun run scripts/dev-skill.ts", "start": "bun run browse/src/server.ts", "eval:list": "bun run scripts/eval-list.ts", diff --git a/pair-agent/SKILL.md b/pair-agent/SKILL.md index 3351915071..22809f2aa9 100644 --- a/pair-agent/SKILL.md +++ b/pair-agent/SKILL.md @@ -2,12 +2,7 @@ name: pair-agent version: 0.1.0 description: | - Pair a remote AI agent with your browser. One command generates a setup key and - prints instructions the other agent can follow to connect. Works with OpenClaw, - Hermes, Codex, Cursor, or any agent that can make HTTP requests. The remote agent - gets its own tab with scoped access (read+write by default, admin on request). - Use when asked to "pair agent", "connect agent", "share browser", "remote browser", - "let another agent use my browser", or "give browser access". (gstack) + gstack pair-agent lets another AI agent connect to the shared browser through a scoped session key. Voice triggers (speech-to-text aliases): "pair agent", "connect agent", "share my browser", "remote browser access". triggers: - pair with agent diff --git a/pair-agent/SKILL.md.tmpl b/pair-agent/SKILL.md.tmpl index 75ed42d590..df1655a7b1 100644 --- a/pair-agent/SKILL.md.tmpl +++ b/pair-agent/SKILL.md.tmpl @@ -2,12 +2,7 @@ name: pair-agent version: 0.1.0 description: | - Pair a remote AI agent with your browser. One command generates a setup key and - prints instructions the other agent can follow to connect. Works with OpenClaw, - Hermes, Codex, Cursor, or any agent that can make HTTP requests. The remote agent - gets its own tab with scoped access (read+write by default, admin on request). - Use when asked to "pair agent", "connect agent", "share browser", "remote browser", - "let another agent use my browser", or "give browser access". (gstack) + gstack pair-agent lets another AI agent connect to the shared browser through a scoped session key. voice-triggers: - "pair agent" - "connect agent" diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index 1a745695c9..b31c255e92 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -4,14 +4,7 @@ preamble-tier: 3 interactive: true version: 1.0.0 description: | - CEO/founder-mode plan review. Rethink the problem, find the 10-star product, - challenge premises, expand scope when it creates a better product. Four modes: - SCOPE EXPANSION (dream big), SELECTIVE EXPANSION (hold scope + cherry-pick - expansions), HOLD SCOPE (maximum rigor), SCOPE REDUCTION (strip to essentials). - Use when asked to "think bigger", "expand scope", "strategy review", "rethink this", - or "is this ambitious enough". - Proactively suggest when the user is questioning scope or ambition of a plan, - or when the plan feels like it could be thinking bigger. (gstack) + gstack CEO plan review challenges scope, strategy, ambition, and product leverage before implementation. benefits-from: [office-hours] allowed-tools: - Read diff --git a/plan-ceo-review/SKILL.md.tmpl b/plan-ceo-review/SKILL.md.tmpl index 45648f8001..864dd1cc6d 100644 --- a/plan-ceo-review/SKILL.md.tmpl +++ b/plan-ceo-review/SKILL.md.tmpl @@ -4,14 +4,7 @@ preamble-tier: 3 interactive: true version: 1.0.0 description: | - CEO/founder-mode plan review. Rethink the problem, find the 10-star product, - challenge premises, expand scope when it creates a better product. Four modes: - SCOPE EXPANSION (dream big), SELECTIVE EXPANSION (hold scope + cherry-pick - expansions), HOLD SCOPE (maximum rigor), SCOPE REDUCTION (strip to essentials). - Use when asked to "think bigger", "expand scope", "strategy review", "rethink this", - or "is this ambitious enough". - Proactively suggest when the user is questioning scope or ambition of a plan, - or when the plan feels like it could be thinking bigger. (gstack) + gstack CEO plan review challenges scope, strategy, ambition, and product leverage before implementation. benefits-from: [office-hours] allowed-tools: - Read diff --git a/plan-design-review/SKILL.md b/plan-design-review/SKILL.md index 6a2807d95d..20dd05cc8c 100644 --- a/plan-design-review/SKILL.md +++ b/plan-design-review/SKILL.md @@ -4,13 +4,7 @@ preamble-tier: 3 interactive: true version: 2.0.0 description: | - Designer's eye plan review — interactive, like CEO and Eng review. - Rates each design dimension 0-10, explains what would make it a 10, - then fixes the plan to get there. Works in plan mode. For live site - visual audits, use /design-review. Use when asked to "review the design plan" - or "design critique". - Proactively suggest when the user has a plan with UI/UX components that - should be reviewed before implementation. (gstack) + gstack design plan review fixes missing UI/UX decisions before implementation starts. allowed-tools: - Read - Edit diff --git a/plan-design-review/SKILL.md.tmpl b/plan-design-review/SKILL.md.tmpl index e44ba7da3b..2e43b6311c 100644 --- a/plan-design-review/SKILL.md.tmpl +++ b/plan-design-review/SKILL.md.tmpl @@ -4,13 +4,7 @@ preamble-tier: 3 interactive: true version: 2.0.0 description: | - Designer's eye plan review — interactive, like CEO and Eng review. - Rates each design dimension 0-10, explains what would make it a 10, - then fixes the plan to get there. Works in plan mode. For live site - visual audits, use /design-review. Use when asked to "review the design plan" - or "design critique". - Proactively suggest when the user has a plan with UI/UX components that - should be reviewed before implementation. (gstack) + gstack design plan review fixes missing UI/UX decisions before implementation starts. allowed-tools: - Read - Edit diff --git a/plan-devex-review/SKILL.md b/plan-devex-review/SKILL.md index 5c00d00752..e7c0ee4de6 100644 --- a/plan-devex-review/SKILL.md +++ b/plan-devex-review/SKILL.md @@ -4,14 +4,7 @@ preamble-tier: 3 interactive: true version: 2.0.0 description: | - Interactive developer experience plan review. Explores developer personas, - benchmarks against competitors, designs magical moments, and traces friction - points before scoring. Three modes: DX EXPANSION (competitive advantage), - DX POLISH (bulletproof every touchpoint), DX TRIAGE (critical gaps only). - Use when asked to "DX review", "developer experience audit", "devex review", - or "API design review". - Proactively suggest when the user has a plan for developer-facing products - (APIs, CLIs, SDKs, libraries, platforms, docs). (gstack) + gstack developer-experience plan review improves API, CLI, SDK, docs, and onboarding plans. Voice triggers (speech-to-text aliases): "dx review", "developer experience review", "devex review", "devex audit", "API design review", "onboarding review". benefits-from: [office-hours] allowed-tools: diff --git a/plan-devex-review/SKILL.md.tmpl b/plan-devex-review/SKILL.md.tmpl index bd824dc2bf..1a6f6d2450 100644 --- a/plan-devex-review/SKILL.md.tmpl +++ b/plan-devex-review/SKILL.md.tmpl @@ -4,14 +4,7 @@ preamble-tier: 3 interactive: true version: 2.0.0 description: | - Interactive developer experience plan review. Explores developer personas, - benchmarks against competitors, designs magical moments, and traces friction - points before scoring. Three modes: DX EXPANSION (competitive advantage), - DX POLISH (bulletproof every touchpoint), DX TRIAGE (critical gaps only). - Use when asked to "DX review", "developer experience audit", "devex review", - or "API design review". - Proactively suggest when the user has a plan for developer-facing products - (APIs, CLIs, SDKs, libraries, platforms, docs). (gstack) + gstack developer-experience plan review improves API, CLI, SDK, docs, and onboarding plans. voice-triggers: - "dx review" - "developer experience review" diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index a5a5f4fc22..efedf1794d 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -4,12 +4,7 @@ preamble-tier: 3 interactive: true version: 1.0.0 description: | - Eng manager-mode plan review. Lock in the execution plan — architecture, - data flow, diagrams, edge cases, test coverage, performance. Walks through - issues interactively with opinionated recommendations. Use when asked to - "review the architecture", "engineering review", or "lock in the plan". - Proactively suggest when the user has a plan or design doc and is about to - start coding — to catch architecture issues before implementation. (gstack) + gstack engineering plan review locks architecture, data flow, edge cases, tests, and performance. Voice triggers (speech-to-text aliases): "tech review", "technical review", "plan engineering review". benefits-from: [office-hours] allowed-tools: diff --git a/plan-eng-review/SKILL.md.tmpl b/plan-eng-review/SKILL.md.tmpl index 2d26783771..27f78f0776 100644 --- a/plan-eng-review/SKILL.md.tmpl +++ b/plan-eng-review/SKILL.md.tmpl @@ -4,12 +4,7 @@ preamble-tier: 3 interactive: true version: 1.0.0 description: | - Eng manager-mode plan review. Lock in the execution plan — architecture, - data flow, diagrams, edge cases, test coverage, performance. Walks through - issues interactively with opinionated recommendations. Use when asked to - "review the architecture", "engineering review", or "lock in the plan". - Proactively suggest when the user has a plan or design doc and is about to - start coding — to catch architecture issues before implementation. (gstack) + gstack engineering plan review locks architecture, data flow, edge cases, tests, and performance. voice-triggers: - "tech review" - "technical review" diff --git a/plan-tune/SKILL.md b/plan-tune/SKILL.md index f89e61b85a..b6e00b394d 100644 --- a/plan-tune/SKILL.md +++ b/plan-tune/SKILL.md @@ -3,18 +3,7 @@ name: plan-tune preamble-tier: 2 version: 1.0.0 description: | - Self-tuning question sensitivity + developer psychographic for gstack (v1: observational). - Review which AskUserQuestion prompts fire across gstack skills, set per-question preferences - (never-ask / always-ask / ask-only-for-one-way), inspect the dual-track - profile (what you declared vs what your behavior suggests), and enable/disable - question tuning. Conversational interface — no CLI syntax required. - - Use when asked to "tune questions", "stop asking me that", "too many questions", - "show my profile", "what questions have I been asked", "show my vibe", - "developer profile", or "turn off question tuning". (gstack) - - Proactively suggest when the user says the same gstack question has come up before, - or when they explicitly override a recommendation for the Nth time. + gstack plan-tune adjusts question sensitivity and reviews declared vs observed developer profile. triggers: - tune questions - stop asking me that diff --git a/plan-tune/SKILL.md.tmpl b/plan-tune/SKILL.md.tmpl index f31bd9f436..f6c7385dd0 100644 --- a/plan-tune/SKILL.md.tmpl +++ b/plan-tune/SKILL.md.tmpl @@ -3,18 +3,7 @@ name: plan-tune preamble-tier: 2 version: 1.0.0 description: | - Self-tuning question sensitivity + developer psychographic for gstack (v1: observational). - Review which AskUserQuestion prompts fire across gstack skills, set per-question preferences - (never-ask / always-ask / ask-only-for-one-way), inspect the dual-track - profile (what you declared vs what your behavior suggests), and enable/disable - question tuning. Conversational interface — no CLI syntax required. - - Use when asked to "tune questions", "stop asking me that", "too many questions", - "show my profile", "what questions have I been asked", "show my vibe", - "developer profile", or "turn off question tuning". (gstack) - - Proactively suggest when the user says the same gstack question has come up before, - or when they explicitly override a recommendation for the Nth time. + gstack plan-tune adjusts question sensitivity and reviews declared vs observed developer profile. triggers: - tune questions - stop asking me that diff --git a/qa-only/SKILL.md b/qa-only/SKILL.md index 17d766dea5..624369b28d 100644 --- a/qa-only/SKILL.md +++ b/qa-only/SKILL.md @@ -3,11 +3,7 @@ name: qa-only preamble-tier: 4 version: 1.0.0 description: | - Report-only QA testing. Systematically tests a web application and produces a - structured report with health score, screenshots, and repro steps — but never - fixes anything. Use when asked to "just report bugs", "qa report only", or - "test but don't fix". For the full test-fix-verify loop, use /qa instead. - Proactively suggest when the user wants a bug report without any code changes. (gstack) + gstack report-only QA tests a web app and produces evidence without changing code. Voice triggers (speech-to-text aliases): "bug report", "just check for bugs". allowed-tools: - Bash diff --git a/qa-only/SKILL.md.tmpl b/qa-only/SKILL.md.tmpl index 75c4123cc5..d2ae4cf7e5 100644 --- a/qa-only/SKILL.md.tmpl +++ b/qa-only/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: qa-only preamble-tier: 4 version: 1.0.0 description: | - Report-only QA testing. Systematically tests a web application and produces a - structured report with health score, screenshots, and repro steps — but never - fixes anything. Use when asked to "just report bugs", "qa report only", or - "test but don't fix". For the full test-fix-verify loop, use /qa instead. - Proactively suggest when the user wants a bug report without any code changes. (gstack) + gstack report-only QA tests a web app and produces evidence without changing code. voice-triggers: - "bug report" - "just check for bugs" diff --git a/qa/SKILL.md b/qa/SKILL.md index 1f8e3116a7..1b1c697a04 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -3,14 +3,7 @@ name: qa preamble-tier: 4 version: 2.0.0 description: | - Systematically QA test a web application and fix bugs found. Runs QA testing, - then iteratively fixes bugs in source code, committing each fix atomically and - re-verifying. Use when asked to "qa", "QA", "test this site", "find bugs", - "test and fix", or "fix what's broken". - Proactively suggest when the user says a feature is ready for testing - or asks "does this work?". Three tiers: Quick (critical/high only), - Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores, - fix evidence, and a ship-readiness summary. For report-only mode, use /qa-only. (gstack) + gstack QA tests a web app, fixes bugs, re-verifies behavior, and reports ship readiness. Voice triggers (speech-to-text aliases): "quality check", "test the app", "run QA". allowed-tools: - Bash diff --git a/qa/SKILL.md.tmpl b/qa/SKILL.md.tmpl index 62081d2c19..8a50ffbf58 100644 --- a/qa/SKILL.md.tmpl +++ b/qa/SKILL.md.tmpl @@ -3,14 +3,7 @@ name: qa preamble-tier: 4 version: 2.0.0 description: | - Systematically QA test a web application and fix bugs found. Runs QA testing, - then iteratively fixes bugs in source code, committing each fix atomically and - re-verifying. Use when asked to "qa", "QA", "test this site", "find bugs", - "test and fix", or "fix what's broken". - Proactively suggest when the user says a feature is ready for testing - or asks "does this work?". Three tiers: Quick (critical/high only), - Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores, - fix evidence, and a ship-readiness summary. For report-only mode, use /qa-only. (gstack) + gstack QA tests a web app, fixes bugs, re-verifies behavior, and reports ship readiness. voice-triggers: - "quality check" - "test the app" diff --git a/retro/SKILL.md b/retro/SKILL.md index 08361de4a9..01b44de99b 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -3,11 +3,7 @@ name: retro preamble-tier: 2 version: 2.0.0 description: | - Weekly engineering retrospective. Analyzes commit history, work patterns, - and code quality metrics with persistent history and trend tracking. - Team-aware: breaks down per-person contributions with praise and growth areas. - Use when asked to "weekly retro", "what did we ship", or "engineering retrospective". - Proactively suggest at the end of a work week or sprint. (gstack) + gstack retro summarizes recent engineering work, contributors, quality trends, and follow-ups. allowed-tools: - Bash - Read diff --git a/retro/SKILL.md.tmpl b/retro/SKILL.md.tmpl index 0f5894ecf3..bcb673c6c5 100644 --- a/retro/SKILL.md.tmpl +++ b/retro/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: retro preamble-tier: 2 version: 2.0.0 description: | - Weekly engineering retrospective. Analyzes commit history, work patterns, - and code quality metrics with persistent history and trend tracking. - Team-aware: breaks down per-person contributions with praise and growth areas. - Use when asked to "weekly retro", "what did we ship", or "engineering retrospective". - Proactively suggest at the end of a work week or sprint. (gstack) + gstack retro summarizes recent engineering work, contributors, quality trends, and follow-ups. allowed-tools: - Bash - Read diff --git a/review/SKILL.md b/review/SKILL.md index f21a401213..c8da13a0e3 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -3,10 +3,7 @@ name: review preamble-tier: 4 version: 1.0.0 description: | - Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust - boundary violations, conditional side effects, and other structural issues. Use when - asked to "review this PR", "code review", "pre-landing review", or "check my diff". - Proactively suggest when the user is about to merge or land code changes. (gstack) + gstack pre-landing review checks branch diffs for structural bugs before merge. allowed-tools: - Bash - Read diff --git a/review/SKILL.md.tmpl b/review/SKILL.md.tmpl index fada691125..dd87ce513d 100644 --- a/review/SKILL.md.tmpl +++ b/review/SKILL.md.tmpl @@ -3,10 +3,7 @@ name: review preamble-tier: 4 version: 1.0.0 description: | - Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust - boundary violations, conditional side effects, and other structural issues. Use when - asked to "review this PR", "code review", "pre-landing review", or "check my diff". - Proactively suggest when the user is about to merge or land code changes. (gstack) + gstack pre-landing review checks branch diffs for structural bugs before merge. allowed-tools: - Bash - Read diff --git a/scripts/skill-check.ts b/scripts/skill-check.ts index 9182737ee1..5f6d9d73b4 100644 --- a/scripts/skill-check.ts +++ b/scripts/skill-check.ts @@ -9,7 +9,9 @@ */ import { validateSkill } from '../test/helpers/skill-parser'; +import { ALL_HOST_CONFIGS, getExternalHosts, getHostConfig } from '../hosts/index'; import { discoverTemplates, discoverSkillFiles } from './discover-skills'; +import { collectSkillContextBudget, evaluateSkillContextBudget, formatBytes } from './skill-context-budget'; import * as fs from 'fs'; import * as path from 'path'; import { execSync } from 'child_process'; @@ -64,14 +66,26 @@ for (const file of SKILL_FILES) { console.log('\n Templates:'); const TEMPLATES = discoverTemplates(ROOT); +const CLAUDE_SKIPPED_SKILL_DIRS = new Set(getHostConfig('claude').generation.skipSkills ?? []); + +function templateSkillDir(tmpl: string): string { + const dir = path.dirname(tmpl); + return dir === '.' ? '' : dir; +} for (const { tmpl, output } of TEMPLATES) { const tmplPath = path.join(ROOT, tmpl); const outPath = path.join(ROOT, output); + const skippedForClaude = CLAUDE_SKIPPED_SKILL_DIRS.has(templateSkillDir(tmpl)); + if (!fs.existsSync(tmplPath)) { console.log(` \u26a0\ufe0f ${output.padEnd(30)} — no template`); continue; } + if (skippedForClaude) { + console.log(` - ${tmpl.padEnd(30)} — skipped for Claude host`); + continue; + } if (!fs.existsSync(outPath)) { hasErrors = true; console.log(` \u274c ${output.padEnd(30)} — generated file missing! Run: bun run gen:skill-docs`); @@ -88,9 +102,51 @@ for (const file of SKILL_FILES) { } } -// ─── External Host Skills (config-driven) ─────────────────── +// ─── Context Budget ───────────────────────────────────────── + +console.log('\n Context Budget:'); +const budgetReport = collectSkillContextBudget(ROOT); +const budgetEvaluation = evaluateSkillContextBudget(budgetReport); + +console.log( + ` Visible: ${budgetReport.visibleSkills.length} skills, ` + + `${formatBytes(budgetReport.totals.visibleBytes)} ` + + `(~${budgetReport.totals.visibleApproxTokens} tokens)`, +); +console.log( + ` Discovery: ${budgetReport.totals.visibleDescriptionChars} description chars, ` + + `${budgetReport.eagerCatalog.chars} catalog chars`, +); +console.log( + ` Hidden host outputs: ${budgetReport.hiddenHostSkills.length} skills, ` + + `${formatBytes(budgetReport.totals.hiddenHostBytes)}`, +); + +for (const error of budgetEvaluation.errors) { + hasErrors = true; + console.log(` \u274c ${error.path ?? error.code} — ${error.message}`); +} -import { getExternalHosts } from '../hosts/index'; +const hiddenHostWarningCount = budgetEvaluation.warnings.filter(warning => warning.path?.startsWith('.')).length; +const budgetWarnings = budgetEvaluation.warnings.filter(warning => !warning.path?.startsWith('.')); +const warningPreview = budgetWarnings.slice(0, 8); +for (const warning of warningPreview) { + console.log(` \u26a0\ufe0f ${warning.path ?? warning.code} — ${warning.message}`); +} +if (hiddenHostWarningCount > 0) { + console.log( + ` \u26a0\ufe0f ${budgetReport.hiddenHostSkills.length} hidden host generated skill file(s) present ` + + `(${hiddenHostWarningCount} warning(s)); run: bun run skill:budget`, + ); +} +if (budgetWarnings.length > warningPreview.length) { + console.log(` \u26a0\ufe0f ${budgetWarnings.length - warningPreview.length} more budget warning(s); run: bun run skill:budget`); +} +if (budgetEvaluation.errors.length === 0) { + console.log(' \u2705 Hard budget checks pass'); +} + +// ─── External Host Skills (config-driven) ─────────────────── for (const hostConfig of getExternalHosts()) { const hostDir = path.join(ROOT, hostConfig.hostSubdir, 'skills'); @@ -130,8 +186,6 @@ for (const hostConfig of getExternalHosts()) { // ─── Freshness (config-driven) ────────────────────────────── -import { ALL_HOST_CONFIGS } from '../hosts/index'; - for (const hostConfig of ALL_HOST_CONFIGS) { const hostFlag = hostConfig.name === 'claude' ? '' : ` --host ${hostConfig.name}`; console.log(`\n Freshness (${hostConfig.displayName}):`); diff --git a/scripts/skill-context-budget.ts b/scripts/skill-context-budget.ts new file mode 100644 index 0000000000..a9d7029c3f --- /dev/null +++ b/scripts/skill-context-budget.ts @@ -0,0 +1,583 @@ +#!/usr/bin/env bun +/** + * Skill context budget reporter. + * + * Measures eager discovery cost (frontmatter descriptions and catalog lines) + * separately from execution cost (generated SKILL.md body size and preamble). + */ + +import * as fs from 'fs'; +import * as path from 'path'; +import { execFileSync } from 'child_process'; +import { ALL_HOST_CONFIGS } from '../hosts/index'; +import { discoverTemplates } from './discover-skills'; + +const ROOT = path.resolve(import.meta.dir, '..'); + +export const SKILL_CONTEXT_BUDGETS = { + descriptionTargetChars: 180, + descriptionHardChars: 360, + eagerCatalogTargetChars: 12_000, + skillTargetBytes: 50_000, + skillHardBytes: 160_000, + preambleTargetBytesForTier2Plus: 22_000, +} as const; + +const SKIP_DIRS = new Set([ + '.git', + 'node_modules', + 'dist', + 'coverage', + '.gstack', +]); + +export interface FrontmatterInfo { + name: string; + description: string; + preambleTier?: number; + bodyStart: number; +} + +export interface SkillBudgetEntry { + path: string; + name: string; + description: string; + descriptionChars: number; + bytes: number; + lines: number; + approxTokens: number; + preambleTier?: number; + preambleBytes?: number; + hidden: boolean; + host?: string; + generated: boolean; +} + +export interface TemplateDescriptionEntry { + path: string; + name: string; + description: string; + descriptionChars: number; + changed: boolean; +} + +export interface HostBudgetSummary { + host: string; + path: string; + exists: boolean; + count: number; + bytes: number; + approxTokens: number; +} + +export interface SkillContextBudgetReport { + root: string; + visibleSkills: SkillBudgetEntry[]; + hiddenHostSkills: SkillBudgetEntry[]; + templateDescriptions: TemplateDescriptionEntry[]; + hostSummaries: HostBudgetSummary[]; + parseErrors: Array<{ path: string; message: string }>; + eagerCatalog: { + chars: number; + approxTokens: number; + lines: string[]; + }; + totals: { + visibleBytes: number; + visibleLines: number; + visibleApproxTokens: number; + visibleDescriptionChars: number; + visibleDescriptionApproxTokens: number; + hiddenHostBytes: number; + hiddenHostApproxTokens: number; + }; +} + +export interface BudgetFinding { + level: 'warning' | 'error'; + code: string; + path?: string; + message: string; +} + +export interface BudgetEvaluation { + warnings: BudgetFinding[]; + errors: BudgetFinding[]; +} + +function byteLength(value: string): number { + return Buffer.byteLength(value, 'utf8'); +} + +function approxTokens(charsOrBytes: number): number { + return Math.ceil(charsOrBytes / 4); +} + +function collapseWhitespace(value: string): string { + return value.replace(/\s+/g, ' ').trim(); +} + +function frontmatterEnd(content: string): number { + if (!content.startsWith('---\n')) return -1; + return content.indexOf('\n---', 4); +} + +function extractFrontmatterField(frontmatter: string, field: string): string { + const lines = frontmatter.split('\n'); + const fieldPattern = new RegExp(`^${field}:\\s*(.*)$`); + + for (let index = 0; index < lines.length; index++) { + const match = lines[index].match(fieldPattern); + if (!match) continue; + + const rest = match[1].trim(); + if (rest && rest !== '|' && rest !== '>') { + return rest.replace(/^['"]|['"]$/g, ''); + } + + const blockLines: string[] = []; + for (let blockIndex = index + 1; blockIndex < lines.length; blockIndex++) { + const line = lines[blockIndex]; + if (line.trim() !== '' && !/^\s/.test(line)) break; + blockLines.push(line.replace(/^ /, '')); + } + return blockLines.join('\n').trim(); + } + + return ''; +} + +export function parseSkillFrontmatter(content: string, relPath: string): FrontmatterInfo { + const fmEnd = frontmatterEnd(content); + if (fmEnd === -1) { + throw new Error(`${relPath} is missing YAML frontmatter`); + } + + const frontmatter = content.slice(4, fmEnd); + const name = extractFrontmatterField(frontmatter, 'name'); + const description = extractFrontmatterField(frontmatter, 'description'); + const tierRaw = extractFrontmatterField(frontmatter, 'preamble-tier'); + const preambleTier = tierRaw ? Number.parseInt(tierRaw, 10) : undefined; + + if (!name) throw new Error(`${relPath} frontmatter is missing name`); + if (!description) throw new Error(`${relPath} frontmatter is missing description`); + + return { + name, + description, + preambleTier: Number.isFinite(preambleTier) ? preambleTier : undefined, + bodyStart: fmEnd + '\n---'.length, + }; +} + +function estimatePreambleBytes(content: string, bodyStart: number, preambleTier?: number): number | undefined { + if (!preambleTier || preambleTier < 2) return undefined; + + const body = content.slice(bodyStart); + let inFence = false; + let offset = 0; + + for (const line of body.split('\n')) { + if (line.startsWith('```')) { + inFence = !inFence; + } else if (!inFence && line.startsWith('# ')) { + return byteLength(body.slice(0, offset)); + } + offset += line.length + 1; + } + + return undefined; +} + +function shouldSkipDir(name: string, includeHidden: boolean): boolean { + if (SKIP_DIRS.has(name)) return true; + if (!includeHidden && name.startsWith('.')) return true; + return false; +} + +function walkSkillFiles(root: string, includeHidden: boolean): string[] { + const results: string[] = []; + + function walk(dir: string): void { + let entries: fs.Dirent[]; + try { + entries = fs.readdirSync(dir, { withFileTypes: true }); + } catch { + return; + } + + for (const entry of entries) { + if (entry.isDirectory()) { + if (shouldSkipDir(entry.name, includeHidden)) continue; + walk(path.join(dir, entry.name)); + continue; + } + + if (entry.isFile() && entry.name === 'SKILL.md') { + results.push(path.relative(root, path.join(dir, entry.name))); + } + } + } + + walk(root); + return results.sort(); +} + +function collectChangedTemplates(root: string): Set { + const changed = new Set(); + const commands: string[][] = [ + ['diff', '--name-only', '--diff-filter=ACMRT', 'HEAD', '--', '*.tmpl'], + ['ls-files', '--others', '--exclude-standard', '--', '*.tmpl'], + ]; + + for (const args of commands) { + try { + const output = execFileSync('git', args, { cwd: root, encoding: 'utf8', stdio: ['ignore', 'pipe', 'ignore'] }); + for (const line of output.split('\n')) { + const trimmed = line.trim(); + if (trimmed) changed.add(trimmed); + } + } catch { + // Non-git checkouts still get absolute limits and parser checks. + } + } + + return changed; +} + +function skillEntryFromFile( + root: string, + relPath: string, + hidden: boolean, + host: string | undefined, + parseErrors: Array<{ path: string; message: string }>, +): SkillBudgetEntry | null { + const fullPath = path.join(root, relPath); + const content = fs.readFileSync(fullPath, 'utf8'); + const stats = fs.statSync(fullPath); + + try { + const frontmatter = parseSkillFrontmatter(content, relPath); + return { + path: relPath, + name: frontmatter.name, + description: frontmatter.description, + descriptionChars: frontmatter.description.length, + bytes: stats.size, + lines: content.split('\n').length, + approxTokens: approxTokens(stats.size), + preambleTier: frontmatter.preambleTier, + preambleBytes: estimatePreambleBytes(content, frontmatter.bodyStart, frontmatter.preambleTier), + hidden, + host, + generated: content.includes('AUTO-GENERATED from SKILL.md.tmpl'), + }; + } catch (err) { + parseErrors.push({ path: relPath, message: (err as Error).message }); + return null; + } +} + +function hostSkillFiles(root: string, hostSubdir: string): string[] { + const hostRoot = path.join(root, hostSubdir, 'skills'); + if (!fs.existsSync(hostRoot)) return []; + + const rootRealPath = fs.realpathSync(root); + try { + if (fs.realpathSync(hostRoot) === rootRealPath) return []; + } catch { + return []; + } + + return walkSkillFiles(hostRoot, true) + .map(rel => path.join(hostSubdir, 'skills', rel)) + .sort(); +} + +export function collectSkillContextBudget(root: string = ROOT): SkillContextBudgetReport { + const parseErrors: Array<{ path: string; message: string }> = []; + const visibleSkills = walkSkillFiles(root, false) + .map(rel => skillEntryFromFile(root, rel, false, undefined, parseErrors)) + .filter((entry): entry is SkillBudgetEntry => entry !== null); + + const hiddenHostSkills: SkillBudgetEntry[] = []; + const hostSummaries: HostBudgetSummary[] = []; + + for (const hostConfig of ALL_HOST_CONFIGS) { + const relHostDir = path.join(hostConfig.hostSubdir, 'skills'); + const files = hostSkillFiles(root, hostConfig.hostSubdir); + const entries = files + .map(rel => skillEntryFromFile(root, rel, true, hostConfig.name, parseErrors)) + .filter((entry): entry is SkillBudgetEntry => entry !== null); + + hiddenHostSkills.push(...entries); + hostSummaries.push({ + host: hostConfig.name, + path: relHostDir, + exists: fs.existsSync(path.join(root, relHostDir)), + count: entries.length, + bytes: entries.reduce((sum, entry) => sum + entry.bytes, 0), + approxTokens: entries.reduce((sum, entry) => sum + entry.approxTokens, 0), + }); + } + + const changedTemplates = collectChangedTemplates(root); + const templateDescriptions = discoverTemplates(root).map(({ tmpl }) => { + const content = fs.readFileSync(path.join(root, tmpl), 'utf8'); + try { + const frontmatter = parseSkillFrontmatter(content, tmpl); + return { + path: tmpl, + name: frontmatter.name, + description: frontmatter.description, + descriptionChars: frontmatter.description.length, + changed: changedTemplates.has(tmpl), + }; + } catch (err) { + parseErrors.push({ path: tmpl, message: (err as Error).message }); + return { + path: tmpl, + name: '', + description: '', + descriptionChars: 0, + changed: changedTemplates.has(tmpl), + }; + } + }); + + const catalogLines = visibleSkills.map(entry => + `${entry.name}: ${collapseWhitespace(entry.description)} (${entry.path})` + ); + const catalogChars = catalogLines.reduce((sum, line) => sum + line.length + 1, 0); + + return { + root, + visibleSkills, + hiddenHostSkills, + templateDescriptions, + hostSummaries, + parseErrors, + eagerCatalog: { + chars: catalogChars, + approxTokens: approxTokens(catalogChars), + lines: catalogLines, + }, + totals: { + visibleBytes: visibleSkills.reduce((sum, entry) => sum + entry.bytes, 0), + visibleLines: visibleSkills.reduce((sum, entry) => sum + entry.lines, 0), + visibleApproxTokens: visibleSkills.reduce((sum, entry) => sum + entry.approxTokens, 0), + visibleDescriptionChars: visibleSkills.reduce((sum, entry) => sum + entry.descriptionChars, 0), + visibleDescriptionApproxTokens: approxTokens( + visibleSkills.reduce((sum, entry) => sum + entry.descriptionChars, 0), + ), + hiddenHostBytes: hiddenHostSkills.reduce((sum, entry) => sum + entry.bytes, 0), + hiddenHostApproxTokens: hiddenHostSkills.reduce((sum, entry) => sum + entry.approxTokens, 0), + }, + }; +} + +export function evaluateSkillContextBudget(report: SkillContextBudgetReport): BudgetEvaluation { + const warnings: BudgetFinding[] = []; + const errors: BudgetFinding[] = []; + const allSkillEntries = [...report.visibleSkills, ...report.hiddenHostSkills]; + + for (const parseError of report.parseErrors) { + errors.push({ + level: 'error', + code: 'frontmatter-parse', + path: parseError.path, + message: parseError.message, + }); + } + + for (const entry of allSkillEntries) { + if (entry.bytes > SKILL_CONTEXT_BUDGETS.skillHardBytes) { + errors.push({ + level: 'error', + code: 'skill-hard-ceiling', + path: entry.path, + message: `${entry.path} is ${formatBytes(entry.bytes)}, above ${formatBytes(SKILL_CONTEXT_BUDGETS.skillHardBytes)}`, + }); + } else if (entry.bytes > SKILL_CONTEXT_BUDGETS.skillTargetBytes) { + warnings.push({ + level: 'warning', + code: 'skill-target', + path: entry.path, + message: `${entry.path} is ${formatBytes(entry.bytes)}, above target ${formatBytes(SKILL_CONTEXT_BUDGETS.skillTargetBytes)}`, + }); + } + + if (entry.descriptionChars > SKILL_CONTEXT_BUDGETS.descriptionTargetChars) { + warnings.push({ + level: 'warning', + code: 'description-target', + path: entry.path, + message: `${entry.path} description is ${entry.descriptionChars} chars, target ${SKILL_CONTEXT_BUDGETS.descriptionTargetChars}`, + }); + } + + if ( + entry.preambleTier !== undefined && + entry.preambleTier >= 2 && + entry.preambleBytes !== undefined && + entry.preambleBytes > SKILL_CONTEXT_BUDGETS.preambleTargetBytesForTier2Plus + ) { + warnings.push({ + level: 'warning', + code: 'preamble-target', + path: entry.path, + message: `${entry.path} tier ${entry.preambleTier} preamble is ${formatBytes(entry.preambleBytes)}, target ${formatBytes(SKILL_CONTEXT_BUDGETS.preambleTargetBytesForTier2Plus)}`, + }); + } + + if (entry.hidden) { + warnings.push({ + level: 'warning', + code: 'hidden-host-skill', + path: entry.path, + message: `${entry.path} is a generated host skill under ${entry.host ?? 'unknown host'} output`, + }); + } + } + + for (const template of report.templateDescriptions) { + if (template.changed && template.descriptionChars > SKILL_CONTEXT_BUDGETS.descriptionHardChars) { + errors.push({ + level: 'error', + code: 'changed-template-description-hard-limit', + path: template.path, + message: `${template.path} changed description is ${template.descriptionChars} chars, above ${SKILL_CONTEXT_BUDGETS.descriptionHardChars}`, + }); + } + } + + if (report.eagerCatalog.chars > SKILL_CONTEXT_BUDGETS.eagerCatalogTargetChars) { + warnings.push({ + level: 'warning', + code: 'eager-catalog-target', + message: `eager catalog estimate is ${report.eagerCatalog.chars} chars, target ${SKILL_CONTEXT_BUDGETS.eagerCatalogTargetChars}`, + }); + } + + return { warnings, errors }; +} + +export function formatBytes(bytes: number): string { + if (bytes < 1024) return `${bytes} B`; + if (bytes < 1024 * 1024) return `${(bytes / 1024).toFixed(1)} KB`; + return `${(bytes / 1024 / 1024).toFixed(2)} MB`; +} + +function topBy(items: T[], pick: (item: T) => number, limit: number): T[] { + return [...items].sort((a, b) => pick(b) - pick(a)).slice(0, limit); +} + +export function summarizeSkillContextBudget(report: SkillContextBudgetReport): Record { + return { + visible_skills: report.visibleSkills.length, + visible_bytes: report.totals.visibleBytes, + visible_approx_tokens: report.totals.visibleApproxTokens, + visible_description_chars: report.totals.visibleDescriptionChars, + visible_description_approx_tokens: report.totals.visibleDescriptionApproxTokens, + eager_catalog_chars: report.eagerCatalog.chars, + eager_catalog_approx_tokens: report.eagerCatalog.approxTokens, + hidden_host_skills: report.hiddenHostSkills.length, + hidden_host_bytes: report.totals.hiddenHostBytes, + hidden_host_approx_tokens: report.totals.hiddenHostApproxTokens, + largest_skills: topBy(report.visibleSkills, entry => entry.bytes, 10).map(entry => ({ + path: entry.path, + bytes: entry.bytes, + approx_tokens: entry.approxTokens, + })), + largest_descriptions: topBy(report.visibleSkills, entry => entry.descriptionChars, 10).map(entry => ({ + path: entry.path, + chars: entry.descriptionChars, + })), + hosts: report.hostSummaries, + }; +} + +function table(rows: string[][]): string { + const widths = rows[0].map((_, index) => Math.max(...rows.map(row => row[index].length))); + return rows + .map(row => row.map((cell, index) => cell.padEnd(widths[index])).join(' ')) + .join('\n'); +} + +export function renderSkillContextBudgetReport( + report: SkillContextBudgetReport, + evaluation: BudgetEvaluation = evaluateSkillContextBudget(report), +): string { + const largestSkills = topBy(report.visibleSkills, entry => entry.bytes, 10); + const largestDescriptions = topBy(report.visibleSkills, entry => entry.descriptionChars, 10); + const hostRows = report.hostSummaries + .filter(host => host.exists || host.count > 0) + .map(host => [host.host, host.path, String(host.count), formatBytes(host.bytes), `~${host.approxTokens}`]); + + const sections = [ + 'Skill Context Budget', + '', + `Visible skills: ${report.visibleSkills.length}`, + `Visible bytes: ${formatBytes(report.totals.visibleBytes)} (~${report.totals.visibleApproxTokens} tokens)`, + `Visible description chars: ${report.totals.visibleDescriptionChars} (~${report.totals.visibleDescriptionApproxTokens} tokens)`, + `Eager catalog estimate: ${report.eagerCatalog.chars} chars (~${report.eagerCatalog.approxTokens} tokens)`, + `Hidden host duplicate bytes: ${formatBytes(report.totals.hiddenHostBytes)} (~${report.totals.hiddenHostApproxTokens} tokens)`, + '', + 'Largest skills:', + table([ + ['path', 'bytes', 'tokens', 'lines'], + ...largestSkills.map(entry => [entry.path, formatBytes(entry.bytes), `~${entry.approxTokens}`, String(entry.lines)]), + ]), + '', + 'Largest descriptions:', + table([ + ['path', 'chars'], + ...largestDescriptions.map(entry => [entry.path, String(entry.descriptionChars)]), + ]), + ]; + + if (hostRows.length > 0) { + sections.push('', 'Host generated outputs:', table([ + ['host', 'path', 'skills', 'bytes', 'tokens'], + ...hostRows, + ])); + } + + if (evaluation.errors.length > 0) { + sections.push('', 'Errors:', ...evaluation.errors.map(error => + `- [${error.code}] ${error.path ? `${error.path}: ` : ''}${error.message}` + )); + } + + if (evaluation.warnings.length > 0) { + sections.push('', 'Warnings:', ...evaluation.warnings.map(warning => + `- [${warning.code}] ${warning.path ? `${warning.path}: ` : ''}${warning.message}` + )); + } + + sections.push('', 'JSON summary:', JSON.stringify(summarizeSkillContextBudget(report), null, 2)); + return sections.join('\n'); +} + +function printUsageAndExit(): never { + console.error('Usage: bun run scripts/skill-context-budget.ts [--report|--check]'); + process.exit(2); +} + +if (import.meta.main) { + const mode = process.argv.includes('--check') + ? 'check' + : process.argv.includes('--report') || process.argv.length <= 2 + ? 'report' + : 'unknown'; + + if (mode === 'unknown') printUsageAndExit(); + + const report = collectSkillContextBudget(ROOT); + const evaluation = evaluateSkillContextBudget(report); + console.log(renderSkillContextBudgetReport(report, evaluation)); + + if (mode === 'check' && evaluation.errors.length > 0) { + process.exit(1); + } +} diff --git a/setup-browser-cookies/SKILL.md b/setup-browser-cookies/SKILL.md index 8c2b65a399..067ce643fc 100644 --- a/setup-browser-cookies/SKILL.md +++ b/setup-browser-cookies/SKILL.md @@ -3,10 +3,7 @@ name: setup-browser-cookies preamble-tier: 1 version: 1.0.0 description: | - Import cookies from your real Chromium browser into the headless browse session. - Opens an interactive picker UI where you select which cookie domains to import. - Use before QA testing authenticated pages. Use when asked to "import cookies", - "login to the site", or "authenticate the browser". (gstack) + gstack cookie setup imports real-browser Chromium cookies into the headless browse session. triggers: - import browser cookies - login to test site diff --git a/setup-browser-cookies/SKILL.md.tmpl b/setup-browser-cookies/SKILL.md.tmpl index f812d9f56f..0630564425 100644 --- a/setup-browser-cookies/SKILL.md.tmpl +++ b/setup-browser-cookies/SKILL.md.tmpl @@ -3,10 +3,7 @@ name: setup-browser-cookies preamble-tier: 1 version: 1.0.0 description: | - Import cookies from your real Chromium browser into the headless browse session. - Opens an interactive picker UI where you select which cookie domains to import. - Use before QA testing authenticated pages. Use when asked to "import cookies", - "login to the site", or "authenticate the browser". (gstack) + gstack cookie setup imports real-browser Chromium cookies into the headless browse session. triggers: - import browser cookies - login to test site diff --git a/setup-deploy/SKILL.md b/setup-deploy/SKILL.md index 415181f4de..430563e114 100644 --- a/setup-deploy/SKILL.md +++ b/setup-deploy/SKILL.md @@ -3,12 +3,7 @@ name: setup-deploy preamble-tier: 2 version: 1.0.0 description: | - Configure deployment settings for /land-and-deploy. Detects your deploy - platform (Fly.io, Render, Vercel, Netlify, Heroku, GitHub Actions, custom), - production URL, health check endpoints, and deploy status commands. Writes - the configuration to CLAUDE.md so all future deploys are automatic. - Use when: "setup deploy", "configure deployment", "set up land-and-deploy", - "how do I deploy with gstack", "add deploy config". + gstack deploy setup detects platform, production URL, health checks, and persists deploy config. triggers: - configure deploy - setup deployment diff --git a/setup-deploy/SKILL.md.tmpl b/setup-deploy/SKILL.md.tmpl index 587a993c01..5f7fd37d55 100644 --- a/setup-deploy/SKILL.md.tmpl +++ b/setup-deploy/SKILL.md.tmpl @@ -3,12 +3,7 @@ name: setup-deploy preamble-tier: 2 version: 1.0.0 description: | - Configure deployment settings for /land-and-deploy. Detects your deploy - platform (Fly.io, Render, Vercel, Netlify, Heroku, GitHub Actions, custom), - production URL, health check endpoints, and deploy status commands. Writes - the configuration to CLAUDE.md so all future deploys are automatic. - Use when: "setup deploy", "configure deployment", "set up land-and-deploy", - "how do I deploy with gstack", "add deploy config". + gstack deploy setup detects platform, production URL, health checks, and persists deploy config. triggers: - configure deploy - setup deployment diff --git a/setup-gbrain/SKILL.md b/setup-gbrain/SKILL.md index 1ee78dac5e..e9bb1917fe 100644 --- a/setup-gbrain/SKILL.md +++ b/setup-gbrain/SKILL.md @@ -3,11 +3,7 @@ name: setup-gbrain preamble-tier: 2 version: 1.0.0 description: | - Set up gbrain for this coding agent: install the CLI, initialize a - local PGLite or Supabase brain, register MCP, capture per-remote trust - policy. One command from zero to "gbrain is running, and this agent - can call it." Use when: "setup gbrain", "connect gbrain", "start - gbrain", "install gbrain", "configure gbrain for this machine". (gstack) + gstack gbrain setup installs and registers the coding-agent memory CLI and trust policy. triggers: - setup gbrain - install gbrain diff --git a/setup-gbrain/SKILL.md.tmpl b/setup-gbrain/SKILL.md.tmpl index 3bbf9b12ef..4fbbce3c7a 100644 --- a/setup-gbrain/SKILL.md.tmpl +++ b/setup-gbrain/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: setup-gbrain preamble-tier: 2 version: 1.0.0 description: | - Set up gbrain for this coding agent: install the CLI, initialize a - local PGLite or Supabase brain, register MCP, capture per-remote trust - policy. One command from zero to "gbrain is running, and this agent - can call it." Use when: "setup gbrain", "connect gbrain", "start - gbrain", "install gbrain", "configure gbrain for this machine". (gstack) + gstack gbrain setup installs and registers the coding-agent memory CLI and trust policy. triggers: - setup gbrain - install gbrain diff --git a/ship/SKILL.md b/ship/SKILL.md index 1030ef9938..aabf012267 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -3,11 +3,7 @@ name: ship preamble-tier: 4 version: 1.0.0 description: | - Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, - update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy", - "push to main", "create a PR", "merge and push", or "get it deployed". - Proactively invoke this skill (do NOT push/PR directly) when the user says code - is ready, asks about deploying, wants to push code up, or asks to create a PR. (gstack) + gstack ship workflow tests, reviews, versions, changelog-updates, commits, pushes, and opens a PR. allowed-tools: - Bash - Read diff --git a/ship/SKILL.md.tmpl b/ship/SKILL.md.tmpl index b6a19bcbab..afa4fce3df 100644 --- a/ship/SKILL.md.tmpl +++ b/ship/SKILL.md.tmpl @@ -3,11 +3,7 @@ name: ship preamble-tier: 4 version: 1.0.0 description: | - Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, - update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy", - "push to main", "create a PR", "merge and push", or "get it deployed". - Proactively invoke this skill (do NOT push/PR directly) when the user says code - is ready, asks about deploying, wants to push code up, or asks to create a PR. (gstack) + gstack ship workflow tests, reviews, versions, changelog-updates, commits, pushes, and opens a PR. allowed-tools: - Bash - Read diff --git a/test/gen-skill-docs.test.ts b/test/gen-skill-docs.test.ts index 4c20343581..704a05e2f1 100644 --- a/test/gen-skill-docs.test.ts +++ b/test/gen-skill-docs.test.ts @@ -6,7 +6,7 @@ import * as path from 'path'; import * as os from 'os'; const ROOT = path.resolve(import.meta.dir, '..'); -const MAX_SKILL_DESCRIPTION_LENGTH = 1024; +const MAX_SKILL_DESCRIPTION_LENGTH = 360; function extractDescription(content: string): string { const fmEnd = content.indexOf('\n---', 4); @@ -190,8 +190,8 @@ describe('gen-skill-docs', () => { } }); - test('every Codex SKILL.md description stays under 900-char warning threshold', () => { - const WARN_THRESHOLD = 900; + test('every Codex SKILL.md description stays under compact routing threshold', () => { + const WARN_THRESHOLD = MAX_SKILL_DESCRIPTION_LENGTH; const agentsDir = path.join(ROOT, '.agents', 'skills'); if (!fs.existsSync(agentsDir)) return; const violations: string[] = []; @@ -1176,7 +1176,13 @@ describe('DESIGN_SKETCH resolver', () => { describe('CODEX_SECOND_OPINION resolver', () => { const content = fs.readFileSync(path.join(ROOT, 'office-hours', 'SKILL.md'), 'utf-8'); - const codexContent = fs.readFileSync(path.join(ROOT, '.agents', 'skills', 'gstack-office-hours', 'SKILL.md'), 'utf-8'); + const codexOfficeHoursPath = path.join(ROOT, '.agents', 'skills', 'gstack-office-hours', 'SKILL.md'); + if (!fs.existsSync(codexOfficeHoursPath)) { + Bun.spawnSync(['bun', 'run', 'scripts/gen-skill-docs.ts', '--host', 'codex'], { + cwd: ROOT, stdout: 'pipe', stderr: 'pipe', + }); + } + const codexContent = fs.readFileSync(codexOfficeHoursPath, 'utf-8'); test('Phase 3.5 section appears in office-hours SKILL.md', () => { expect(content).toContain('Phase 3.5: Cross-Model Second Opinion'); @@ -1741,7 +1747,7 @@ describe('Codex generation (--host codex)', () => { }); test('multiline descriptions preserved in Codex output', () => { - // office-hours has a multiline description — verify it survives the frontmatter transform + // office-hours has a compact multiline description; verify it survives the frontmatter transform. const content = fs.readFileSync(path.join(AGENTS_DIR, 'gstack-office-hours', 'SKILL.md'), 'utf-8'); const fmEnd = content.indexOf('\n---', 4); const frontmatter = content.slice(4, fmEnd); @@ -1749,7 +1755,7 @@ describe('Codex generation (--host codex)', () => { const descLines = frontmatter.split('\n').filter(l => l.startsWith(' ')); expect(descLines.length).toBeGreaterThan(1); // Verify key phrases survived - expect(frontmatter).toContain('YC Office Hours'); + expect(frontmatter).toContain('gstack office-hours'); }); test('hook skills have safety prose and no hooks: in frontmatter', () => { diff --git a/test/skill-context-budget.test.ts b/test/skill-context-budget.test.ts new file mode 100644 index 0000000000..3d827734a3 --- /dev/null +++ b/test/skill-context-budget.test.ts @@ -0,0 +1,60 @@ +import { describe, expect, test } from 'bun:test'; +import * as path from 'path'; +import { + collectSkillContextBudget, + evaluateSkillContextBudget, + parseSkillFrontmatter, + SKILL_CONTEXT_BUDGETS, +} from '../scripts/skill-context-budget'; + +const ROOT = path.resolve(import.meta.dir, '..'); + +describe('skill context budget', () => { + test('collects visible skill execution and discovery metrics', () => { + const report = collectSkillContextBudget(ROOT); + + expect(report.visibleSkills.length).toBeGreaterThanOrEqual(40); + expect(report.totals.visibleBytes).toBeGreaterThan(1_000_000); + expect(report.totals.visibleApproxTokens).toBeGreaterThan(200_000); + expect(report.totals.visibleDescriptionChars).toBeGreaterThan(1_000); + expect(report.eagerCatalog.chars).toBeGreaterThan(report.totals.visibleDescriptionChars); + expect(report.eagerCatalog.lines.length).toBe(report.visibleSkills.length); + expect(report.totals.visibleDescriptionChars).toBeLessThanOrEqual(8_500); + expect(report.eagerCatalog.chars).toBeLessThanOrEqual(11_000); + }); + + test('current generated skills stay below the hard execution ceiling', () => { + const report = collectSkillContextBudget(ROOT); + const evaluation = evaluateSkillContextBudget(report); + + const hardErrors = evaluation.errors.filter(error => error.code === 'skill-hard-ceiling'); + expect(hardErrors).toEqual([]); + for (const skill of report.visibleSkills) { + expect(skill.bytes).toBeLessThanOrEqual(SKILL_CONTEXT_BUDGETS.skillHardBytes); + } + }); + + test('check mode has no hard errors in the current checkout', () => { + const report = collectSkillContextBudget(ROOT); + const evaluation = evaluateSkillContextBudget(report); + + expect(evaluation.errors).toEqual([]); + expect(evaluation.warnings.length).toBeGreaterThan(0); + }); + + test('frontmatter parser handles inline and block descriptions', () => { + const inline = parseSkillFrontmatter( + `---\nname: demo\ndescription: Small demo skill.\npreamble-tier: 2\n---\n# Demo\n`, + 'inline/SKILL.md', + ); + expect(inline.name).toBe('demo'); + expect(inline.description).toBe('Small demo skill.'); + expect(inline.preambleTier).toBe(2); + + const block = parseSkillFrontmatter( + `---\nname: demo\ndescription: |\n First line.\n Second line.\n---\n# Demo\n`, + 'block/SKILL.md', + ); + expect(block.description).toBe('First line.\nSecond line.'); + }); +}); diff --git a/test/skill-validation.test.ts b/test/skill-validation.test.ts index 23b909ae86..64851f17a7 100644 --- a/test/skill-validation.test.ts +++ b/test/skill-validation.test.ts @@ -1417,9 +1417,8 @@ describe('Codex skill', () => { // --- Trigger phrase validation --- describe('Skill trigger phrases', () => { - // Skills that must have "Use when" trigger phrases in their description. - // Excluded: root gstack (browser tool), gstack-upgrade (gstack-specific), - // humanizer (text tool) + // Skills that must have compact routing metadata. Long "Use when" prose moved + // out of descriptions so eager skill catalogs stay small. const SKILLS_REQUIRING_TRIGGERS = [ 'qa', 'qa-only', 'ship', 'review', 'investigate', 'office-hours', 'plan-ceo-review', 'plan-eng-review', 'plan-design-review', @@ -1428,18 +1427,19 @@ describe('Skill trigger phrases', () => { ]; for (const skill of SKILLS_REQUIRING_TRIGGERS) { - test(`${skill}/SKILL.md has "Use when" trigger phrases`, () => { + test(`${skill}/SKILL.md has trigger metadata outside the description`, () => { const skillPath = path.join(ROOT, skill, 'SKILL.md'); if (!fs.existsSync(skillPath)) return; const content = fs.readFileSync(skillPath, 'utf-8'); - // Extract description from frontmatter const frontmatterEnd = content.indexOf('---', 4); const frontmatter = content.slice(0, frontmatterEnd); - expect(frontmatter).toMatch(/Use when/i); + expect(frontmatter).toMatch(/^triggers:\n(?:\s+-\s+.+\n?)+/m); + expect(frontmatter).not.toMatch(/Use when/i); }); } - // Skills with proactive triggers should have "Proactively suggest" in description + // Proactive routing also lives in explicit trigger metadata, not long + // frontmatter descriptions. const SKILLS_REQUIRING_PROACTIVE = [ 'qa', 'qa-only', 'ship', 'review', 'investigate', 'office-hours', 'plan-ceo-review', 'plan-eng-review', 'plan-design-review', @@ -1447,13 +1447,14 @@ describe('Skill trigger phrases', () => { ]; for (const skill of SKILLS_REQUIRING_PROACTIVE) { - test(`${skill}/SKILL.md has proactive routing phrase`, () => { + test(`${skill}/SKILL.md keeps proactive routing out of description prose`, () => { const skillPath = path.join(ROOT, skill, 'SKILL.md'); if (!fs.existsSync(skillPath)) return; const content = fs.readFileSync(skillPath, 'utf-8'); const frontmatterEnd = content.indexOf('---', 4); const frontmatter = content.slice(0, frontmatterEnd); - expect(frontmatter).toMatch(/Proactively (suggest|invoke)/i); + expect(frontmatter).toMatch(/^triggers:\n(?:\s+-\s+.+\n?)+/m); + expect(frontmatter).not.toMatch(/Proactively (suggest|invoke)/i); }); } }); diff --git a/unfreeze/SKILL.md b/unfreeze/SKILL.md index 379ea52f7c..b899b9fa88 100644 --- a/unfreeze/SKILL.md +++ b/unfreeze/SKILL.md @@ -2,10 +2,7 @@ name: unfreeze version: 0.1.0 description: | - Clear the freeze boundary set by /freeze, allowing edits to all directories - again. Use when you want to widen edit scope without ending the session. - Use when asked to "unfreeze", "unlock edits", "remove freeze", or - "allow all edits". (gstack) + gstack unfreeze clears the active edit boundary so files can be changed anywhere. triggers: - unfreeze edits - unlock all directories diff --git a/unfreeze/SKILL.md.tmpl b/unfreeze/SKILL.md.tmpl index 83e2827c87..97dcf72213 100644 --- a/unfreeze/SKILL.md.tmpl +++ b/unfreeze/SKILL.md.tmpl @@ -2,10 +2,7 @@ name: unfreeze version: 0.1.0 description: | - Clear the freeze boundary set by /freeze, allowing edits to all directories - again. Use when you want to widen edit scope without ending the session. - Use when asked to "unfreeze", "unlock edits", "remove freeze", or - "allow all edits". (gstack) + gstack unfreeze clears the active edit boundary so files can be changed anywhere. triggers: - unfreeze edits - unlock all directories