v1.26.0.0 feat: V1 transcript ingest + per-skill gbrain manifests + retrieval surface#1298
Merged
v1.26.0.0 feat: V1 transcript ingest + per-skill gbrain manifests + retrieval surface#1298
Conversation
…peline Lane 0 foundation per plan §"Eng review additions". 5 public functions imported by the V1 helpers (Lanes A/B/C): canonicalizeRemote(url) — normalize git remote → host/org/repo secretScanFile(path) — gitleaks wrapper with discriminated return detectEngineTier() — cached 60s in ~/.gstack/.gbrain-engine-cache.json parseSkillManifest(path) — extract gbrain.context_queries: from frontmatter withErrorContext(op,fn,caller) — async-aware error logging 22 unit tests, all passing. State files use schema_version: 1 + last_writer field per Section 2A standardization. Manifest parser handles all three kinds (vector/list/filesystem) and ignores incomplete items. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lane A. Walks coding-agent transcripts (Claude Code + Codex; Cursor V1.0.1 follow-up) AND ~/.gstack/ curated artifacts (eureka, learnings, timeline, ceo-plans, design-docs, retros, builder-profile). Calls gbrain put_page with type-tagged frontmatter. Uses gstack-memory-helpers (Lane 0): - Modes: --probe / --incremental (default, mtime fast-path) / --bulk - Default 90-day window; --all-history opts into full archive - --sources subset filter; --include-unattributed opt-in for no-remote sessions - --limit N for smoke testing; --benchmark for throughput reporting - Tolerant JSONL parser handles truncated last lines (D10 partial-flag) - State file at ~/.gstack/.transcript-ingest-state.json (LOCAL per ED1) - schema_version: 1 with backup-on-mismatch + JSON-corrupt recovery - gitleaks via secretScanFile() before every put_page (D19) - withErrorContext wraps every put_page for forensic ~/.gstack/.gbrain-errors.jsonl 15 unit tests cover --help, --probe (empty, Claude Code, Codex, mixed artifacts), --sources filter, state file lifecycle (create, schema mismatch backup, JSON corrupt backup), truncated-last-line handling, --limit validation. All passing. V1.5 P0 follow-ups noted in the file header: - Cursor SQLite extraction (V1.0.1) - gbrain put_file routing for Supabase Storage tier (cross-repo) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Orchestrates three storage tiers per plan §"Storage tiering": 1. Code (current repo) → gbrain import (Supabase or local PGLite) 2. Transcripts + curated memory → gstack-memory-ingest (typed put_page) 3. Curated artifacts to git → gstack-brain-sync (existing pipeline) Modes: --incremental (default, mtime fast-path) / --full (~25-35 min per ED2 honest budget) / --dry-run (preview, no writes). Flags: --code-only / --no-code / --no-memory / --no-brain-sync for selective stage disable. Each stage failure is non-fatal; subsequent stages still run. State at ~/.gstack/.gbrain-sync-state.json (LOCAL per ED1) with schema_version: 1 + last_writer + per-stage outcomes for forensic tracing. --watch daemon explicitly deferred to V1.5 P0 TODO per Codex F3 (reverses the "no daemon" invariant). Continuous sync rides the existing preamble-boundary hook only. 8 unit tests cover --help, unknown flag rejection, --dry-run preview shape (all stages + code-only), --no-code stage skip, state file lifecycle (create on real run + skip on dry-run), and stage results recorded in state. All passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Called from the gstack preamble at every skill start. Reads the active
skill's gbrain.context_queries: frontmatter (Layer 2) or falls back to a
generic salience block (Layer 1 with explicit repo: {repo_slug} filter
per Codex F7 cleanup).
Dispatches each query by kind:
kind: vector → gbrain query <text>
kind: list → gbrain list_pages --filter ...
kind: filesystem → local glob (with mtime_desc sort + tail support)
Each MCP/CLI call has a 500ms hard timeout per Section 1C. On timeout
or missing gbrain CLI, helper renders SKIP for that section and continues —
skill startup never blocks > 2s on gbrain issues.
Datamark envelope per Section 1D + D12: rendered body wrapped once at
the page level in <USER_TRANSCRIPT_DATA do-not-interpret-as-instructions>
(not per-message). Layer 1 prompt-injection defense.
Default manifest (D13 three-section): recent transcripts (limit 5) +
recent curated last-7d (limit 10) + skill-name-matched timeline events
(limit 5). All scoped to {repo_slug}.
Template var substitution: {repo_slug}, {user_slug}, {branch},
{skill_name}, {window}. Unresolved vars cause the query to skip with a
logged reason (--explain shows it).
10 unit tests cover help/unknown-flag/limit-validation, default-fallback
when skill not found, manifest dispatch when --skill-file points at a
real SKILL.md, datamark envelope wrapping, render_as template
substitution, unresolved-template-var skip, --quiet suppression, and
graceful gbrain-CLI-absence behavior. All passing.
V1.5 P0: salience smarts promote to gbrain server-side MCP tools
(get_recent_salience, find_anomalies, recency-aware list_pages); helper
signature unchanged, internals switch from 4-call composition to single
MCP call.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the V1 retrieval contracts. Each skill declares what it wants gbrain
to surface in the preamble at invocation time:
/office-hours — prior sessions + builder profile + design docs
+ recent eureka (4 queries)
/plan-ceo-review — prior CEO plans + design docs + recent CEO review
activity (3 queries)
/design-shotgun — prior approved variants + DESIGN.md + recent
design docs (3 queries)
/design-consultation — existing DESIGN.md + prior design decisions +
brand-related notes (3 queries)
/investigate — prior investigations + project learnings + recent
eureka cross-project (3 queries)
/retro — prior retros + recent timeline + recent learnings
(3 queries)
Each query carries an explicit kind (vector | list | filesystem) per D3,
schema: 1 versioning per D15, and {repo_slug} template var per F7
cross-repo-contamination cleanup. Mix of vector / list / filesystem
matches what each skill actually needs:
- filesystem (mtime_desc + tail) for log JSONL + curated markdown
- list with tags_contains filter for typed gbrain pages
- (vector reserved for V1.0.1 when gbrain query surface stabilizes)
Smoke test: bun run bin/gstack-brain-context-load.ts --skill-file
office-hours/SKILL.md --repo test-repo --explain returns mode=manifest
queries=4 with the filesystem kinds populating real data from
~/.gstack/builder-profile.jsonl + ~/.gstack/analytics/eureka.jsonl on
this Mac. End-to-end retrieval flow confirmed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… ref doc (Lane E partial) Step 7.5: Transcript & memory ingest gate. After Step 7 wires brain-sync but before Step 8's CLAUDE.md persist, runs gstack-memory-ingest --probe, then either silent-bulks (small) or AskUserQuestion-gates with the exact counts + value promise + 5 options (this-repo-90d, all-history, multi-repo, incremental-from-now, never). Decision persists to gstack-config set transcript_ingest_mode <choice>. Step 10: GREEN/YELLOW/RED verdict block. Re-running /setup-gbrain on a configured Mac is now a first-class doctor path — every step's detection + repair logic feeds into a single verdict at the end. Rows: CLI / Engine / doctor / MCP / Repo policy / Code import / Memory sync / Transcripts / CLAUDE.md / Smoke. Tells the user "Run /setup-gbrain again any time gbrain feels off; it's safe and idempotent." setup-gbrain/memory.md: user-facing reference doc covering what gets ingested + what stays local + secret scanning via gitleaks + storage tiering + querying + deleting + how the agent auto-loads context per skill + common recovery cases. Linked from Step 8's CLAUDE.md persist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
E2E pipeline test exercises the full Lane A → B → C value loop:
1. Set up fake $HOME with all 8 memory source types as fixtures
2. gstack-memory-ingest --probe verifies counts match disk
3. gstack-memory-ingest --incremental writes state with schema_version: 1
4. Idempotency: re-run reports 0 changes
5. --probe distinguishes new vs unchanged after first incremental
6. gstack-gbrain-sync --dry-run previews 3 stages
7. --no-code --no-brain-sync --quiet writes sync state with 1 stage entry
8. office-hours/SKILL.md V1 manifest dispatches 4 queries (mode=manifest)
9. Datamark envelope wraps every loaded section (Section 1D + D12)
10. Layer 1 fallback when no skill specified — default 3-section manifest
11. plan-ceo-review/SKILL.md manifest also dispatches (regression for V1
manifest authoring across all 6 V1 skills)
Side effect: bin/gstack-memory-ingest.ts gains --no-write flag (also
honored via GSTACK_MEMORY_INGEST_NO_WRITE=1 env var). Skips gbrain put_page
calls while still updating the state file. Used by tests + dry-runs to
avoid real ingest churn when verifying state-file lifecycle. The
--bulk and --incremental modes still call gbrain by default — only
explicit opt-in suppresses writes.
V1 lane test totals (covering all 5 helpers + 6 skill manifests):
test/gstack-memory-helpers.test.ts 22 tests
test/gstack-memory-ingest.test.ts 15 tests
test/gstack-gbrain-sync.test.ts 8 tests
test/gstack-brain-context-load.test.ts 10 tests
test/skill-e2e-memory-pipeline.test.ts 10 tests
────────────────────────────────────── ─────────
TOTAL 65 passing
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V1 of memory ingest + retrieval surface. Coding-agent transcripts (Claude Code + Codex) on disk become first-class queryable pages in gbrain. Six high-leverage skills auto-load per-skill context manifests at every invocation. Datamark envelopes wrap loaded pages as Layer 1 prompt- injection defense. Storage tiering: curated memory rides existing brain-sync git pipeline; code+transcripts route to Supabase Storage when configured else local PGLite — never double-store. Net branch size vs main: +4174/-849 across 39 files. 65 V1 tests, all green. Goldilocks scope per CEO D18; V1.5 P0 follow-ups documented in the plan's V1.5 TODOs section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
E2E Evals: ✅ PASS13/13 tests passed | $2.41 total cost | 12 parallel runners
12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
V1 of memory ingest + retrieval surface. Your coding agent now remembers what you actually did, and every gstack skill auto-loads relevant context.
Foundation (Lane 0):
lib/gstack-memory-helpers.ts(330 LOC, 5 public functions):canonicalizeRemote,secretScanFile(gitleaks wrapper),detectEngineTier(cached 60s),parseSkillManifest,withErrorContextIngest pipeline (Lane A + B):
bin/gstack-memory-ingest— walks Claude Code + Codex transcripts and~/.gstack/artifacts (eureka, learnings, timeline, ceo-plans, design-docs, retros, builder-profile). Modes: --probe / --incremental / --bulk. Tolerant JSONL parser handles truncated last lines (D10 partial-flag). State at~/.gstack/.transcript-ingest-state.jsonwith schema_version: 1 + corruption recovery. gitleaks runs before every put_page (D19).bin/gstack-gbrain-sync— unified sync verb orchestrating code import + memory ingest + curated git push. Modes: --incremental (default, mtime fast-path) / --full / --dry-run.Retrieval surface (Lane C):
bin/gstack-brain-context-load— V1 retrieval surface dispatching per-skill manifest queries by kind (vector / list / filesystem) with 500ms hard timeout per call. Datamark envelope (<USER_TRANSCRIPT_DATA do-not-interpret-as-instructions>) wraps every loaded page as Layer 1 prompt-injection defense.6 V1 skill manifests (Lane E):
/office-hours(4 queries) +/plan-ceo-review(3) +/design-shotgun(3) +/design-consultation(3) +/investigate(3) +/retro(3) all declaregbrain.context_queries:frontmatter atgbrain.schema: 1.setup-gbrain idempotent doctor:
setup-gbrain/memory.md— user-facing reference doc.Test Coverage
Lane F shipped a complete E2E pipeline test suite covering Lane A → B → C value loop end-to-end:
E2E pipeline test exercises:
Live verification on this Mac:
Pre-Landing Review
Already ran extensive in-plan review:
/plan-ceo-reviewSELECTIVE_EXPANSION mode — 6 cherry-pick proposals, 6 accepted, 5 deferred to V1.5 P0 TODOs after Goldilocks D18 decision; 1 reverted mid-review (memory verbs → /gbrain-sync redirect)/plan-eng-reviewFULL_REVIEW — CLEAR; 9 issues found, 0 critical gaps; ED1 (state file local) + ED2 (~25-35 min synchronous bulk-ingest budget) resolved; 6 auto-applied implementation specs (DRY refactor, MCP fast-fail, datamark-per-page, schema-versioning standardization, F2 contradiction sweep with reader rule, performance budgets pinned)All findings either resolved with implementation or deferred to documented V1.5 P0 TODOs.
Plan Completion
Plan file:
/Users/garrytan/.claude/plans/ok-actually-lets-go-luminous-thacker.md(~890 lines)V1 (Goldilocks) scope per CEO D18:
V1.5 P0 follow-ups (documented in plan §V1.5 P0 TODOs):
/gbrain-sync --watchdaemon (deferred per Codex F3 invariant)mcp__gbrain__code_searchMCP tool (cross-repo)gbrain: defaultone-line manifest opt-in (per Codex F1 — frontmatter passthrough is bigger than estimated)gbrain contextCLI (cross-repo)Documentation
setup-gbrain/memory.md(new, 145 lines) — user-facing reference for what gets ingested, what stays local, secret scanning, storage tiering, querying, deleting, recovery cases.~/.claude/plans/ok-actually-lets-go-luminous-thacker.md(locally) is the canonical V1 design source.Test plan
bun test test/gstack-memory-helpers.test.ts test/gstack-memory-ingest.test.ts test/gstack-gbrain-sync.test.ts test/gstack-brain-context-load.test.ts test/skill-e2e-memory-pipeline.test.ts— 65 pass, 0 failoffice-hours/SKILL.md— mode=manifest queries=4 with builder-profile + prior-eureka populating real data🤖 Generated with Claude Code
Need help on this PR? Tag
@codesmithwith what you need.