Simulation MCP tools, game mode, and streaming by freesig · Pull Request #17 · essential-contributions/spec-forest

freesig · 2026-04-06T23:17:36Z

Summary

Add persistent notification system with session backgrounding
Add shadow answer support for implementation status review
Add ? help popup with full keybinding reference
Expose simulation via 10 MCP tools for agent-driven interaction
Pre-computed interaction tree for instant simulation responses
Game mode for spec-updating play-through simulation
Switch simulation runner to stream-json for real-time tool call visibility
Breadcrumb navigation trail for simulation interactions

Stack

PR 1/5: TUI Infrastructure
PR 2/5: Simulation Core & UX
PR 3/5 ← you are here
PR 4/5: Members Screen
PR 5/5: Lean Game Mode

Test plan

Notification bar appears when backgrounding a simulation session
Help popup shows with ? key
MCP simulation tools work from an external agent
Game mode updates spec based on play-through decisions
Streaming shows real-time tool call progress

…grounding Simulations can now be backgrounded (Esc) instead of destroyed, with a global notification bar showing when responses arrive on any screen. Ctrl+s opens a session picker overlay to switch between background sessions. Q permanently ends a simulation.

…n ready The bar now always displays when any simulation is backgrounded, showing the count and status (ready/running) with a Ctrl+s hint. Previously it only appeared when a response notification fired.

Replace verbose footer bars (15+ keys per line) with minimal badge-styled hints showing only 3-5 essential keys. Press ? on any non-text-input screen to open a context-aware help popup listing all available key bindings.

…ction Extract shared orchestration logic from TUI into spec-forest lib so both TUI and MCP can drive simulations. Add fire-and-poll MCP tools: sim_create_session, sim_start, sim_send_input, sim_ask_report, sim_update_scenario, sim_get_status, sim_get_channels, sim_get_report, sim_list_sessions, sim_end.

When explore_code is true, sim_start loads the spec's project directory from the database so the simulation agent can read the codebase. Errors if the spec has no directory set.

Replace sequential request-response simulation with a pre-computed interaction tree. The AI now generates a tree of likely user interactions and their resulting outputs in a single call, so picking a predicted interaction is instant (no AI latency). Custom input falls back to AI generation. When the user reaches a leaf node, the next tree is pre-generated in the background. New MCP tool: sim_get_interactions returns predicted choices at the current position. sim_send_input now returns tree_hit/at_leaf fields. sim_create_session accepts tree_depth (1-3) and tree_branching (2-4).

Add an interactions panel between decisions and the input area that displays the pre-computed interaction choices from the tree. Users can navigate with arrow keys and press Enter to select, getting an instant response without AI latency. The 'i' key enters custom input mode for interactions not in the tree.

…on, and leaf interactions - Make PredictedInteraction.result optional so leaf-depth nodes can suggest interactions without pre-computed results - Change defaults to depth=4, branching=2 for deeper exploration (31 nodes) - Add Backspace back-navigation through the interaction tree - Eagerly pre-generate subtrees when the most likely path leads to a dead end, grafting results onto the existing tree instead of replacing it - Show spinner + "Expanding tree..." during background pregeneration - Dim shallow interactions with "(generates)" suffix in the interactions panel

Game mode lets users play through their spec by choosing interaction+outcome pairs. Each choice feeds back into the spec DAG, iteratively refining the specification through play. Players can also reject outcomes with corrections. - New data types: GameChoiceGroup, GameOutcome, GameTreeRoot, GameSpecUpdate - Game-specific AI prompts presenting alternative outcomes per interaction - Backend orchestration for game turns with background spec updates - 4 new MCP tools: game_get_choices, game_select_outcome, game_reject_outcome, game_get_spec_updates - TUI: grouped choices panel, reject overlay, spec update log overlay - Channel picker toggle (g key) to enable game mode

Update game mode cardinal rule and tree rules to steer the AI toward interactions that expose genuine spec ambiguity rather than obvious outcomes. At least half of interactions should target decision points where the player's choice resolves a meaningful specification question.

Show a breadcrumb bar (Start > Login > Email > Submit) in the TUI simulation view, allowing users to see their full interaction path and jump to any previous point. Press 'b' to focus the trail, use left/right to select, Enter to jump. Also exposes breadcrumbs via the sim_get_status MCP tool and adds a sim_navigate_to tool.

LLM responses sometimes include prose or markdown code fences around JSON, causing parse failures in simulation and game mode. Strengthen all prompts to explicitly forbid non-JSON output and extract a shared generic `extract_json<T>()` helper to replace duplicated 4-step extraction logic across all four parse functions.

Simulation init was slow with no way to see what Claude was doing. Switch from --output-format json to stream-json, parse NDJSON to extract the same final result while logging tool calls at info level and all intermediate events at debug level.

Replace cmd.output() with spawn + BufReader line-by-line streaming so stream-json events are logged as they arrive, not after the full response completes. Tool use events now appear immediately in logs.

Byte-index slicing panicked on multi-byte UTF-8 characters (e.g. '…') when truncating log lines to 200 bytes.

LLMs struggle to produce valid deeply-nested JSON at scale (~32KB), causing consistent parse failures like "key must be a string at column 18790". Replace the nested tree format with a flat {nodes, edges} adjacency list that eliminates nesting entirely. - Add FlatTree/FlatNode/FlatEdge wire types for Claude's output - Add flat_to_sim_tree and flat_to_game_tree conversion functions - Parse functions try flat format first, fall back to nested (legacy) - Update prompt schemas for both sim and game mode - Include serde error and full response in parse failure messages - Add tracing to extract_json for step-by-step diagnostics

freesig added 18 commits April 1, 2026 09:02

fix: show notification bar whenever background sims exist, not just o…

aec5353

…n ready The bar now always displays when any simulation is backgrounded, showing the count and status (ready/running) with a Ctrl+s hint. Previously it only appeared when a response notification fired.

feat: add Alt+S keybinding to regenerate shadow answers from TUI

4b06bc0

feat: add ? help popup and clean up footer key hints

8a337ba

Replace verbose footer bars (15+ keys per line) with minimal badge-styled hints showing only 3-5 essential keys. Press ? on any non-text-input screen to open a context-aware help popup listing all available key bindings.

feat: add explore_code param to sim_start for code-aware simulation mode

eb24264

When explore_code is true, sim_start loads the spec's project directory from the database so the simulation agent can read the codebase. Errors if the spec has no directory set.

fix: add --verbose flag required by stream-json output format

108730b

feat: stream Claude CLI stdout for real-time tool call logging

7fb3744

Replace cmd.output() with spawn + BufReader line-by-line streaming so stream-json events are logged as they arrive, not after the full response completes. Tool use events now appear immediately in logs.

fix: use char-boundary-safe truncation for stream event logging

27ffd27

Byte-index slicing panicked on multi-byte UTF-8 characters (e.g. '…') when truncating log lines to 200 bytes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simulation MCP tools, game mode, and streaming#17

Simulation MCP tools, game mode, and streaming#17
freesig wants to merge 18 commits into
freesig/sim-2-simulationfrom
freesig/sim-3-sim-tools

freesig commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

freesig commented Apr 6, 2026

Summary

Stack

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant