-
Notifications
You must be signed in to change notification settings - Fork 0
Stage Definitions
A stage definition declares a single stage in the SM's pipeline. It is a markdown file with a YAML frontmatter header. The frontmatter is the stage's contract with the runtime; the body is the system-prompt template sent to the LLM at Init.
Companion page to State Machines contract and SM Stage Lifecycle. This page carries
contractVersion: 1.0.0(see Changelog) and has no independent semver; breaking changes are recorded in the State Machines contract's changelog.
---
id: plan
name: Plan
allowedTools: ["Read", "Grep", "Glob"]
completionTool: submit_plan
completionSchema:
type: object
required: [summary, steps]
properties:
summary: { type: string }
steps:
type: array
items: { type: string }
retryPolicy:
maxAttempts: 2
backoff: none
turnCap: 20
resolutionPolicy: retry-later
---
You are planning a change to the repository. You have access to read tools only —
you cannot modify files in this stage.
Task: {{ctx.task}}
Produce a plan. When you are done, call `submit_plan` with:
- `summary`: one paragraph summarising the approach.
- `steps`: an ordered list of concrete actions.| Field | Type | Meaning |
|---|---|---|
id |
string (kebab-case) | Pipeline-unique ID used by Next() and the SM state slot. |
name |
string | Human-readable label used in UI, audit, and logs. |
allowedTools |
list of tool names | The subset of the orchestrator's live tool set this stage may call. Narrows inheritance; does not re-resolve names. |
completionTool |
tool name | The single tool the LLM must call to end Act successfully. |
completionSchema |
JSON Schema | Validates the completionTool's payload. Parse gate, not truth gate (see SM Stage Lifecycle). |
retryPolicy |
{ maxAttempts, backoff } |
Cap and backoff for Assert → retry verdicts within a single stage execution. |
turnCap |
positive integer | Hard upper bound on LLM turns inside Act. |
resolutionPolicy |
enum | How this stage treats CheckGate outcomes defer and block. Values and semantics in SM Stage Lifecycle. |
All eight are required. A stage definition missing any of them fails init on the owning SM.
| Field | Type | Meaning |
|---|---|---|
description |
string | Free-text description surfaced in /sm show <id>. |
inputsSchema |
JSON Schema | Schema that each element of ctx.upstream must satisfy. For a sequential successor ctx.upstream has one element (its predecessor's StageResult); for a join stage each sibling StageResult is validated individually. Validation fires at Setup; a failure is an authoring diagnostic audited as StageInitFailed. |
tags |
list of strings | For SM-authored filtering; not interpreted by core. |
Optional fields are additive; the contract is silent on unknown keys so SM authors may carry metadata. The runtime ignores unknown keys.
The body is plain markdown, sent verbatim as the stage's system prompt, with templated placeholders substituted before send.
Templates render against the stage's StageContext, bound to ctx. Every placeholder resolves inside that namespace; there are no other namespaces except the stage's own frontmatter under stage.*.
| Placeholder | Resolves to |
|---|---|
{{ctx.<sm-field>}} |
Any SM-written field under ctx (e.g., {{ctx.task}}, {{ctx.branch}}, {{ctx.files}}). The SM owns these names. |
{{ctx.upstream[i].*}} |
Predecessor outputs. ctx.upstream[0] is the predecessor's StageResult on a sequential successor; a join indexes each sibling's StageResult in NextResult.stages order. Empty on the entry stage and on parallel siblings. Core-provided. See SM Stage Lifecycle § How successors receive StageResult. |
{{ctx.workflowRunId}} / {{ctx.stageExecutionId}}
|
Idempotency keys. Core-provided. |
{{stage.id}} / {{stage.name}}
|
The stage's own frontmatter values. |
ctx.attempts is not template-addressable. Init renders the body once at the start of the stage execution, and Assert → retry re-enters Act in the same transcript without re-rendering — so any {{ctx.attempts.*}} value would always resolve against the first attempt. Per-attempt signalling happens inside the transcript (the retry reason arrives as an extra message to the LLM on retry), not via the rendered body. See SM Stage Lifecycle § Init.
ctx.host is a runtime surface (Host API) and is not template-addressable. The complete StageContext schema — reserved core-provided names and the SM-writable extension space — is defined in SM Stage Lifecycle § StageContext.
The template engine is intentionally narrow. No conditionals, no loops, no arbitrary expressions — every branch point belongs in the SM's pipeline code, not in the prompt. Authors who need conditional system prompts should split the logic into multiple stages or compose the ctx before Setup.
-
Env or settings values.
{{env.FOO}}is not recognised. LLM context isolation applies — secrets do not reach the prompt except through the sanctioned Context Provider path. See LLM Context Isolation. -
Bulk transcripts of prior stages. Stage transcripts are stage-local and in-memory; they never cross the stage boundary. A successor receives only its predecessor's
StageResult— typed fields (verdict,reason,parsed,capHit,attemptCount) plus the optional SM-authoredoutput. If a downstream stage needs a digest of a prior stage's reasoning, the prior stage'sExitcode writes a reduced summary to itsStageResult.outputand the successor reads it via{{ctx.upstream[i].output.<field>}}. There is no channel — neither the state slot norctx.upstream— that carries a full transcript. See SM Stage Lifecycle § StageResult and State Machines § State slot.
The completionTool names a stage-local completion channel — not a Tool in the Tools registry. Core synthesizes the channel per-stage at Setup and injects it into the stage's LLM tool manifest alongside the tools selected by allowedTools. Its role is to give the LLM a well-shaped way to signal "this stage is done; here is my result."
| Property | Rule |
|---|---|
| Registration | Not registered. Core creates the channel per-stage from completionTool + completionSchema; it does not appear in the Tool registry. |
| Reserved name | The completionTool name must not match any registered tool name (bundled, discovered, or MCP). Core rejects the stage definition at Init on collision; see SM Stage Lifecycle — Init. |
| Visibility | Always injected into the stage's tool manifest. Present even if allowedTools: []. |
| Approval |
Bypasses allowedTools, the security-mode gate, and guard hooks. The channel is core-internal; it produces no side effect beyond ending Act. |
| Scope | Stage-local. Discarded at Exit. Not visible to other stages. |
| Persistence | Not written to session message history. The parsed payload is handed to Assert and to the SM's Exit for state-slot persistence if the SM chooses. |
| Audit | Not a ToolInvocation… event. Completion lands in StageAssertOutcome and StageExited. |
flowchart TD
Start[Act starts] --> LLM[LLM turn]
LLM -->|"called completionTool"| Parse{parse completionSchema}
Parse -->|ok| Done["Act ends: completion ok"]
Parse -->|fail| Retry[core returns parse-error<br/>as a tool-call result;<br/>Act continues]
LLM -->|"no tool call<br/>plain-text only"| Steer[core appends a<br/>steering message;<br/>Act continues]
LLM -->|"called another tool"| RouteTool[dispatch through<br/>allowedTools and gates]
RouteTool --> LLM
Steer --> TurnCount{"turn count < turnCap?"}
Retry --> TurnCount
TurnCount -->|yes| LLM
TurnCount -->|no| Cap["Act ends: capHit"]
-
Success. The LLM calls
completionToolwith a payload that passescompletionSchema.Actends;Assertreceives{ parsed, transcript, attempts }. -
Parse failure. The LLM called
completionToolwith a payload that failscompletionSchema. Core synthesizes a tool-call result carrying the schema-validation error (as content the model can read) and returns it on the same transcript.Actcontinues until a successful completion orturnCapfires. The parse error is captured onctx.attempts[i].parseErrorforAssertto inspect. -
Plain-text reply. The LLM replied with prose and no tool call. Core appends a short steering message to the transcript (e.g., "call
completionToolwhen you are done") andActcontinues. A run of plain-text replies counts towardturnCap. -
Other tool. The LLM called a tool named in
allowedTools(or covered by an activegrantStageTool). The call dispatches through the normal gates; the result returns to the transcript;Actcontinues. -
Cap.
turnCapLLM turns elapsed without a successful completion call.Actends withcapHit: trueandAssertreceives the raw transcript.
In all cases, Assert sees the raw transcript and the attempt history and has the last word.
A provider may emit more than one tool call in a single LLM response. The completion channel has a normative rule for these cases:
| Batch shape | Core behavior |
|---|---|
Exactly one call and it is completionTool
|
Parse against completionSchema; the Success, Parse-failure, or normal Act-ending branches above apply. |
completionTool plus one or more sibling tool calls in the same response |
Batch rejected. Core does not dispatch any sibling tool call to the approval stack. A synthetic tool-call result is returned on the transcript stating that completionTool must be the sole tool call in its response. Act continues; the rejected turn counts toward turnCap. |
Two or more completionTool calls in the same response |
Batch rejected with the same synthetic error. Act continues; the turn counts toward turnCap. |
No completionTool; one or more sibling tool calls |
Dispatch through the normal gates per the Other tool branch. |
Core evaluates the batch before any sibling call reaches allowedTools, mode gate, or guard hooks — a rejected batch never executes any partial work. The determinism contract here is that Act either ends on a single clean completionTool call or continues with no side effects from the rejected turn.
A cap-hit should be rare. If Assert routinely accepts cap-hit outcomes by interpreting transcripts, the stage has regressed to an LLM-orchestration model and the determinism pitch weakens. Cap-hit acceptance is an escape hatch for diagnostics and best-effort salvage, not a normal success path.
The completion channel requires the provider/model to support strict tool calling:
- Tool calls must be emitted as structured calls that core can parse deterministically (not buried inside prose).
- Malformed tool payloads must be routable back to the model as tool-call results, not dropped or re-emitted as plain text.
Every SM contributes toolCalling: hard to the session's required capabilities; see Capability Negotiation — Required capabilities. A provider/model that cannot guarantee strict tool calling (e.g., some weak OpenAI-compatible or local models) is incompatible with SM-driven workflows and fails the capability check at session start or on /sm attach.
-
Tools.
allowedToolsnames are resolved against the live tool set once, at orchestrator session start. A stage definition'sallowedToolsonly narrows inheritance. Name drift is an authoring error caught at stageInit. - Providers. The stage inherits the orchestrator's provider and model. A stage cannot declare its own provider in v1.
-
Hooks. Hooks attached to orchestrator-session turn stages fire during stage
Acton the tool calls. Guard hooks own per-call argument-sensitive deny (the orthogonality is specified in State Machines).
-
Keep the
allowedToolstight. An over-broad tool set wastes tokens without adding safety; a narrow set is easier to audit. When the LLM's plan legitimately needs an out-of-envelope tool, core surfaces exactly onegrantStageToolprompt and the LLM receives a typed denied-tool-call result if the user says no — the stage keeps running inside its envelope. -
Write
completionSchemafor shape, not judgment. If the judgment logic belongs in Assert, write Assert. If the schema rejects half of valid answers, you will retry forever. -
Set
turnCaprealistically. The cap is a safety net, not a budget. A cap hit should be rare; if it is common, the stage is misdesigned. -
Keep
Actidempotent. Strict cancel (see SM Stage Lifecycle and Concurrency and Cancellation) will not clean up side effects your stage wrote before an abort. Usectx.workflowRunIdandctx.stageExecutionIdas idempotency keys against external systems so a re-run after abort or crash is a no-op. See Idempotency surface for stage authors. - Prefer read-only siblings in parallel fan-out. If a parallel sibling must mutate, keep the mutation small and naturally idempotent; broad mutations belong in a sequential stage after the join. See Safe envelope for parallel fan-out.
See LLM Authoring Guide for a walk-through of common stage-definition patterns.
When NextResult.execution is 'parallel' and a join stage is declared,
the join stage runs only when all parallel siblings succeed. Any sibling
failure (non-successful Assert verdict or an unhandled ExtensionHost error
from a stage execution) aborts the compound turn immediately with
ExtensionHost/ParallelSiblingFailure. Core does not invoke the join stage for
a partially-completed fan-out; the SM receives the failure event and may inspect
which sibling failed via the event payload.
This is a deterministic safety property: a half-committed fan-out is never
silently patched over by the join stage. SMs that need partial-result handling
must model it explicitly (e.g., a recover stage listed as the sibling's own
terminal branch, not as the join).
NextResult.join naming a stage ID not present in nextStages is rejected at
load time with Validation/StageJoinDangling.
Initial definition. Five normative stage-level fields:
| Field | Description |
|---|---|
body |
Markdown system-prompt template with ${ctx.*} placeholders; rendered once at Init. |
allowedTools |
Optional allow-list narrowing the session tool set for this stage. |
turnCap |
Hard upper bound on LLM continuation iterations (must be ≥ 1). |
completionTool / completionSchema
|
Stage-local completion channel; not in the Tool registry; bypasses approval gates. |
next(ctx) |
Returns NextResult with `execution: 'sequential' |
Q-4 fail-fast semantics added: parallel siblings must all succeed before join
runs; any sibling failure aborts the compound turn with
ExtensionHost/ParallelSiblingFailure.
- Execution Model
- Message Loop
- Concurrency and Cancellation
- Error Model
- Event and Command Ordering
- Event Bus
- Command Model
- Interaction Protocol
- Hook Taxonomy
- Host API
- Extension Lifecycle
- Env Provider
- Prompt Registry
- Resource Registry
- Session Lifecycle
- Session Manifest
- Persistence and Recovery
- Stage Executions
- Subagent Sessions
- Contract Pattern
- Versioning and Compatibility
- Deprecation Policy
- Capability Negotiation
- Dependency Resolution
- Validation Pipeline
- Cardinality and Activation
- Extension State
- Conformance and Testing
- Providers
- Provider Params
- Tools
- Hooks
- UI
- Loggers
- State Machines
- SM Stage Lifecycle
- Stage Definitions
- Commands
- Session Store
- Context Providers
- Settings Shape
- Trust Model
- Project Trust
- Extension Isolation
- Extension Integrity
- LLM Context Isolation
- Secrets Hygiene
- Security Modes
- Tool Approvals
- MCP Trust
- Sandboxing
- Configuration Scopes
- Project Root
- Extension Discovery
- Extension Installation
- Extension Reloading
- Headless and Interactor
- Determinism and Ordering
- Launch Arguments
- Network Policy
- Platform Integration
Tools
UI
Session Stores
Loggers
Providers
Hooks
Context Providers
Commands
- First Run
- Default Chat
- Tool Call Cycle
- Hook Interception
- Guard Deny Reproposal
- State Machine Workflow
- SM Stage Retry
- Hot Model Switch
- Capability Mismatch Switch
- Session Resume
- Session Resume Drift
- Approval and Auth
- Interaction Timeout
- Headless Run
- Parallel Tool Approvals
- Subagent Delegation
- Scope Layering
- Project First-Run Trust
- Reload Mid-Turn
- Compaction Warning
- MCP Remote Tool Call
- MCP Prompt Consume
- MCP Resource Bind
- MCP Reconnect