Skip to content

Stage Definitions

Z-M-Huang edited this page Apr 27, 2026 · 4 revisions

Stage Definitions

A stage definition declares a single stage in the SM's pipeline. It is a markdown file with a YAML frontmatter header. The frontmatter is the stage's contract with the runtime; the body is the system-prompt template sent to the LLM at Init.

Companion page to State Machines contract and SM Stage Lifecycle. This page carries contractVersion: 1.0.0 (see Changelog) and has no independent semver; breaking changes are recorded in the State Machines contract's changelog.


File shape

---
id: plan
name: Plan
allowedTools: ["Read", "Grep", "Glob"]
completionTool: submit_plan
completionSchema:
  type: object
  required: [summary, steps]
  properties:
    summary: { type: string }
    steps:
      type: array
      items: { type: string }
retryPolicy:
  maxAttempts: 2
  backoff: none
turnCap: 20
resolutionPolicy: retry-later
---

You are planning a change to the repository. You have access to read tools only —
you cannot modify files in this stage.

Task: {{ctx.task}}

Produce a plan. When you are done, call `submit_plan` with:
- `summary`: one paragraph summarising the approach.
- `steps`: an ordered list of concrete actions.

Required frontmatter fields

Field Type Meaning
id string (kebab-case) Pipeline-unique ID used by Next() and the SM state slot.
name string Human-readable label used in UI, audit, and logs.
allowedTools list of tool names The subset of the orchestrator's live tool set this stage may call. Narrows inheritance; does not re-resolve names.
completionTool tool name The single tool the LLM must call to end Act successfully.
completionSchema JSON Schema Validates the completionTool's payload. Parse gate, not truth gate (see SM Stage Lifecycle).
retryPolicy { maxAttempts, backoff } Cap and backoff for Assert → retry verdicts within a single stage execution.
turnCap positive integer Hard upper bound on LLM turns inside Act.
resolutionPolicy enum How this stage treats CheckGate outcomes defer and block. Values and semantics in SM Stage Lifecycle.

All eight are required. A stage definition missing any of them fails init on the owning SM.


Optional frontmatter fields

Field Type Meaning
description string Free-text description surfaced in /sm show <id>.
inputsSchema JSON Schema Schema that each element of ctx.upstream must satisfy. For a sequential successor ctx.upstream has one element (its predecessor's StageResult); for a join stage each sibling StageResult is validated individually. Validation fires at Setup; a failure is an authoring diagnostic audited as StageInitFailed.
tags list of strings For SM-authored filtering; not interpreted by core.

Optional fields are additive; the contract is silent on unknown keys so SM authors may carry metadata. The runtime ignores unknown keys.


Body

The body is plain markdown, sent verbatim as the stage's system prompt, with templated placeholders substituted before send.

Template placeholders

Templates render against the stage's StageContext, bound to ctx. Every placeholder resolves inside that namespace; there are no other namespaces except the stage's own frontmatter under stage.*.

Placeholder Resolves to
{{ctx.<sm-field>}} Any SM-written field under ctx (e.g., {{ctx.task}}, {{ctx.branch}}, {{ctx.files}}). The SM owns these names.
{{ctx.upstream[i].*}} Predecessor outputs. ctx.upstream[0] is the predecessor's StageResult on a sequential successor; a join indexes each sibling's StageResult in NextResult.stages order. Empty on the entry stage and on parallel siblings. Core-provided. See SM Stage Lifecycle § How successors receive StageResult.
{{ctx.workflowRunId}} / {{ctx.stageExecutionId}} Idempotency keys. Core-provided.
{{stage.id}} / {{stage.name}} The stage's own frontmatter values.

ctx.attempts is not template-addressable. Init renders the body once at the start of the stage execution, and Assert → retry re-enters Act in the same transcript without re-rendering — so any {{ctx.attempts.*}} value would always resolve against the first attempt. Per-attempt signalling happens inside the transcript (the retry reason arrives as an extra message to the LLM on retry), not via the rendered body. See SM Stage Lifecycle § Init.

ctx.host is a runtime surface (Host API) and is not template-addressable. The complete StageContext schema — reserved core-provided names and the SM-writable extension space — is defined in SM Stage Lifecycle § StageContext.

The template engine is intentionally narrow. No conditionals, no loops, no arbitrary expressions — every branch point belongs in the SM's pipeline code, not in the prompt. Authors who need conditional system prompts should split the logic into multiple stages or compose the ctx before Setup.

What does not enter the body

  • Env or settings values. {{env.FOO}} is not recognised. LLM context isolation applies — secrets do not reach the prompt except through the sanctioned Context Provider path. See LLM Context Isolation.
  • Bulk transcripts of prior stages. Stage transcripts are stage-local and in-memory; they never cross the stage boundary. A successor receives only its predecessor's StageResult — typed fields (verdict, reason, parsed, capHit, attemptCount) plus the optional SM-authored output. If a downstream stage needs a digest of a prior stage's reasoning, the prior stage's Exit code writes a reduced summary to its StageResult.output and the successor reads it via {{ctx.upstream[i].output.<field>}}. There is no channel — neither the state slot nor ctx.upstream — that carries a full transcript. See SM Stage Lifecycle § StageResult and State Machines § State slot.

Completion channel semantics

The completionTool names a stage-local completion channel — not a Tool in the Tools registry. Core synthesizes the channel per-stage at Setup and injects it into the stage's LLM tool manifest alongside the tools selected by allowedTools. Its role is to give the LLM a well-shaped way to signal "this stage is done; here is my result."

What the channel is

Property Rule
Registration Not registered. Core creates the channel per-stage from completionTool + completionSchema; it does not appear in the Tool registry.
Reserved name The completionTool name must not match any registered tool name (bundled, discovered, or MCP). Core rejects the stage definition at Init on collision; see SM Stage Lifecycle — Init.
Visibility Always injected into the stage's tool manifest. Present even if allowedTools: [].
Approval Bypasses allowedTools, the security-mode gate, and guard hooks. The channel is core-internal; it produces no side effect beyond ending Act.
Scope Stage-local. Discarded at Exit. Not visible to other stages.
Persistence Not written to session message history. The parsed payload is handed to Assert and to the SM's Exit for state-slot persistence if the SM chooses.
Audit Not a ToolInvocation… event. Completion lands in StageAssertOutcome and StageExited.

Act-ending branches

flowchart TD
    Start[Act starts] --> LLM[LLM turn]
    LLM -->|"called completionTool"| Parse{parse completionSchema}
    Parse -->|ok| Done["Act ends: completion ok"]
    Parse -->|fail| Retry[core returns parse-error<br/>as a tool-call result;<br/>Act continues]
    LLM -->|"no tool call<br/>plain-text only"| Steer[core appends a<br/>steering message;<br/>Act continues]
    LLM -->|"called another tool"| RouteTool[dispatch through<br/>allowedTools and gates]
    RouteTool --> LLM
    Steer --> TurnCount{"turn count < turnCap?"}
    Retry --> TurnCount
    TurnCount -->|yes| LLM
    TurnCount -->|no| Cap["Act ends: capHit"]
Loading
  • Success. The LLM calls completionTool with a payload that passes completionSchema. Act ends; Assert receives { parsed, transcript, attempts }.
  • Parse failure. The LLM called completionTool with a payload that fails completionSchema. Core synthesizes a tool-call result carrying the schema-validation error (as content the model can read) and returns it on the same transcript. Act continues until a successful completion or turnCap fires. The parse error is captured on ctx.attempts[i].parseError for Assert to inspect.
  • Plain-text reply. The LLM replied with prose and no tool call. Core appends a short steering message to the transcript (e.g., "call completionTool when you are done") and Act continues. A run of plain-text replies counts toward turnCap.
  • Other tool. The LLM called a tool named in allowedTools (or covered by an active grantStageTool). The call dispatches through the normal gates; the result returns to the transcript; Act continues.
  • Cap. turnCap LLM turns elapsed without a successful completion call. Act ends with capHit: true and Assert receives the raw transcript.

In all cases, Assert sees the raw transcript and the attempt history and has the last word.

Multi-tool responses containing completionTool

A provider may emit more than one tool call in a single LLM response. The completion channel has a normative rule for these cases:

Batch shape Core behavior
Exactly one call and it is completionTool Parse against completionSchema; the Success, Parse-failure, or normal Act-ending branches above apply.
completionTool plus one or more sibling tool calls in the same response Batch rejected. Core does not dispatch any sibling tool call to the approval stack. A synthetic tool-call result is returned on the transcript stating that completionTool must be the sole tool call in its response. Act continues; the rejected turn counts toward turnCap.
Two or more completionTool calls in the same response Batch rejected with the same synthetic error. Act continues; the turn counts toward turnCap.
No completionTool; one or more sibling tool calls Dispatch through the normal gates per the Other tool branch.

Core evaluates the batch before any sibling call reaches allowedTools, mode gate, or guard hooks — a rejected batch never executes any partial work. The determinism contract here is that Act either ends on a single clean completionTool call or continues with no side effects from the rejected turn.

Cap-hit is an exception path

A cap-hit should be rare. If Assert routinely accepts cap-hit outcomes by interpreting transcripts, the stage has regressed to an LLM-orchestration model and the determinism pitch weakens. Cap-hit acceptance is an escape hatch for diagnostics and best-effort salvage, not a normal success path.

Provider capability requirements

The completion channel requires the provider/model to support strict tool calling:

  • Tool calls must be emitted as structured calls that core can parse deterministically (not buried inside prose).
  • Malformed tool payloads must be routable back to the model as tool-call results, not dropped or re-emitted as plain text.

Every SM contributes toolCalling: hard to the session's required capabilities; see Capability Negotiation — Required capabilities. A provider/model that cannot guarantee strict tool calling (e.g., some weak OpenAI-compatible or local models) is incompatible with SM-driven workflows and fails the capability check at session start or on /sm attach.


Relationship to other surfaces

  • Tools. allowedTools names are resolved against the live tool set once, at orchestrator session start. A stage definition's allowedTools only narrows inheritance. Name drift is an authoring error caught at stage Init.
  • Providers. The stage inherits the orchestrator's provider and model. A stage cannot declare its own provider in v1.
  • Hooks. Hooks attached to orchestrator-session turn stages fire during stage Act on the tool calls. Guard hooks own per-call argument-sensitive deny (the orthogonality is specified in State Machines).

Authoring guidance

  • Keep the allowedTools tight. An over-broad tool set wastes tokens without adding safety; a narrow set is easier to audit. When the LLM's plan legitimately needs an out-of-envelope tool, core surfaces exactly one grantStageTool prompt and the LLM receives a typed denied-tool-call result if the user says no — the stage keeps running inside its envelope.
  • Write completionSchema for shape, not judgment. If the judgment logic belongs in Assert, write Assert. If the schema rejects half of valid answers, you will retry forever.
  • Set turnCap realistically. The cap is a safety net, not a budget. A cap hit should be rare; if it is common, the stage is misdesigned.
  • Keep Act idempotent. Strict cancel (see SM Stage Lifecycle and Concurrency and Cancellation) will not clean up side effects your stage wrote before an abort. Use ctx.workflowRunId and ctx.stageExecutionId as idempotency keys against external systems so a re-run after abort or crash is a no-op. See Idempotency surface for stage authors.
  • Prefer read-only siblings in parallel fan-out. If a parallel sibling must mutate, keep the mutation small and naturally idempotent; broad mutations belong in a sequential stage after the join. See Safe envelope for parallel fan-out.

See LLM Authoring Guide for a walk-through of common stage-definition patterns.


Related pages


Parallel-sibling fail-fast semantics (Q-4)

When NextResult.execution is 'parallel' and a join stage is declared, the join stage runs only when all parallel siblings succeed. Any sibling failure (non-successful Assert verdict or an unhandled ExtensionHost error from a stage execution) aborts the compound turn immediately with ExtensionHost/ParallelSiblingFailure. Core does not invoke the join stage for a partially-completed fan-out; the SM receives the failure event and may inspect which sibling failed via the event payload.

This is a deterministic safety property: a half-committed fan-out is never silently patched over by the join stage. SMs that need partial-result handling must model it explicitly (e.g., a recover stage listed as the sibling's own terminal branch, not as the join).

NextResult.join naming a stage ID not present in nextStages is rejected at load time with Validation/StageJoinDangling.


Changelog

contractVersion: 1.0.0

Initial definition. Five normative stage-level fields:

Field Description
body Markdown system-prompt template with ${ctx.*} placeholders; rendered once at Init.
allowedTools Optional allow-list narrowing the session tool set for this stage.
turnCap Hard upper bound on LLM continuation iterations (must be ≥ 1).
completionTool / completionSchema Stage-local completion channel; not in the Tool registry; bypasses approval gates.
next(ctx) Returns NextResult with `execution: 'sequential'

Q-4 fail-fast semantics added: parallel siblings must all succeed before join runs; any sibling failure aborts the compound turn with ExtensionHost/ParallelSiblingFailure.

Introduction

Reading

Core runtime

Contracts

Category contracts

Context

Security

Runtime behavior

Operations

Providers (bundled)

Integrations

Reference extensions

Tools

UI

Session Stores

Loggers

Providers

Hooks

Context Providers

Commands

Case studies

Flows

Maintainers

Clone this wiki locally