Skip to content

Interaction Protocol

Z-M-Huang edited this page May 3, 2026 · 7 revisions

Interaction Protocol

contractVersion: 1.1.1

The typed request/response channel between core and active UI interactors. Used for ask-user, approval, and auth/device-code flows. Sits next to the event bus but is a different shape.


Why two channels

stud-cli splits cross-actor communication into two rails.

Channel Shape Direction Authority
Event Bus pub/sub core → subscribers None — projection only
Interaction Protocol typed request → response core ↔ active interactors Authoritative — core blocks until one answer wins

Events describe what happened. Interactions ask for a decision core needs before it can continue. Conflating them is how authority leaks into projection and UIs start to depend on event timing for correctness.

See Event and Command Ordering for the wider ordering guarantees.


Roles

classDiagram
    class Core {
        +requestInteraction(req) : Response
    }
    class UI {
        <<interface>>
    }
    class Subscriber {
        +onEvent(event)
    }
    class Interactor {
        +handle(req) : Response
    }
    UI <|.. Subscriber
    UI <|.. Interactor
    Core --> Interactor : fan-out requestInteraction
    Core --> Subscriber : publish events
Loading

A UI extension may implement either role or both.

  • Subscriber — observes events. Many can coexist.
  • Interactor — answers interaction requests. Many can be active concurrently (Q-9).

Multiple interactors (Q-9)

Per Q-9, a session may have multiple active interactors concurrently. Each interactor receives the same prompt; the first to respond wins (race-to-answer). The remaining interactors dismiss their own dialogs on the broadcast InteractionAnswered event; late responses are rejected with a typed error.

Per-interactor FIFO. Each interactor serializes its own dialog queue: a single interactor will not see prompt N+1 until prompt N has been answered or cancelled at that interactor. Cross-interactor resolution is race-to-answer, not serialized — the prompt fans out concurrently and the first-completed wins.

InteractionAnswered event. Core emits this event on the event bus immediately after the winning response is accepted. Payload: { correlationId, status, value?, extId } (the answering interactor's extId is included so observers can attribute the decision). It is a core-emitted event, not a subscriber-emitted one — interactors do not synthesize it themselves.

Session/InteractionAlreadyAnswered. The typed error a late-arriving interactor's response receives. Interactors must handle this gracefully: release any held UI state, dismiss the dialog, and treat the error as a normal race outcome. It is not a bug. The expected handler shape is "submit response → on InteractionAlreadyAnswered, dismiss the dialog and return; on any other error, propagate."


The headless behavior (no active interactor) lives in Headless and Interactor.


Request kinds

The protocol is typed. Core and interactors share a fixed schema.

Kind Prompts for Blocks stage Notes
Ask Free-form text from user Yes Used by the ask-user tool — see Bundled Tools
Approve Tool-call approval Yes Per-call; serialized in parallel batches
Select Choice from a list Yes Used by commands that require disambiguation
Auth.DeviceCode Auth flow completion Yes Provider or MCP server auth
Auth.Password Password/secret Yes Enters the value via secure input; never logged
Confirm Yes/no with default Yes Used by destructive operations
grantStageTool One-shot tool grant inside an SM stage Yes Stage-scoped; see below
approveSubagentEnvelope One-shot subagent envelope approval at child-session spawn Yes Session-scoped to the child; see § approveSubagentEnvelope

Response payloads are typed per kind (approval has an allowOnce flag, Select has an index, etc.). The shapes are normative — extensions speak through the contracted shape and nothing else.

New request kinds are added via a minor contract bump. Removing a kind is a major bump. See Versioning and Compatibility.

Common attribution fields

Every request kind carries the following attribution fields in addition to its kind-specific payload:

Field Meaning
correlationId Threads the request and its response across the bus and audit. Existing in 1.0.0.
parentSessionId Present when the request originates inside a child session; equals the orchestrator session's id. Absent when the orchestrator session itself raised the request. Added in 1.1.0.
subagentId Present when the request originates inside a child session; identifies which subagent is asking. Added in 1.1.0. Interactor implementations MUST surface this in the dialog (chip + subagent label) so the user can attribute the prompt to the agent that raised it.

Attribution applies to every kind — Ask, Approve, Select, Auth.DeviceCode, Auth.Password, Confirm, grantStageTool (when an attached SM's stage runs inside a child session — out of scope in v1 since child sessions do not attach an SM, but the field shape is reserved), and approveSubagentEnvelope.

grantStageTool

Trigger. Core auto-issues a grantStageTool request during stage Act when the LLM proposes a call to a tool that is neither in the stage's allowedTools nor covered by an already-active grant. The request is not an API the stage author invokes ad-hoc; it is the system's deterministic response to an out-of-envelope proposal. A proposal denied by allowedTools never reaches the security-mode gate or guard hooks until a matching grant has been issued.

Binding to exact proposal identity. A grant is bound to the specific proposal that triggered it — not to a (stage, tool) pair. Rebinding to a different invocation is rejected, so the LLM cannot launder one approval into subsequent calls with different arguments.

Request field Meaning
stageExecutionId The stage execution's ID (unique per run, not shared across restarts).
attempt The 0-based index of the current attempt in ctx.attempts[] (equal to ctx.attempts.length - 1 for the duration of the attempt).
proposalId The opaque ID core assigned to the LLM's specific tool-call proposal.
tool The tool flat-name.
argsDigest An opaque, implementation-defined deterministic digest of the raw proposed arguments. Core computes it from a canonicalized form of the proposal's arguments at mint time and compares by equality at consumption time (and only by equality). Stage authors, SMs, and extensions treat it as an equality-only blob — they may store, surface, or compare it, but must not parse it, reconstruct arguments from it, or rely on any specific algorithm or canonicalization. The wiki fixes only the equality semantic; the algorithm is a core implementation detail that may change without a contract bump.
argsSummary A redacted, human-readable summary for the interactor to display. The raw args are held in core and used at call time; only the digest and summary cross the interaction boundary.
reason Optional author-provided explanation attached to the proposal (via the stage's Act author-supplied reason logic, if any). Free-text.
Response field Meaning
outcome approve, deny, or defer.
note Optional text the user can attach. Surfaces to Assert on the matching ctx.attempts[i].grants[j] entry (and on the top-level grants view for the current attempt). See SM Stage Lifecycle — Assert.

Consumption semantics.

  • approve → core mints a single-use grant token bound to {stageExecutionId, attempt, proposalId, tool, argsDigest}. When the stage's next tool call matches all five, the token is consumed. Consuming the token is equivalent to an SM-approve decision: the security-mode gate is bypassed; guard hooks still run (guard-deny wins in any mode, per Hooks). A token whose argsDigest no longer matches at call time is rejected and the call is denied as if no grant existed.
  • deny → the call is refused before the mode gate sees it. A denied-tool-call tool-call result is synthesized back to the LLM on the same transcript so Act can continue or terminate.
  • defer → equivalent to deny for this proposal; the stage's Assert may still return retry (subject to retryPolicy.maxAttempts), and the LLM may re-propose on the next attempt. A deferred grant does not accumulate a pending-review queue — every attempt starts clean.

Grant scope. A minted token is single-use and scoped to the proposal it was issued for. It does not apply to sibling stages, does not outlive the current attempt, and does not survive a crash (stage state is ephemeral; see Persistence and Recovery). A retry (Assert → retry) starts the next attempt with a clean grant state — any out-of-envelope proposal the LLM makes in attempt N+1 triggers a fresh grantStageTool request with a fresh proposalId, even if the arguments are identical. This is intentional: retries reflect a changed model state and the user's prior approval should not leak across that boundary.

Why not SM-attached callbacks? The SM contract does not expose a per-call author callback that could auto-approve proposals; per-call argument-sensitive policy lives in guard hooks (see State Machines — SM authority vs hook authority). grantStageTool is the sole channel that lets an interactor widen a stage's tool envelope for a single call.

The retry-persistence rule and headless fallback are specified in Stage Executions and Headless and Interactor § Headless emit-and-halt.

approveSubagentEnvelope

Trigger. Core auto-issues an approveSubagentEnvelope request immediately after the bundled delegate tool is called and the requested envelope passes strict-subset validation against the parent session's currently-active tool manifest. The request is not an API the orchestrator's LLM or any extension invokes ad-hoc; it is the system's deterministic response to a delegate call. The child subagent session is not opened until this request resolves.

Binding to the spawn boundary. A grant from this request authorizes a single child session's lifetime, not a named tool name. There is no per-call re-evaluation; the user's decision at spawn binds the entire child session.

Request field Meaning
subagentId The child session's identifier (unique within the parent session).
parentSessionId The orchestrator session's sessionId for audit threading.
depth The child's depth — orchestrator is depth 0; first-level subagents are depth 1. Distinguishes nested-subagent envelope prompts.
requestedEnvelope The list of tool flat-names the orchestrator's LLM proposed. Each name is guaranteed to resolve in the parent's active tool manifest at request mint time (validation rejects unknowns before the request is issued).
promptSummary A redacted, human-readable summary of the orchestrator-supplied subagent task prompt for the interactor to display. The full prompt is held in core and used only at child-session spawn; only the summary crosses the interaction boundary.
model The resolved (providerId, modelId) the child will run with — either inherited from the parent or overridden by the delegate tool's model arg (see Subagent Sessions § Model selection). Surfaced for user transparency at envelope approval; not user-mutable from this dialog.
Response field Meaning
outcome approve or deny. Unlike grantStageTool there is no defer — the child session either spawns or does not.
note Optional text the user can attach. Audited on the SubagentEnvelopeApproved / SubagentEnvelopeDenied record.

Consumption semantics.

  • approve → core marks the envelope as authorized for this subagentId and proceeds to spawn the child session. Tools whose flat-name matches the envelope bypass the child session's mode gate for the child session's lifetime (mode-gate-bypass equivalent to a stage's in-envelope tool, except scoped to the whole child session, not a single proposal). Guard hooks still run.
  • deny → the child session does not spawn. The parent's delegate tool call returns a typed Subagent/EnvelopeDenied result to the orchestrator's LLM on the same transcript so the orchestrator can continue.

Grant scope. The envelope authorization is bound to the subagentId. It does not apply to other subagents, does not outlive the child session, and is not re-evaluable mid-run. The user cannot grow the envelope after spawn — out-of-envelope tool calls inside the child session follow the inherited mode gate normally (see Tool Approvals § Subagent envelope and child-session approvals).

Why session-scoped, not per-call. The trust principal for envelope authorization is the user, not an SM author. Per-call re-prompting (the grantStageTool shape) would defeat the user's pre-decision and turn every subagent run into an interrupt cascade. The single-prompt-per-spawn shape mirrors the user's natural mental model: "I am delegating this task to a subagent; here are the tools I trust it with for this task."

The headless rule and the interaction with --yolo are specified in Headless and Interactor § Headless emit-and-halt.


Core blocks; the first interactor to respond wins

sequenceDiagram
    autonumber
    participant Stage as Message loop stage
    participant Core
    participant Intx as Interaction protocol
    participant UI1 as Interactor A
    participant UI2 as Interactor B

    Stage->>Core: need decision (e.g., tool approval)
    Core->>Intx: Request(kind, payload, correlationId)
    Intx->>UI1: handle(request) [concurrent]
    Intx->>UI2: handle(request) [concurrent]
    UI1-->>Intx: Response(accepted, correlationId)
    Note over Intx: UI1 wins — broadcast InteractionAnswered
    Intx-->>Core: Response
    Core-->>Stage: continue / deny / etc.
    UI2-->>Intx: (late) Response
    Note over Intx: Session/InteractionAlreadyAnswered
Loading

Core does not time out by default. An attached SM may impose a timeout through Host API cancellation. A cancelled interaction request completes with cancelled; each interactor should release its held UI state.

For the InteractionAnswered event payload and the Session/InteractionAlreadyAnswered error, see § Multiple interactors above — the canonical definition lives there to avoid drift.


Headless and SM behavior

Interaction-request resolution when no interactor is active — including the grantStageTool auto-deny rule — is specified in Headless and Interactor § Headless emit-and-halt. Parallel tool-call approvals serialize in core by proposal order before each request fans out to active interactors; see Tool Approvals for the full precedence stack.


Correlation and observability

Every request carries a correlationId that matches the turn's correlation ID. Responses propagate it. Audit Trail records every interaction's request kind, correlation ID, outcome, and wall-clock times.

Secrets collected via Auth.Password or similar are never recorded in the audit trail as cleartext. See Secrets Hygiene.


What the protocol is not

  • Not an event bus. Subscribers do not see interaction requests; interactors do not see projection events through this channel.
  • Not a general RPC. Kinds are enumerated; extensions may not smuggle arbitrary request types.
  • Not extensible by arbitrary extensions. Only core originates interaction requests. An extension that wants to ask the user uses a known kind (e.g., a tool uses Ask). New kinds go through contract bumps.
  • Not a sandbox. Like everything else in v1, active interactors run in-process. See Extension Isolation.

Changelog

1.0.0 — initial

  • Typed request/response channel between core and one or more active interactors; seven request kinds (Ask, Approve, Select, Auth.DeviceCode, Auth.Password, Confirm, grantStageTool); per-request correlation ID.
  • Multiple interactors may be active concurrently (race-to-answer; first response wins; late responses receive Session/InteractionAlreadyAnswered).
  • InteractionAnswered core event with payload { correlationId, status, value?, extId }extId attributes the winning interactor for audit.
  • Per-interactor FIFO: each interactor serializes its own dialog queue; cross-interactor resolution is concurrent.

1.1.0 — approveSubagentEnvelope

  • New request kind approveSubagentEnvelope for the bundled delegate tool. Auto-issued at child-session spawn; payload carries subagentId, parentSessionId, depth, requestedEnvelope, promptSummary, model. Response is approve / deny (no defer).
  • Approval grants envelope-bypass of the child session's mode gate for in-envelope tools for the lifetime of that child session; out-of-envelope tools follow the inherited mode gate. Guard hooks still run.
  • Parent-session-level serialization invariant for IP requests originating from concurrent subagents — see Subagent Sessions § Cross-subagent serialization. No change to per-interactor FIFO or cross-interactor race-to-answer.
  • No removal of pre-existing kinds; additive on top of 1.0.0.

1.1.1 — approveSubagentEnvelope.model carries the resolved (providerId, modelId)

  • Clarification of the existing model field on the approveSubagentEnvelope request payload. In 1.1.0 the field always reflected the parent's (providerId, modelId). In 1.1.1, the value is the resolved (providerId, modelId) for the child — either inherited from the parent or overridden by the delegate tool's model arg per Subagent Sessions § Model selection. The wire shape and field name are unchanged; only the semantic widens.
  • No new request kinds. No payload shape changes. Patch-version bump.

Introduction

Reading

Core runtime

Contracts

Category contracts

Context

Security

Runtime behavior

Operations

Providers (bundled)

Integrations

Reference extensions

Tools

UI

Session Stores

Loggers

Providers

Hooks

Context Providers

Commands

Case studies

Flows

Maintainers

Clone this wiki locally