-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
When the AI AgentAPI (CODER_MCP_AI_AGENTAPI_URL) is configured, the MCP server's WithTaskReporter callback unconditionally overrides all self-reported states from the AI agent to working. This means failure, idle, and complete states reported by the agent via the coder_report_task tool are silently discarded. The system then relies entirely on the screen watcher (SSE subscription to agentapi) to detect StatusStable and report idle. If the screen watcher fails or never fires, the task state gets stuck — either as working forever or as null (never reported).
This results in tasks where:
current_stateis permanentlynulleven though the task is active and waiting for inputcurrent_stateis stuck onworkingeven though the agent has finished- The
POST /tasks/{user}/{task}/sendendpoint returns 502 because agentapi'sGetStatus()reportsrunninginstead ofstable - Terminal states (
failure,complete) are unreachable when AgentAPI is enabled
Root Cause
toolsdk.WithTaskReporter(func(args toolsdk.ReportTaskArgs) error {
// The agent does not reliably report its status correctly. If AgentAPI
// is enabled, we will always set the status to "working" when we get an
// MCP message, and rely on the screen watcher to eventually catch the
// idle state.
state := codersdk.WorkspaceAppStatusStateWorking
if s.aiAgentAPIClient == nil {
state = codersdk.WorkspaceAppStatusState(args.State)
}
...When aiAgentAPIClient != nil, every self-reported state (including failure, idle, complete) is overridden to working. The design intention is to distrust the agent's idle reporting and rely on the screen watcher instead, but the override is too broad.
Contributing Factors
1. Screen watcher is the only path to idle (when AgentAPI is enabled)
The startWatcher goroutine (L614-663) subscribes to agentapi SSE events and maps StatusStable → idle and StatusRunning → working. If the SSE subscription fails (L617-619), the goroutine returns early and idle is never reported.
2. No periodic fallback/polling
The screen watcher is entirely event-driven via SSE. There is no periodic poll of GetStatus() as a fallback if the SSE connection drops or never connects.
3. Terminal states are impossible
complete and failure can only come from agent self-reports (the screen watcher only knows running/stable). Since all self-reports are overridden to working when AgentAPI is enabled, these terminal states are unreachable.
4. Queue predicate filters duplicate working updates
The queue predicate at ~L452-455 discards non-user-message working updates from the screen watcher after the first report. So if the agent self-reports failure (converted to working), then the watcher reports working, it gets discarded as a duplicate. Then when the watcher finally reports idle, it should go through — but only if the watcher is running at all.
5. Misleading unconditional log line
At L541, cliui.Infof(inv.Stderr, "Failed to watch screen events") is printed unconditionally (not inside an error handler). This is misleading for debugging but doesn't affect behavior.
6. Send endpoint gates on agentapi status, not task current_state
The POST /tasks/{user}/{task}/send endpoint in coderd/aitasks.go ~L766 calls agentAPIClient.GetStatus() directly and requires StatusStable. This is independent of the task's current_state field, so even if current_state were correctly set to idle, the send can still fail if agentapi disagrees.
Observed Behavior
- Tasks created with trivial prompts (e.g. "Tests") get
failurefrom the agent, which is silently converted toworking, and thencurrent_statestaysnullorworkingforever - Tasks doing real work (analysis, code review) also end up with
nullcurrent_state— the screen watcher either isn't connecting or isn't emittingStatusStable - Sending input to a task in this state fails with 502:
Task app is not ready to accept input. Status: running
Suggested Fixes
- Allow terminal states through the override: The
WithTaskReportershould only overrideidle→workingwhen AgentAPI is enabled, and should passfailureandcompletethrough as-is from agent self-reports - Add periodic status polling as fallback: If the SSE connection to agentapi drops, periodically poll
GetStatus()to catchStatusStable→idle - Fix the unconditional log line: The
"Failed to watch screen events"message at L541 should only print on actual failure - Consider allowing send when
current_stateis null: If the task is active andcurrent_stateis null (no state has been reported yet), it may be reasonable to attempt the send rather than blocking
Created on behalf of @mafredri