Bug: Task app state reporting is inconsistent — self-reported failure/idle/complete states are silently overridden when AgentAPI is enabled

## Summary

When the AI AgentAPI (`CODER_MCP_AI_AGENTAPI_URL`) is configured, the MCP server's `WithTaskReporter` callback unconditionally overrides **all** self-reported states from the AI agent to `working`. This means `failure`, `idle`, and `complete` states reported by the agent via the `coder_report_task` tool are silently discarded. The system then relies entirely on the screen watcher (SSE subscription to agentapi) to detect `StatusStable` and report `idle`. If the screen watcher fails or never fires, the task state gets stuck — either as `working` forever or as `null` (never reported).

This results in tasks where:
- `current_state` is permanently `null` even though the task is active and waiting for input
- `current_state` is stuck on `working` even though the agent has finished
- The `POST /tasks/{user}/{task}/send` endpoint returns 502 because agentapi's `GetStatus()` reports `running` instead of `stable`
- Terminal states (`failure`, `complete`) are unreachable when AgentAPI is enabled

## Root Cause

In [`cli/exp_mcp.go` ~L696-706](https://github.com/coder/coder/blob/main/cli/exp_mcp.go#L696):

```go
toolsdk.WithTaskReporter(func(args toolsdk.ReportTaskArgs) error {
    // The agent does not reliably report its status correctly.  If AgentAPI
    // is enabled, we will always set the status to "working" when we get an
    // MCP message, and rely on the screen watcher to eventually catch the
    // idle state.
    state := codersdk.WorkspaceAppStatusStateWorking
    if s.aiAgentAPIClient == nil {
        state = codersdk.WorkspaceAppStatusState(args.State)
    }
    ...
```

When `aiAgentAPIClient != nil`, **every** self-reported state (including `failure`, `idle`, `complete`) is overridden to `working`. The design intention is to distrust the agent's `idle` reporting and rely on the screen watcher instead, but the override is too broad.

## Contributing Factors

### 1. Screen watcher is the only path to `idle` (when AgentAPI is enabled)
The `startWatcher` goroutine ([L614-663](https://github.com/coder/coder/blob/main/cli/exp_mcp.go#L614)) subscribes to agentapi SSE events and maps `StatusStable` → `idle` and `StatusRunning` → `working`. If the SSE subscription fails (L617-619), the goroutine returns early and `idle` is **never** reported.

### 2. No periodic fallback/polling
The screen watcher is entirely event-driven via SSE. There is no periodic poll of `GetStatus()` as a fallback if the SSE connection drops or never connects.

### 3. Terminal states are impossible
`complete` and `failure` can only come from agent self-reports (the screen watcher only knows `running`/`stable`). Since all self-reports are overridden to `working` when AgentAPI is enabled, these terminal states are unreachable.

### 4. Queue predicate filters duplicate `working` updates
The queue predicate at [~L452-455](https://github.com/coder/coder/blob/main/cli/exp_mcp.go#L452) discards non-user-message `working` updates from the screen watcher after the first report. So if the agent self-reports `failure` (converted to `working`), then the watcher reports `working`, it gets discarded as a duplicate. Then when the watcher finally reports `idle`, it should go through — but only if the watcher is running at all.

### 5. Misleading unconditional log line
At [L541](https://github.com/coder/coder/blob/main/cli/exp_mcp.go#L541), `cliui.Infof(inv.Stderr, "Failed to watch screen events")` is printed **unconditionally** (not inside an error handler). This is misleading for debugging but doesn't affect behavior.

### 6. Send endpoint gates on agentapi status, not task `current_state`
The `POST /tasks/{user}/{task}/send` endpoint in [`coderd/aitasks.go` ~L766](https://github.com/coder/coder/blob/main/coderd/aitasks.go#L766) calls `agentAPIClient.GetStatus()` directly and requires `StatusStable`. This is independent of the task's `current_state` field, so even if `current_state` were correctly set to `idle`, the send can still fail if agentapi disagrees.

## Observed Behavior

- Tasks created with trivial prompts (e.g. "Tests") get `failure` from the agent, which is silently converted to `working`, and then `current_state` stays `null` or `working` forever
- Tasks doing real work (analysis, code review) also end up with `null` `current_state` — the screen watcher either isn't connecting or isn't emitting `StatusStable`
- Sending input to a task in this state fails with 502: `Task app is not ready to accept input. Status: running`

## Suggested Fixes

1. **Allow terminal states through the override**: The `WithTaskReporter` should only override `idle` → `working` when AgentAPI is enabled, and should pass `failure` and `complete` through as-is from agent self-reports
2. **Add periodic status polling as fallback**: If the SSE connection to agentapi drops, periodically poll `GetStatus()` to catch `StatusStable` → `idle`
3. **Fix the unconditional log line**: The `"Failed to watch screen events"` message at L541 should only print on actual failure
4. **Consider allowing send when `current_state` is null**: If the task is active and `current_state` is null (no state has been reported yet), it may be reasonable to attempt the send rather than blocking

---

Created on behalf of @mafredri

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Task app state reporting is inconsistent — self-reported failure/idle/complete states are silently overridden when AgentAPI is enabled #1350

Summary

Root Cause

Contributing Factors

1. Screen watcher is the only path to `idle` (when AgentAPI is enabled)

2. No periodic fallback/polling

3. Terminal states are impossible

4. Queue predicate filters duplicate `working` updates

5. Misleading unconditional log line

6. Send endpoint gates on agentapi status, not task `current_state`

Observed Behavior

Suggested Fixes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: Task app state reporting is inconsistent — self-reported failure/idle/complete states are silently overridden when AgentAPI is enabled #1350

Description

Summary

Root Cause

Contributing Factors

1. Screen watcher is the only path to idle (when AgentAPI is enabled)

2. No periodic fallback/polling

3. Terminal states are impossible

4. Queue predicate filters duplicate working updates

5. Misleading unconditional log line

6. Send endpoint gates on agentapi status, not task current_state

Observed Behavior

Suggested Fixes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Screen watcher is the only path to `idle` (when AgentAPI is enabled)

4. Queue predicate filters duplicate `working` updates

6. Send endpoint gates on agentapi status, not task `current_state`