Skip to content

fix(gastown): prevent duplicate session leak when restarting a starting agent #1341

@jrf0110

Description

@jrf0110

Context

When startAgent is called for an agent that's already in starting status (between sessionCount++ and session.create()), the current code calls stopAgent() to kill the old session before proceeding. However, stopAgent() cannot actually cancel a startup that hasn't created a sessionId yet. The original startAgent() call keeps going, subscribes to events, and can leave an extra live session that is no longer tracked in the agents map.

This was identified during PR #1336 review (comment by kilo-code-bot).

Current behavior

In container/src/process-manager.ts:startAgent():

if (existing && (existing.status === 'running' || existing.status === 'starting')) {
  await stopAgent(request.agentId).catch(...);
}

If existing.status === 'starting', stopAgent tries to kill it but has no sessionId to target. The original startup continues in the background, creating an orphaned session.

Expected behavior

When a new startAgent request arrives for an agent in starting status, the system should either:

  1. Wait for the existing startup to complete (with a timeout), then stop the resulting session
  2. Use an AbortController to cancel the in-flight startup
  3. Track the pending startup promise and await it before stopping

Option 2 is cleanest — thread an AbortController through the startup sequence so stopAgent can signal cancellation before session.create() completes.

Impact

Low — this race requires two startAgent calls for the same agent within the ~1-2 second window between sessionCount++ and session.create(). In practice this is rare because the reconciler runs every 5s and the DO serializes RPC calls. But it could happen during rapid container eviction/restart cycles.

Parent: #204

Metadata

Metadata

Assignees

No one assigned

    Labels

    P0Blocks soft launchbugSomething isn't workinggt:containerContainer management, agent processes, SDK, heartbeatkilo-auto-fixAuto-generated label by Kilokilo-triagedAuto-generated label by Kilo

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions