Skip to content

IoContext Timeout During MCP Initialization with McpAgent #640

@jezweb

Description

@jezweb

IoContext Timeout During MCP Initialization with McpAgent

Summary

When implementing an MCP server using McpAgent base class, we're experiencing IoContext timed out due to inactivity errors during the MCP protocol initialization handshake. This occurs before any tools are called, suggesting a transport-level timeout issue.

This might be our implementation error - we're reporting this to understand if we're missing required infrastructure or configuration.

Environment

  • Package: agents v0.1.0 (from Cloudflare npm registry)
  • MCP SDK: @modelcontextprotocol/sdk v1.0.7
  • Cloudflare Workers: Deployed to production
  • Durable Objects: Using McpAgent<Env, State> base class
  • Test Client: @modelcontextprotocol/inspector

Reproduction

Our Implementation Pattern

Entry Point (src/index.ts):

import { ImageGeneratorAgent } from "./agents/image-generator";

export default {
  fetch: async (req: Request, env: Env, ctx: ExecutionContext) => {
    const url = new URL(req.url);

    // Simple Bearer token auth
    const authHeader = req.headers.get("Authorization");
    if (!authHeader?.startsWith("Bearer ") || authHeader.split(" ")[1] !== env.API_TOKEN) {
      return new Response("Unauthorized", { status: 401 });
    }

    if (url.pathname === "/sse") {
      return ImageGeneratorAgent.serveSSE("/sse")(req, env, ctx);
    }

    if (url.pathname === "/mcp") {
      return ImageGeneratorAgent.serve("/mcp")(req, env, ctx);
    }

    return new Response("Not found", { status: 404 });
  }
};

Agent Class (src/agents/image-generator.ts):

import { McpAgent } from "agents/mcp";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { Database } from "../database";

export class ImageGeneratorAgent extends McpAgent<Env, State> {
  private server!: McpServer;
  private db!: Database;

  constructor(ctx: DurableObjectState, env: Env) {
    super(ctx, env);
    this.init();
  }

  private init() {
    this.server = new McpServer({
      name: "Cloudflare Image Generator",
      version: "1.0.0",
    }, {
      capabilities: {
        tools: {},
      },
    });

    this.db = new Database(this.ctx);
    this.registerTools();
  }

  private registerTools() {
    // 4 tools registered with this.server.tool(...)
    // All tools return quickly or use ctx.waitUntil() for async work
  }
}

Test Command

npx @modelcontextprotocol/inspector \
  https://mcp-image-generator.webfonts.workers.dev/sse \
  -H "Authorization: Bearer <token>" \
  --timeout 30000

Observed Behavior

Wrangler Tail Logs:

POST https://mcp-image-generator.webfonts.workers.dev/mcp - Ok @ 06/11/2025, 12:15:47 am

GET http://dummy-example.cloudflare.com/cdn-cgi/partyserver/set-name/ - Ok @ 06/11/2025, 12:17:05 am
  (log) ImageGeneratorAgent started: streamable-http:01da0455de4e3ad08c7af90e586edca1d77d2e7c312915b39db96044dab1639b

ImageGeneratorAgent.setInitializeRequest - Ok @ 06/11/2025, 12:17:05 am
ImageGeneratorAgent.getInitializeRequest - Ok @ 06/11/2025, 12:17:05 am
ImageGeneratorAgent.updateProps - Ok @ 06/11/2025, 12:17:05 am

GET https://mcp-image-generator.webfonts.workers.dev/mcp - Canceled @ 06/11/2025, 12:17:05 am
POST https://mcp-image-generator.webfonts.workers.dev/mcp - Canceled @ 06/11/2025, 12:17:05 am
  (warn) IoContext timed out due to inactivity, waitUntil tasks were cancelled without completing.

Key Observations:

  1. Agent starts successfully - Durable Object initializes
  2. Internal methods work - setInitializeRequest, getInitializeRequest, updateProps all complete
  3. ~2 minute gap - Between initial POST (12:15:47) and internal calls (12:17:05)
  4. Requests canceled - Both GET and POST to /mcp canceled
  5. IoContext timeout - "timed out due to inactivity, waitUntil tasks were cancelled"

Comparison with Official Examples

We noticed the official Cloudflare MCP servers use different patterns:

1. Different Server Implementation

Official examples (browser-rendering, auditlogs, etc.):

this.server = new CloudflareMCPServer({
  userId,
  wae: this.env.MCP_METRICS,
  serverInfo: {
    name: this.env.MCP_SERVER_NAME,
    version: this.env.MCP_SERVER_VERSION,
  },
})

Our implementation:

this.server = new McpServer({
  name: "Cloudflare Image Generator",
  version: "1.0.0",
}, {
  capabilities: { tools: {} },
});

2. Different Entry Pattern

Official examples: Wrap in OAuthProvider from @cloudflare/workers-oauth-provider

Our implementation: Direct routing with Bearer token auth

Questions

  1. Is CloudflareMCPServer required for production use with McpAgent, or should standard McpServer work?

  2. Is there timeout configuration we're missing for the IoContext during MCP initialization?

  3. Does the OAuthProvider wrapper provide necessary keep-alive or timeout handling that we're missing?

  4. What causes the 2-minute delay between the initial request and internal agent methods being called?

What We've Tried

  • ✅ Implemented async pattern with ctx.waitUntil() for long-running operations
  • ✅ Background job processing to avoid blocking tool responses
  • ✅ Tested both /mcp and /sse endpoints
  • ✅ Verified tools return quickly (< 1s) with job tracking
  • Refactored entire codebase to match Cloudflare MCP patterns:
    • Restructured files (*.app.ts, *.context.ts, agents/, tools/)
    • Added error recording helpers
    • Separated tool registration into dedicated file
    • Result: SAME TIMEOUT PERSISTS - confirming it's not an implementation pattern issue

The timeout occurs before any tools are called, during the MCP handshake itself.

Possible Causes

  1. We're using the wrong server class (McpServer vs CloudflareMCPServer)
  2. We're missing required wrapper infrastructure (OAuthProvider)
  3. We're missing timeout configuration for Durable Object IoContext
  4. We need to implement custom keep-alive during initialization

Request

Could you help us understand:

  • Are we implementing this correctly?
  • What infrastructure is required for production MCP servers on Cloudflare?
  • How to prevent IoContext timeout during MCP initialization?

We're happy to refactor our implementation to match best practices - just want to understand what we're missing!

Repository

Our implementation: https://github.com/jezweb/mcp-cloudflare-image-generator

Full investigation details: TIMEOUT_INVESTIGATION.md

Related Issues

Possibly related to:

However, our issue appears to be specifically during initialization before any tools are called.

Additional Context

This is a learning project to understand MCP protocol implementation on Cloudflare Workers. We're trying to build a simple image generation MCP server using Workers AI and Durable Objects for state management.

We've documented our complete investigation including timing analysis, comparison with official examples, and refactoring attempts in the repository.

Thank you for any guidance!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions