Skip to content

Add DoS protection and idle timeout to vMCP session manager#4047

Draft
yrobla wants to merge 1 commit intomainfrom
issue-3874
Draft

Add DoS protection and idle timeout to vMCP session manager#4047
yrobla wants to merge 1 commit intomainfrom
issue-3874

Conversation

@yrobla
Copy link
Contributor

@yrobla yrobla commented Mar 9, 2026

Summary

Implements resource-exhaustion protections for the session-scoped backend lifecycle (SessionManagementV2), resolving issue #3874.

Global session limit: new session requests (no Mcp-Session-Id header) receive HTTP 503 with a Retry-After header when the server-wide cap is reached. Default: 100 sessions.

Per-client session limit: CreateSession enforces a maximum number of concurrent sessions per auth.Identity.Subject. Anonymous clients are exempt. The counter is rolled back on all failure paths. Default: 10 sessions per identity.

Idle session timeout: a background reaper goroutine terminates sessions that have had no CallTool activity for longer than the configured threshold. The idle clock resets on every tool call and is initialised when a session is fully established. The reaper is wired into shutdownFuncs so it stops cleanly on server shutdown. Default: 5 min.

All three limits are configurable via Config fields; zero values select the defaults. The Limits struct is passed to sessionmanager.New() and all existing call sites are updated.

Fixes #3874

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Does this introduce a user-facing change?

No

Special notes for reviewers

@github-actions github-actions bot added the size/M Medium PR: 300-599 lines changed label Mar 9, 2026
@yrobla yrobla requested a review from Copilot March 9, 2026 11:55
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 88deb977d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements resource-exhaustion protections for SessionManagementV2 by adding configurable session limits and idle-session cleanup to reduce risk of backend connection/memory/FD exhaustion.

Changes:

  • Add per-client (identity subject) session limiting and idle-session tracking/reaping to sessionmanager.Manager.
  • Add server-level global session limit middleware returning HTTP 503 + Retry-After for new sessions when capped.
  • Extend/adjust unit tests to cover per-client limits and idle reaping behaviors.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
pkg/vmcp/server/sessionmanager/session_manager.go Adds limits struct, per-client counters, idle tracking + reaper, and integrates cleanup into session termination/tool calls.
pkg/vmcp/server/server.go Adds config knobs/defaults, global session-limit middleware (503 + Retry-After), and wires idle reaper lifecycle into server start/shutdown.
pkg/vmcp/server/sessionmanager/session_manager_test.go Updates constructors for new Limits parameter and adds tests for per-client limit + idle reaper behavior.
Comments suppressed due to low confidence (1)

pkg/vmcp/server/server.go:607

  • The new session-limit middleware is applied before AuthMiddleware in code, but due to wrapper ordering the auth middleware becomes the outermost handler and runs first. This contradicts the comment and means over-limit requests will still pay auth cost. Apply AuthMiddleware first and then wrap with sessionLimitMiddleware (so the limit check runs before auth).
	// Apply session limit middleware when V2 session management is active.
	// Runs before auth so over-limit requests are rejected early without auth overhead.
	if s.vmcpSessionMgr != nil && s.config.MaxSessions > 0 {
		mcpHandler = s.sessionLimitMiddleware(mcpHandler)
		slog.Info("session limit middleware enabled", "max_sessions", s.config.MaxSessions)
	}

	// Apply authentication middleware if configured (runs first in chain)
	if s.config.AuthMiddleware != nil {
		mcpHandler = s.config.AuthMiddleware(mcpHandler)
		slog.Info("authentication middleware enabled for MCP endpoints")

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Mar 9, 2026

Codecov Report

❌ Patch coverage is 70.62937% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.63%. Comparing base (85c5f3e) to head (ce8810c).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/vmcp/server/sessionmanager/session_manager.go 73.00% 23 Missing and 4 partials ⚠️
pkg/vmcp/server/server.go 65.11% 13 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4047      +/-   ##
==========================================
+ Coverage   68.61%   68.63%   +0.01%     
==========================================
  Files         445      445              
  Lines       45374    45516     +142     
==========================================
+ Hits        31135    31240     +105     
- Misses      11841    11867      +26     
- Partials     2398     2409      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Implements resource-exhaustion protections for the session-scoped
backend lifecycle (SessionManagementV2), resolving issue #3874.

Global session limit: new session requests (no Mcp-Session-Id header)
receive HTTP 503 with a Retry-After header when the server-wide cap is
reached. Default: 100 sessions.

Per-client session limit: CreateSession enforces a maximum number of
concurrent sessions per auth.Identity.Subject. Anonymous clients are
exempt. The counter is rolled back on all failure paths. Default: 10
sessions per identity.

Idle session timeout: a background reaper goroutine terminates sessions
that have had no CallTool activity for longer than the configured
threshold. The idle clock resets on every tool call and is initialised
when a session is fully established. The reaper is wired into
shutdownFuncs so it stops cleanly on server shutdown. Default: 5 min.

All three limits are configurable via Config fields; zero values select
the defaults. The Limits struct is passed to sessionmanager.New() and
all existing call sites are updated.

Closes: #3874
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 9, 2026
Copy link
Contributor

@jerm-dro jerm-dro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these changes are required for the session v2 work. Would you be okay setting this work aside for now?

I think we'll tackle a related problem as part of the upcoming scale work, where we'll need to implement an LRU for the session storage so not all sessions are held in memory.

Comment on lines +631 to +660
// sessionLimitMiddleware is a best-effort fast-fail for new session requests
// (no Mcp-Session-Id header): it returns HTTP 503 + Retry-After before the
// request reaches the SDK when the global session cap appears to be reached.
// Existing sessions (with a valid Mcp-Session-Id) are never affected.
//
// This check is intentionally optimistic (non-atomic): it avoids the overhead
// of routing and SDK processing for clearly-over-limit requests, but it does
// not guarantee strict enforcement under concurrent load. Strict enforcement
// is provided atomically by sessionmanager.Manager.Generate(), which uses an
// increment-first reservation to prevent races between concurrent initialize
// requests.
func (s *Server) sessionLimitMiddleware(next http.Handler) http.Handler {
// Resolve the concrete manager once so we can call ActiveSessionCount().
mgr, _ := s.vmcpSessionMgr.(*sessionmanager.Manager)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Header.Get("Mcp-Session-Id") == "" && mgr != nil {
if mgr.ActiveSessionCount() >= s.config.MaxSessions {
w.Header().Set("Retry-After", strconv.Itoa(defaultRetryAfterSeconds))
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusServiceUnavailable)
_, _ = w.Write([]byte(
`{"error":{"code":-32000,"message":"Maximum concurrent sessions exceeded. ` +
`Please try again later or contact administrator."}}`,
))
return
}
}
next.ServeHTTP(w, r)
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker: can we remove the rate limiting behavior from this PR?

It's not really needed today, so I'd prefer to avoid adding it.

Comment on lines +115 to +128
// perClientMu guards perClientCounts and sessionSubject.
perClientMu sync.Mutex
perClientCounts map[string]int // subject → active session count
sessionSubject map[string]string // sessionID → subject (for decrement on Terminate)

// idleActivityMu guards idleActivity.
idleActivityMu sync.RWMutex
idleActivity map[string]time.Time // sessionID → last active time

// activeSessionCount tracks sessions that have been generated but not yet
// terminated, excluding terminated placeholders left for TTL cleanup.
// This gives an accurate count for global limit enforcement, unlike
// storage.Count() which includes those terminated placeholders.
activeSessionCount atomic.Int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker: this is adding a lot of complexity to Manager. We should refactor the Manager struct first so we don't have to mix all these concerns together in one class. That would also make unit testing much simpler. I would expect a change like this doesn't require the integration_tests at all.

@yrobla
Copy link
Contributor Author

yrobla commented Mar 10, 2026

ok setting as draft for now. I know that is not essential but i was speeding up with pending tasks while the main code was reviewed...

@yrobla yrobla marked this pull request as draft March 10, 2026 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[vMCP] Implement resource exhaustion and DoS protection for session creation

4 participants