Skip to content

fix(api): prevent idle TCP connections from blocking the accept loop#3303

Merged
luispater merged 2 commits into
router-for-me:devfrom
lihan3238:fix-3267
May 11, 2026
Merged

fix(api): prevent idle TCP connections from blocking the accept loop#3303
luispater merged 2 commits into
router-for-me:devfrom
lihan3238:fix-3267

Conversation

@lihan3238
Copy link
Copy Markdown
Contributor

@lihan3238 lihan3238 commented May 9, 2026

Summary

  • Move per-connection protocol detection (TLS handshake, reader.Peek) out of the accept loop and into a per-connection goroutine
  • Add regression test TestAcceptMuxNotBlockedByIdleConnection

Problem

acceptMuxConnections performs protocol sniffing inline: after accepting a connection, it calls tlsConn.Handshake() and reader.Peek(1) before returning to listener.Accept(). If a client opens a TCP connection and never sends any bytes, Peek(1) blocks indefinitely, preventing all subsequent connections from being accepted. This makes the entire management/API server unresponsive while the process remains alive.

Reproduction (from #3267):

Terminal 1: open idle TCP connection
bash -c "exec 3<>/dev/tcp/127.0.0.1/8317; sleep 60"

Terminal 2: HTTP request times out
curl -v --max-time 5 http://127.0.0.1/8317/

Fix

Extract the protocol detection logic into routeMuxConnection() and call it via a goroutine. The accept loop now only calls listener.Accept() and immediately dispatches each connection, so a single slow/idle connection cannot block the loop.

The extracted method uses return instead of the original continue since it runs in its own goroutine.

Test Plan

  • TestAcceptMuxNotBlockedByIdleConnection -- opens an idle TCP connection, then sends a real HTTP request and verifies it succeeds within 3 seconds
  • All existing tests in internal/api/... pass
  • go build ./internal/api/... compiles cleanly

Closes #3267

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the protocol multiplexer to handle connections asynchronously, preventing the main accept loop from being blocked by idle or slow clients. This is achieved by moving the TLS handshake and protocol sniffing logic into a separate goroutine. A new test case has been added to verify that idle connections do not block subsequent requests. Feedback was provided regarding a potential resource leak, as the new worker goroutines lack I/O deadlines, which could be exploited by slowloris-style attacks.

}

// routeMuxConnection performs per-connection protocol detection and routing.
func (s *Server) routeMuxConnection(conn net.Conn, httpListener *muxListener) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The routeMuxConnection goroutine performs network I/O (TLS handshake and reader.Peek) without a deadline. While this change correctly prevents the main accept loop from blocking, an idle client that connects but never sends data will still cause this worker goroutine to hang indefinitely. This can lead to a resource leak (goroutines and file descriptors) under a slowloris-style attack or with many naturally idle clients. It is recommended to set a short read deadline on the connection before starting the protocol detection logic and clear it once the connection is successfully routed to its respective handler.

lihan3238 added a commit to lihan3238/CLIProxyAPI that referenced this pull request May 9, 2026
Set a 10s read deadline on connections entering routeMuxConnection so
that idle clients which never send bytes are cleaned up instead of
leaking goroutines and file descriptors. The deadline is cleared once
the connection is successfully routed to its handler.

Addresses gemini-code-assist review feedback on PR router-for-me#3303.
@lihan3238
Copy link
Copy Markdown
Contributor Author

Good catch. Added a 10s read deadline in routeMuxConnection that is cleared once the connection is successfully routed to its handler. Idle connections will now time out and release the goroutine + fd. Pushed in 4ee690b.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4ee690b41d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

continue
return
}
s.handleRedisConnection(conn, reader)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear the sniff deadline before Redis handling

When management Redis/RESP is enabled and a client keeps a connection open, this path inherits the 10-second read deadline set before sniffing and never clears it before entering handleRedisConnection. Because that handler loops on blocking RESP reads, an authenticated Redis client that is idle or waits more than 10 seconds after accept before its next command will get disconnected by an i/o timeout, unlike the HTTP paths which clear the deadline after routing.

Useful? React with 👍 / 👎.

Move per-connection protocol detection (TLS handshake, reader.Peek) out
of the accept loop and into a per-connection goroutine. An idle TCP
connection that never sends bytes would previously block Peek(1)
indefinitely, preventing all subsequent connections from being accepted
and making the management/API server unresponsive.

Closes router-for-me#3267
Clear the 10s read deadline before calling handleRedisConnection so
that authenticated Redis clients are not disconnected by an i/o timeout
after 10 seconds of idle time. HTTP paths already clear the deadline
after routing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@lihan3238
Copy link
Copy Markdown
Contributor Author

Good catch, thanks @chatgpt-codex-connector! Added deadline clear before entering in the latest push (c5596e0).

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

Copy link
Copy Markdown
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:
This change keeps the shared listener responsive by moving TLS/protocol sniffing out of the accept loop and into per-connection routing. The added sniff deadline also addresses the follow-up resource-leak risk for clients that connect and never send bytes.

Key findings:

  • No blocking findings.

Test plan:

  • Reviewed the diff for internal/api/protocol_multiplexer.go and internal/api/protocol_multiplexer_test.go.
  • Verified PR checks are passing: pr-test-build, translator-path-guard, and agents-md-guard.
  • Locally ran go build -o test-output ./cmd/server && rm test-output on PR HEAD c5596e09.

@luispater luispater merged commit 041ccf0 into router-for-me:dev May 11, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants