fix(api): prevent idle TCP connections from blocking the accept loop#3303
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the protocol multiplexer to handle connections asynchronously, preventing the main accept loop from being blocked by idle or slow clients. This is achieved by moving the TLS handshake and protocol sniffing logic into a separate goroutine. A new test case has been added to verify that idle connections do not block subsequent requests. Feedback was provided regarding a potential resource leak, as the new worker goroutines lack I/O deadlines, which could be exploited by slowloris-style attacks.
| } | ||
|
|
||
| // routeMuxConnection performs per-connection protocol detection and routing. | ||
| func (s *Server) routeMuxConnection(conn net.Conn, httpListener *muxListener) { |
There was a problem hiding this comment.
The routeMuxConnection goroutine performs network I/O (TLS handshake and reader.Peek) without a deadline. While this change correctly prevents the main accept loop from blocking, an idle client that connects but never sends data will still cause this worker goroutine to hang indefinitely. This can lead to a resource leak (goroutines and file descriptors) under a slowloris-style attack or with many naturally idle clients. It is recommended to set a short read deadline on the connection before starting the protocol detection logic and clear it once the connection is successfully routed to its respective handler.
Set a 10s read deadline on connections entering routeMuxConnection so that idle clients which never send bytes are cleaned up instead of leaking goroutines and file descriptors. The deadline is cleared once the connection is successfully routed to its handler. Addresses gemini-code-assist review feedback on PR router-for-me#3303.
|
Good catch. Added a 10s read deadline in |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ee690b41d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| continue | ||
| return | ||
| } | ||
| s.handleRedisConnection(conn, reader) |
There was a problem hiding this comment.
Clear the sniff deadline before Redis handling
When management Redis/RESP is enabled and a client keeps a connection open, this path inherits the 10-second read deadline set before sniffing and never clears it before entering handleRedisConnection. Because that handler loops on blocking RESP reads, an authenticated Redis client that is idle or waits more than 10 seconds after accept before its next command will get disconnected by an i/o timeout, unlike the HTTP paths which clear the deadline after routing.
Useful? React with 👍 / 👎.
Move per-connection protocol detection (TLS handshake, reader.Peek) out of the accept loop and into a per-connection goroutine. An idle TCP connection that never sends bytes would previously block Peek(1) indefinitely, preventing all subsequent connections from being accepted and making the management/API server unresponsive. Closes router-for-me#3267
Clear the 10s read deadline before calling handleRedisConnection so that authenticated Redis clients are not disconnected by an i/o timeout after 10 seconds of idle time. HTTP paths already clear the deadline after routing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Good catch, thanks @chatgpt-codex-connector! Added deadline clear before entering in the latest push (c5596e0). |
|
To use Codex here, create a Codex account and connect to github. |
luispater
left a comment
There was a problem hiding this comment.
Summary:
This change keeps the shared listener responsive by moving TLS/protocol sniffing out of the accept loop and into per-connection routing. The added sniff deadline also addresses the follow-up resource-leak risk for clients that connect and never send bytes.
Key findings:
- No blocking findings.
Test plan:
- Reviewed the diff for
internal/api/protocol_multiplexer.goandinternal/api/protocol_multiplexer_test.go. - Verified PR checks are passing:
pr-test-build,translator-path-guard, andagents-md-guard. - Locally ran
go build -o test-output ./cmd/server && rm test-outputon PR HEADc5596e09.
Summary
Problem
acceptMuxConnections performs protocol sniffing inline: after accepting a connection, it calls tlsConn.Handshake() and reader.Peek(1) before returning to listener.Accept(). If a client opens a TCP connection and never sends any bytes, Peek(1) blocks indefinitely, preventing all subsequent connections from being accepted. This makes the entire management/API server unresponsive while the process remains alive.
Reproduction (from #3267):
Terminal 1: open idle TCP connection
bash -c "exec 3<>/dev/tcp/127.0.0.1/8317; sleep 60"
Terminal 2: HTTP request times out
curl -v --max-time 5 http://127.0.0.1/8317/
Fix
Extract the protocol detection logic into routeMuxConnection() and call it via a goroutine. The accept loop now only calls listener.Accept() and immediately dispatches each connection, so a single slow/idle connection cannot block the loop.
The extracted method uses return instead of the original continue since it runs in its own goroutine.
Test Plan
Closes #3267