Skip to content

Conversation

@redbaron
Copy link
Contributor

@redbaron redbaron commented Nov 6, 2025

Change deadline delay for default DeadlineContextWatcherHander to prevent snowballing effect of establishing connections in excess of configured connection pool maximum size in pathological case with many context cancellations.

Previously, with default deadline of 0, on context cancellation pgx was immediately marking connection as closed, even though pgConn.asyncClose still needs time to complete and PG need time to act on the cancellation request. Connection pooler (pgx pool or stdlib sql) checks IsClosed() state on Release, recognizes it as closed and immediately removes it from the pool, allowing new connection to be established.

If new connection establishment time is less than full end-to-end query cancellation time, including time a server needs to act on cancellation request, and there are many context cancellations events happening per second, then number of connections seen by a PostgreSQL server start to accumulate, exceeding configured total pool size.

It eventually reaches max_connections limit on a server, with the majority of them being cancelled. No new connection can be established, bringing app to a halt.

New default deadline delay of 50ms is chosen to be larger than pessimistic time for a full query cancellation completion. With the new deadline pgx keeps connection in the pool after cancellation for up to 50ms, thus effectively throttling how fast cancelled queries can be replaced with a new connection. Because new connection creation is now throttled, it can't overrun query cancellations, preventing the snowball effect.

It also has nice side effect of increasing a chance of connection reuse if what would be cancelled query completes in less than 50ms.

Change deadline delay for default DeadlineContextWatcherHander to prevent
snowballing effect of establishing connections in excess of configured
connection pool maximum size in pathological case with many context
cancellations.

Previously, with default deadline of 0, on context cancellation pgx was
immediately marking connection as closed, even though pgConn.asyncClose still
needs time to complete and PG need time to act on the cancellation request.
Connection pooler (pgx pool or stdlib sql) checks `IsClosed()` state on Release,
recognizes it as closed and immediately removes it from the pool, allowing new
connection to be established.

If new connection establishment time is less than full end-to-end query cancellation
time, including time a server needs to act on cancellation request, and there are
many context cancellations events happening per second, then number of connections
seen by a PostgreSQL server start accumulate, exceeding configured total pool size.

It eventually reaches `max_connections` limit on a server, with the majority of them
being cancelled. No new connection can be established, bring app to a halt.

New default deadline delay of 50ms is chosen to be larger than pessimistic time for a
full query cancellation completion. With the new deadline pgx keeps connection in the
pool after cancellation for up to 50ms, thus effectively throttling how fast cancelled
queries can be replaced with a new connection. Because new connection creation is now
throttled, it can't overrun query cancellations, preventing the snowball effect.

It also has nice side effect of increasing a chance of connection reuse if what would be
cancelled query completes in less than 50ms.
@jackc
Copy link
Owner

jackc commented Nov 9, 2025

I built DeadlineContextWatcherHandler to support this behavior, but I'm not sure I want it as default.

My approach to date has been that by default context cancellation should cause the canceled operation to return as quickly as possible.

Regardless, it seems that the underlying issue is the pool creating more connections than it has capacity for. I have some recollection of that being an issue in the past, but I thought it was resolved.

@redbaron
Copy link
Contributor Author

redbaron commented Nov 9, 2025

My approach to date has been that by default context cancellation should cause the canceled operation to return as quickly as possible.

Yes, but it effectively misinforms an application layer: connection is still pretty much alive, but app thinks it is gone for good. This creates conditions for creating more connections than it intends to.

underlying issue is the pool creating more connections than it has capacity for.

Pool is doing best it can: if pgConn.IsClosed() is true, then it must free up slot and allow new connection to be created.

Ideally IsClosed() should be accurate and faithfully wait for in-flight query to be truly aborted by a PG server. It can currently be achieved with DeadlineContextWatcherHandler.DeadlineDelay set to infinity, but async cancellation can fail for various reasons, so there must be a time limit, exceeding which we just close TCP connection as last resort. 50ms is a reasonable limit for the async closure full completion.

@jackc
Copy link
Owner

jackc commented Nov 9, 2025

Ideally IsClosed() should be accurate and faithfully wait for in-flight query to be truly aborted by a PG server.

This role is filled by reading from PgConn.CleanupDone(). That's what I thought was already preventing creating a new connection while connections that are being cleaned up are still active. If that's not working then that is an underlying bug that should be fixed, rather than masking the bug with what is essentially a sleep.

Where / how are you measuring the connection count? Is it possible that is including the temporary connections made to issue the "CancelQuery"? Those aren't full connections and aren't considered by the pool, but may be impacting this situation.

@redbaron
Copy link
Contributor Author

redbaron commented Nov 9, 2025

This role is filled by reading from PgConn.CleanupDone().

Cleanup channel is closed in a goroutine started by pgConn.asyncClose immediately after query cancelation request is replied by a PG server. As I understand it doesn't guarantee that connection, which is being closed is actually closed by server.

Where / how are you measuring the connection count? Is it possible that is including the temporary connections made to issue the "CancelQuery"? Those aren't full connections and aren't considered by the pool, but may be impacting this situation.

Connection count is measured on PG side using pg_stat_activity. All excessive connections have matching application_name we configure for the pgx pools. On pgx side we don't use custom context cancellation handler and pgx connects directly to PG, without any intermediaries. As I understand it auxiliary connections pgx opens to issue CancelQuery command are not counted towards real connections count by the PG server: they are not even authenticated like normal connections do.

Scenario to max out server connections beyond pgx pool max size (in our case we had 6x max_connections than pool size) is following:

  • number of active pool.Acquire goroutines >> pool size (200-300 times more in our case). Acquires should never dry out , always applying pressure. Acquire should use cancellable context, it was 2s in our case.
  • SQL queries on acquired connections should take less time than context timeout
  • Majority Acquire will timeout without success, but few will be lucky enough to run query
  • Some of those which run query will be cancelling them mid-query, thus triggering pgConn.asyncClose
  • if PG server is overloaded with queries, cancellation processing got delayed, building up unaccounted (by pgxpool) connections.

@jackc
Copy link
Owner

jackc commented Nov 28, 2025

Cleanup channel is closed in a goroutine started by pgConn.asyncClose immediately after query cancelation request is replied by a PG server. As I understand it doesn't guarantee that connection, which is being closed is actually closed by server.

I looked at all the locations the cleanup channel is closed. In every case the net.Conn is closed first. So at least on the client side the connection is entirely terminated. But I suppose that on the server side the connection could still be open for a little while. 🤔

Looking at the PG docs it doesn't seem there is a good way for the client to know when the server has actually finished cleaning up the connection. From https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-TERMINATION

The normal, graceful termination procedure is that the frontend sends a Terminate message and immediately closes the connection. On receipt of this message, the backend closes the connection and terminates.

We are already doing that.

But a simple change would be to add the sleep / delay to asyncClose. That would preserve the behavior of the caller of an interrupted query getting control back as quickly as possible while delaying the closing of the cleanup channel for a while.

It also might be possible to use CloseWrite and Read to wait a little while for the server. With that we could wait up to a few seconds if the server never closed the connection, but also close the cleanup channel immediately if the server did close it. Might need to review #637 though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants