PoolInner::acquire() does not try the idle queue after a transient connection failure

While thinking about potential sources of the infamous `PoolTimedOut` error, I realized that there's an interesting failure mode to `acquire()`.

Once it decides to open a new connection, that's all it tries to do: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-core/src/pool/inner.rs#L283-L284

If a nonfatal connection error happens, it just continues in the backoff loop in `connect()` and never touches the idle queue again: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-core/src/pool/inner.rs#L348

It will continue to do this until the timeout if the transient error does not resolve itself.

Right now, only the Postgres driver overrides `DatabaseError::is_transient_in_connect_phase()`, but one of the error codes it considers transient is the "too many connections" error: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-postgres/src/error.rs#L192-L195

This means that if the `max_connections` of the pool exceeds what is currently available on the server, tasks can get stuck in a loop trying to open new connections despite there being idle connections available, leading to surprising `PoolTimedOut` errors.

This is potentially the cause of some such issues being reported, although it's only likely to occur with the Postgres driver.

	// too_many_connections
	// This may be returned if we just un-gracefully closed a connection,
	// give the database a chance to notice it and clean it up.
	"53300",

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PoolInner::acquire() does not try the idle queue after a transient connection failure #2848

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	// Attempt to connect...
	return self.connect(deadline, guard).await;

PoolInner::acquire() does not try the idle queue after a transient connection failure #2848

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions