Description
While thinking about potential sources of the infamous PoolTimedOut
error, I realized that there's an interesting failure mode to acquire()
.
Once it decides to open a new connection, that's all it tries to do:
sqlx/sqlx-core/src/pool/inner.rs
Lines 283 to 284 in e1ac388
If a nonfatal connection error happens, it just continues in the backoff loop in connect()
and never touches the idle queue again:
sqlx/sqlx-core/src/pool/inner.rs
Line 348 in e1ac388
It will continue to do this until the timeout if the transient error does not resolve itself.
Right now, only the Postgres driver overrides DatabaseError::is_transient_in_connect_phase()
, but one of the error codes it considers transient is the "too many connections" error:
sqlx/sqlx-postgres/src/error.rs
Lines 192 to 195 in e1ac388
This means that if the max_connections
of the pool exceeds what is currently available on the server, tasks can get stuck in a loop trying to open new connections despite there being idle connections available, leading to surprising PoolTimedOut
errors.
This is potentially the cause of some such issues being reported, although it's only likely to occur with the Postgres driver.