-
Notifications
You must be signed in to change notification settings - Fork 18k
database/sql: DB full resetterCh causes driver.ErrBadConn error #31480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
experiencing same issue |
Change https://golang.org/cl/174122 mentions this issue: |
I can see two solutions to this issue:
I'll look into this more. I'm not ready for a final fix (the referenced CL isn't quite right I think). If someone wants to submit a CL in the next couple of days that would be great. We are really close to the 1.13 freeze. |
I think you'd need to do both. This is a bit of a black-box issue that sporadically effects production services—is there an ETA here, or would a submitted CL be processed fairly quickly? Thanks. |
I think it is too late in the cycle to subit the CL, but if you want to, you could carry-pick the change locally and try it out. |
Would 1.12.x be game, or does the 1.13 freeze effect 1.12.x updates? |
@kardianos if the change is too invasive or subtle for 1.13 at this point, then it's almost certainly not a candidate for a 1.12 backport either, especially given that there is a workaround (namely, setting the connection limit ≤ 50). |
This would be a problem for larger services where 50 concurrent connections isn't adequate, however. The CL builds and tests out on my end under pretty good stress—500 go routines spamming local and remote MySQL instances. I patched 1.12.7 and the two versions of go performed the same. That being said, I'm not able to reproduce the reported issue outside of prod, and it would be difficult to deploy this in our production without a bit of a gamut of new Docker images, not to mention that we're trying to not have this issue effect us poorly again. :) |
I patched 1.12.7, and it builds and tests out on my end, I put some fairly heavy load on both 1.12.7 and patched 1.12.7, basically the OP's 500-worker use case. I wasn't able to reproduce the issue locally, but both versions performed fine and comparably. The build-in go tests passed as well, obviously. It would be difficult to deploy this in our production to reproduce the issue without undoing workaround code and gamut of new Docker images, not to mention that we're trying to not have this issue effect us poorly again. :) All that said, the CL seems worthy, IMO. |
Any chance https://golang.org/cl/174122 can be pushed into 13.++ before it's forgotten? :-) |
@wekb, last I checked CL 174122 was awaiting a bit of rework to avoid blocking indefinitely in When the fix is ready, I suspect it will not be eligible for backporting: to my knowledge it is not a regression, and there is a workaround today (setting |
Thanks. I just wanted to represent that there's ongoing interest and that it makes it into a release in due time. |
Reverting commit in CL https://go-review.googlesource.com/c/go/+/223958. |
Change https://golang.org/cl/242102 mentions this issue: |
Change https://golang.org/cl/242522 mentions this issue: |
Manually backported the subject CLs, because of lack of Gerrit "forge-author" permissions, but also because the prior cherry picks didn't apply cleanly, due to a tight relation chain. The backport comprises of: * CL 174122 * CL 216197 * CL 223963 * CL 216240 * CL 216241 Note: Due to the restrictions that we cannot retroactively introduce API changes to Go1.13.13 that weren't in Go1.13, the Conn.Validator interface (from CL 174122, CL 223963) isn't exposed, and drivers will just be inspected, for if they have an IsValid() bool method implemented. For a description of the content of each CL: * CL 174122: database/sql: process all Session Resets synchronously Adds a new interface, driver.ConnectionValidator, to allow drivers to signal they should not be used again, separatly from the session resetter interface. This is done now that the session reset is done after the connection is put into the connection pool. Previous behavior attempted to run Session Resets in a background worker. This implementation had two problems: untested performance gains for additional complexity, and failures when the pool size exceeded the connection reset channel buffer size. * CL 216197: database/sql: check conn expiry when returning to pool, not when handing it out With the original connection reuse strategy, it was possible that when a new connection was requested, the pool would wait for an an existing connection to return for re-use in a full connection pool, and then it would check if the returned connection was expired. If the returned connection expired while awaiting re-use, it would return an error to the location requestiong the new connection. The existing call sites requesting a new connection was often the last attempt at returning a connection for a query. This would then result in a failed query. This change ensures that we perform the expiry check right before a connection is inserted back in to the connection pool for while requesting a new connection. If requesting a new connection it will no longer fail due to the connection expiring. * CL 216240: database/sql: prevent Tx statement from committing after rollback It was possible for a Tx that was aborted for rollback asynchronously to execute a query after the rollback had completed on the database, which often would auto commit the query outside of the transaction. By W-locking the tx.closemu prior to issuing the rollback connection it ensures any Tx query either fails or finishes on the Tx, and never after the Tx has rolled back. * CL 216241: database/sql: on Tx rollback, retain connection if driver can reset session Previously the Tx would drop the connection after rolling back from a context cancel. Now if the driver can reset the session, keep the connection. * CL 223963 database/sql: add test for Conn.Validator interface This addresses comments made by Russ after https://golang.org/cl/174122 was merged. It addes a test for the connection validator and renames the interface to just "Validator". Updates #31480 Updates #32530 Updates #32942 Updates #34775 Fixes #40205 Change-Id: I6d7307180b0db0bf159130d91161764cf0f18b58 Reviewed-on: https://go-review.googlesource.com/c/go/+/242522 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Theophanes <kardianos@gmail.com>
Manually backported the subject CLs, because of lack of Gerrit "forge-author" permissions, but also because the prior cherry picks didn't apply cleanly, due to a tight relation chain. The backport comprises of: * CL 174122 * CL 216197 * CL 223963 * CL 216240 * CL 216241 Note: Due to the restrictions that we cannot retroactively introduce API changes to Go1.14.6 that weren't in Go1.14, the Conn.Validator interface (from CL 174122, CL 223963) isn't exposed, and drivers will just be inspected, for if they have an IsValid() bool method implemented. For a description of the content of each CL: * CL 174122: database/sql: process all Session Resets synchronously Adds a new interface, driver.ConnectionValidator, to allow drivers to signal they should not be used again, separatly from the session resetter interface. This is done now that the session reset is done after the connection is put into the connection pool. Previous behavior attempted to run Session Resets in a background worker. This implementation had two problems: untested performance gains for additional complexity, and failures when the pool size exceeded the connection reset channel buffer size. * CL 216197: database/sql: check conn expiry when returning to pool, not when handing it out With the original connection reuse strategy, it was possible that when a new connection was requested, the pool would wait for an an existing connection to return for re-use in a full connection pool, and then it would check if the returned connection was expired. If the returned connection expired while awaiting re-use, it would return an error to the location requestiong the new connection. The existing call sites requesting a new connection was often the last attempt at returning a connection for a query. This would then result in a failed query. This change ensures that we perform the expiry check right before a connection is inserted back in to the connection pool for while requesting a new connection. If requesting a new connection it will no longer fail due to the connection expiring. * CL 216240: database/sql: prevent Tx statement from committing after rollback It was possible for a Tx that was aborted for rollback asynchronously to execute a query after the rollback had completed on the database, which often would auto commit the query outside of the transaction. By W-locking the tx.closemu prior to issuing the rollback connection it ensures any Tx query either fails or finishes on the Tx, and never after the Tx has rolled back. * CL 216241: database/sql: on Tx rollback, retain connection if driver can reset session Previously the Tx would drop the connection after rolling back from a context cancel. Now if the driver can reset the session, keep the connection. * CL 223963 database/sql: add test for Conn.Validator interface This addresses comments made by Russ after https://golang.org/cl/174122 was merged. It addes a test for the connection validator and renames the interface to just "Validator". Updates #31480 Updates #32530 Updates #32942 Updates #34775 Fixes #39101 Change-Id: I043d2d724a367588689fd7d6f3cecb39abeb042c Reviewed-on: https://go-review.googlesource.com/c/go/+/242102 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Theophanes <kardianos@gmail.com>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?go env
GOHOSTARCH="amd64" GOHOSTOS="linux" GOOS="linux"What did you do?
I am doing stress test for mysql server using golang.
I create a sql.DB and set
Then I create 500 go-routines (500 clients) and send 1000000 queries to mysql server.
After I run the program, it sometimes pops up error "driver: bad connection" (
driver.ErrBadConn
).I found that in sql.OpenDB, it creates a *sql.DB struct with:
resetterCh: make(chan *driverConn, 50)
In
func (db *DB) putConn(dc *driverConn, err error, resetSession bool)
, ifdb.resetterCh
is full, it marks connection as badand if number of connections exceeds max connection, it reuses old connection which is marked as bad and return
driver.ErrBadConn
.I can solve it by set max connection less than 50 (which is size of
db.resetterCh
).Why did you hardcoded size of
db.resetterCh
to 50?Should it be set to max connections?
https://play.golang.org/p/phUILuRV3hJ
What did you expect to see?
What did you see instead?
The text was updated successfully, but these errors were encountered: