Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 1.15.0 causes errors server closed the connection unexpectedly when closing down Sidekiq worker dynos #150

Closed
damhonglinh opened this issue Jun 2, 2021 · 5 comments · Fixed by #169

Comments

@damhonglinh
Copy link

damhonglinh commented Jun 2, 2021

After yesterday deploy, we started to get a lot of ActiveRecord errors server closed the connection unexpectedly.

After hours of investigating, narrowing down the cause, we found out it's because of the new version 1.15.0 of pgbouncer (or v0.8.0 of this buildpack), and the errors happen when we stop (scale down) or restart a Sidekiq worker dyno.

From the logs when restarting a dyno, it looks like pgbouncer process (in the new version) gets killed too soon, at the same time when Sidekiq processes are stopping so any running SQL queries are abruptly stopped. While for the previous version, pgbouncer process seems to get killed after Sidekiq processes have stoped, as expected.

Our questions:

  • Any ideas why the new version 1.15.0 causes these errors? Because checking changelog of the buildpack and of pgbouncer, we cannot find any suspects.
  • Any ideas how to fix this issue?
  • Is there a way for us to lock in version 0.7.0 of this buildpack (v1.14.0 of pgbouncer) as we can't find any documentation about this.

Variants of these errors:

  • ActiveRecord::StatementInvalid: PG::UnableToSend: server closed the connection unexpectedly. This probably means the server terminated abnormally before or while processing the request.
  • ActiveRecord::StatementInvalid: PG::ConnectionBad: PQconsumeInput() server closed the connection unexpectedly. This probably means the server terminated abnormally before or while processing the request
  • ActiveRecord::StatementInvalid: PG::ConnectionBad: PQconsumeInput() SSL connection has been closed unexpectedly
@honigc
Copy link

honigc commented Jun 2, 2021

We're also encountering this same problem today. It's possible to pin an unofficial buildpack to a specific tag or commit, and using https://github.com/heroku/heroku-buildpack-pgbouncer#v0.7.0 seems to work as a temporary fix.

@edmorley
Copy link
Member

Did anyone figure out the cause of the issues with pgbouncer 1.15?

In order to fix #164 (for Heroku-22) we now need to upgrade to pgbouncer 1.17, and unless 1.16 or 1.17 happened to fix the issue here, it's going to occur again. One option would be to only use 1.17 on Heroku-22, however that means more differences between stacks which isn't ideal, as it adds more variables to consider when upgrading stack.

cc @beanieboi

edmorley added a commit that referenced this issue Jun 20, 2022
To fix crashes/seg faults seen on Heroku-22 due to Ubuntu 22.04
now shipping with OpenSSL 3.

See:
https://www.pgbouncer.org/2022/03/pgbouncer-1-17-0
https://www.pgbouncer.org/changelog.html#pgbouncer-117x

The other stacks remain on pgbouncer 1.14.0 for now, until #150 (the reason
an earlier upgrade to pgbouncer was rolled back) can be investigated.

#165 is also open for improving the CI of this repo (since it didn't catch these
crashes), though fixing that is out of the scope of this PR, as we just need to
stop the crashes for now.

Fixes #164.
GUS-W-11319779.
@ezekg
Copy link

ezekg commented Sep 5, 2022

Any update on this?

edmorley added a commit that referenced this issue Sep 6, 2022
* Adds the missing patchfile for 1.17.0, meaning sigterm is now
   correctly patched for that version too.
* Removes the "default" patchfiles directory and makes the
   per-version directories mandatory, so future releases can't
   accidentally forget to add a new patchfile.
* Removes the patchfile for pgbouncer 1.7 since that version is
   no longer built.

Closes #150.
Closes #168.
edmorley added a commit that referenced this issue Sep 6, 2022
* Adds the missing patchfile for 1.17.0, meaning sigterm is now correctly patched for that version too.
* Removes the "default" patchfiles directory and makes the per-version directories mandatory, so future releases can't accidentally forget to add a new patchfile.
* Removes the patchfile for pgbouncer 1.7 and 1.13.0 since those versions are no longer built.

Closes #150.
Closes #168.
GUS-W-11700066.
@edmorley
Copy link
Member

edmorley commented Sep 6, 2022

So it turns out that this buildpack patches pgbouncer to adjust its shutdown behaviour, and that the patch was missing for both the earlier 1.15.0 upgrade attempt, and the more recent 1.17.0 release. The reason for this is that the build script had a silent "fall back to no patches" behaviour if a version-specific patch was not found.

In #169 I've added the missing patchfile for 1.17.0, and also adjusted the build script such that it no longer has a silent fallback to no patches - which will prevent future pgbouncer upgrades from accidentally not including the patch.

@edmorley
Copy link
Member

edmorley commented Sep 6, 2022

(But ultimately the real issue here is that this buildpack is only informally maintained by people who don't actively use it, and the buildpack has insufficient test coverage)

edmorley added a commit that referenced this issue Sep 9, 2022
For parity with Heroku-22, now that the cause of #150 is
understood and fixed.
edmorley added a commit that referenced this issue Sep 9, 2022
For parity with Heroku-22 (thereby reducing differences between stacks, so
there is one fewer thing to think about when upgrading stacks), now that the
cause of #150 is understood and fixed.

https://www.pgbouncer.org/changelog.html#pgbouncer-117x

Binaries generated using `make build-heroku-18 build-heroku-20`.
shrolox added a commit to klaxit/heroku-buildpack-pgbouncer that referenced this issue Oct 5, 2022
* Add support for Heroku-22 (heroku#161)

Binary generated using `make build-heroku-22`.

GUS-W-10346720.

* Release v0.10.0 (heroku#162)

To pick up heroku#161.

GUS-W-10346720.

* Update to pgbouncer v1.17.0 on Heroku-22 (heroku#166)

To fix crashes/seg faults seen on Heroku-22 due to Ubuntu 22.04
now shipping with OpenSSL 3.

See:
https://www.pgbouncer.org/2022/03/pgbouncer-1-17-0
https://www.pgbouncer.org/changelog.html#pgbouncer-117x

The other stacks remain on pgbouncer 1.14.0 for now, until heroku#150 (the reason
an earlier upgrade to pgbouncer was rolled back) can be investigated.

heroku#165 is also open for improving the CI of this repo (since it didn't catch these
crashes), though fixing that is out of the scope of this PR, as we just need to
stop the crashes for now.

Fixes heroku#164.
GUS-W-11319779.

* Patch sigterm for version 1.17.0 too (heroku#169)

* Adds the missing patchfile for 1.17.0, meaning sigterm is now correctly patched for that version too.
* Removes the "default" patchfiles directory and makes the per-version directories mandatory, so future releases can't accidentally forget to add a new patchfile.
* Removes the patchfile for pgbouncer 1.7 and 1.13.0 since those versions are no longer built.

Closes heroku#150.
Closes heroku#168.
GUS-W-11700066.

* Update pgbouncer to v1.17.0 for Heroku-18 and Heroku-20 (heroku#170)

For parity with Heroku-22 (thereby reducing differences between stacks, so
there is one fewer thing to think about when upgrading stacks), now that the
cause of heroku#150 is understood and fixed.

https://www.pgbouncer.org/changelog.html#pgbouncer-117x

Binaries generated using `make build-heroku-18 build-heroku-20`.

* Convert to Github Actions (heroku#172)

* Allow SSL certificate usage for database connections

* remake binaries

Co-authored-by: Ed Morley <501702+edmorley@users.noreply.github.com>
Co-authored-by: Ken Barber <ken@bob.sh>
Co-authored-by: teckwan <wong.teck.wan@outlook.com>
Co-authored-by: Pierre-Olivier Maugis <pierre-olivier.maugis@klaxit.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants