-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout errors #389
Comments
I would try increasing the |
Hey @joshrotenberg, I forgot to mention but increasing the I also noticed this other error that I believe was not happening on 0.10, but not sure if it's related or not:
|
Is it possible that the ssl config changed? Also, would it be possible to try different versions of Erlang? Can you try 22.1? I vaguely remember running into some ssl-related issues with different erlang versions but I don’t remember the details :( |
Which config do you mean exactly? But I'd say no, we haven't changed any config in the app nor in the Kafka cluster.
Actually, we are on the 22.1.6 already. I've just tried the 22.1.8 as well but still the same. And to make sure the Erlang version is somehow related, I downgraded to the initial version and it works perfectly, no timeouts at all. @ elixir_buildpack.config:1 @
-erlang_version=22.1.8
+erlang_version=21.3.7 |
While the 21.3.7 is working fine, the next closest version (22.0.3) available in the Heroku buildpack is already raising the timeouts, so potentially is something related to the major upgrade. |
Confirming the issue, we had to downgrade to the same version (21.3.7) to avoid timout issues. |
Hi, any news with that? |
FYI, I created an issue on the OTP project. https://bugs.erlang.org/browse/ERL-1213 I'm not sure what we can really do in the mean time, we can try silently re-create a socket every time there's a timeout, but that sounds like a bad idea. |
I tested many versions of OTP. The problem appears in OTP 21.3. With OTP 21.2 and lower, things are fine: https://github.com/jbruggem/kafka_ex_ssl_bug#run-many-times |
Thank you for following up on this so thoroughly! |
@jbruggem whats the conclusion here? I’ll get kayrock patched ASAP but my understanding was that’s not the bug but rather a compilation problem you discovered while hunting the bug. |
Nothing changes here, still waiting for somebody to dig into this. It will take a few days at least. I'm not sure you need to patch Kayrock, as your dependency definition allows for |
Should we re-open the issue then? |
My bad. I closed this by mistake and didn't even realize. |
News from upstream: after many back-and-forth to help a very dedicated OTP dev figure out the issue, 🎉 she did 🎉 ! From the issue:
There's nothing more to do on our side, except update the the documentation to mention this bug and the CI to use the latest versions of OTP releases. |
Thanks @jbruggem ! |
Amazing! Thank you! |
If anyone's still having the same issue, you can try upgrade OTP to 22.3.4.12 or beyond. There was several fixes on ssl recv after the major fix in 22.3.3. https://www.erlang.org/patches/otp-22.3.3#ssl-9.6.2 |
Hi all,
It all started in the Heroku review-envs when we upgraded our application from (Erlang 21.3.7 and Elixir 1.8.2) to (Erlang 22.0.7 and Elixir 1.9.1).
We noticed, for some reason, that downgrading Erlang back to 21.3.7 solved the timeouts.
In this first moment, messages were not being consumed/produced and thousands of timeout errors were flooding the logs. Then we bumped to
kafka_ex
0.10, which seemed to have improved a bit (fewer timeouts in the logs and messages being consumed).However, in the last week, we started to get the same timeouts in production but starting with a lower frequency (~10 timeouts/day).
Some extra information:
I also tried the
kafka_ex
on the master branch and with the newkayrock
API, but seems that it didn't help. Lots of timeouts keep flooding the logs.The error on 0.10:
And the error now with master:
The
Successfully connected...
andReceiving data from broker...
messages are also logged indefinitely, many per second.The text was updated successfully, but these errors were encountered: