-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bazel hangs when using GRPC cache #10731
Comments
/cc @buchgr |
Does it hang "forever" or terminate after a long period time? |
There is a problem that if the TCP connection dies Bazel might not be able to detect it for ~20 minutes or so. We have a solution but never got around to implement it: https://docs.google.com/document/d/13vXJTFHJX_hnYUxKdGQ2puxbNM3uJdZcEO5oNGP-Nnc/edit |
In our case it seems to be hanging forever, or at least far more than 20 minutes. |
We now have evidence of this happening to us as well. It's not too frequent, thankfully. But when it hits, people cancel the build well before waiting 20min. |
Here's a jvm dump for our case: https://gist.github.com/kastiglione/151a3b5723daef7e968610ee38c8d569 |
What is the reason that a timeout isn't happening for this? |
Small update: this is only occurring for us with writes, if we set the client to read only then we never end up in this deadlocked state. The frontend causing the issues is nginx, when switching to a different one that uses haproxy we do not see any issues. |
This issue should be fixed. Please check #11782 for more details. Closing. |
Description of the problem / feature request:
Bazel hangs sometimes when using a remote GRPC cache. Console / BEP output shows:
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
bazel build --remote_timeout=300 --remote_download_minimal --remote_upload_results=true --remote_cache=grpcs://remote.cache:443
What operating system are you running Bazel on?
Mac 10.14.6
What's the output of
bazel info release
?release 2.0.0
Have you found anything relevant by searching the web?
#5112
Any other information, logs, or outputs that you want to share?
Stack trace of the bazel server while stuck:
stack.txt
The text was updated successfully, but these errors were encountered: