-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
race condition during blocking query? #300
Comments
I just noticed the bug is reproduced even more easily if I set longer wait times (like 5 seconds). |
This may be a dup of #279. There does seem to be some race with the index calculation it seems. |
Can you provide any scripts you use to trigger this? |
Fixed by 3330956! |
Hey @armon thank you for the prompt support! Sorry for not putting it together as a test.go file, but here is a gist that reproduces the bug. https://gist.github.com/imkira/f40e64d00583160c0c9f As initially reported, I get two bug patterns:
Let me know if you can fix this. |
@imkira I am unable to reproduce the error with you test code running against master. Can you verify you re-built against master? Also a gist of the error output could be useful. |
Yes, I am running the latest version of consul.
I built the app as shown above and I ran it with As for the output, I ran the gist program 10 times and posted the results as a comment to the gist https://gist.github.com/imkira/f40e64d00583160c0c9f Let me know of any other way I can help you track down this bug. |
Oh my oh my! My bad! |
No worries! Glad it is working. |
I am running a cluster but I am able to reproduce (using just a single node) what appears to be a race condition during a blocking query.
the setup
So, in that node I run a consul agent in client mode, and in the same node I run 2 processes:
http://127.0.0.1:8500/v1/health/service/myservice?consistent=&index=<last_index>&wait=1ms
First I run process1 and leave it running. I never stop it.
Then, I run process2. Then I kill it (gracefully by sending it a signal to quit). Then run it again. I repeat this process until I reproduce the bug which happens after a few runs (few < 10).
the bug
It happens on process1.
Sometimes index doesn't change between responses (yes this is expected) but the response's Checks structure changes. I am observing 2 changes that I get in the response's Checks, specifically the addition of a service check (with the same checkID I am calling on process2 and status="unknown") and the second change being status changing to "passing". I am logging all responses I get from the API so I was able to notice this. Even after these 2 changes, there are times where response.index does not change.
Is this intended? Is this some race condition?
Thank you in advance and sorry for the long problem report.
notes
The text was updated successfully, but these errors were encountered: