-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consistency issues #587
Comments
Besides, we also found one consistency issue about not read committed: https://github.com/jerrytesting/inconsistent-bugs-canonical-raft/tree/main/not_read_committed |
Thanks for the reports! Are you able to share any information about the Jepsen setup that produced these results? We'd love to incorporate any changes into https://github.com/canonical/jepsen.dqlite that are needed to reproduce them in our CI. |
Sorry that our Jepsen setup is not open-sourced now but it will be soon. Currently, I can only give you enough bug reports to help you figure out the issues. |
@jerrytesting No problem, we appreciate your work on this. Does your setup include an "assertion checker", like this in jepsen.dqlite? |
Yes, included the assertion checker and reported two assertion failures in ISSUE-291. We are still running to see whether we can capture other issues. |
hey @jerrytesting do you have the |
Also, which version of raft are you using? Based on the log messages,you are not on v0.17.1 which contains quite some bugfixes. |
What is also strange is e.g. in An example of an inconsistent read is key 55 around timestamp 2023-02-13 14:59:41, that's also a time during which node n1 is accepting requests as a single member and leader of a cluster, but during that time Is it possible you are not setting up the cluster correctly in your tests? |
Hey, we run in v0.14.0. I think it should not be the setup issue as we use your Jepsen harness. Will upload the full logs soon (some logs are very large, so they are missed). |
Looking at the logs from a Jepsen test perspective, it's hard to make sense of what's happening. From looking at the log messages, the The test framework
Running the current Note: Jepsen provides extension mechanisms, e.g. |
Yeah, there was one bug in our tool that the stop-partition nemesis was not invoked and so not executed. Thanks for finding this bug in our tool. However, the other points you said are not issues. Our tool only decides how to dispatch nemesis but the real generators are still from Jepsen. I think this consistency violation really exists but happens in an extreme environment; for example, the network partition doesn't be healed in long time. |
We found one consistency issue about not read uncomitted in the latest raft version. Here is the bug log: https://github.com/jerrytesting/inconsistent-bugs-canonical-raft/tree/main/not_read_uncommitted and hope it could help you debug.
The text was updated successfully, but these errors were encountered: