-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd test write timeout #10
Comments
The specification is correct. When the server receives a write operation, it performs the write. From the client's perspective, an operation can time out. At that point, there are two possibilities, because the client can't tell whether the request was dropped by the network before reaching the server, or the request was received and processed but the response was dropped by the network before reaching the client. After a timed-out write operation, the client performs no more operations (you can verify this in that example history, client 4 does no more operations after that point). How do we represent timed-out write operations as part of the history? They are operations with a known invocation time, but we have no upper bound on when the response was received (or whether the operation was received at all), which we can model with an upper bound of infinity. (If the operation wasn't received at all, then we can choose its linearization point to be after all reads, at which point it doesn't matter how we order writes.) Here is where we handle all the write/CAS operations that time out, by appending their return (with unknown value) to the end of the history: https://github.com/anishathalye/porcupine/blob/master/porcupine_test.go#L285-L287 |
Great observation. I added an assertion to make sure that a process does not perform any operations after a timeout and it looks like the test dataset follows that rule.
|
I found some counterexamples:
porcupine/test_data/jepsen/etcd_100.log Lines 20 to 31 in 7cbc834
also in etcd_101.log and etcd_102.log |
Ok I understand now, setting a realy large return time does make sense since it would model both cases: a) the write was performed, b) the write was not performed and it's too far in the future to invalidate it with a read. Thanks a lot! Ah another clarification, you only parse for timeout read Lines 203 to 209 in 7cbc834
Therefore, all timeout write and timeout cas are appended at the end, after all read that are actually parsed and appended at the correct position. Lines 285 to 287 in 7cbc834
|
I think the etcd specification is incorrect:
porcupine/porcupine_test.go
Lines 151 to 152 in 7cbc834
A write can timeout
porcupine/test_data/jepsen/etcd_000.log
Line 61 in 7cbc834
In this case, the Step function should not apply the write
The text was updated successfully, but these errors were encountered: