-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gorums cannot handle nil for non-PerNodeArg Gorums RPC calls. #35
Comments
I'm not sure I understand the issue here. I think that passing nil values should panic, as this is the developer's mistake. |
The quorum call actually can invoke gRPC calls with nil message and the quorum functions can also receive enough replies (also nil) to obtain a quorum. So, the returned reply for this quorum call in this case is nil, however if the server failure happens, then no quorum is obtain, the returned reply is also nil. Therefore, I think in my tests, this two cases return the same values (both nils) (one is not an error (the err is nil) according to Hein, another is caused by the server failure), so I can not check the difference. |
I think Hein has a solution about this issue. ;) |
TestPerNodeArg is left failing to demonstrate the issue. See #35.
A quorum call works with nil values as we circumvent the actual gRPC call: if arg == nil {
// send a nil reply to the for-select-loop
replyChan <- internalWriteResponse{node.id, nil, nil}
return
} An actual gRPC call would panic with:
@meling: If you look at config_qc_test.go#L928, a request is ignored but the nil values is still added to the reply slice, i.e., we are allowing a write request which needs a quorum of at least 3 with only 2 actual responses. The call succeeds as len(replies) is > 2 (as the nil values is counted). See PR #36 for a potential fix. |
TestPerNodeArg is left failing to demonstrate the issue. See #35.
TestPerNodeArg is left failing to demonstrate the issue. See #35.
TestPerNodeArg is left failing to demonstrate the issue. See #35.
It seems my quick and dirty initial solution was a bit too simplistic; I was hoping to avoid the extra noise around the |
Added another commit to the |
The routine that caused the panic needs to invoke recover, or we might never learn about the panic (expect if we sync with the routine at some point). |
My initial thoughts on this is that it seems to be a bit much hassle for handling |
Yes, that's what I'm thinking as well, we should not try to code ourselves around developer error, let it panic. Though, we should recover a crashing go routine and communicate that back to the calling quorum call, so that the panic is actually caught. I.e., the panic needs to be invoked on the main thread so that the program actually panics. |
Don't you think it is a bit of an overkill to complicate things by recovering individual goroutines in order to pass that on to the main goroutine. Not sure if that helps debug the actual problem, especially if the panic is caused by another issue that we haven't thought about yet. I think I prefer to let it panic as it does now. Though we cannot write a test for |
TestPerNodeArg is left failing to demonstrate the issue. See #35.
For my read-write distributed storage implemented by Gorums,
if I send a nil message through either read or write quorum call,
then the quorum call returns a nil reply with error nil.
The reason for this issue is the callGRPC methods generated by Gorums.
For example in method: callGRPCWrite:
This allows the write quorum call to send a nil write request and reply a nil as the write response with error message nil. (the quorum function can receive enough replies to obtain a quorum)
So, in my quorum functions, I currently wrote some code to handle the nil replies, otherwise, it can cause panic, since there's no value and timestamp to get if replies are nils.
The text was updated successfully, but these errors were encountered: