Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batching RPCs over TLS causes stuck RPCs #44

Open
xqrzd opened this issue Feb 2, 2020 · 1 comment
Open

Batching RPCs over TLS causes stuck RPCs #44

xqrzd opened this issue Feb 2, 2020 · 1 comment
Labels
bug Something isn't working

Comments

@xqrzd
Copy link
Owner

xqrzd commented Feb 2, 2020

Very obscure bug. If 2 RPCs get sent in the same buffer to Socket.WriteAsync(), we only receive the data for the first RPC from Socket.ReceiveAsync(). Both RPCs are received by Kudu, if the client disconnects while waiting for the 2nd RPC response, the server prints this:
W0130 19:40:45.970429 934 connection.cc:554] Connection torn down before Call kudu.master.MasterService.GetTableLocations from 10.0.2.2:51671 (request call id 3) could send its response
The 2nd RPC can be unstuck by sending an additional RPC, suggesting somewhere a buffer isn't being flushed in this scenario.

For now there is a hack-fix in place to prevent multiple RPCs being sent in a single call to Socket.WriteAsync(), at the cost of some efficiency.

@xqrzd xqrzd added bug Something isn't working help wanted Extra attention is needed labels Feb 2, 2020
@xqrzd
Copy link
Owner Author

xqrzd commented Feb 6, 2021

This doesn't reproduce anymore in Kudu 1.12+, guessing this is the commit that fixed it: apache/kudu@1a21e92

Will need to look into options for removing the hack-fix on Kudu 1.12 and newer.

@xqrzd xqrzd removed the help wanted Extra attention is needed label Jul 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant