Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection leak from watch timeout. #20

Closed
jmlMetaswitch opened this issue Nov 14, 2017 · 5 comments
Closed

Connection leak from watch timeout. #20

jmlMetaswitch opened this issue Nov 14, 2017 · 5 comments
Labels

Comments

@jmlMetaswitch
Copy link

I'm using the watch timeout set to 10s. If we've not received a change notification by then I ping the downstream service anyway and then restart the watch. Unfortunately, this leaks a connection every 10s if there is no activity in etcd.

I assume this is due to the rust-etcd watch integration with hyper. Let me know if you would like any specific diagnostics.

@jimmycuadra
Copy link
Owner

Can you clarify what you mean by leaking a connection?

@jmlMetaswitch
Copy link
Author

Sure, sorry.

As above I'm setting a 10s timeout on a watch and then repeating after 10s. This issues an http://etcd:2379/v2/keys/root?wait=true (from memory) every 10s with no response, as expected. If you run netstat -anp | grep -cE ":2379.*ESTABLISHED" you can see that the number of connections to etcd increments every 10s. Each connection is allocated an outbound TCP port. At that rate it takes about a week to run out of ports to connect from.

My understanding is that this is because the timeout configured in https://github.com/jimmycuadra/rust-etcd/blob/master/src/kv.rs#L531 has no way of cancelling the hyper connection, and the latter never times out. I've not really looked into hyper, but perhaps they don't think there is a problem: hyperium/hyper#1097.

I'm using (from Cargo.lock) etcd 0.8.0 and hyper 0.11.6.

@mthebridge
Copy link

As linked above, I think this is a hyper bug which I've raised with them - or if it isn't, then I think it needs a better explanation of how to use timeouts for long-poll GETs!

@mthebridge
Copy link

The underlying hyper bug is fixed as of v0.11.10, so taking that should fix this problem - could we update to this version? Happy to submit a PR if that would help.

@jimmycuadra
Copy link
Owner

Should be fixed by f2574a2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants