-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix retry mechanism #362
fix retry mechanism #362
Conversation
err = c.tunnel.BindSSH(ctx, sshConn, reqs, chans) | ||
if n, ok := err.(net.Error); ok && !n.Temporary() { | ||
retry = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the retry flag all together seems pretty hacky, better to get to the bottom of the issue. Maybe n.Temporary isn’t accurate…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we saw this issue when the remote host was unavailable for some reason (easy to reproduce - just block with iptables).
when it became available again - chisel did not reconnect...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I think I’ve seen it before too, just hoping to keep the retry flags - and instead of find the false-negative case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpillora Any chance you can merge this please?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the retry flag all together seems pretty hacky
Can we keep the retry
variable, I think the fix is just to remove this block:
if n, ok := err.(net.Error); ok && !n.Temporary() {
retry = false
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for for info, see golang/go#45729
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpillora I removed only the condition you mentioned, and tried the following simple scenario:
- run chisel client -> connected to host
- blocked the host in iptables (with DROP rule)
- chisel client disconnected after a few minutes:
2022/08/14 04:43:02 client: Connection error: read tcp ...: read: connection timed out
2022/08/14 04:43:02 client: Give up
2022/08/14 04:43:02 client: tun: Unbound proxies
- remove the DROP rules from iptables - i.e. the host is accessible again
- chisel client did not re-try, and remained disconnected (I think this is the "Give up" message).
So it looks the retry flag doesn't really work...I think it should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With respect, retry definitely works under certain circumstances:
Server:
$ docker run --name chisel -p 8080:8080 --rm -it jpillora/chisel server socks
2022/08/14 14:52:32 server: Fingerprint WtF6u45xORDyMD+kLAZc42H33hijiPhjou4IOp8Ssvo=
2022/08/14 14:52:32 server: Listening on http://0.0.0.0:8080
2022/08/14 14:52:45 server: session#6: Client version (1.7.7) differs from server version (v1.7.7)
-ctrl-c
$ docker run --name chisel -p 8080:8080 --rm -it jpillora/chisel server socks
2022/08/14 14:52:53 server: Fingerprint IZvG3hhx/9GqHSrfq1eI6WDgeAD1cZiJzSvP4wdWeFE=
2022/08/14 14:52:53 server: Listening on http://0.0.0.0:8080
2022/08/14 14:52:53 server: session#1: Client version (1.7.7) differs from server version (v1.7.7)
Client:
$ chisel client http://127.0.0.1:8080 socks
2022/08/14 15:52:45 client: Connecting to ws://127.0.0.1:8080
2022/08/14 15:52:45 client: tun: proxy#127.0.0.1:1080=>socks: Listening
2022/08/14 15:52:45 client: Connected (Latency 1.3523ms)
2022/08/14 15:52:50 client: Disconnected
2022/08/14 15:52:50 client: Connection error: websocket: close 1006 (abnormal closure): unexpected EOF
2022/08/14 15:52:50 client: Retrying in 100ms...
2022/08/14 15:52:50 client: Connection error: dial tcp 127.0.0.1:8080: connect: connection refused (Attempt: 1)
2022/08/14 15:52:50 client: Retrying in 200ms...
2022/08/14 15:52:50 client: Connection error: dial tcp 127.0.0.1:8080: connect: connection refused (Attempt: 2)
2022/08/14 15:52:50 client: Retrying in 400ms...
2022/08/14 15:52:50 client: Connection error: dial tcp 127.0.0.1:8080: connect: connection refused (Attempt: 3)
2022/08/14 15:52:50 client: Retrying in 800ms...
2022/08/14 15:52:51 client: Connection error: dial tcp 127.0.0.1:8080: connect: connection refused (Attempt: 4)
2022/08/14 15:52:51 client: Retrying in 1.6s...
2022/08/14 15:52:53 client: Connected (Latency 1.2117ms)
As you can see, the client disconnects when the server isn't there, retries till it's back and then reconnects.
Perhaps it responds to the iptables DROP rule by assuming it'll never be able to access the server and so gives up...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't say it never works...But if it only works in certain scenarios - it's a bug.
The DROP rule was just an example, we saw the same issue in other cases as well where the host was temporarily unavailable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay thanks for looking into it
merged
Found an issue with chisel's retry mechanism.
It has a loop that should retry to re-connect, but in some cases it's not working:
https://github.com/jpillora/chisel/blob/master/client/client_connect.go#L50
This is because some errors are marked as retry = false in code.
https://github.com/jpillora/chisel/blob/master/client/client_connect.go#L140
So the client gives up instead of trying to re-connect at an increasing interval.
Changes made: when running with unlimited attempts - it never gives up, and always tries to reconnect.