Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLAVEOF command may block a long time #1170

Closed
2 tasks done
caipengbo opened this issue Dec 8, 2022 · 5 comments · Fixed by #1172
Closed
2 tasks done

SLAVEOF command may block a long time #1170

caipengbo opened this issue Dec 8, 2022 · 5 comments · Fixed by #1172
Labels
bug type bug

Comments

@caipengbo
Copy link
Contributor

caipengbo commented Dec 8, 2022

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.2.0

Minimal reproduce step

  1. echo 6 > /proc/sys/net/ipv4/tcp_syn_retries
  2. SLAVEOF a non-routable IP address(such as 10.255.255.1)
  3. SLAVEOF new_master_ip port

What did you expect to see?

Step 3 should not block for a long time because we are switching to the new master, which should be a quick operation.

What did you see instead?

Step 3 will block a long time(> 2min based on above reproduced scenario)

Anything Else?

This is because we use blocking IO when use socket connect(2). In some cases, connect(2) will be blocked for a long time.

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@caipengbo caipengbo added the bug type bug label Dec 8, 2022
@git-hulk
Copy link
Member

git-hulk commented Dec 8, 2022

@caipengbo I think the root cause is we didn't set a timeout for the connect operation, so it will backoff retransmit the syn packet(1s,2s,4s,8s,16s,32s,64s) until reaching the tcp_syn_retries(default is 6 for most Linux distro).

@caipengbo
Copy link
Contributor Author

@caipengbo I think the root cause is we didn't set a timeout for the connect operation, so it will backoff retransmit the sync packet(1s,2s,4s,8s,16s,32s,64s) until reaching the tcp_syn_retries(default is 6 for most Linux distro).

Yes, we should add a timeout.

@git-hulk
Copy link
Member

git-hulk commented Dec 8, 2022

Cool, got it.

@caipengbo
Copy link
Contributor Author

Let's set it a little longer than 3 seconds, so we can tolerate 2 times lost sync packages? @git-hulk

@git-hulk
Copy link
Member

git-hulk commented Dec 8, 2022

Yes, sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug type bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants