-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
system-health.service takes 100s to go down and delays warm shutdown causing LAGs flap #14964
Labels
Comments
vaibhavhd
changed the title
system-health.service cannot be killed with SIGTERM : causes warmreboot to fail
system-health.service takes 100s to go down and delays warm shutdown causing LAGs flap
May 5, 2023
Syslog containing both good and bad case runs is attached. |
10 tasks
This PR (#15212) will address the system-health service taking 100s to go down issue. |
All fixes merged |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Steps to reproduce the issue:
PR causing it: #4835
Describe the results you received:
Issue seen in 202205 branch, but may be also impacting other branches too.
Good case :
system-health.service
takes ~2s to shutdown. IMO 2s is also too long in this time critical path. But, it is better than bad case which takes ~100s.Bad case:
system-health.service
takes 100s to go down and keeps the system in waiting state. This causes warmreboot to fail as LACP session window of 90s is exceeded and LAGs flap and traffic is dropped.Describe the results you expected:
system-health.service
should go down as fast as possible (~0.1s).Output of
show version
:Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):
syslog_healthd_fail.log
The text was updated successfully, but these errors were encountered: