-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[image_config] add rasdaemon.timer #14300
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
[Unit] | ||
Description=Delays rasdaemon until SONiC has started | ||
|
||
[Timer] | ||
OnUnitActiveSec=0 sec | ||
OnBootSec=1min 30 sec | ||
Unit=rasdaemon.service | ||
|
||
[Install] | ||
WantedBy=multi-user.target |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a downside here as well? Can the device potentially lose the hardware/memory errors?
I think the asnwer is yes. In that case, I think we should do this only where necessarily needed. Is there a merit of doing this for anything except warmboot?
For other cases (cold/fast/load-mg/config-reload) : The memory/hardware errors are more likely to be hit during bootup time and indicate a possibly bad hw. We want such errors to be logged while the system is booting up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vaibhavhd
load-mg/config-reload - rasdaemon is not touched.
warm/fast - we must also delay it for fast reboot. It is the same kexec operation. Same operation from CPU/mem perspective.
Please note, that we don't have a way to delay per different boot types, e.g pmon, snmp, lldp are delayed regardless of the boot type.
rasdaemon was added as a replacement for mcelog. MCE exceptions are recorded into a kernel ring buffer, so they aren't lost if it reads it later (or at least this is how mcelog worked). rasdaemon states it reads not just /dev/mce but several sources EDAC, MCE, PCI, ... I am not aware how can I test whether events are lost or not as I don't know a way to generate this exceptions. At least an MCE exceptions should not be missed in my understanding.