-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@marc-hb sof-kernel-log-check: narrower ignore_str for ICL reboot issue #372
@marc-hb sof-kernel-log-check: narrower ignore_str for ICL reboot issue #372
Conversation
So we can start ignoring known issues on a per platform basis. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
This PR doesn't not add any shellcheck warning in https://travis-ci.org/github/thesofproject/sof-test/builds/725765539, they were all already there. I didn't to stress test ICL to test this change, #361 was enough. Please help review #361 too (merged, thx @xiulipan ) |
@marc-hb I will close this PR. The issue can be seen on any device, but the reproduce rate on ICL is very high. So we should ignore the info for all platforms. |
You merged #365 before all discussions were finished and now you're closing this one before any discussion has started.
I can't see the how the second sentence relates to the first one. ("very high" doesn't mean much without some numbers thesofproject/sof#3395 has some) |
#365 is only a workaround. I did not think more discussion is needed for that. I think we have plan to use log level to filter the warning and error message to avoid keyword detection. As the panic log is not platform specific one, I think we should ignore this info message for all platforms. |
I've never seen it outside ICL across many PRs and the description of thesofproject/sof#3395 is very specifically about ICL. If/when it happens elsewhere then we want to know exactly when and how and not ignore failures "just in case they might happen". This code makes it trivial to add other platforms later, it's designed for that. |
@marc-hb It also happened to CML_HEL_RT5682, check the failure on CML_HEL_RT5682 in https://sof-ci.sh.intel.com/#/result/planresultdetail/319 |
de61027
to
a0bf35c
Compare
Thanks @aiChaoSONG , I just added I also updated thesofproject/sof#3395 |
@marc-hb As the panic info may show on any platforms, why we need to narrower the ignore list? I do not want to add a new platform name here once we see it on another platform. |
People interested in finding and fixing issues are interested in when, where, how and how often they happened. They don't mind adding 4 characters in one file for that sort of information, that's a small price to pay. |
Anyway this is not even the main point of this PR. The main purpose here is not to miss OTHER issues unrelated to thesofproject/sof#3395 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. the commit title only talks of ICL when the patch is for ICL and CML. only minor though
tools/sof-kernel-log-check.sh
Outdated
ignore_str="$ignore_str"'|sof-audio-pci 0000:[0-9a-f]{2}:[0-9a-f]{2}\.[0-9a-f]: status = 0x[0-9a-f]{8} panic = 0x[0-9a-f]{8}' | ||
case "$platform" in | ||
icl|cml) | ||
ignore_str="$ignore_str"'|sof-audio-pci 0000:00:1f.3: status = 0x[0]{8} panic = 0x[0]{8}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't you need a \ before the .?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't make any difference in this very particular case because the kernel will always print the exact same character in this position. However it is good reminder for other cases when matching an actual dot matters so I'm adding it.
Fixes ac415de ("tools: ignore a false error message with 'panic'") which was ignoring way too many errors. See previous discussion in corresponding PR and long source code comment. Generally speaking, ignoring errors ("green failures") is extremely dangerous and should be as narrow as possible. More specifically here, not even knowing which platforms experience the issue and what code they print is is really not going to help fix it. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
a0bf35c
to
952b8cd
Compare
Merge this for |
2 commits. The main one:
Fixes ac415de ("tools: ignore a false error message with 'panic'")
which was ignoring way too many errors. See previous PR and long source
code comment.
Generally speaking, ignoring errors ("green failures") is extremely
dangerous and should be as narrow as possible.
More specifically here, not even knowing which platforms experience the
issue and what code they print is is really not going to help fix it.