-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for a valid trace for ALL tests #297
Comments
First part submitted in #373, please help review. |
I think what you're asking is: "is the logger a low priority feature"? I read the description of this request as a pretty clear "no" but I'd like to hear more from others.
It's risky to rely on a single test, if the test is broken then the feature can be broken and no one will notice. Even if the test is not broken, the logger could be broken only in some scenarios not tested by the logger test. I think the links above are some real-world examples of either. A test is never meant to test only one thing, the more extras each test can test (in no extra time) and the better. Unexpected failures mean extra bugs are found which is great. These questions sounds strange to me at a time when the logger seems to be working fine, I mean all tests in #373 pass right now. I mean it sounds very strange to worry about failures we don't even have right now (this reminds me of recent #372) It almost sounds like like a desire to find FEWER bugs / and have more "green failures"? |
@marc-hb What I want to make clarify about is the criterion for our test case pass/fail.
I do not think we are adding test cases in this way. Each test case in this repo is focusing in a single feature. This will help us to narrow down the issue. |
This is :
|
They didn't pass, they only appear to pass because I misused
Now we have one: pretty much all tests involving the logger fail to kill -INT it at the end, see new bug https://github.com/thesofproject/sof/issues/3433 [ |
multipipeline tests returning an empty trace: #380 (comment) |
Does not completely fixes thesofproject#297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Does not completely fixes thesofproject#297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@aiChaoSONG does this seem related to your #359 PR? |
Does not completely fixes thesofproject#297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
... because it makes no sense. When a test using the logger fails it produces a confusing error message like this: 2020-09-18 10:17:25 UTC [REMOTE_INFO] Starting /usr/bin/sof-logger -l /etc/sof/sof-byt.ldc -o /home/ubuntu/sof-test/logs/check-alsabat/<date>/etrace.txt error: in logger_read(), fread(..., /sys/kernel/debug/sof/etrace) failed: Inappropriate ioctl for device(25) Reported by Pierre in #384 This mistake seems to have been there since the dawn of time however no one noticed because most things logger-related (and others) have been silenced so far - which is changing now with thesofproject#297. The problem this fixes can be reproduced trivially with this one-line patch: --- a/case-lib/lib.sh +++ b/case-lib/lib.sh @@ -125,6 +125,7 @@ func_lib_start_log_collect() dlogi "Starting $loggerCmd" # Cleaned up by func_exit_handler() in hijack.sh sudo "$loggerCmd" & + exit 1 } Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Does not completely fixes #297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Probably the main change is fixing the huge etrace test gaps thesofproject#321 and thesofproject/sof#3281 Also fixes DMA trace gaps thesofproject#297 and thesofproject#298 I initial tried to preserve some of the existing code but it was just too bad. PR thesofproject#161 / commit 7274f49 seemed especially bad: - It tried to ignore a specific `ll drift` error but instead it filtered out almost every log statement out of... stderr, that does not have show log statements!! (Just for the record this `ll drift` error has been downgraded to warning now, see thesofproject/sof#2686 and thesofproject/sof#3854) - That same commit also added code that merely starts the DMA trace with "there is an error below" (without failing the test) but that's eclipsed by the entire log that follows. Later, the firmware started printing ERROR every single time when the ERROR FW ABI prefix was introduced yet no one ever noticed which proves how useless this prefix is was. So remove this DMA trace prefix as the purpose of this test is - as clearly stated in thesofproject#167 - not to find firmware errors but errors with the sof-logger itself (even though we never had anything looking at firmware errors so far) Don't grep for "error" on stderr: anything on stderr is a logger failure (not a firmware failure). Don't require whitespace before the TIMESTAMP header. Add set -e. Use shell functions. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Probably the main change is fixing the huge etrace test gaps #321 and thesofproject/sof#3281 Also fixes DMA trace gaps #297 and #298 I initial tried to preserve some of the existing code but it was just too bad. PR #161 / commit 7274f49 seemed especially bad: - It tried to ignore a specific `ll drift` error but instead it filtered out almost every log statement out of... stderr, that does not have show log statements!! (Just for the record this `ll drift` error has been downgraded to warning now, see thesofproject/sof#2686 and thesofproject/sof#3854) - That same commit also added code that merely starts the DMA trace with "there is an error below" (without failing the test) but that's eclipsed by the entire log that follows. Later, the firmware started printing ERROR every single time when the ERROR FW ABI prefix was introduced yet no one ever noticed which proves how useless this prefix is was. So remove this DMA trace prefix as the purpose of this test is - as clearly stated in #167 - not to find firmware errors but errors with the sof-logger itself (even though we never had anything looking at firmware errors so far) Don't grep for "error" on stderr: anything on stderr is a logger failure (not a firmware failure). Don't require whitespace before the TIMESTAMP header. Add set -e. Use shell functions. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
After finally testing this a bit (#666) and performing some overdue "data-mining" in test results it's very clear the DMA trace has been very unreliable since forever: |
If sof-logger is started (or restarted) while DSP is running, the initial traces may come in incorrect order or are incomplete. This is important to note when parsing the logger results. BugLink: thesofproject/sof-test#297 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
If sof-logger is started (or restarted) while DSP is running, the initial traces may come in incorrect order or are incomplete. This is important to note when parsing the logger results. BugLink: thesofproject/sof-test#297 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
If sof-logger is started (or restarted) while DSP is running, the initial traces may be incomplete. Document the limitation and give a brief explanation of the current ringbuffer design and how it affects the start-up behaviour. BugLink: thesofproject/sof-test#297 BugLink: thesofproject/linux#3275 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
If sof-logger is started (or restarted) while DSP is running, the initial traces may be incomplete. Document the limitation and give a brief explanation of the current ringbuffer design and how it affects the start-up behaviour. BugLink: thesofproject/sof-test#297 BugLink: thesofproject/linux#3275 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Low priority to P2 as SOF is going to use native Zephyr log implementation. @marc-hb @plbossart |
I don't understand, Zephyr bugs can be solved without any logs? |
... should finish this but it's currently blocking, triggering some failures. |
Blocked by mtrace enabling atm. |
This has been finally fixed by @kv2019i: And it does catch mtrace issues like thesofproject/linux#4618 and others. The test works, closing. |
We need to double-check that the trace is functional and that sof-logger reports at least the firmware details and a dai trigger.
If there is no trace then there's no point in checking results and debugging further.
cc:
The text was updated successfully, but these errors were encountered: