Check for a valid trace for ALL tests #297

plbossart · 2020-07-28T15:32:29Z

We need to double-check that the trace is functional and that sof-logger reports at least the firmware details and a dai trigger.

If there is no trace then there's no point in checking results and debugging further.

cc:

many others, follow the links.

marc-hb · 2020-09-11T00:00:06Z

First part submitted in #373, please help review.

marc-hb · 2020-09-15T05:19:50Z

From @xiulipan in #373

it that behavior right? Do we need to fail the case if we could not get the logger? As we have standalone test case for logger.

marc-hb · 2020-09-15T05:33:52Z

Do we need to fail the case if we could not get the logger?

I think what you're asking is: "is the logger a low priority feature"? I read the description of this request as a pretty clear "no" but I'd like to hear more from others.

As we have standalone test case for logger.

It's risky to rely on a single test, if the test is broken then the feature can be broken and no one will notice. Even if the test is not broken, the logger could be broken only in some scenarios not tested by the logger test. I think the links above are some real-world examples of either.

A test is never meant to test only one thing, the more extras each test can test (in no extra time) and the better. Unexpected failures mean extra bugs are found which is great.

These questions sounds strange to me at a time when the logger seems to be working fine, I mean all tests in #373 pass right now. I mean it sounds very strange to worry about failures we don't even have right now (this reminds me of recent #372) It almost sounds like like a desire to find FEWER bugs / and have more "green failures"?

xiulipan · 2020-09-15T09:05:32Z

@marc-hb What I want to make clarify about is the criterion for our test case pass/fail.
Should we make aplay test fail if a debug feature is not working?

A test is never meant to test only one thing, the more extras each test can test (in no extra time) and the better.

I do not think we are adding test cases in this way. Each test case in this repo is focusing in a single feature. This will help us to narrow down the issue.

marc-hb · 2020-09-15T14:19:36Z

I do not think we are adding test cases in this way. Each test case in this repo is focusing in a single feature. This will help us to narrow down the issue.

This is :

neither possible because sof-test are not unit tests. They all run from user space so each test involves many lines of code
nor desirable unless you want to find fewer bugs - as it happened in the links above and elsewhere.

marc-hb · 2020-09-18T02:51:23Z

I mean all tests in #373 pass right now.

They didn't pass, they only appear to pass because I misused exit_failure=2 in #373 where 2 means "Not Applicable" and Not Applicable is green. I didn't know 2 was a special value.

I mean it sounds very strange to worry about [logger] failures we don't even have right now

Now we have one: pretty much all tests involving the logger fail to kill -INT it at the end, see new bug https://github.com/thesofproject/sof/issues/3433 [BUG] sof-logger resisted pkill -INT, using -KILL

marc-hb · 2020-09-18T03:29:09Z

multipipeline tests returning an empty trace: #380 (comment)

Does not completely fixes thesofproject#297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

marc-hb · 2020-09-18T05:31:53Z

multiple-pipeline-capture.sh and multiple-pipeline-playback.sh have a consistently empty trace on about half the platforms: https://sof-ci.01.org/softestpr/PR383/build194/devicetest/

2020-09-18 05:13:41 UTC [REMOTE_INFO] nlines=1 /home/ubuntu/sof-test/logs/multiple-pipeline-playback/2020-09-18-05:13:29-16938/slogger.txt
2020-09-18 05:13:41 UTC [REMOTE_ERROR] Empty logger trace
2020-09-18 05:13:41 UTC [REMOTE_INFO] Test Result: FAIL!

@aiChaoSONG does this seem related to your #359 PR?

Does not completely fixes thesofproject#297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

aiChaoSONG · 2020-09-18T05:40:23Z

@marc-hb Yes, they are related. without #359, these two case are actually testing nothing, and always pass. It is the sof-logger issue which makes the multiple-pipeline-playback/capture failed after your #373 patch

... because it makes no sense. When a test using the logger fails it produces a confusing error message like this: 2020-09-18 10:17:25 UTC [REMOTE_INFO] Starting /usr/bin/sof-logger -l /etc/sof/sof-byt.ldc -o /home/ubuntu/sof-test/logs/check-alsabat/<date>/etrace.txt error: in logger_read(), fread(..., /sys/kernel/debug/sof/etrace) failed: Inappropriate ioctl for device(25) Reported by Pierre in #384 This mistake seems to have been there since the dawn of time however no one noticed because most things logger-related (and others) have been silenced so far - which is changing now with thesofproject#297. The problem this fixes can be reproduced trivially with this one-line patch: --- a/case-lib/lib.sh +++ b/case-lib/lib.sh @@ -125,6 +125,7 @@ func_lib_start_log_collect() dlogi "Starting $loggerCmd" # Cleaned up by func_exit_handler() in hijack.sh sudo "$loggerCmd" & + exit 1 } Signed-off-by: Marc Herbert <marc.herbert@intel.com>

Does not completely fixes #297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

Probably the main change is fixing the huge etrace test gaps thesofproject#321 and thesofproject/sof#3281 Also fixes DMA trace gaps thesofproject#297 and thesofproject#298 I initial tried to preserve some of the existing code but it was just too bad. PR thesofproject#161 / commit 7274f49 seemed especially bad: - It tried to ignore a specific `ll drift` error but instead it filtered out almost every log statement out of... stderr, that does not have show log statements!! (Just for the record this `ll drift` error has been downgraded to warning now, see thesofproject/sof#2686 and thesofproject/sof#3854) - That same commit also added code that merely starts the DMA trace with "there is an error below" (without failing the test) but that's eclipsed by the entire log that follows. Later, the firmware started printing ERROR every single time when the ERROR FW ABI prefix was introduced yet no one ever noticed which proves how useless this prefix is was. So remove this DMA trace prefix as the purpose of this test is - as clearly stated in thesofproject#167 - not to find firmware errors but errors with the sof-logger itself (even though we never had anything looking at firmware errors so far) Don't grep for "error" on stderr: anything on stderr is a logger failure (not a firmware failure). Don't require whitespace before the TIMESTAMP header. Add set -e. Use shell functions. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

Probably the main change is fixing the huge etrace test gaps #321 and thesofproject/sof#3281 Also fixes DMA trace gaps #297 and #298 I initial tried to preserve some of the existing code but it was just too bad. PR #161 / commit 7274f49 seemed especially bad: - It tried to ignore a specific `ll drift` error but instead it filtered out almost every log statement out of... stderr, that does not have show log statements!! (Just for the record this `ll drift` error has been downgraded to warning now, see thesofproject/sof#2686 and thesofproject/sof#3854) - That same commit also added code that merely starts the DMA trace with "there is an error below" (without failing the test) but that's eclipsed by the entire log that follows. Later, the firmware started printing ERROR every single time when the ERROR FW ABI prefix was introduced yet no one ever noticed which proves how useless this prefix is was. So remove this DMA trace prefix as the purpose of this test is - as clearly stated in #167 - not to find firmware errors but errors with the sof-logger itself (even though we never had anything looking at firmware errors so far) Don't grep for "error" on stderr: anything on stderr is a logger failure (not a firmware failure). Don't require whitespace before the TIMESTAMP header. Add set -e. Use shell functions. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

marc-hb · 2021-06-12T04:02:28Z

After finally testing this a bit (#666) and performing some overdue "data-mining" in test results it's very clear the DMA trace has been very unreliable since forever: Empty DMA trace thesofproject/sof/issues/4333. On some platforms much more than others.

If sof-logger is started (or restarted) while DSP is running, the initial traces may come in incorrect order or are incomplete. This is important to note when parsing the logger results. BugLink: thesofproject/sof-test#297 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>

If sof-logger is started (or restarted) while DSP is running, the initial traces may be incomplete. Document the limitation and give a brief explanation of the current ringbuffer design and how it affects the start-up behaviour. BugLink: thesofproject/sof-test#297 BugLink: thesofproject/linux#3275 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>

marc-hb · 2022-01-21T17:57:42Z

More DMA unreliability:

mengdonglin · 2022-03-20T08:07:47Z

Low priority to P2 as SOF is going to use native Zephyr log implementation. @marc-hb @plbossart

marc-hb · 2022-03-29T21:52:42Z

I don't understand, Zephyr bugs can be solved without any logs?

marc-hb · 2022-07-26T00:02:05Z

Enable cavstool #897

... should finish this but it's currently blocking, triggering some failures.

mengdonglin · 2022-09-21T04:32:29Z

Blocked by mtrace enabling atm.

Enable mtrace #956

marc-hb · 2023-09-28T21:34:22Z

This has been finally fixed by @kv2019i:

check-sof-logger: fix the DSP boot check #1095

And it does catch mtrace issues like thesofproject/linux#4618 and others.

The test works, closing.

plbossart mentioned this issue Jul 30, 2020

Missing FW logs on CI failures #294

Closed

mengdonglin added P1 Blocker bugs or important features type:enhancement New framework feature or request labels Jul 31, 2020

mengdonglin mentioned this issue Jul 31, 2020

check-sof-logger need to report fail is no valid log shows #298

Closed

mengdonglin assigned marc-hb Jul 31, 2020

marc-hb mentioned this issue Jul 31, 2020

[BUG] error trace debugfs entry (etrace) is always empty thesofproject/sof#3265

Closed

This was referenced Aug 13, 2020

SoC: intel: sof_sdw: Add support for product Ripto thesofproject/linux#2357

Merged

Up2-nocodec: sof-logger / ldc failure in capture/playback test thesofproject/linux#2361

Closed

marc-hb mentioned this issue Sep 10, 2020

Stop ignoring various logger failures #373

Merged

marc-hb mentioned this issue Sep 18, 2020

test-case: add missing pipeline count acquisition logic #380

Merged

marc-hb added a commit to marc-hb/sof-test that referenced this issue Sep 18, 2020

hijack.sh: fail if the logger trace is empty

b43659e

Does not completely fixes thesofproject#297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

marc-hb added a commit to marc-hb/sof-test that referenced this issue Sep 18, 2020

hijack.sh: fail if the logger trace is empty

88adae1

Does not completely fixes thesofproject#297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

marc-hb mentioned this issue Sep 18, 2020

hijack.sh: fail if the logger trace is empty #383

Closed

marc-hb added a commit to marc-hb/sof-test that referenced this issue Sep 18, 2020

hijack.sh: fail if the logger trace is empty

c3982e1

Does not completely fixes thesofproject#297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

marc-hb mentioned this issue Sep 19, 2020

hijack.sh: don't start a second logger in the exit handler #386

Closed

marc-hb mentioned this issue Nov 10, 2020

Failure to start logger does not fail immediately, only at the end of the test #506

Closed

marc-hb mentioned this issue Mar 13, 2021

check-sof-logger: fail on empty logs #629

Closed

marc-hb mentioned this issue Apr 23, 2021

hijack.sh: fail if the logger trace is empty #662

Merged

marc-hb closed this as completed in #662 Apr 23, 2021

marc-hb added a commit that referenced this issue Apr 23, 2021

hijack.sh: fail if the logger trace is empty

a207ff7

Does not completely fixes #297 but goes a long way. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

marc-hb mentioned this issue May 26, 2021

[BUG]ERROR ewt in etrace thesofproject/sof#4248

Closed

marc-hb added area:logs Log and results collection, storage, etc. False Pass / green failure labels Jul 3, 2021

This was referenced Jul 6, 2021

Re-enable DMA trace initialization in Zephyr thesofproject/sof#4452

Closed

[TEST] disable DMA trace to see if underflows happen thesofproject/linux#3021

Closed

marc-hb mentioned this issue Nov 9, 2021

[BUG] Irrelevant tests should be completely skipped #804

Closed

kv2019i mentioned this issue Nov 10, 2021

debugability: logger: add note on DMA trace limitations thesofproject/sof-docs#381

Merged

kv2019i mentioned this issue Nov 16, 2021

[FEATURE] Ability to start DMA-trace while DSP is running thesofproject/linux#3275

Open

mengdonglin added P2 Critical bugs or normal features and removed P1 Blocker bugs or important features labels Mar 20, 2022

marc-hb added P1 Blocker bugs or important features and removed P2 Critical bugs or normal features labels Mar 31, 2022

mengdonglin added the state:blocked label Sep 21, 2022

marc-hb mentioned this issue Jun 30, 2023

Catch firmware errors #1075

Merged

marc-hb closed this as completed Sep 28, 2023

marc-hb mentioned this issue Dec 14, 2023

debug_overlay.conf: temporarily disable SOF_BOOT_TEST thesofproject/sof#8624

Merged

marc-hb mentioned this issue Apr 17, 2024

Catch firmware errors reported in logs #1173

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check for a valid trace for ALL tests #297

Check for a valid trace for ALL tests #297

plbossart commented Jul 28, 2020 •

edited by marc-hb

Loading

marc-hb commented Sep 11, 2020

marc-hb commented Sep 15, 2020 •

edited

Loading

marc-hb commented Sep 15, 2020 •

edited

Loading

xiulipan commented Sep 15, 2020

marc-hb commented Sep 15, 2020

marc-hb commented Sep 18, 2020

marc-hb commented Sep 18, 2020

marc-hb commented Sep 18, 2020

aiChaoSONG commented Sep 18, 2020 •

edited

Loading

marc-hb commented Jun 12, 2021 •

edited

Loading

marc-hb commented Jan 21, 2022

mengdonglin commented Mar 20, 2022 •

edited

Loading

marc-hb commented Mar 29, 2022

marc-hb commented Jul 26, 2022

mengdonglin commented Sep 21, 2022 •

edited by marc-hb

Loading

marc-hb commented Sep 28, 2023

Check for a valid trace for ALL tests #297

Check for a valid trace for ALL tests #297

Comments

plbossart commented Jul 28, 2020 • edited by marc-hb Loading

marc-hb commented Sep 11, 2020

marc-hb commented Sep 15, 2020 • edited Loading

marc-hb commented Sep 15, 2020 • edited Loading

xiulipan commented Sep 15, 2020

marc-hb commented Sep 15, 2020

marc-hb commented Sep 18, 2020

marc-hb commented Sep 18, 2020

marc-hb commented Sep 18, 2020

aiChaoSONG commented Sep 18, 2020 • edited Loading

marc-hb commented Jun 12, 2021 • edited Loading

marc-hb commented Jan 21, 2022

mengdonglin commented Mar 20, 2022 • edited Loading

marc-hb commented Mar 29, 2022

marc-hb commented Jul 26, 2022

mengdonglin commented Sep 21, 2022 • edited by marc-hb Loading

marc-hb commented Sep 28, 2023

plbossart commented Jul 28, 2020 •

edited by marc-hb

Loading

marc-hb commented Sep 15, 2020 •

edited

Loading

marc-hb commented Sep 15, 2020 •

edited

Loading

aiChaoSONG commented Sep 18, 2020 •

edited

Loading

marc-hb commented Jun 12, 2021 •

edited

Loading

mengdonglin commented Mar 20, 2022 •

edited

Loading

mengdonglin commented Sep 21, 2022 •

edited by marc-hb

Loading