Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SoundWire clock stop support #1586

Conversation

plbossart
Copy link
Member

This is a proposal to allow the clock stop behavior to be modifiable as needed, depending on the DSP driver requirements

@RanderWang can you try and add support for the clock stop mode with the bus reset on restart? We can enable the other modes when we have D0i3 support, or figure out a means to keep the parent active when the clock stops.

Comments welcome.

@bardliao
Copy link
Collaborator

bardliao commented Dec 4, 2019

@plbossart Can we use different clock_stop_quirk for each link?

@plbossart
Copy link
Member Author

@plbossart Can we use different clock_stop_quirk for each link?

I thought about it but I am not sure if it's useful. The power dependencies are really between the DSP and the SoundWire IPs, if you want to enable fast restart on one link then the DSP needs to remain active (D0/D0i3), so you might as well enable fast restart on all links. Conversely if you do want to turn the DSP off for power consumption reasons, then all links needs to restart with a bus reset.

The only thing that might be useful is e.g. if there are latency requirements for specific links, but the platform driver does not know this, it's handled at the machine driver level which is the only entity that might know about such requirements.

we can change this later if needed, the code in intel.c does not presume anything about other links, it'd just be the interface between SOF/hda and intel_init that would change. I think it's already be very good if we have several knobs to play with.

@RanderWang
Copy link

This is a proposal to allow the clock stop behavior to be modifiable as needed, depending on the DSP driver requirements

@RanderWang can you try and add support for the clock stop mode with the bus reset on restart? We can enable the other modes when we have D0i3 support, or figure out a means to keep the parent active when the clock stops.

I also did some tests. I tried to add a flag PCI_DEV_FLAGS_NO_D3 to make audio pci device in D0 when it is inactive, but master was still powered off. Later I found master will be powered off if dsp core is reset and powered down. I don't why, maybe it is a hw design.

I will finish the clock stop mode tomorrow for I am blocked by other issues.

@plbossart
Copy link
Member Author

This is a proposal to allow the clock stop behavior to be modifiable as needed, depending on the DSP driver requirements
@RanderWang can you try and add support for the clock stop mode with the bus reset on restart? We can enable the other modes when we have D0i3 support, or figure out a means to keep the parent active when the clock stops.

I also did some tests. I tried to add a flag PCI_DEV_FLAGS_NO_D3 to make audio pci device in D0 when it is inactive, but master was still powered off. Later I found master will be powered off if dsp core is reset and powered down. I don't why, maybe it is a hw design.

I will finish the clock stop mode tomorrow for I am blocked by other issues.

Thanks @RanderWang. I'd like to provide an update for the SoundWire interfaces later this week or early next week, so if you can check the definitions that would be nice.

@plbossart plbossart force-pushed the integration/soundwire-latest branch from f6e2dba to add887e Compare December 4, 2019 23:22
@RanderWang
Copy link

@plbossart I updated a prototype plbossart#7. Thanks!

@plbossart plbossart force-pushed the integration/soundwire-latest branch 2 times, most recently from d03ccc7 to d64c4a3 Compare December 5, 2019 17:38
@plbossart plbossart force-pushed the fix/clock-stop-definitions branch from b95b6d3 to 5aef8fe Compare December 5, 2019 22:02
@plbossart
Copy link
Member Author

plbossart commented Dec 5, 2019

@RanderWang @bardliao @ranj063 @YvonneYang2 can you take a look at this updated PR? Basically it's a merge/cleanup of Rander's work with my initial definitions.

I tested the NOT_ALLOWED mode, which keeps the master active.
I tested the TEARDOWN mode, which essentially prevents any jack detection (same as before)
I was not successful on Olympic with the BUS_RESET mode. Somehow the master-2 fails, I am not sure why the configuration to accept CMD_IGNORED doesn't seem to work.

[   24.007288] intel-sdw sdw-master-2: intel_suspend_runtime start
[   24.007297] intel-sdw sdw-master-2: sdw_cdns_clock_stop: start
[   24.007303] intel-sdw sdw-master-2: sdw_cdns_clock_stop: slave attached 0
[   24.007498] intel-sdw sdw-master-2: Msg ignored for Slave 15
[   24.007501] intel-sdw sdw-master-2: ClockStopNow Broadcast message failed -61
[   24.007505] intel-sdw sdw-master-2: bus clock stop failed -61
[   24.007508] intel-sdw sdw-master-2: cannot enable clock stop on suspend

@RanderWang did you test on RVPs or on Olympic devices? @bardliao this probably works on other devices with 2 amps or the 3-in-1 board.

Edit: if I mask the link 2 with
options soundwire_intel_init sdw_link_mask=0xB
and change the acpi matching table, I am able to test jack detection while we are in D3/pm_runtime_suspend. We just need to figure out how to stop the clock on a link that have no slaves (that may be already solved with newer BIOS but it's still a problem to deal with).

Also I am not sure I understand why we added a 'block wake' parameter, the documentation states that the block_wake needs to be set prior to entering clock_stop mode?

Last, I added a tentative support for the regular mode where we keep the PCI device active. I didn't test since I am not sure the sequences are fine.

I think we are very close to the goal here, hopefully we've got enough configurations to please everyone, and we'll have even better results when D0i3 is supported

To test, you can use

options snd-sof-intel-hda-common sdw_clock_stop_quirks=0x2 for teardown
options snd-sof-intel-hda-common sdw_clock_stop_quirks=0x1 for not allowed
options snd-sof-intel-hda-common sdw_clock_stop_quirks=0x8 for bus reset
options snd-sof-intel-hda-common sdw_clock_stop_quirks=0x0 for normal mode

please do not play with the sof-debug options or disable pm_runtime by hand with module parameters.

@plbossart plbossart changed the title clock stop definitions SoundWire clock stop support Dec 5, 2019
@YvonneYang2
Copy link

YvonneYang2 commented Dec 6, 2019

With options snd-sof-intel-hda-common sdw_clock_stop_quirks=0x8 for bus reset, switch output source to headphone in sound setting during paplay will print dmesg like:

Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229348] intel-sdw sdw-master-0: intel_prepare: SDW0 Pin2: start
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229353] rt711 sdw:0:25d:711:0: DPN_PortCtrl register write failed for port 3
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229355] intel-sdw sdw-master-0: Program transport params failed: -22
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229357] intel-sdw sdw-master-0: Program params failed: -22
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229359] sdw_prepare_stream: SDW0 Pin2-Playback: done
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229360] intel-sdw sdw-master-0: intel_prepare: SDW0 Pin2: done
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229362] intel-sdw sdw-master-0: ASoC: cpu DAI prepare error: -22
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229364]  SDW0-Playback: ASoC: backend prepare failed -22
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229390]  Headphone: ASoC: hw_free FE Headphone
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229392] sof-audio-pci 0000:00:1f.3: pcm: free stream 0 dir 0
Oct 16 06:54:32 ubuntu-Latitude-11111 kernel: [   52.229395] sof-audio-pci 0000:00:1f.3: ipc tx: 0x60030000: GLB_STREAM_MSG: PCM_FREE

Tested it with plbossart#8, issue cannot be reproduced.

Logs:
source_switch.zip

Note:
After suspend/resume stress test 300 times, then aplay0,2(RT1308) will print logs seems same as above. Will test it again with plbossart#8 to check if it can fix the issue.

Log of aplay0,2 failed after suspend/resume stress test:
dmesg.log

@RanderWang
Copy link

@plbossart if clock_stop_quirks == 0, our driver doesn't work. According to spec, we need glue hw involved to enable clock stop. First I tuned the driver without glue hw for a long time, but I still get io timeout. And with glue hw involved, there is no problem. And for this case, pci driver is in D0, we need to process wakeen interrupt in sof pci thread.

I update all my change in plbossart#8. please check and merge some patch into one. Thanks!

intel_shim_wake(sdw, false);

/*
* wake up master and slave so that slave can notify master
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're only waking up the slave here right? So the comment "wake up master ..." is not correct as you dont wake master up explicitly here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RanderWang this piece of code is indeed troubling.

I am not sure it's quite right to test for the slave status when handling the wakeen events, since the IP may be lost and the slave status is meaningless until the bus has actually restarted.

I think what you were trying to do here is to only resume actual devices that can generate a wake and ignore both the devices that cannot generate a wake and the ghost devices present in ACPI tables only, but the use of the status is very dangerous and racy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're only waking up the slave here right? So the comment "wake up master ..." is not correct as you dont wake master up explicitly here?

yes, not explicitly, just implicitly. If slave is waken up, the parent master should be waken up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RanderWang this piece of code is indeed troubling.

I am not sure it's quite right to test for the slave status when handling the wakeen events, since the IP may be lost and the slave status is meaningless until the bus has actually restarted.

If the slave is unattached before clock stop, in my opinion, It can't wake up master with clock stop.

I think what you were trying to do here is to only resume actual devices that can generate a wake and ignore both the devices that cannot generate a wake and the ghost devices present in ACPI tables only, but the use of the status is very dangerous and racy.

yes, you got my idea. I don't want to wake up any ghost device.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart I create a PR: plbossart#9 to remove these check. There is no regression issue. Do you prefer this defensive code or no check ? Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RanderWang your idea of not waking up ghost devices does not work with this code.

when a device is probe, we read the properties, and that's how we set wake-capable.
/* device is probed so let's read the properties now */
if (slave->ops && slave->ops->read_prop)
slave->ops->read_prop(slave);

In other words, even ghost devices are seen as capable of generating wakes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart Now I check slave->dev_num to skip ghost devices, please check plbossart#11. Thanks!

@@ -1239,20 +1239,27 @@ static int sdw_handle_slave_alerts(struct sdw_slave *slave)

sdw_modify_slave_status(slave, SDW_SLAVE_ALERT);

ret = pm_runtime_get_sync(&slave->dev);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also typos and grammar mistakes in the commit message.
"There are two types of io errors when processing alert event.
The first one is: master receives an alert event for jack event
and invokes implement_def function in slave to check jack status.
At this time codec is just suspended, so io registers can't be
accessed. Another one is: when woken or waking up from clock stop state
and bus needs a complete re-enumeration for synchronization issue
, slave register can't be accessed."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not modify this commit message since it's a controversial patch, with errors reported in #1597

@RanderWang we really need to figure out if this is needed or not.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart I showed the details in #1589. Thanks!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implement_def

Thanks! @ranj063 I fixed in #1589

plbossart and others added 19 commits December 11, 2019 20:13
…mode

When none of the clock stop quirks is specified, the Master IP will
assume the context is preserved and will not reset the Bus and restart
enumeration. Due to power rail dependencies, the HDaudio controller
needs to remain powered and prevented from executing its pm_runtime
suspend routine.

This choice of course has a power impact, and this mode should only be
selected when latency requirements are critical or the parent device
can enter D0ix modes.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
In this mode, on restart the bus restarts immediately, the Slaves
remain synchronized and all context is kept intact.

Signed-off-by: Rander Wang <rander.wang@intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
When a SoundWire link is in clock stop state, a Slave device may wake
up the Master for some events such as jack detection. The WAKEEN
interrupt will be triggered and processed by the audio pci device.

If audio device is in D3, the interrupt will be routed to PME, or
aggregated at cAVS level as interrupt when audio device is in D0. This
patch only supports D3 case, where the audio pci device will be
resumed by a PME event and the WAKEEN interrupt will be processed
after audio pci device is powered up and ROM is initialized
successfully.

The WAKEEN handling is only enabled after the first boot due to
dependencies on a shim_lock mutex being initialized.

Signed-off-by: Rander Wang <rander.wang@intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
…quirks

Add module parameter so that the different modes can be quickly tested.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
If pci device is in D0, wakeen interrupt will be
aggregated at cAVS level as interrupt. This commit
check the wakeen status and process it in irq thread

Signed-off-by: Rander Wang <rander.wang@intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
There are two types of io error when processing alert event.
The first one is: master receives a alert event for jack event
and invokes implement_def function in slave to check jack status.
At this time codec is just suspended, so io registers can't be
accessed. Another one is: when waken up from clock stop state
and bus needs a complete re-enumeration for synchronization issue
, slave register can't be accessed.

This patch wakes up codec and wait for initialization complete
when processing slave alert event, so that io registers can be
accessed.

Signed-off-by: Rander Wang <rander.wang@intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
If master is in clock stop state, driver can't modify registers
in master except the registers for clock stop setting.

Signed-off-by: Rander Wang <rander.wang@intel.com>
is set

If two masters are working and one of them become suspended and activated
again, this master will be failed to resumed in SDW_INTEL_CLK_STOP_BUS_RESET
mode. The reason is: on intel platform, there are four masters in the same
power domain. If one master is active, all other masters are always powered on.
Each master will be set to clock stop when it is inactive. And in resume function
each master will be initialized because it asummes it is powered off. But if the
master is not powered off, it should be in clock stop state, and the normal
initialization will be failed in this state.

Now check clock status and don't initialize master if it is in clock
stop mode.

Signed-off-by: Rander Wang <rander.wang@intel.com>
Somehow the master-2 fails to stop the clock

[   24.007288] intel-sdw sdw-master-2: intel_suspend_runtime start
[   24.007297] intel-sdw sdw-master-2: sdw_cdns_clock_stop: start
[   24.007303] intel-sdw sdw-master-2: sdw_cdns_clock_stop: slave attached 0
[   24.007498] intel-sdw sdw-master-2: Msg ignored for Slave 15
[   24.007501] intel-sdw sdw-master-2: ClockStopNow Broadcast message failed -61
[   24.007505] intel-sdw sdw-master-2: bus clock stop failed -61
[   24.007508] intel-sdw sdw-master-2: cannot enable clock stop on suspend

there are not slaves attached, so the command ignored should be accepted

what's not clear is what the CMD_ACCEPT (1) is already the default?

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
These traces show the suspend can happen before jack detection is complete

[  184.045885] rt711 sdw:0:25d:711:0: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00000404
[  184.056884] rt711 sdw:0:25d:711:0: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00000404
[  184.067823] rt711 sdw:0:25d:711:0: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00000404
[  184.078824] rt711 sdw:0:25d:711:0: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00000404
[  184.089928] rt711 sdw:0:25d:711:0: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00000404
[  184.100949] rt711 sdw:0:25d:711:0: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00000404
[  184.111990] rt711 sdw:0:25d:711:0: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00000404
[  184.121175] rt711 sdw:0:25d:711:0: rt711_dev_suspend start
[  184.121183] rt711 sdw:0:25d:711:0: rt711_dev_suspend end
[  184.121205] Failed to get private value: 752046 => 0000 ret=-16
[  184.121214] IO error in rt711_headset_detect, ret -16
[  184.121221] IO error in rt711_jack_detect_handler, ret -16

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
…esent

If there aren't any Slaves present (ATTACHED or ALERT status), we can
safely infore CMD_IGNORED/-ENODATA error codes. The Cadence IP is
configured to handle such cases as success.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
If there are no Slaves in ATTACHED or ALERT mode, the
CMD_IGNORED/-ENODATA error code is perfectly legit. Filter this case
to report errors and let the caller deal with the CMD_IGNORED case.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
cdns_updatel() applies its parameter assuming it's already shifted by
the correct amount. Using '1' instead of BIT(1) is incorrect, fix.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Add the type of command, device number, register offset and length to
reverse engineer what caused the issue.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
if a slave has been attached to a bus, the slave->dev_num_sticky
should be non-zero, so we can check this value to skip the
ghost devices defined in ACPI table but not populated in hardware.

Signed-off-by: Rander Wang <rander.wang@intel.com>
@plbossart plbossart force-pushed the fix/clock-stop-definitions branch from d158e21 to 6fa65e2 Compare December 12, 2019 02:14
Copy link
Collaborator

@bardliao bardliao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RanderWang @plbossart Minor comments from my side.

val = sdw_bread_no_pm(bus, dev_num, SDW_SCP_STAT) &
SDW_SCP_STAT_CLK_STP_NF;
if (!val)
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we return 0 here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'd miss the dev_info log below

ret = sdw_slave_clk_stop_prepare(slave,
slave_mode, true);
if (ret < 0) {
dev_err(&slave->dev,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do something when sdw_slave_clk_stop_callback() succeed, but sdw_slave_clk_stop_prepare() fail? And the case that sdw_slave_clk_stop_callback() fail for a slave, but succeed for previous slaves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we could.

continue;

/* Identify if Slave(s) are available on Bus */
is_slave = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set is_slave = true; after

if (slave->status != SDW_SLAVE_ATTACHED &&
		    slave->status != SDW_SLAVE_ALERT)
			continue;

?


if (slave->status != SDW_SLAVE_ATTACHED &&
slave->status != SDW_SLAVE_ALERT)
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move is_slave = true; here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes


if (is_slave && !simple_clk_stop)
sdw_bus_wait_for_clk_prep_deprep(bus, SDW_BROADCAST_DEV_NUM);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to test is_slave before list_for_each_entry(slave, &bus->slaves, node) {}?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

continue;

slave_mode = slave->curr_clk_stop_mode;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to test is_slave before list_for_each_entry(slave, &bus->slaves, node) {}? ie. Don't do list_for_each_entry(slave, &bus->slaves, node) {} if is_slave is false.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

When all the links are suspended, the HDaudio controller may suspend
and the power rails to the SoundWire IP may be disabled, requiring a
complete re-initialization/enumeration on resume. However, if one or
more Masters remained active, the HDaudio controller will remain active
and the power rails will remain enabled. As a result, during the link
resume step we can check if the context was preserved by verifying if
the clock was stopped, and avoid doing a complete bus reset and
re-enumeration.

Signed-off-by: Rander Wang <rander.wang@intel.com>
So we don't need extra if condition outside the while loop.

Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
We don't need to do anything for the slave if it is unattached.

Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
…es, node)

We don't need to test each slave's status if we already know there
is no slave attached.

Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
@plbossart plbossart force-pushed the integration/soundwire-latest branch from e888e6e to 56220b2 Compare December 13, 2019 04:49
@plbossart
Copy link
Member Author

all merged in various branches already

@plbossart plbossart closed this Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants