Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HDA DMA stop sequence #4844

Merged
merged 12 commits into from
Oct 21, 2021
Merged

Conversation

ranj063
Copy link
Collaborator

@ranj063 ranj063 commented Oct 5, 2021

The first 5 commits are from #4820

Corresponding kernel PR: thesofproject/linux#3167

int (*release)(struct dma_chan_data *channel);
int (*post_release)(struct dma_chan_data *channel);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to commit description: if you only need to add actions when DMA has been stopped, why are you also adding post_start and post_resume?

@@ -609,27 +609,6 @@ static int dai_config_prepare(struct comp_dev *dev)
return 0;
}

static void dai_config_reset(struct comp_dev *dev)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a way this commit seems to limit dai_config_reset() to IPC3 only, while the description says, that it should become common. Wouldn't it be better to just make it global here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its not longer limited to IPC3 only with my new changes

src/audio/dai.c Outdated
* Do not reset DAI config for HDA DAI's. It will be reset when the DAI_CONFIG IPC is sent
* with the flag SOF_DAI_CONFIG_FLAGS_HW_FREE
*/
if (ipc_config->type != SOF_DAI_INTEL_HDA)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this too Intel-HDA-specific to be here? In general, would it be possible to call dai_config_reset() later during DAI_CONFIG IPC similar to HDA? Otherwise maybe we need some late_reset DAI flag?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is HDA specific and no I cant move it to the DAI_CONFIG IPC for all types as there is no DAI_CONFIG IPC for all types ex: even with the Intel DAI's we do not send it for the DMIC's. I'm not sure if the NXP/AMD DAI's send the DAI_CONFIG in the BE trigger in the kernel.

break;
}

return ret;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're always returning 0 here, so you can use return 0 here and avoid initialising ret

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ABI process note

#define SOF_DAI_CONFIG_FLAGS_NONE (0 << 0) /**< DAI_CONFIG sent without stage information */
#define SOF_DAI_CONFIG_FLAGS_HW_PARAMS (1 << 0) /**< DAI_CONFIG sent during hw_params stage */
#define SOF_DAI_CONFIG_FLAGS_HW_FREE (2 << 0) /**< DAI_CONFIG sent during hw_free stage */
#define SOF_DAI_CONFIG_FLAGS_RFU (3 << 0) /**< not used, reserved for future use */
#define SOF_DAI_CONFIG_FLAGS_PAUSE (3 << 0) /**< DAI_CONFIG sent during pause */
#define SOF_DAI_CONFIG_FLAGS_PAUSE_RELEASE (4 << 0) /**< DAI_CONFIG during pause release */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add "(ABI3.20)" to the field documentation.

@kv2019i kv2019i added the ABI ABI change involved label Oct 5, 2021
@lgirdwood lgirdwood added this to the ABI-3.20 milestone Oct 5, 2021
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only downside I see here is that we are adding more ops to the DMA and this seems unavoidable due to the recommended HW programming flow (I may have meaning wrong here, but teh intent is to show host/FW action sync at stop)

  1. FW stop DMA
  2. Host stop DMA
  3. FW disable DMA
    This applies to both host and link DMA for STOP, but does it also apply to PAUSE ?

src/ipc/ipc3/dai.c Show resolved Hide resolved
src/ipc/ipc3/dai.c Outdated Show resolved Hide resolved

if (ret < 0)
return ret;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my education, what does a 'reset' have to do with 'stop'? Or is this yet another abuse of language and 'reset' is the dual of 'params' so reset means 'hw_free'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is not clear in my commit message and I'll fix that. But basically, we need to stop the DMA in the FW after the host has cleared the RUN bit. This applies to both the host DMA and link DMA. For the link DMA, we use the DAI_CONFIG IPC which is send after the RUN bit is cleared to achieve this. For the host DMA, we use the PCM_FREE IPC to do the same thing. A PCM_FREE IPC results in resetting all components in the pipeline and I use this reset op for the host to stop the DMA. Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes ok

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranj063 @plbossart I think you need to update src/include/sof/lib/dma.h as well. We need to be clear on reset() and stop() semantics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we add a new call as second step stop, then should not be added to other places before dma_channel_put() is called? But if they are coming in pairs why not handle it directly in dma_channel_put() as @kv2019i suggested?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ujfalusi so this has the "pause problem" (when pausing, there will not be a call to channeLput() but stop needs to be two-step still)

@ranj063
Copy link
Collaborator Author

ranj063 commented Oct 5, 2021

The only downside I see here is that we are adding more ops to the DMA and this seems unavoidable due to the recommended HW programming flow (I may have meaning wrong here, but teh intent is to show host/FW action sync at stop)

  1. FW stop DMA
  2. Host stop DMA
  3. FW disable DMA
    This applies to both host and link DMA for STOP, but does it also apply to PAUSE ?

Yes @lgirdwood it applies to both pause and stop. We perform the same actions for stop and pause ie clear the RUN bit in the host and clear the GEN bit in the DSP. So it has to be done in the same order, otherwise there will be issues with the DMA hanging. The reason I added the new ops is because we don't want to modify the other DMA's ex: GPDMA. They do not have have the same need as the HDA-DMA and should not be changed.

@ranj063 ranj063 force-pushed the fix/hda_dma_stop branch 4 times, most recently from 2939098 to 96e36b7 Compare October 5, 2021 19:00
@ranj063 ranj063 changed the title Fix HDA DMA stop sequence + add trace free IPC Fix HDA DMA stop sequence Oct 5, 2021
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments/improvements suggested below, but in general this looks much better without an explicit 'HDAudio' reference.

src/include/sof/lib/dma.h Show resolved Hide resolved
src/include/sof/lib/dma.h Outdated Show resolved Hide resolved
src/include/sof/lib/dma.h Outdated Show resolved Hide resolved
src/include/ipc/stream.h Outdated Show resolved Hide resolved

if (ret < 0)
return ret;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes ok

src/drivers/intel/hda/hda-dma.c Show resolved Hide resolved
*
* To start a stream (Playback):
* 1. Host sends the PCM_PARAMS IPC with two_step_stop flag set
* 2. Host sets RUN bit for link DMA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens in the case of a mixer where the link DMA might already be running?
Also is there an order between 1 and 2? I would think not, but maybe the order is between 2 and 3?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering of 1, 2 and 3 is correct for pipelines with the mixer, isnt it? There's always a PCM_PARAMS before trigger start and thats when this flag would get set. And The ordering between 2 and 3 depends on the trigger order we set in the host. For playback, this order is correct.

In the case of mixer, we send the PCM_PARAMS but since the link DMA is already running, 2 would be skipped. But, even in the FW, the PCM_PARAMS for the second pipeline are not propagated beyond the mixer to the DAI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe for the first stream it's valid, but when a second stream gets mixed we wouldn't start the link DMA?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, do you want me to mention this here? Maybe we need a separate section for mixer-based streams that we can add with the deep-buffer patches?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it would be good to mention here, and we'll revisit this with the deep-buffer cases...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. sure

src/drivers/intel/hda/hda-dma.c Outdated Show resolved Hide resolved
src/drivers/intel/hda/hda-dma.c Outdated Show resolved Hide resolved
src/drivers/intel/hda/hda-dma.c Outdated Show resolved Hide resolved
@ranj063 ranj063 force-pushed the fix/hda_dma_stop branch 4 times, most recently from 5c7747a to f9a9802 Compare October 5, 2021 23:18
src/include/sof/lib/dma.h Show resolved Hide resolved
/*
* Stream quirks. Only supported with ABI 3.20 onwards
* The two-step stop will happen only when both the kernel and FW support ABI3.20 or greater.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2 step stop does.....

@@ -94,7 +100,9 @@ struct sof_ipc_stream_params {
uint32_t host_period_bytes;
uint16_t no_stream_position; /**< 1 means don't send stream position */

uint16_t reserved[3];
uint16_t quirks; /**< ABI3.20 */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be in dai instead ?

src/audio/host.c Outdated
if (ret < 0)
comp_err(dev, "host_trigger(): dma stop failed: %d",
ret);
if (!(config->quirks & SOF_PCM_STREAM_TWO_STEP_STOP)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to say

if (TWO_STOP) {
}

Since it's a quirk flow and wont be the default flow.

src/audio/dai.c Outdated
@@ -665,7 +670,8 @@ static int dai_reset(struct comp_dev *dev)

comp_info(dev, "dai_reset()");

dai_config_reset(dev);
if (!(config->quirks & SOF_PCM_STREAM_TWO_STEP_STOP))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, default flow is no quirk.

src/audio/dai.c Outdated
dai_trigger(dd->dai, cmd, dev->direction);
#else
dai_trigger(dd->dai, cmd, dev->direction);
ret = dma_stop(dd->chan);
if (!config->two_step_stop)
ret = dma_stop(dd->chan);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can create a common_trigger() and a quirk_two_step_trigger()

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the interface bits still make my head spin, but I think we've done enough search for cleaner solutions, so I'm willing to stamp this myself.

}

/* with two-step stop, DMA will be stopped during reset */
if (!dd->two_step_stop)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ujfalusi So it turns out this won't work for pause (two-step stop is needed but channel should not be released).

@marc-hb

This comment has been minimized.

@marc-hb
Copy link
Collaborator

marc-hb commented Oct 15, 2021

Thanks @fredoh9 for extending the number of suspend/resume cycles in PR testing from 1 (!!) to 5.

Testing in https://sof-ci.01.org/sofpr/PR4844/build10721/devicetest/ did not reproduce #4558 which is good HOWEVER
testing was unfortunately partially blocked by much older bug suspend/resume TIMEOUT bug thesofproject/sof-test#650

In https://sof-ci.01.org/sofpr/PR4844/build10721/devicetest/?model=CML_SKU0955_HDA&testcase=multiple-pause-resume-5, I find this a bit worrisome because I don't remember seeing it before:

[  493.322398] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx error for 0x60060000 (msg/reply size: 12/12): -16
[  493.322451] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -16
[  493.322468] kernel:  HDMI3: ASoC: trigger FE cmd: 3 failed: -16
[  493.323703] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx error for 0x60070000 (msg/reply size: 12/12): -16
[  493.323726] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -16
[  493.323740] kernel:  HDMI3: ASoC: trigger FE cmd: 4 failed: -16

Copy link
Contributor

@ujfalusi ujfalusi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranjani, this looks good, I only have some concerns around the flag naming.
Not going to block the PR, but not going to approve either for now.

src/ipc/ipc3/dai.c Outdated Show resolved Hide resolved
if (dd->chan) {
/* remove callback */
notifier_unregister(dev, dd->chan, NOTIFIER_ID_DMA_COPY);
dma_channel_put(dd->chan);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance that the DMA channel was running when we dma_channel_put() it? Is that a allowed (releasing a running channel)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think it can be running at this point but if we want to add that check it should be a separate PR.

* DAI pause occurs during STREAM_TRIG_PAUSE IPC and DMA pause during DAI_CONFIG IPC with
* the SOF_DAI_CONFIG_FLAGS_PAUSE flag.
*/
bool two_step_stop;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still a bit of misleading as we have one step to stop the DMA (and this flag refers to the DMA), it is just it happens at different point in time.
This flag in here I think should be delayed_dma_stop

As for the flag in the IPC: i think it can remain SOF_DAI_CONFIG_FLAGS_2_STEP_STOP_PAUSE, but I would just drop the _PAUSE. or use SOF_DAI_CONFIG_FLAGS_SPLIT_STOP_SEQUENCE

Move the function dai_config_reset to the common DAI code so it can be
reused. Also, rename it to dai_dma_release as it does not really reset the
DAI config but rather releases the DMA channel.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Some DAI components, such as HDaudio, need to be stopped in two steps
a) stop the DAI component
b) stop the DAI DMA

This patch enables this two-step stop by expanding the DAI_CONFIG
IPC flags and split them into 2 parts.

The 4 LSB bits indicate when the DAI_CONFIG IPC is sent, ex: hw_params,
hw_free or pause. The 4 MSB bits are used as the quirk flags to be used
along with the command flags. The quirk flag called
SOF_DAI_CONFIG_FLAGS_2_STEP_STOP shall be set along with the HW_PARAMS
command flag, i.e. before the pipeline is started so that the stop/pause
trigger op in the FW can take the appropriate action to either
perform/skip the DMA stop. If set, the DMA stop will be executed when the
DAI_CONFIG IPC is sent during hw_free. In the case of pause, DMA pause
will be handled when the DAI_CONFIG IPC is sent with the PAUSE command
flag.

Along with this, modify the signature for the hda_ctrl_dai_widget_setup/
hda_ctrl_dai_widget_free() functions to take additional flags as an
argument and modify all users to pass the appropriate quirk flags. Only
the HDA DAI's need to pass the SOF_DAI_CONFIG_FLAGS_2_STEP_STOP quirk
flag during hw_params to indicate that it supports two-step stop and
pause.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add a new field, delayed_dma_stop to struct dai_data that will be set by
the host when the DAI_CONFIG IPC is sent during hw_params.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
In preparation for adding a reset op that could be used in place of the
stop op for DAI's that support the two-step stop option.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add a new optional op for DMA that can be used to stop the DMA after the
pipeline has been reset.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
If the two_step_stop flag is set for a DAI, DMA reset will be skipped
during DAI reset and it will be done when the DAI_CONFIG IPC is sent
during hw_free. Also, for the pause push case, the DAI_CONFIG IPC will
be sent with the flag SOF_DAI_CONFIG_FLAGS_2_STEP_STOP_PAUSE which
call dma_reset to stop the DMA.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Some DMA's like the HD-Audio DMA need to be stopped ater the pipeline
has been reset. So call dma_stop_delayed op in host reset so that the DMA
can be stopped during the PCM_FREE IPC when the host gets reset.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Add a pointer to save the device-specific data for the client device that
requests the DMA channel in struct dma_chan_data.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
This can be used by the DMA drivers to access device-specific data.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
The recommended programming sequence for HD-Audio DMA's is to clear the
GEN bit after the host has cleared the RUN bit. For host DMA, we can
replace the stop op with the stop_delayed op so that the GEN bit can be
cleared during host reset. For link DMA, add the stop_delayed op to
perform DMA stop during DAI_CONFIG IPC with hw_free if the two-step stop
flag is set for the DAI. For single-step op, there's no change in
sequence.

The two-step sequence for pause is similar to stop. The DMA will be
stopped with dma_stop_delayed when the DAI_CONFIG IPC with the
SOF_DAI_CONFIG_FLAGS_2_STEP_STOP_PAUSE flag after the host has cleard
the RUN bit.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Make sure the HDA DMA channel is idle after stopping by checking the
GBUSY bit.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
…quence

Spell out the HD-Audio DMA programming sequence to make it
easier to follow.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
@ranj063
Copy link
Collaborator Author

ranj063 commented Oct 15, 2021

@ujfalusi i've addressed your comments o change flag and functions names now

@lgirdwood
Copy link
Member

@ranj063 I assume you have just done some local testing only ? @XiaoyunWu6666 @mengdonglin can we schedule a stress test with the kernel PR (sorry dont know the kernel PR #) on CML hardware. Thanks !

@ranj063
Copy link
Collaborator Author

ranj063 commented Oct 18, 2021

@ranj063 I assume you have just done some local testing only ? @XiaoyunWu6666 @mengdonglin can we schedule a stress test with the kernel PR (sorry dont know the kernel PR #) on CML hardware. Thanks !

@lgirdwood No, I have run the daily test with both the SOF and kernel PR's last week with no regressions. I've kicked off another daily test with this one and the kernel PR thesofproject/linux#3167 again today.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one part where I didn't follow the code.

src/ipc/ipc3/dai.c Show resolved Hide resolved
Copy link
Contributor

@ujfalusi ujfalusi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranj063, looks nice, I don't think there is a better way to do this within the constraints we have.

@lgirdwood lgirdwood added the P1 Blocker bugs or important features label Oct 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ABI ABI change involved P1 Blocker bugs or important features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants