Audio: Volume: Add HiFi5 implementation. #9419

singalsu · 2024-08-29T11:17:40Z

Add HiFi5 implementation of volume functions, compared with HiFi3 version, can reduce about 28% cycles.

singalsu · 2024-08-29T11:20:15Z

This is a new merge conflicts fixed version of Andrula's #8900. I've run this successfully in testbench HiFi5 environment with 48 kHz and 44.1 kHz rates with s16 and s32 formats. Though testbench is currently IPC3, so the test didn't exercise IPC4 code parts.

lgirdwood · 2024-09-02T12:40:12Z

src/audio/volume/volume.c

+	const uint32_t byte_align = 16;
+
+	/*There is no limit for frame number, so both source and sink set it to be 1*/
+	const uint32_t frame_align_req = 1;


Should this come from Kconfig ?
i.e. The selection of HiFi3, HIFI4, HIFI5, AVX etc would set a generic CONFIG_FRAME_BYTE_ALIGN macro that could be used everywhere ?

singalsu · 2024-10-28T16:04:29Z

src/audio/volume/volume.c

 	 * xtensa intrinsics ask for 8-byte aligned. 5.1 format SSE audio
 	 * requires 16-byte aligned.
 	 */
-	const uint32_t byte_align = audio_stream_get_channels(source) == 6 ? 16 : 8;
+	const uint32_t byte_align = audio_stream_get_channels(source) == 6 ?
+		SOF_FRAME_BYTE_ALIGN_6CH : SOF_FRAME_BYTE_ALIGN;


@lgirdwood I don't remember from where this align requirement for 6ch comes from. There was no discussion that I could find for it in #5266. Is it specific to peakvolume or generic for loading/storing the format in 64 bit or 128 bit chunks. If internal to peakvolume, then this SOF_FRAME_BYTE_ALIGN_6CH in common.h would make no sense.

6ch is 5.1 via display port.

singalsu · 2024-10-28T16:07:16Z

src/include/sof/common.h

 #  else
 #    define SOF_MAX_XCHAL_HIFI NONE
 #  endif
 #endif

+#if SOF_MAX_XCHAL_HIFI == NONE
+#  ifndef SOF_FRAME_BYTE_ALIGN
+#    define SOF_FRAME_BYTE_ALIGN	4


If this is bytes we should state in the comments next to the definition. Should never be 1 if bytes.

I think one is the default if it's not set to be free to provide any number of frames, but it makes no sense as align constraint. Also Above I think the #if SOF_MAX_XCHAL_HIFI == NONE could be left out as redundant. These could be before this a macros section for SSE or AVX specific definitions.

singalsu · 2024-10-29T08:18:05Z

src/include/sof/common.h

 #  elif XCHAL_HAVE_HIFI4
 #    define SOF_MAX_XCHAL_HIFI 4
-#  elif XCHAL_HAVE_HIFI3
+#    define SOF_FRAME_BYTE_ALIGN	8
+#    define SOF_FRAME_BYTE_ALIGN_6CH	16


I'll move this 6ch specific definition into volume, I can't see generally how 6ch alignement would be a special case. Every word length, channels count has some number of frames that is not matching align with 8 or 16 bytes / 64 or 128 bits.

lyakh · 2024-10-29T08:05:48Z

src/audio/volume/volume.c

+	/* Both source and sink buffer in HiFi5  processing version,
+	 * xtensa intrinsics ask for 16-byte aligned.
+	 *
+	 * Both source and sink buffer in HiFi 3 or HiFi4 processing version,


maybe we can converge one way or another - with or without a space in "HiFi.N"

Yep, changing to without space.

lyakh · 2024-10-29T08:10:27Z

src/audio/volume/volume_hifi5.c

+{
+	int32_t i;
+
+	/* using for loop instead of memcpy_s(), because for loop costs less cycles */


this loop can be replaced with a single memcpy() and you've found out that the loop is faster?.. Interesting, then we have a problem with our memcpy()

It was a finding by Andrula that I haven't verified.

lyakh · 2024-10-29T08:11:24Z

src/audio/volume/volume_hifi5.c

+		cd->vol[i] = cd->volume[i];
+		cd->vol[i + channels_count * 1] = cd->volume[i];
+		cd->vol[i + channels_count * 2] = cd->volume[i];
+		cd->vol[i + channels_count * 3] = cd->volume[i];


although it looks like it wouldn't be a single memcpy(), but a loop of them. So you actually mean that 4 assignments are faster than a memcpy()? That would be logical

I can remove the comment to avoid it to confuse. It's most of use cases just two channels. Especially with recommended memcpy_s() there is more overhead.

lyakh · 2024-10-29T08:18:14Z

src/audio/volume/volume_hifi5.c

+	const int inc = sizeof(ae_int32x4);
+	int samples = channels_count * frames;
+
+	/** to ensure the adsress is 16-byte aligned and avoid risk of


is this really supposed to be a doxygen comment? More below

Fixed in next version

lyakh · 2024-10-29T08:23:17Z

src/audio/volume/volume_hifi5.c

+		m = audio_stream_samples_without_wrap_s32(sink, out);
+		n = MIN(m, n);
+		inu = AE_LA128_PP(in);
+		/* process four continuous samples once */


"once?" Did you mean "per iteration?"

kv2019i

Only minor notes, looks good!

kv2019i · 2024-10-29T11:20:45Z

src/audio/volume/volume.h

@@ -112,6 +112,9 @@ struct sof_ipc_ctrl_value_chan;
 #define VOL_S16_SAMPLES_TO_BYTES(s)	((s) << 1)
 #define VOL_S32_SAMPLES_TO_BYTES(s)	((s) << 2)

+/** \brief PCM samples align requirement for HiFi3 an Hifi4 for volume component */
+#define VOLUME_HIFI3_HIFI4_FRAME_BYTE_ALIGN_6CH	16


Comment says "PCM samples" which is a bit misleading as this is alignment in bytes (as tge define name says, FRAME_BYTE).

Oops yes, it looks quite confusing.

kv2019i · 2024-10-29T11:21:56Z

src/audio/volume/volume_hifi5.c

+	const int inc = sizeof(ae_int32x4);
+	int samples = channels_count * frames;
+
+	/* to ensure the adsress is 16-byte aligned and avoid risk of


typo: adsress

I added commit to fix same typos in HiFi3 and HiFi4 versions.

Add HiFi5 implementation of volume functions, compared with HiFi3 version, can reduce about 28% cycles. Signed-off-by: Andrula Song <andrula.song@intel.com> Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

Changed adsress -> address, also the comments are edited to avoid to be mistaken as Doxygen. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

kv2019i · 2024-10-31T08:34:09Z

sof-docs fail and Intel LNL fails all known and tracked in https://github.com/thesofproject/sof/issues?q=is%3Aissue+is%3Aopen+label%3ACI

singalsu mentioned this pull request Aug 29, 2024

Audio: Volume: Add HiFi5 implementation. #8900

Closed

singalsu requested a review from ShriramShastry August 29, 2024 11:23

lgirdwood reviewed Sep 2, 2024

View reviewed changes

singalsu force-pushed the volume_hifi5_rebase branch from 9e4ec96 to 394c08d Compare October 28, 2024 15:57

singalsu commented Oct 28, 2024

View reviewed changes

singalsu force-pushed the volume_hifi5_rebase branch from 394c08d to 69e2d33 Compare October 28, 2024 16:06

singalsu commented Oct 28, 2024

View reviewed changes

singalsu commented Oct 29, 2024

View reviewed changes

lyakh reviewed Oct 29, 2024

View reviewed changes

singalsu force-pushed the volume_hifi5_rebase branch from 69e2d33 to 07df9a2 Compare October 29, 2024 09:25

thesofproject deleted a comment from lyakh Oct 29, 2024

singalsu force-pushed the volume_hifi5_rebase branch from 07df9a2 to 5b345b7 Compare October 29, 2024 10:13

singalsu marked this pull request as ready for review October 29, 2024 10:19

singalsu requested review from kv2019i, iuliana-prodan, dbaluta, abonislawski, plbossart, mmaka1 and lbetlej as code owners October 29, 2024 10:19

kv2019i approved these changes Oct 29, 2024

View reviewed changes

andrula-song and others added 2 commits October 29, 2024 15:47

Audio: Volume: Add HiFi5 implementation.

256b68c

Add HiFi5 implementation of volume functions, compared with HiFi3 version, can reduce about 28% cycles. Signed-off-by: Andrula Song <andrula.song@intel.com> Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

Audio: Volume: Fix some typos in comment texts

1050ffc

Changed adsress -> address, also the comments are edited to avoid to be mistaken as Doxygen. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

singalsu force-pushed the volume_hifi5_rebase branch from 5b345b7 to 1050ffc Compare October 29, 2024 13:50

lgirdwood approved these changes Oct 29, 2024

View reviewed changes

abonislawski approved these changes Oct 30, 2024

View reviewed changes

singalsu requested a review from lyakh October 30, 2024 08:45

lyakh approved these changes Oct 31, 2024

View reviewed changes

kv2019i merged commit 2f4efa5 into thesofproject:main Oct 31, 2024
42 of 47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio: Volume: Add HiFi5 implementation. #9419

Audio: Volume: Add HiFi5 implementation. #9419

singalsu commented Aug 29, 2024

singalsu commented Aug 29, 2024 •

edited

Loading

lgirdwood Sep 2, 2024

singalsu Oct 28, 2024

lgirdwood Oct 28, 2024

singalsu Oct 28, 2024

lgirdwood Oct 28, 2024

singalsu Oct 29, 2024 •

edited

Loading

singalsu Oct 29, 2024

lyakh Oct 29, 2024

singalsu Oct 29, 2024

lyakh Oct 29, 2024

singalsu Oct 29, 2024

lyakh Oct 29, 2024

singalsu Oct 29, 2024

lyakh Oct 29, 2024

singalsu Oct 29, 2024

lyakh Oct 29, 2024

singalsu Oct 29, 2024

kv2019i left a comment

kv2019i Oct 29, 2024

singalsu Oct 29, 2024

kv2019i Oct 29, 2024

singalsu Oct 29, 2024

singalsu Oct 29, 2024

kv2019i commented Oct 31, 2024

Audio: Volume: Add HiFi5 implementation. #9419

Audio: Volume: Add HiFi5 implementation. #9419

Conversation

singalsu commented Aug 29, 2024

singalsu commented Aug 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

singalsu Oct 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kv2019i left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kv2019i commented Oct 31, 2024

singalsu commented Aug 29, 2024 •

edited

Loading

singalsu Oct 29, 2024 •

edited

Loading