-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio: Multiband DRC: Use optimized 4th order IIR filter version #9808
Conversation
WIP - I'll see if I can further improve the HiFi4 and HiFi5 IIR versions. |
The 4th filter with two biquads in series is commonly used in crossover and multiband DRC components. The omitting of outer loop for parallel biquads and check for null coefficients and use of fixed loop count of two makes the critical code faster. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch changes crossover component to use the optimized 4th order IIR function. The LR4 (Linkwitz-Riley 4th order) filter bank is hard-coded to 4th order, so this change does no add restrictions. The filter bank is used by multiband DRC component. The saving in three bands configuration in a HiFi5 platform is 5.2 MCPS, from 90.36 MCPS to 85.17 MCPS. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch changes in multiband DRC component the emphasis and de-emphasis IIR filters to use the optimized 4th order IIR code. The patch for crossover already covered the bands filter bank. This change saves additional 2 MCPS in a HiFi5 build of the component. From 85.17 MCPS to 83.44 MCPS. The change is not restricting configuration. The existing filters are hard-coded to 4th order (SOF_EMP_DEEMP_BIQUADS). Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
558048d
to
c07023d
Compare
Just for the record, I tried this change for the inner loop but it's in practical application a bit slower (1.35 MCPS, while a separate test code indicated 296 cycles vs. earlier 310 cycles) than this proposed version:
So the current version with 7 words coefficients set becomes the proposal. Padding it to 8 words for two 128 bits loads didn't improve. In addition it needed a separate new function to copy and align the existing format coefficients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the current version with 7 words coefficients set becomes the proposal. Padding it to 8 words for two 128 bits loads didn't improve. In addition it needed a separate new function to copy and align the existing format coefficients.
IIUC the coefficient pattern of EQIIR isn't changed by this migration, the current config blobs will still be compatible. Is that correct?
If that is the case then LGTM
@@ -13,6 +13,7 @@ | |||
#include <sof/common.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: @singalsu "The omitting of outer loop for parallel biquads and check for null coefficients and use of fixed loop count of two" very complex language is, comments Yoda.
The simpler 4th order hard-coded IIR version saves MCPS in in band split filter bank and emphasis/de-emphasis IIR filters.