Audio: DRC: Change DRC to use lookup table based sine function#8491
Audio: DRC: Change DRC to use lookup table based sine function#8491lgirdwood merged 3 commits intothesofproject:mainfrom
Conversation
src/math/lut_trig.c
Outdated
| #define SOFM_LUT_SINE_SIZE (SOFM_LUT_SINE_NQUART + 1) | ||
|
|
||
| /* An 1/4 period of sine wave as Q1.31 */ | ||
| const int32_t sofm_lut_sine_table[SOFM_LUT_SINE_SIZE] = { |
There was a problem hiding this comment.
TODO: Should test if this could be int16_t and size 257 for 16 bit sine.
There was a problem hiding this comment.
one is check wether can use 16-bit, another thing is whether can shorten the size? do we really need 512?
such a small usage cost 2k bytes table, it is big cost.
Also, could you add comments for how this table calculated? sin(i/2pi)? then we can further figure out whether 512 is must.
There was a problem hiding this comment.
to ensure the accuracy,and I guess maybe it is because the latency is 512 point.
There was a problem hiding this comment.
I tried 256 size but the quality was worse than with default cordic algorithm based 16 bit sine. So, the table still has 512 elements but uint16_t type was sufficient, so the table is now half of previous.
I added also static.
src/include/sof/audio/drc/drc_math.h
Outdated
| int32_t sin_val = sin_fixed_16b(denorm_x); | ||
|
|
||
| return sin_val << 16; | ||
| return sofm_lut_sin_fixed_32b(denorm_x); |
There was a problem hiding this comment.
some of the tooling might complain about a missing line between variable definitions and statements - same below
There was a problem hiding this comment.
Yep, I didn't notice first. You are right.
btian1
left a comment
There was a problem hiding this comment.
do we have a float version drc source code? I want to first check float version, the fixed version, then optimized version.
src/math/lut_trig.c
Outdated
| #define SOFM_LUT_SINE_SIZE (SOFM_LUT_SINE_NQUART + 1) | ||
|
|
||
| /* An 1/4 period of sine wave as Q1.31 */ | ||
| const int32_t sofm_lut_sine_table[SOFM_LUT_SINE_SIZE] = { |
There was a problem hiding this comment.
one is check wether can use 16-bit, another thing is whether can shorten the size? do we really need 512?
such a small usage cost 2k bytes table, it is big cost.
Also, could you add comments for how this table calculated? sin(i/2pi)? then we can further figure out whether 512 is must.
src/math/Kconfig
Outdated
| Select this to enable sin(), cos(), asin(), acos(), | ||
| and cexp() functions as 16 bit and 32 bit versions. | ||
|
|
||
| config LUT_TRIG_FIXED |
There was a problem hiding this comment.
We need a cleanup at some point where we have MATH_TRIG_ prefix and likewise convention for all maths APIs, macros, Kconfigs etc.
src/math/lut_trig.c
Outdated
| #define SOFM_LUT_SINE_SIZE (SOFM_LUT_SINE_NQUART + 1) | ||
|
|
||
| /* An 1/4 period of sine wave as Q1.31 */ | ||
| const int32_t sofm_lut_sine_table[SOFM_LUT_SINE_SIZE] = { |
src/math/Kconfig
Outdated
| Select this to enable sofm_lut_sin_fixed_32b() function. The | ||
| calculation is using 1/4 wave lookup and interpolation. | ||
| This option consumes 2052 bytes .bss RAM for the lookup | ||
| table. |
There was a problem hiding this comment.
Can we offer advice on when each trig type should be used. i.e. I would expect we export the same public API for all maths, but the internal calculations will depend on which Kconfig is selected by the user at build time.
There was a problem hiding this comment.
I added some text to Kconfig about preferring the lookup sine when used in hot code parts.
8bc42e4 to
02d7c32
Compare
02d7c32 to
7c8ae5f
Compare
There used to be long ago in git first version that was float C by Sebastiano and Johny but it was replaced by fixed point code when the work proceeded. A float version should be found from ChromeOS sources. The float code and fixed conversion work for this contribution is owned by team Google so we don't review it here. I've used the scripts in tools/test/audio to evaluate objective steady signal audio parameters for DRC and we've not seen difference in team Intel's optimizations. DRC has complex transient signal characteristics, and we currently don't have other but subjective expert listening test method for that. It means for me to listen an album of music with this processing in DUT and try to spot any issues. |
zephyr/CMakeLists.txt
Outdated
| zephyr_library_sources_ifdef(CONFIG_MATH_LUT_TRIG_FIXED | ||
| ${SOF_MATH_PATH}/lut_trig.c | ||
| ) | ||
|
|
There was a problem hiding this comment.
Please move this next to the other SOF_MATH_PATH (#8620)
There was a problem hiding this comment.
Wrong PR? There's no change to CMakeLists.txt in that.
ShriramShastry
left a comment
There was a problem hiding this comment.
I have reviewed the changes and they appear to be in good standing. I hope that LUT's size is good to others.
src/audio/drc/Kconfig
Outdated
| config COMP_DRC | ||
| bool "Dynamic Range Compressor component" | ||
| select CORDIC_FIXED | ||
| select MATH_LUT_TRIG_FIXED |
There was a problem hiding this comment.
Can it be MATH_LUT_SINE_FIXED instead of MATH_LUT_TRIG_FIXED?
There was a problem hiding this comment.
Yep, I'll change the config name. There is no need for other functions now.
src/math/Kconfig
Outdated
| Select this to enable sin(), cos(), asin(), acos(), | ||
| and cexp() functions as 16 bit and 32 bit versions. | ||
|
|
||
| config MATH_LUT_TRIG_FIXED |
There was a problem hiding this comment.
Can it be MATH_LUT_SINE_FIXED instead of MATH_LUT_TRIG_FIXED?
I tried 256 size LUT but the quality dropped a lot. With 512 the quality is a tiny bit better than in 16 bit cordic, so there should be no negative audio quality impact from this. |
This patch adds function sofm_lut_sin_fixed_16b(). It was used earlier in SOF with name sin_fixed() but was remove at add of Cordic trigonometric library. This sine function can be used in hot code parts. Due to look-up table usage it consumes more .bss RAM than cordic version. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
7c8ae5f to
13c1725
Compare
| /* Q4.28 x Q12.20 -> Q16.48 --> Q16.31*/ | ||
| idx_tmp = ((int64_t)w * SOFM_LUT_SINE_C_Q20) >> 17; | ||
| idx = (idx_tmp >> 31); /* Shift to Q0 */ | ||
| frac = (int32_t)(idx_tmp - (idx << 31)); /* Get fraction Q1.31*/ |
There was a problem hiding this comment.
that seems to boil down to
idx_tmp - (idx_tmp & 0xffffffff80000000ULL) ==
idx_tmp & 0x7fffffff
would the compiler optimise that out by itself?
There was a problem hiding this comment.
I was thinking that but it looks awkward in arithmetic that is not bit-banging to HW registers etc. The shifts have association to Qx.y format. But if that gives MCPS advantage I can change, and comment what happens, I'll try.
There was a problem hiding this comment.
With this modification 69.976 to 69.925 MCPS, not worth it I think, because of bit-and awkwardness here. I think our perf measurement works down to 0.1 MCPS level, below that it's probably noise.
| int theta; | ||
|
|
||
| for (theta = 0; theta < 360; ++theta) { | ||
| double rad = _M_PI * (theta / 180.0); |
There was a problem hiding this comment.
hopefully the compiler will calculate _M_PI / 180.0 at compile time, but parentheses might actually prevent it from doing that and make it a (redundant) run-time calculation
There was a problem hiding this comment.
It's cmocka test code so we don't care about performance even if there would be emulated floats.
The test function is based on test function for the cordic sine function. The error tolerance is adjusted to just pass. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This change saves in TGL platform about 13 MPCS, from 83 to 70 MCPS. In MTL platform the saving is 12 MCPS, from 46 to 34 MCPS. The .bss RAM usage increases by 1 kB from selecting CONFIG_MATH_LUT_SINE_FIXED. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
13c1725 to
3d1d453
Compare
|
@singalsu can you check CI. Thanks ! |
This change saves in TGL platform about 13 MPCS, from 83 to 70 MCPS. In MTL platform the saving is 12 MCPS, from 46 to 34 MCPS.
The .bss RAM usage increases by 1 kB from selecting CONFIG_MATH_LUT_TRIG_FIXED.