smart_amp: revamp to two-layer modular design structure#8036
smart_amp: revamp to two-layer modular design structure#8036lgirdwood merged 1 commit intothesofproject:mainfrom johnylin76:smart_amp_revamp
Conversation
lgirdwood
left a comment
There was a problem hiding this comment.
Good stuff @johnylin76 ! Just some minor comments.
@mengdonglin @alex-cri are testing pass thru smart amp in CI today ?
src/audio/smart_amp/smart_amp.c
Outdated
| } | ||
|
|
||
| /* set frame formats to inner model */ | ||
| ret = mod->mod_ops->set_fmts(mod, sad->ff_mod.frame_fmt, sad->fb_mod.frame_fmt); |
There was a problem hiding this comment.
I'm guessing we are missing a module_set_fmt() wrapper for this in module core ?
There was a problem hiding this comment.
I'm not sure I get the idea. Do you mean we need the ops set_fmt() to set the format of core?
The design now has ops set_fmts() to set the format of feed-forward (FF) data path and feedback (FB) path respectively. Indeed the core format can be independent of input/ouput data format, however I wonder if that has the real need (in fact I think FF and FB formats could align each other (as well as core format) then set_fmts() could be replaced with set_fmt()). Since the design now is proposed to handle input/output data format conversion in generic layer, it should be fair to require the format alignment.
There was a problem hiding this comment.
Sorry, I meant an inline static wrapper func around the API ops dereference.
There was a problem hiding this comment.
I see. Anyway it triggered my re-consideration of the format for module core. I revised to unify the module formats which include core, ff-input and fb-input previously. Now the module ops will only set one format .
| for (fr = 0 ; fr < frames ; fr++) { | ||
| for (ch = 0 ; ch < src_mod->channels; ch++) { | ||
| if (chan_map[ch] == -1) { | ||
| *mod_ptr = 0; | ||
| } else { | ||
| src_idx = src_frag + chan_map[ch]; | ||
| src_ptr = audio_stream_read_frag_s32(src, src_idx); | ||
| *mod_ptr = *src_ptr; | ||
| } | ||
| mod_ptr++; | ||
| } | ||
| src_frag += src_ch; | ||
| } |
There was a problem hiding this comment.
@singalsu @andrula-song is this using the optimal method now ?
There was a problem hiding this comment.
It would be better to write with frames without wrap helper to avoid audio_stream_read_frag_s32() usage for every sample read.
There was a problem hiding this comment.
the most waste part is if else, take 48 frame, 2 channel as example, if need 96 check for if/else, this is big cost.
better to use channel in first loop, and then check, chan_map[ch], if == -1, with one loop, else with another loop, like below:
for (ch = 0 ; ch < src_mod->channels; ch++) {
if (chan_map[ch] == -1) {
for (fr = 0 ; fr < frames ; fr++) {
*mod_ptr = 0;
mod_ptr += src_mod->channels;
}
} else {
for (fr = 0 ; fr < frames ; fr++) {
src_idx = src_frag + chan_map[ch];
src_ptr = audio_stream_read_frag_s16(src, src_idx);
*mod_ptr = *src_ptr;
mod_ptr += src_mod->channels;
}
src_frag += src_ch;
}
}
this will have only 2 if/else, will save cycles a lot.
Note: this is just reference code, please have more investigation to get correct offset calculation.
There was a problem hiding this comment.
@btian1 I don't think that's equivalent to the above: in the original code src_frag is incremented for each frame and kept the same across channels.
There was a problem hiding this comment.
Thanks for the advice. I will do the follow-up optimization
There was a problem hiding this comment.
it is just for demo purpose for save if/else, as I said in the note, "need more investigation to get correct offset"
There was a problem hiding this comment.
just updated the commit for better buffer read/write strategy.
| } else { | ||
| src_idx = src_frag + chan_map[ch]; | ||
| src_ptr = audio_stream_read_frag_s32(src, src_idx); | ||
| *mod_ptr = *src_ptr << 8; |
There was a problem hiding this comment.
here too, use audio_stream_read_frag_s32(), there is an example in aria_algo_get_data() in aria_generic.c
|
@lgirdwood how much power (and would it be even worth it) if userspace opened the PCM in non-blocking so we can just never yield samples to userspace allowing both CRAS and the DMA to sit idle? |
@cujomalainey not sure - ideally if we do nothing then we should not consume power. However, this proposal may enable DMA channels and maybe other IPs in preparation for stream propagation (and consume some power, but I would think less than the full stream). I assume you want to minimize the number of PCMs for smart amp when not needed ?I think @ranj063 has recently been looking at a way to get rid of the second PCM when its not needed (IIRC for AEC?). |
lgirdwood
left a comment
There was a problem hiding this comment.
@johnylin76 This looks ready to land, I think we just need the optimal copy() code and a few minor comments and we are good.
| The selection for Smart Amplifier component implementation | ||
| will depend on the Amplifier solution supplier. It is fair | ||
| to treat the supported solutions as mutually exclusive ones. | ||
| There should be no more than one solution selected per build |
There was a problem hiding this comment.
Why can't we setup a system similar to modules and have support for all devices at once? You could also do composition and just have this by a stand in for default implementation in a module?
There was a problem hiding this comment.
We are able to do it for sure. The code implementation will be a bit more complicated (as how module_adapter is implemented https://github.com/thesofproject/sof/blob/main/src/include/sof/audio/module_adapter/module/generic.h
I didn't implement that way due to the proposed plan in #7801 said the long-term goal is to use module API for all components, so I would defer that to the future task for converting smart_amp to use module API.
src/audio/smart_amp/smart_amp.c
Outdated
|
|
||
| /* available formats: SOF_IPC_FRAME_S16_LE, SOF_IPC_FRAME_S24_4LE, SOF_IPC_FRAME_S32_LE */ | ||
| for (i = 0; i < num_ff_fmts; i++) { | ||
| if (ff_fmts[i] >= ff_src_fmt && ff_fmts[i] <= SOF_IPC_FRAME_S32_LE) { |
There was a problem hiding this comment.
comparing formats for greater or less looks a bit suspicious... Currently we don't do that, do we?
There was a problem hiding this comment.
yep I agree with that. The comparison will be easy for audio_format in IPC4 by comparing the depth attribute.
The design now (in IPC3 format) relies on the background knowledge that the enum value is the same order as the depth among {S16_LE=0, S24_4LE=1, S32_LE=2}, which is implicit.
To elaborate the current design, I would propose to wrap the implicit comparison into helper functions with comments for the background. Or maybe add get_depth(fmt) helper and compare by get_depth(fmt). get_depth(fmt) can also help smart_amp_generic implementation where we need to specify the shift bit for each conversion pair, e.g. S16-to-S24, S16-to-S32.
There was a problem hiding this comment.
updated the commit. I added a helper function get_sample_bitdepth() in include/sof/audio/format.h. I'm not sure if it's appropriate, we can also put it inside of smaty_amp if that is the case. WDYT?
src/audio/smart_amp/smart_amp.c
Outdated
| buffer_release(buf_c); | ||
| } else { | ||
| /* should not be used anyway */ | ||
| fb_src_fmt = 0; |
There was a problem hiding this comment.
0 is SOF_IPC_FRAME_S16_LE. Maybe -1?
There was a problem hiding this comment.
just revised the code to prevent the redundant code like this.
kv2019i
left a comment
There was a problem hiding this comment.
My first-pass review, a few comments inline but this seems solid. With 2.8 opened for development, this would be a good time to get bigger changes like this in. We do have a couple of big changes ongoing (one is the move to module adapter) and now with 2.8 we will revamp the buffer structure (#8106 ), so some coordination is needed to avoid too many rebase hassles.
src/audio/smart_amp/smart_amp.c
Outdated
| size = SMART_AMP_FB_BUF_DB_SZ * sizeof(int32_t); | ||
| ret = smart_amp_buf_alloc(&sad->fb_mod.buf, size); | ||
| if (ret < 0) | ||
| return ret; |
There was a problem hiding this comment.
This looks a bit suspicious how ff_mod.buf memory is freed if error happens here. It seems this is ok, you free all on free_data_buffers, but it's a bit unconventional to have the caller free partial resources, so a comment would be nice.
There was a problem hiding this comment.
I would actually make this even a bit stronger: I find it significantly better to process all function-internal errors within that same function. In this case this would result in a duplicated call to the clean up function, but I think it would be better than relying on the caller to clean up.
There was a problem hiding this comment.
Thanks. I updated the commit as the suggested by @lyakh
https://github.com/thesofproject/sof/pull/8036/files#diff-4a0050312896c64976b6e80b0d7fc01348b8457d831698766e49900c05ee85ffR68
src/audio/smart_amp/smart_amp.c
Outdated
| ret = smart_amp_buf_alloc(&sad->ff_mod.buf, size); | ||
| if (ret < 0) | ||
| return ret; | ||
| total_size += size; |
There was a problem hiding this comment.
nitpick: you can remove initialisation and just do total_size = size here
src/audio/smart_amp/smart_amp.c
Outdated
| size = SMART_AMP_FB_BUF_DB_SZ * sizeof(int32_t); | ||
| ret = smart_amp_buf_alloc(&sad->fb_mod.buf, size); | ||
| if (ret < 0) | ||
| return ret; |
There was a problem hiding this comment.
I would actually make this even a bit stronger: I find it significantly better to process all function-internal errors within that same function. In this case this would result in a duplicated call to the clean up function, but I think it would be better than relying on the caller to clean up.
| return NULL; | ||
| if (frames == 0) { | ||
| comp_warn(dev, "smart_amp_copy(): feedback frame size zero warning."); | ||
| return 0; |
There was a problem hiding this comment.
returning 0 means no error, and a warning message is issued. And this is a processing function. Can it happen that this warning will flood the log?
There was a problem hiding this comment.
this warning message is as it was actually (just moved here due to the revamp.
Based on my previous debugging experience, this can be reproduced every now and then with no harm (well... at least as far as I've met), but is not an expected behavior.
| { | ||
| enum DSM_API_MESSAGE retcode; | ||
| struct dsm_api_memory_size_ext_t memsize; | ||
| int *circularbuffersize = hspk->circularbuffersize; |
There was a problem hiding this comment.
a general comment: such names would be easier to read with underscores, e.g. circular_buffer_size. Can be a follow up
There was a problem hiding this comment.
this is the same as the original code (authored by Maxim) before the revamp (just moved from elsewhere). So I would keep it as is. And I think it is probably named that way on purposes to be aligned with the pre-built libraries for Maxim
|
@johnylin76 if you have time now to do a rebase after #8217 is merged, we could now push this in. |
@johnylin76 #827 now merged, good to go ? |
Thanks. I just addressed the 2nd-round comments and rebased the commit. |
The motivation is depicted in #7801 This commit revamps smart_amp component design to two-layer structure, i.e. generic layer and inner model layer. Generic layer is the common part of smart amp process which can be regarded as the glue code interacting between SOF component ops and inner model. While inner model may have various implementations respectively for solution suppliers in a modular way. Signed-off-by: Pin-chih Lin <johnylin@google.com>
kv2019i
left a comment
There was a problem hiding this comment.
Thanks @johnylin76 , looks good now!
The motivation is depicted in #7801
This PR revamps smart_amp component design to two-layer structure, i.e. generic layer and inner model layer.
Generic layer is the common part of smart amp process which can be regarded as the glue code interacting between SOF component ops and inner model. While inner model may have various implementations respectively for solution suppliers in a modular way.
This PR revamps Smart Amplifier component design to a two-layer structured design, i.e. generic layer and inner model layer. The latter can have various implementations respectively for Amplifier solution suppliers, while the former is the common part of Smart Amp process adaptable for all solutions.
In structural aspect, one can regard generic layer as the glue code that wraps inner model in a SOF component. Ops are defined for interaction between two layers. Inner model is the solution-specific modular code, which may have static libraries linked as needed. The structure is figured below:
As illustrated above, generic layer handles the cross-communication for inner model and SOF pipeline flow, as well as the smart-amp common tasks including: