-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory: Use cached memory for frequently used objects #527
Conversation
Hi all, I submitted this patch. This patch can fix the issue #443 on UP2 initialization crash issue. When I was debugging this issue, I found after the dcache flash, the issue will happen. So I tried to invalid the dcache every time I allocate the memory. This crash issue disappears. However, I think some of the allocated memories are frequently used by the FW. I changed the mode to L1 $-able memory. |
@libinyang sorry I'm not following how changing memory attributes on the DSP will fix the stream tag on the host for #443 ? Yes, the stream tag should be validated on the DSP (but this PR does not even do that). Looking at the driver hda-trace.c
Surely the stream tag will be fixed here ? |
@lgirdwood |
I think I found the root cause of why dcache flush randomly changes the memory incorrectly. I will make a patch for the bug #539 DSP panic after system booted for topology load issue on up2 But I do think this is patch is useful to improve the performance and should be merged. What do you guys think? Thanks ;-) |
@mmaka1 can you comment on @libinyang PR for change memory attributes. |
This patch allocates the frequently used normal objects with cached memory to improve the performance. Signed-off-by: Libin Yang <libin.yang@intel.com>
18fdddb
to
a191a82
Compare
any comments? |
add more objects to use cachable memory to improve the performance. |
@libinyang I think we need the uncache flag so that these peripherals can be shared between multiple cores. Where is the crash location from the stack trace ? |
@lgirdwood |
@libinyang ok, if this PR does not fix the issue what is it needed for ? |
@lgirdwood Let's looking into the code of free memory: it doesn't handle the cache flush/invalidate. Let's image we free a memory which is cachable. And later, we allocate a un-cachable memory which is aligned to the same memory we just freed. Some time later, HW may decide to flush the dcache and the new allocated memory's contents will be changed randomly. Such bugs will be hard to debug. So let's prevent such issues in advance. My suggestion is that in the same block of memory, we allocate them in the same behavior, either always cachable or always un-cachable. And I think for the frequently objects, cachable should be more reasonable. For the multi-cores, I think HW should already handle the synchronization. But If QA can do the test on multi-core, it will be helpful and more persuasive. @mmaka1 What do you think of it? Please help to correct me if I miss something. |
Ok, but the correct fix here is to invalidate and writeback the memory region on free() within the allocator. |
Yes, invalidate cache in free() is a good choice. This can make sure that the memories are safe. And this is can help to handle the DMA memory at the same time. My plan is to handle the dma memory with a flag and invalidate the cache in allocate just like linux kernel. However, caching the frequently used objects will improve the performance. I think we had better cache them all. What do you think? |
@libinyang Making those objects cacheable will break memory sychronization between cores. Of course, the best way would be to have dedicated shared memory space, but right now it would require huge effort to accomplish. There are simply too many objects, that needs sharing in order to run workload on slave cores.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SMP will be broken.
Thanks for explanation. What I understand is that HW will handle such cache operation automatically, won't it? Do you have some test cases for SMP to produce such sychronization issue? I would like to have a test and try to find the reason. |
@libinyang All cores share the same piece of SRAM, but have separate caches. HW won't handle such synchronization, so we need to make sure that coreX will flush its own data to SRAM in order for the coreY to have it up to date. |
I went through our spec about the cachable memory and didn't find requirement of manually sync between multiple cores. I only found multiple microprocessors need to manually handle this. Maybe this is our HW/xtensa special requirement? Besides, I found in our code sometimes it allocates memory in cache, and sometimes not. For example: Is there any special reason? |
@libinyang Only structures, which are used by slave cores are allocated in uncached memory. |
This patch allocates the frequently used normal objects with cached
memory.
Signed-off-by: Libin Yang libin.yang@intel.com