Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm: core: cortex_m: add a barrier before the dummy FP instruction #8373

Merged
merged 1 commit into from
Jun 14, 2018

Conversation

aurel32
Copy link
Collaborator

@aurel32 aurel32 commented Jun 13, 2018

On Cortex-M7 CPU (at least on STM32F723 with patches from #7284), the dummy move FPU instruction is executed before the FPU lazy state preservation is disabled. This patch adds a barrier before it to avoid that.

This can be easily reproduced with a simple example, like hello_world, by setting CONFIG_FLOAT=y. The system goes in an endless loop very early, before outputting anything on the console. This is something difficult to debug, as executing the instructions step by step in GDB hides the problem. I guess the other Cortex-M7 based platforms are also affected: NXP i.MX RT and Microchip SAM E70. I guess @MaureenHelm can test the first one.

@codecov-io
Copy link

codecov-io commented Jun 13, 2018

Codecov Report

Merging #8373 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #8373   +/-   ##
=======================================
  Coverage   64.61%   64.61%           
=======================================
  Files         423      423           
  Lines       40293    40293           
  Branches     6801     6801           
=======================================
  Hits        26034    26034           
  Misses      11126    11126           
  Partials     3133     3133

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a55c72d...b7335c8. Read the comment docs.

@galak
Copy link
Collaborator

galak commented Jun 14, 2018

Do we need both isb & dsb?

@ioannisg ioannisg requested a review from agross-oss June 14, 2018 07:16
@ioannisg
Copy link
Member

Good point, I would think that only the instruction barrier is required.

@aurel32
Copy link
Collaborator Author

aurel32 commented Jun 14, 2018

Do we need both isb & dsb?

I have to say I copied the barriers from below. From what I understand we need both:

  • dsb to make sure the write to FPCCR is finished
  • isb to empty the pipeline and execute vmov using the new FPU configuration

@ioannisg
Copy link
Member

IMHO, a DSB instruction would be needed if we are doing read/write to RAM or system registers and then use that value (so make sure the value is updated before performing new access).
Here, all we need to guarantee is that the FPCCR is updated before we do the dummy write, so we only need to preserve execution synchronization (but we do not use the value of the FPCCR anywhere).

@aurel32
Copy link
Collaborator Author

aurel32 commented Jun 14, 2018

Ok, I think I got it. Given your explanation I guess it also means the dsb after the vmov is useless.

@aurel32
Copy link
Collaborator Author

aurel32 commented Jun 14, 2018

I have just pushed a new version without the dsb instruction.

Copy link
Collaborator

@agross-oss agross-oss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the dsb removal this looks ok

Copy link
Member

@ioannisg ioannisg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me.

@ioannisg
Copy link
Member

I think the commit message prefix should read: "arch: arm: ......."

On Cortex-M7 CPU (at least on STM32F723), the dummy move FPU instruction
is executed before the FPU lazy state preservation is disabled. Add an
instruction synchronization barrier before it to avoid that.

At the same time, remove the data synchronization barrier after the
dummy move as it does not have any effect on RAM or registers.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
@aurel32
Copy link
Collaborator Author

aurel32 commented Jun 14, 2018

I have just updated the patch to add the "arch:" prefix.

@galak galak merged commit bb55155 into zephyrproject-rtos:master Jun 14, 2018
@aurel32 aurel32 deleted the cortex-m7-fpu branch June 14, 2018 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants