Add cache maintenance functions for ARMv7-M #40

adamgreig · 2017-06-09T04:20:03Z

This PR adds the cache control primitives, and the higher-level cache maintenance operations that are also present in CMSIS. Cortex-M7 devices generally have a DCache and ICache which can sometimes be useful (mostly if you have external RAM) but you need to specifically clean and invalidate cache entries to ensure coherency between the CPU and DMA or other bus masters.

Specifically, this PR:

Extends the config flags to include armv7m and armv7em (with -M also being set for -EM parts)
Puts the existing CLIDR, CTR, CCSIDR, and CSSELR CPUID registers behind cfg(armv7m) as they are only implemented since ARMv7-M.
- CSSELR is changed to RW - though the table B4-1 says it's RO, the specific information in B4.8.3 says it is RW, and it's used by writing to it.
Adds methods on CPUID to access those cache related registers and retrieve cache number of sets/ways
Adds the CBP memory-mapped operations block from Table B2-1, which contains operations to clean and invalidate the instruction and data caches and branch predictor on ARMv7-M parts that include them.
Adds methods for each of the memory locations in the CBP block to actually perform those operations, with appropriate arguments.
- Though note the set/way functions are specific to Cortex-M7 and might not apply to a future ARMv7-M implementation with a different cache setup (see the comment in dcisw())
Adds new methods to SCB that mirror the equivalent SCB_* cache functions in CMSIS (see core_cm7.h).

All table/section references refer to the ARMv7-M architecture reference manual.

Testing that these worked as expected was as fun as you'd imagine but I'm confident they do what's expected and match the CMSIS implementation.

Running cache test...
Operation               Cache  RAM
----------------------------------
Initial....................A    A
Increment..................B    B
Enable DCache..............B    B
Increment..................C    B
Clean DCache...............C    C
Increment..................D    C
Invalidate DCache..........C    C
Increment..................D    C
CleanInvalidate DCache.....D    D
Increment..................E    D
Clean DCache wrong addr....E    D
Clean DCache right addr....E    E
Increment..................F    E
Inval DCache wrong addr....F    E
Inval DCache right addr....E    E
Increment..................F    E
CI DCache wrong addr.......F    E
CI DCache right addr.......F    F
Increment..................G    F
Disable DCache.............G    G

A few open questions...

I'm not super happy at how you have to pass &cpuid into some of the SCB cache methods, but I can't see any nice way around this. The functions are in SCB to match the SCB_CleanDCache() macros from CMSIS and because the key cache enable bits are in SCB, but you could imagine moving these to methods on CBP instead.
The SCB methods currently accesses the CBP block with get() but as this whole block is WO, stateless, and just consists of memory-mapped operations, I assume this isn't a problem.
Calling invalidate_dcache while the dcache is enabled is generally a poor idea: your stack including callstack could be reset to some older state among other problems. If you build this with optimisations, invalidate_dcache gets inlined and uses registers enough that it will complete and do what you expect, but if the cache contains anything that should have been cleaned you will get some unexpected results. If you build without optimisations, it will just get stuck in the invalidate loop as the loop counter update gets wiped out. I wrote an inlineable asm version which solves this specific problem, but still means your stack will probably not be correct, and it's a bunch messier, so I think this simple non-asm version is the one to use. Since disabling the DCache also does a clean and invalidate afterwards, it's not clear when you'd actually want to call invalidate_dcache (outside of enable_dcache which calls it before enabling the cache). I included it because it's in CMSIS.

japaric

@adamgreig Thanks for working on this! I had no idea these registers even existed. I left some comments about the Rusty-ness of the proposed API.

About your questions:

Testing that these worked as expected was as fun as you'd imagine but I'm confident they do what's expected and match the CMSIS implementation.

As I have no idea what this is really doing and have no way to test this I'll trust you ;-)

I'm not super happy at how you have to pass &cpuid into some of the SCB cache methods, but I can't see any nice way around this.

I suppose we could merge the two register blocks into a single one and have all methods on that struct.

The SCB methods currently accesses the CBP block with get() but as this whole block is WO, stateless, and just consists of memory-mapped operations, I assume this isn't a problem

Seems fine to me.

Calling invalidate_dcache while the dcache is enabled is generally a poor idea

Perhaps we can return a Result::Err if invalidate_dcache is called when the DCache was enabled (assuming there's some way of checking that the DCache is enabled at runtime).

it's not clear when you'd actually want to call invalidate_dcache (outside of enable_dcache which calls it before enabling the cache). I included it because it's in CMSIS

In that case perhaps we can keep that method private until someone specifally requests it.

japaric · 2017-06-10T17:48:28Z

src/peripheral/mod.rs

@@ -100,13 +104,65 @@ pub struct Cpuid {
    pub isar: [RO<u32>; 5],
    reserved1: u32,
    /// Cache Level ID
+    #[cfg(armv7m)]


observation: technically breaking change even if it probably made no sense to use these on ARMv6-M.

Is that a problem? I can drop this cfg gate if you want, but it will never have done anything useful on ARMv6-M and cortex-m is still <v1.0 too.

I'm fine with making breaking changes but I'd like to minimize the number of minor version bumps when possible, specially when those changes require breaking changes in other crates. There's already one breaking change scheduled for v0.3.0: #41 and I have another in mind. How about postponing this one so it lands with the other two?

Yep sure, makes sense to combine the breaking changes into one release. Want to postpone this whole set of features or just the cfg flag on these registers?

just the cfg flags

japaric · 2017-06-10T17:50:49Z

src/peripheral/mod.rs

+}
+
+#[cfg(armv7m)]
+const CSSELR_IND_POS: u32 = 0;


could these constant be moved into a cfg'd module and then imported to reduce the number of cfg attributes? Something like this:

// Sub-Architecture Specific #[cfg(armv7m)] mod sas { pub const FOO: u32 = 0; } #[cfg(not(armv7m))] mod sas {} use self::sas::*;

Perhaps we can just drop the cfg entirely. They're const, they're only used in methods that are behind cfg gates, they're not pub, so they simply won't appear when not being used.

japaric · 2017-06-10T17:51:56Z

src/peripheral/mod.rs

+    /// * `level`: the required cache level minus 1, e.g. 0 for L1, 1 for L2
+    /// * `ind`: select instruction cache (1) or data/unified cache (0)
+    #[cfg(armv7m)]
+    pub fn select_cache(&self, level: u32, ind: u32) {


This should probably accept enums as inputs to avoid passing non-sensical values like level = 9_001. Either that or assert the valid range.

Same with the functions below.

I'll add a panic for level and make ind an enum, which I think best matches what they're used for.

japaric · 2017-06-10T17:53:43Z

src/peripheral/mod.rs

+    ///
+    /// * `level`: the required cache level minus 1, e.g. 0 for L1, 1 for L2
+    /// * `ind`: select instruction cache (1) or data/unified cache (0)
+    #[cfg(armv7m)]


The cfg attribute on all these methods can be converted into a single cf attribute on the enclosing impl block.

Sure, will do. I avoided it because "what if you wanted some other non-arm7-specific methods on CPUID"? but of course they can go in another impl block.

japaric · 2017-06-10T17:58:11Z

src/peripheral/mod.rs

+
+    /// Returns the number of sets in the selected cache
+    #[cfg(armv7m)]
+    pub fn cache_num_sets(&self, level: u32, ind: u32) -> u32 {


Perhaps change the return type to u16 since the return value can't exceed u16::MAX AFAICT.

Same comment for the method below.

I've changed this to u16 but basically it's just meant doing a bunch of casting from u32 to u16 when we read it, and from u16 back to u32 when we use it (giving a set/way to the cache maintenance operations). Not sure it's really worthwhile?

japaric · 2017-06-10T18:01:50Z

src/peripheral/mod.rs

+    /// `size`: size of the memory block, in number of bytes, a multiple of 32
+    #[cfg(armv7m)]
+    #[inline]
+    pub fn invalidate_dcache_by_address(&self, addr: u32, size: u32) {


The requirements stated in the documentation should be asserted in the body of this method, otherwise (I suppose) Bad Stuff could happen if the user calls this with non aligned address and sizes not multiple of 32.

Alternatively you could say that size has units of "lines" and a size of 1 equals to 32 bytes.

Same comment for the methods below.

I believe the architecture actually masks the lower bits of the address to enforce 32 bit alignment, though I can't find a reference for this. We could enforce this by just masking the address. You'd usually want to use this method like invalidate_dcache_by_address(&mybuf, mybuf.len()) so I think it makes sense to allow it to accept non-32-bit aligned addresses and sizes and just ensure you invalidate at least what was asked for. That behaviour would match the CMSIS version too.

japaric · 2017-06-10T18:03:37Z

src/peripheral/mod.rs

+pub struct Cbp {
+    /// I-cache invalidate all to PoU
+    pub iciallu: WO<u32>,
+    reserved0: RW<u32>,


nit: no need to wrap reserved0 in RW it can just have type u32

japaric · 2017-06-10T18:04:53Z

src/peripheral/mod.rs

+    /// I-cache invalidate by MVA to PoU
+    #[inline(always)]
+    pub fn icimvau(&self, mva: u32) {
+        unsafe { self.icimvau.write(mva); }


Is the whole u32 range a valid input here? What happens if I call cbp.icimvau(u32::MAX)?

Some comments about the inputs of the other methods below.

All u32 are valid here, it's a "modified virtual address" aka the address of whatever you want to invalidate. Invalidating an address that's not in the cache isn't a problem (since you couldn't have known if it was cached or not anyway).

japaric · 2017-06-10T18:15:47Z

build.rs

+        println!("cargo:rustc-cfg=armv7m");
+    } else if target.starts_with("thumbv7em-") {
+        println!("cargo:rustc-cfg=armv7m");
+        println!("cargo:rustc-cfg=armv7em");


I'd prefer not to enable this cfg if we are not using it right now. It can be left commented out until the need for it arises.

Sure, I'll comment it out.

homunkulus · 2017-06-12T03:59:05Z

☔ The latest upstream changes (presumably 3951582) made this pull request unmergeable. Please resolve the merge conflicts.

Conflicts: src/peripheral/mod.rs

adamgreig · 2017-06-12T11:02:56Z

Thank you for the feedback!

I suppose we could merge the two register blocks into a single one and have all methods on that struct.

I think having CPUID available separately is nice because you often want to read things from there separately to using SCB. All of CPUID is read-only except this one CSSELR, which changes what you read from the other CS* registers. We could conceivably move the CS* registers into SCB but that seems a bit weird too. I'm not sure.

Perhaps we can return a Result::Err if invalidate_dcache is called when the DCache was enabled (assuming there's some way of checking that the DCache is enabled at runtime).
In that case perhaps we can keep that method private until someone specifally requests it.

Yes, on reflection I think keeping invalidate_dcache private makes sense. You can still invalidate by MVA (which is what you'd usually be doing anyway) so I don't think it's a problem.

I'll respond to the other comments in their threads and push some commits with changes.

adamgreig · 2017-06-12T11:40:29Z

I think that's all the points addressed (though still a somewhat outstanding question on needing to pass cpuid in, and the breaking change of hiding the registers that don't appear on ARMv6-M).

Let me know if you'd rather have those commits squashed.

adamgreig · 2017-06-13T13:24:18Z

If you hold off on this for a day, I'll add a couple of methods to check if the caches are enabled or not, and make enabling-while-enabled/disabling-while-disabled a no-op.

At the moment if you called disable_dcache while it was already disabled, you'd cause the (not at all up to date) dcache to be cleaned back into main memory with bad consequences, but there's no convenient way for the user to check if it's enabled or not.

The rest of the changes will stay the same so do let me know if you're happy with them.

adamgreig · 2017-06-13T19:42:41Z

Okay, done. That should prevent the more obvious problems and lets you just call disable_dcache to ensure it's turned off, without having to care if it was previously turned on or not, and vice versa.

japaric · 2017-06-14T16:46:39Z

src/peripheral/mod.rs

+
+        let mut addr = addr & 0xFFFF_FFE0;
+
+        for _ in 0..(size/LINESIZE) {


hmm, if address is 32 byte aligned and size is 31 this would be a no-op which sounds like not what you want. Sounds like this should be ((size - 1) / LINESIZE) + 1 or something like that but with a check for size == 0 that should be a no-op.

Yes, good point.

japaric · 2017-06-14T16:49:16Z

src/peripheral/mod.rs

+    /// Invalidates cache starting from the lowest 32-byte aligned address represented by `addr`,
+    /// in blocks of 32 bytes until at least `size` bytes have been invalidated.
+    #[inline]
+    pub fn invalidate_dcache_by_address(&self, addr: u32, size: u32) {


You'd usually want to use this method like invalidate_dcache_by_address(&mybuf, mybuf.len())

In that case these should probably be addr: usize and size: usize. That should require less casts. Maybe even make addr: *const T (or directly accept a single argument: buf: &[T]) but I'm not sure about that one.

Hmm, I wonder about this. How about keeping the current functions (though changing to usize), and then adding invalidate_slice(slice: &[T]) and invalidate(&T) that uses core::mem::size_of to get the size?

japaric · 2017-06-14T16:52:21Z

src/peripheral/mod.rs

+    /// * `level`: the required cache level minus 1, e.g. 0 for L1, 1 for L2
+    /// * `ind`: select instruction cache or data/unified cache
+    pub fn select_cache(&self, level: u8, ind: CsselrCacheType) {
+        assert!(level<8);


Thinking about this a bit more. Maybe the assert is not necessary given the mask below? We do it like that in the svd2rust API. All inputs get masked to make them valid and we don't have assertions to check their range. However svd2rust documentation indicates that masking is going on. If we drop the assertion here then we should add a note about level getting masked to always be < 8.

japaric · 2017-06-14T16:54:21Z

src/peripheral/mod.rs

+    pub fn select_cache(&self, level: u8, ind: CsselrCacheType) {
+        assert!(level<8);
+        unsafe { self.csselr.write(
+            (((level as u32) << CSSELR_LEVEL_POS) & CSSELR_LEVEL_MASK) |


just a comment: I prefer to mask the value before shifting it to the left as that avoids overflows (which turn into panics in debug mode). Now that may or may not occur in this case; I'm just pointing out this problem in the general case.

Hmm, sure. It can't happen here because level is a u8 but it could happen for way in the CBP methods. I would rather keep the _MASK constants as shifted, but we can just shift them down again before masking and it'l compile to the same deal.

japaric · 2017-06-14T16:59:50Z

src/peripheral/mod.rs

+
+    /// D-cache clean by set-way
+    #[inline(always)]
+    pub fn dccsw(&self, set: u16, way: u16) {


Perhaps it would be worthwhile to note the vaild input range of set and way and that anything outside that range will be masked to return to the valid range.

japaric · 2017-06-14T17:03:41Z

Thanks for the changes, @adamgreig. I left some more comments about things I didn't notice in the first review, sorry :-).

This prevents these changes being breaking changes as these fields used to be exposed. To be re-added in a future breaking release.

adamgreig · 2017-06-14T23:08:53Z

Thanks! This definitely started life as a very C-like API and I think it will be much nicer as a Rusty one. I've updated everything except the (now marked as outdated) thread about the higher level functions:

Hmm, I wonder about this. How about keeping the current functions (though changing to usize), and then adding invalidate_slice(slice: &[T]) and invalidate(&T) that uses core::mem::size_of to get the size?

japaric · 2017-06-15T00:30:24Z

I've updated everything except the (now marked as outdated) thread about the higher level functions:

Yeah, I think we can go ahead and land the usize, usize version now and then revisit later if the convenience functions for &T and &[T] make sense. Could you open a RFC issue about that?

This definitely started life as a very C-like API and I think it will be much nicer as a Rusty one

👍

Thanks again for working on this. Let's see what the bot has to say.

@homunkulus r+

homunkulus · 2017-06-15T00:30:25Z

📌 Commit 8f42e8f has been approved by japaric

homunkulus · 2017-06-15T00:30:28Z

⌛ Testing commit 8f42e8f with merge 8f42e8f...

homunkulus · 2017-06-15T00:34:11Z

💔 Test failed - status-travis

adamgreig · 2017-06-15T00:43:13Z

Aw. You can't take the config flag off all those other constants because then they're not used in the other architectures.

Could:

Put individual cfg on each constant. Messy.
Put them in a flagged module as you suggest. I'm not sure about the name sas as it's a bit.. weird? Perhaps a module per peripheral for architecture specific constants.
Put them as consts inside each method that actually uses them. Many of these constants are only used by a single method.

japaric · 2017-06-15T01:06:54Z

Or a mixture of options 3 and 2.

I'm not sure about the name sas as it's a bit.. weird?

It was just an example; you could use mod armv7m or something like that.

adamgreig · 2017-06-15T02:31:31Z

How does this look? Now builds for me on all architectures.

japaric · 2017-06-15T03:18:07Z

LGTM

@homunkulus r+

homunkulus · 2017-06-15T03:18:07Z

📌 Commit 0947e74 has been approved by japaric

homunkulus · 2017-06-15T03:18:10Z

⌛ Testing commit 0947e74 with merge 0947e74...

homunkulus · 2017-06-15T03:20:32Z

☀️ Test successful - status-travis
Approved by: japaric
Pushing 0947e74 to master...

adamgreig · 2017-06-15T03:32:05Z

🎉 thanks for the review!

drop the /DISCARD/ section

Add cache control functions for ARMv7-M

97911d9

japaric reviewed Jun 10, 2017

View reviewed changes

adamgreig added 3 commits June 12, 2017 11:49

Merge branch 'master' into cache_control

cf3379f

Conflicts: src/peripheral/mod.rs

Combine cache_num_sets and cache_num_ways into single read of CCSIDR

07c4042

Make invalidate_dcache private

414422f

adamgreig added 5 commits June 12, 2017 12:17

Tidy up cfg(armv7m) use

7a2de26

Use enum for ind on CSSELR

c7c391d

Use u16 for number of cache sets/ways

a5170eb

Document behaviour of cache _by_address methods better

861c089

Comment out armv7em cfg flag until it's needed

c442c1c

Add icache_enabled and dcache_enabled methods to SCB

bd0a100

japaric reviewed Jun 14, 2017

View reviewed changes

Remove config flags on CPUID fields.

ea7ce5c

This prevents these changes being breaking changes as these fields used to be exposed. To be re-added in a future breaking release.

adamgreig force-pushed the cache_control branch 2 times, most recently from 4e3ddbb to 73ebe63 Compare June 14, 2017 23:06

adamgreig added 4 commits June 15, 2017 00:11

Don't assert(level<8), just note that we mask it

4589661

Add documentation note about set and way masking for CBP methods

aa3e5d0

Mask set and way parameters before shifting to prevent overflow

da097d8

Fix number of cache lines invalidated for sizes not a multiple of 32

8f42e8f

adamgreig force-pushed the cache_control branch from 73ebe63 to 8f42e8f Compare June 14, 2017 23:11

adamgreig mentioned this pull request Jun 14, 2017

Put cache-related CPUID registers behind an armv7m cfg flag #46

Closed

adamgreig mentioned this pull request Jun 15, 2017

RFC: Even higher-level cache maintenance operations #47

Closed

Move ARMv7-M specific constants into a cfg-gated module

0947e74

homunkulus merged commit 0947e74 into rust-embedded:master Jun 15, 2017

adamgreig deleted the cache_control branch June 15, 2017 03:21

adamgreig pushed a commit that referenced this pull request Jan 12, 2022

Merge pull request #40 from japaric/no-discard

a510390

drop the /DISCARD/ section


		let mut addr = addr & 0xFFFF_FFE0;

		for _ in 0..(size/LINESIZE) {

Add cache maintenance functions for ARMv7-M #40

Add cache maintenance functions for ARMv7-M #40

Uh oh!

Conversation

adamgreig commented Jun 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

japaric left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

homunkulus commented Jun 12, 2017

Uh oh!

adamgreig commented Jun 12, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamgreig commented Jun 12, 2017

Uh oh!

adamgreig commented Jun 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamgreig commented Jun 13, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamgreig commented Jun 9, 2017 •

edited

Loading

adamgreig commented Jun 12, 2017 •

edited

Loading

adamgreig commented Jun 13, 2017 •

edited

Loading