-
-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a sync
API for working with global mutable state.
#107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, as much of it as I understood.
src/sync.rs
Outdated
/// Marks that a given pointer is read by volatile means without actually | ||
/// reading it. | ||
#[inline(always)] | ||
pub fn volatile_mark_ro<T>(val: *const T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this is doing what it claims to be doing. I'd like more of an explanation in the rustdoc comment here. Do I just need to go read the inline assembly docs? If so you can just say that and give the link to those. If the explanation is something else we should write that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kinda very suspicious that this even works myself.
The documentation for asm!
states that "by default, an inline assembly block is treated the same way as an external FFI function call with a custom calling convention: it may read/write memory, have observable side effects, etc." This should work well enough, since an external FFI call is about the right thing for representing e.g. what the DMA unit does. (i.e. reads from the input *const
and writes out to the *mut
pointer -- if in a concurrent way that's hard to represent in Rust's type system.)
The thing that makes me doubt myself a lot is that this code is basically the same as std::hint::black_box
, except it uses the asm!
macro instead of llvm_asm!
.
... which is concerning because the documentation for that function says "black_box is only (and can only be) provided on a 'best-effort' basis. The extent to which it can block optimisations may vary depending upon the platform and code-gen backend used. Programs cannot rely on black_box for correctness in any way.". I'm not sure if that's there for future-proofing or if there's a known way that LLVM can optimize across black_box
in an unwanted way already.
We could always just use compiler_fence
instead of having these functions, honestly, which will work for sure at the cost of emitting calls to the empty __sync_synchronize
function until that bug is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this fixes the bug in the hello_world
example in practice, but I don't know if it's guaranteed to always do so in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loka asked me to take a look. I know nothing about embedded or GBA, but have written a lot of concurrent and lock-free code. My comments are mostly under the assumption that the concerns between sharing data between interrupts and normal code are basically the same as with multithreaded, although I know in practice they are somewhat less. The fence I'm referring to is compiler_fence, not a real fence, since I know you can't use one of those.
* Added needed compiler fences to `InitOnce::try_get`. * Change the error in `RawMutex::raw_unlock` to better reflect the cause.
What's left to do in this PR? |
I think we're good. @Lymia i'll merge if you're done here. |
I can't think of anything else I want to do here. Should be good to merge. |
This PR is an alternative to #103 that tries to maintain correctness for all types (by disabling IRQs and doing a plain copy) and use inline assembly to provide fast paths for small (and aligned) types. This avoids the need for separate types or any runtime checks, as the only problem if you use a type that doesn't work with the fast path is that... well. You use a slow path that can cause the audio to skip. (But that isn't memory unsafely.)
The PR also adds some helper types based on
Static
for some common other uses of global variables:RawMutex
andMutex
provide a way to prevent an IRQ and normal code from running the same thing at once when this would cause safety issues.InitOnce
implements something pretty similar tolazy_static!
, allowing you to run some computation only once, after which the result is cached.Implementation notes:
Static::write
andStatic::read
useLDR
/STR
(and their half-word and byte variants) for small types, and useSTMIA
/LDMIA
to save/load the entire contents of theStatic
without a chance for an IRQ to interrupt it for larger types.Static::replace
uses theSWP
andSWPB
instructions for its fast path. It works for any type that is 4-byte aligned and less than 4 bytes, or any type that is only one byte.InitOnce
andRawMutex
are both based onStatic::swap
, allowing them to atomically set and check status bytes, even in the presence of IRQs.compiler_fence
entirely. As such, I've added a stub for the__sync_synchronize
function. This causes an unneeded function call to an empty function every time a compiler fence is used, but this is a minimal cost. It doesn't need to actually do anything, because there's no actual memory barriers needed on GBA.Finally, this fixes a few minor bugs that are a bit too small to put in a separate PR:
hello_world
example that was causing it to panic in debug builds.hello_world
example, theMode3::dma_clear_to
was actually copying uninitialized memory into VRAM instead of the intended color value. It only seemed to work before because the value happened to end up in zeroed memory.)