-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM] Critical sections #114
Conversation
being reordered to outside the critical section but AFAIK they don't tell the | ||
compiler that `X` may change outside the critical section -- could the program | ||
cache the value of `X` on the stack? That would change the semantics of the | ||
program. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually that is the whole point of a compiler fence: it tells the compiler that other threads may have modified shared data (i.e. globals + any locals whose address has been made available globally) and that it should reload those at the point of the fence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also flushes writes made before the fence, which is why memory
is required on the second asm!
(that re-enables interrupts).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Amanieu is that documented somewhere? That's not what I understood from the
atomic::compiler_fence
documentation; that doc only talks about preventing
the compiler from reordering memory operations across the fence; it says
nothing about forcing shared / static variables to be reloaded.
I can certainly see the behavior you mention arising from asm!("" : : : "memory")
where the compiler has to assume that the assembly block may modify
any memory, but the text you have commented on is asking about the program that
exclusively uses atomic::compiler_fence
. Or do you mean to say that I should
consider asm!("" : : : "memory")
and atomic::compiler_fence
to be
equivalent? IME, they do have observable differences on optimizations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the source code loads a variable twice but the emitted assembly only loads it once, the compiler has effectively reordered the second load up to the same point as the first load.
atomic::compiler_fence
emits a LLVM fence
instruction with syncscope
set to singlethread
. As mentioned in a different thread, it's also used for C atomic_signal_fence
, which is made specifically for the interrupts use case:
(It says "signal handler", but from the compiler's perspective, a Unix signal handler in userland and an interrupt handler on bare metal have basically the same semantics; they both interrupt execution at an arbitrary instruction and move control flow to somewhere else.)
I think asm!("" : : : "memory")
should be largely if not entirely equivalent to atomic::compiler_fence
with either AcqRel
or SeqCst
(as opposed to Acquire
and Release
, which only create one-sided barriers). I'd be interested to see what code creates differences in assembly output between the two. Since they are different LLVM constructs, it's possible that LLVM's optimizer could just coincidentally generate different code (in particular, it might be being overly conservative with asm!
), but it's also possible that I'm forgetting something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not what I understood from the
atomic::compiler_fence documentation; that doc only talks about preventing
the compiler from reordering memory operations across the fence; it says
nothing about forcing shared / static variables to be reloaded.
The RFC that added these describes their semantics as equivalent to C11's atomic_signal_fence
. The current docs do not make that clear, but I'd argue that this is a documentation bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the source code loads a variable twice but the emitted assembly only loads
it once, the compiler has effectively reordered the second load up to the same
point as the first load.
That clarifies things; thanks.
(Thinking of #112) Is removing a store operation considered a re-ordering?
I'd be interested to see what code creates differences in assembly output
between the two
compiler_fence
prevents memory operations on stack variables from being
merged / reordered. The "memory" clobber seems to have no effect on stack
variables. See code below:
fn SysTick() {
let mut x = 0;
atomic::compiler_fence(Ordering::AcqRel);
x += 1;
unsafe {
// prevent LLVM from optimizing away `x`
ptr::read_volatile(&x);
}
}
SysTick:
sub sp, #4
movs r0, #0
str r0, [sp] ; x = 0
ldr r0, [sp]
adds r0, #1
str r0, [sp] ; x += 1
ldr r0, [sp] ; read_volatile(&x)
add sp, #4
bx lr
vs
fn SysTick() {
let mut x = 0;
unsafe { asm!("" : : : "memory" : "volatile") }
x += 1;
unsafe {
// prevent LLVM from optimizing away `x`
ptr::read_volatile(&x);
}
}
SysTick:
sub sp, #4
movs r0, #1
str r0, [sp] ; x = 1
ldr r0, [sp] ; read_volatile(&x)
add sp, #4
bx lr
If you replace the stack variable with a static variable then both the compiler
fence and the "memory" clobber prevent merging / re-ordering of memory
operations on the static variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Thinking of #112) Is removing a store operation considered a re-ordering?
Not sure what you are asking? It is not, but once you reordered enough to get two adjacent stores you can obviously remove the first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compiler_fence prevents memory operations on stack variables from being
merged / reordered. The "memory" clobber seems to have no effect on stack
variables. See code below:
To me this looks like an arbitrary LLVM limitation and not a fundamental difference. All clobber and orders aside, the compiler can in both cases assume that x
is unobserved by the outside world because its address is not leaked to anything. So IMO in both cases it would be legal to optimize away all the writes, and just leave the volatile read.
|
||
- I have observed that an `asm!("")` expression with *no* clobbers prevents | ||
operations on `static mut` variables from being merged and reordered. See | ||
example below. Is this intended behavior? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is intended. LLVM may apply stronger constraints than necessary on asm, but that is not something that we should rely on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is not something that we should rely on
Rather than rely on this I would actually like to see instructions like asm!("NOP" :::: "volatile")
not preventing optimizations.
(At some point I thought that the sideeffect
attribute that's attached to to the call asm
in the LLVM-IR was preventing the optimizations but after double checking today by running opt
on the IR after removing the sideeffect
attribute I see that that doesn't help.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's probably related to this issue: LLVM is weird and expects the memory
clobber to be partly handled by the frontend, by attaching readnone
/readonly
to the call asm
instructions; Clang does this but rustc doesn't, so in effect, every Rust asm!
statement is treated as if it had the memory
clobber.
But also note that Clang (but not GCC) treats volatile
as implying a memory
clobber, and both Clang and GCC treat any asm
statement with no output operands as implicitly volatile
. Whether Rust asm!
does the same is... not fully defined, I guess; rustc does mark asm!
without output operands as sideeffect
, but the current implementation doesn't have an opinion on whether volatile should imply clobbering memory, since it treats everything as clobbering memory. :p
I wish we had a `atomic::compiler_fence_val(&x, ORDERING)` function that | ||
prevented only memory operations on `x` from being reordered -- though I don't | ||
know if LLVM supports that kind of fine grained compiler fences -- that would | ||
result in better optimizations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If x
where to be a pointer here, that would not prevent re-orderings on memory that can be reached from x
, only on x
itself. Would that be ok?
Closing due to inactivity, see #169 for tracking. |
Please read #111 first, if you haven't already.
Rendered discussion document