Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM] Critical sections #114

Closed
wants to merge 1 commit into from

Conversation

japaric
Copy link
Member

@japaric japaric commented Apr 14, 2019

Please read #111 first, if you haven't already.

Rendered discussion document

being reordered to outside the critical section but AFAIK they don't tell the
compiler that `X` may change outside the critical section -- could the program
cache the value of `X` on the stack? That would change the semantics of the
program.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually that is the whole point of a compiler fence: it tells the compiler that other threads may have modified shared data (i.e. globals + any locals whose address has been made available globally) and that it should reload those at the point of the fence.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also flushes writes made before the fence, which is why memory is required on the second asm! (that re-enables interrupts).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Amanieu is that documented somewhere? That's not what I understood from the
atomic::compiler_fence documentation; that doc only talks about preventing
the compiler from reordering memory operations across the fence; it says
nothing about forcing shared / static variables to be reloaded.

I can certainly see the behavior you mention arising from asm!("" : : : "memory") where the compiler has to assume that the assembly block may modify
any memory, but the text you have commented on is asking about the program that
exclusively uses atomic::compiler_fence. Or do you mean to say that I should
consider asm!("" : : : "memory") and atomic::compiler_fence to be
equivalent? IME, they do have observable differences on optimizations.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the source code loads a variable twice but the emitted assembly only loads it once, the compiler has effectively reordered the second load up to the same point as the first load.

atomic::compiler_fence emits a LLVM fence instruction with syncscope set to singlethread. As mentioned in a different thread, it's also used for C atomic_signal_fence, which is made specifically for the interrupts use case:

Establishes memory synchronization ordering of non-atomic and relaxed atomic accesses, as instructed by order, between a thread and a signal handler executed on the same thread.

(It says "signal handler", but from the compiler's perspective, a Unix signal handler in userland and an interrupt handler on bare metal have basically the same semantics; they both interrupt execution at an arbitrary instruction and move control flow to somewhere else.)

I think asm!("" : : : "memory") should be largely if not entirely equivalent to atomic::compiler_fence with either AcqRel or SeqCst (as opposed to Acquire and Release, which only create one-sided barriers). I'd be interested to see what code creates differences in assembly output between the two. Since they are different LLVM constructs, it's possible that LLVM's optimizer could just coincidentally generate different code (in particular, it might be being overly conservative with asm!), but it's also possible that I'm forgetting something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@japaric

That's not what I understood from the
atomic::compiler_fence documentation; that doc only talks about preventing
the compiler from reordering memory operations across the fence; it says
nothing about forcing shared / static variables to be reloaded.

The RFC that added these describes their semantics as equivalent to C11's atomic_signal_fence. The current docs do not make that clear, but I'd argue that this is a documentation bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@comex

If the source code loads a variable twice but the emitted assembly only loads
it once, the compiler has effectively reordered the second load up to the same
point as the first load.

That clarifies things; thanks.

(Thinking of #112) Is removing a store operation considered a re-ordering?

I'd be interested to see what code creates differences in assembly output
between the two

compiler_fence prevents memory operations on stack variables from being
merged / reordered. The "memory" clobber seems to have no effect on stack
variables. See code below:

fn SysTick() {
    let mut x = 0;

    atomic::compiler_fence(Ordering::AcqRel);

    x += 1;

    unsafe {
        // prevent LLVM from optimizing away `x`
        ptr::read_volatile(&x);
    }
}
SysTick:
        sub     sp, #4
        movs    r0, #0
        str     r0, [sp] ; x = 0
        ldr     r0, [sp]
        adds    r0, #1
        str     r0, [sp] ; x += 1
        ldr     r0, [sp] ; read_volatile(&x)
        add     sp, #4
        bx      lr

vs

fn SysTick() {
    let mut x = 0;

    unsafe { asm!("" : : : "memory" : "volatile") }

    x += 1;

    unsafe {
        // prevent LLVM from optimizing away `x`
        ptr::read_volatile(&x);
    }
}
SysTick:
        sub     sp, #4
        movs    r0, #1
        str     r0, [sp] ; x = 1
        ldr     r0, [sp] ; read_volatile(&x)
        add     sp, #4
        bx      lr

If you replace the stack variable with a static variable then both the compiler
fence and the "memory" clobber prevent merging / re-ordering of memory
operations on the static variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Thinking of #112) Is removing a store operation considered a re-ordering?

Not sure what you are asking? It is not, but once you reordered enough to get two adjacent stores you can obviously remove the first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compiler_fence prevents memory operations on stack variables from being
merged / reordered. The "memory" clobber seems to have no effect on stack
variables. See code below:

To me this looks like an arbitrary LLVM limitation and not a fundamental difference. All clobber and orders aside, the compiler can in both cases assume that x is unobserved by the outside world because its address is not leaked to anything. So IMO in both cases it would be legal to optimize away all the writes, and just leave the volatile read.


- I have observed that an `asm!("")` expression with *no* clobbers prevents
operations on `static mut` variables from being merged and reordered. See
example below. Is this intended behavior?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this is intended. LLVM may apply stronger constraints than necessary on asm, but that is not something that we should rely on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is not something that we should rely on

Rather than rely on this I would actually like to see instructions like asm!("NOP" :::: "volatile") not preventing optimizations.

(At some point I thought that the sideeffect attribute that's attached to to the call asm in the LLVM-IR was preventing the optimizations but after double checking today by running opt on the IR after removing the sideeffect attribute I see that that doesn't help.)

Copy link

@comex comex Apr 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably related to this issue: LLVM is weird and expects the memory clobber to be partly handled by the frontend, by attaching readnone/readonly to the call asm instructions; Clang does this but rustc doesn't, so in effect, every Rust asm! statement is treated as if it had the memory clobber.

But also note that Clang (but not GCC) treats volatile as implying a memory clobber, and both Clang and GCC treat any asm statement with no output operands as implicitly volatile. Whether Rust asm! does the same is... not fully defined, I guess; rustc does mark asm! without output operands as sideeffect, but the current implementation doesn't have an opinion on whether volatile should imply clobbering memory, since it treats everything as clobbering memory. :p

I wish we had a `atomic::compiler_fence_val(&x, ORDERING)` function that
prevented only memory operations on `x` from being reordered -- though I don't
know if LLVM supports that kind of fine grained compiler fences -- that would
result in better optimizations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If x where to be a pointer here, that would not prevent re-orderings on memory that can be reached from x, only on x itself. Would that be ok?

@RalfJung
Copy link
Member

Closing due to inactivity, see #169 for tracking.

@RalfJung RalfJung closed this Jul 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants