Arm64: Memory barrier improvements #62895

kunalspathak · 2021-12-16T07:49:29Z

Generate store barriers wherever possible. Currently, we generate full barriers for stores.
Generate one-way barriers for volatile variable which makes the speed of volatile declared variables as observed in [arm64] Volatile.Read/Write is 2x faster than "volatile" loads/stores #60232. Thanks @EgorBo for suggesting the solution as well.

ghost · 2021-12-16T07:49:35Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Generate store barriers wherever possible. Currently, we generate full barriers for stores.
Generate one-way barriers for volatile variable which makes the speed of volatile declared variables as observed in [arm64] Volatile.Read/Write is 2x faster than "volatile" loads/stores #60232. Thanks @EgorBo for suggesting the solution as well.

Author:	kunalspathak
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

alexrp · 2021-12-16T08:12:40Z

I haven't reviewed this PR in detail, but I would just advise caution when it comes to memory barriers on ARM64:

Basically, some of the instructions don't quite have the semantics you might expect.

kunalspathak · 2021-12-17T00:57:39Z

I see approx. 0.58% and 0.28% improvement in RPS in mvc and json benchmarks respectively. I believe these are in error range.

kunalspathak · 2021-12-17T00:58:34Z

I haven't reviewed this PR in detail, but I would just advise caution when it comes to memory barriers on ARM64:

Thanks @alexrp . This PR just extends the optimal memory barriers for volatile keyword and for one scenario where we can use dmb ishst.

kunalspathak · 2022-01-03T20:32:49Z

@dotnet/jit-contrib

EgorBo · 2022-01-03T20:42:55Z

LGTM, @VSadov could you please take a quick look if you have time
tldr:

volatileVariable = 42;

used to emit a full memory barrier, now it emits a store-only one.

Also, for most cases stores/loads for variables marked as "volatile" now actually don't emit memory barriers at all and use e.g. stlr instead in case of store, basically the same what Volatile.Write emits today, see #60232

VSadov · 2022-01-03T21:08:36Z

src/coreclr/jit/codegenarm64.cpp

@@ -3280,7 +3280,7 @@ void CodeGen::genCodeForStoreInd(GenTreeStoreInd* tree)
            else
            {
                // issue a full memory barrier before a volatile StInd
-                instGen_MemoryBarrier();
+                instGen_MemoryBarrier(BARRIER_STORE_ONLY);


What is the typical scenario when this branch is taken? When value is a struct?

As I understand the purpose of this barrier is to have release semantics for storing into volatile variable that does not fit into a single register (otherwise stlr could be used).

I think this needs a full barrier, since unlike stlr, dmb ishst only waits for stores in progress and has no effect on loads.

Basically, ldar can be replaced with ldr; dmb ishld , but there is no such equivalency between stlr and dmb ishst; str because ishst is too weak.

From https://developer.arm.com/documentation/100941/0100/Barriers, trying to understand what you stated.

Basically, for ishst, loads can still be reordered around barrier and hence it is weaker than the ishld where loads/stores need to wait till the barrier is complete.

If we change from full barrier to ishst, we might have a load that should have been completed but got reordered and might end up reading the wrong value (pre-updated value). Is my understanding correct?

Yes, loads that appear after ishst, in program order, may speculatively happen ahead of the store.

static volatile int x; static volatile int y; static int xx; static int yy; ---- one thread: x = 42; yy = y; --- another thread: y = 42; xx = x;

Can both xx and yy end up 0 ?

Makes sense. I will revert the change related to shst then.

VSadov

LGTM!

EgorBo · 2022-01-20T16:11:05Z

Improvement on ubuntu-arm64: dotnet/perf-autofiling-issues#2981
and win-arm64: dotnet/perf-autofiling-issues#2977

kunalspathak added 2 commits December 13, 2021 10:28

Use ishst instead of ish

74f24f3

Do not contain address of volatile fields

d2d6d70

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 16, 2021

ghost assigned kunalspathak Dec 16, 2021

Do not contain address only for Arm64

687277b

kunalspathak marked this pull request as ready for review January 3, 2022 20:32

EgorBo approved these changes Jan 3, 2022

View reviewed changes

VSadov reviewed Jan 3, 2022

View reviewed changes

Remove ishst

c9267a5

VSadov approved these changes Jan 5, 2022

View reviewed changes

kunalspathak merged commit 4427c56 into dotnet:main Jan 5, 2022

This was referenced Jan 13, 2022

[Perf] Changes at 1/5/2022 11:37:43 PM dotnet/perf-autofiling-issues#2836

Closed

[Perf] Changes at 1/5/2022 11:37:43 PM dotnet/perf-autofiling-issues#2840

Closed

JulieLeeMSFT mentioned this pull request Jan 25, 2022

What's new in .NET 7 Preview 1 [WIP] dotnet/core#7106

Closed

EgorBo mentioned this pull request Jan 26, 2022

ARM64: Avoid LEA for volatile IND #64354

Merged

EgorBo mentioned this pull request Feb 9, 2022

AddRemoveFromDifferentThreads<string>.ConcurrentStack benchmark hangs on ARM64 #64980

Closed

ghost locked as resolved and limited conversation to collaborators Feb 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arm64: Memory barrier improvements #62895

Arm64: Memory barrier improvements #62895

kunalspathak commented Dec 16, 2021

ghost commented Dec 16, 2021

alexrp commented Dec 16, 2021

kunalspathak commented Dec 17, 2021

kunalspathak commented Dec 17, 2021

kunalspathak commented Jan 3, 2022

EgorBo commented Jan 3, 2022

VSadov Jan 3, 2022

VSadov Jan 3, 2022 •

edited

Loading

VSadov Jan 3, 2022

kunalspathak Jan 4, 2022

VSadov Jan 4, 2022

VSadov Jan 4, 2022

kunalspathak Jan 4, 2022

VSadov left a comment

EgorBo commented Jan 20, 2022 •

edited

Loading

Arm64: Memory barrier improvements #62895

Arm64: Memory barrier improvements #62895

Conversation

kunalspathak commented Dec 16, 2021

ghost commented Dec 16, 2021

alexrp commented Dec 16, 2021

kunalspathak commented Dec 17, 2021

kunalspathak commented Dec 17, 2021

kunalspathak commented Jan 3, 2022

EgorBo commented Jan 3, 2022

VSadov Jan 3, 2022

Choose a reason for hiding this comment

VSadov Jan 3, 2022 • edited Loading

Choose a reason for hiding this comment

VSadov Jan 3, 2022

Choose a reason for hiding this comment

kunalspathak Jan 4, 2022

Choose a reason for hiding this comment

VSadov Jan 4, 2022

Choose a reason for hiding this comment

VSadov Jan 4, 2022

Choose a reason for hiding this comment

kunalspathak Jan 4, 2022

Choose a reason for hiding this comment

VSadov left a comment

Choose a reason for hiding this comment

EgorBo commented Jan 20, 2022 • edited Loading

VSadov Jan 3, 2022 •

edited

Loading

EgorBo commented Jan 20, 2022 •

edited

Loading