Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<atomic>: Consider using __atomic_load_n/__atomic_store_n for Clang #3659

Open
StephanTLavavej opened this issue Apr 18, 2023 · 6 comments
Open
Labels
performance Must go faster

Comments

@StephanTLavavej
Copy link
Member

In llvm/llvm-project#62103 (comment) , @efriedma-quic asked us to consider an alternative to the ARM64 __load_acquire/__stlr intrinsics we've started using for MSVC (see #3399 and #3651):

If possible, I'd strongly prefer if you could change the MS STL to use __atomic_load_n/__atomic_store_n. It's not clear to me what the semantics of the target-specific intrinsics are supposed to be, and LLVM optimizations already know how to optimize the existing atomic intrinsics.

(ldapr is part of armv8.3, so I assume MSVC won't generate it unless you pass flags that indicate the target supports it.)

On the one hand, this would mean maintaining dedicated codepaths for Clang. On the other hand, getting compilers to do our work for us is awesome.

@StephanTLavavej StephanTLavavej added performance Must go faster decision needed We need to choose something before working on this labels Apr 18, 2023
@StephanTLavavej
Copy link
Member Author

@barcharcraz says "not now / not yet" for using such intrinsics for Clang, as they would need to match MSVC's atomics ABI, and we need to fully understand any ABI compatibility issues that could arise from use of the Clang intrinsics. We don't want to break ABI by accident. It would be possible to accept such changes if any ABI impact were thoroughly reviewed (as it was when we changed to the __load_acquire/__stlr intrinsics). For example, if __atomic_load_n/__atomic_store_n use a locktable, that is inherently forever super not compatible with what we've been doing in the STL.

@StephanTLavavej StephanTLavavej removed the decision needed We need to choose something before working on this label Apr 19, 2023
@bharsaklemukesh975
Copy link

Can I work on this?

@efriedma-quic
Copy link

efriedma-quic commented Apr 24, 2023

In clang 16 and newer, clang builtins should be compatible with the sequences MSVC emits, at least for sizes 1/2/4/8. (We did recently fix an ABI issue; see https://reviews.llvm.org/D141748 .)

@StephanTLavavej
Copy link
Member Author

@bharsaklemukesh975 - go ahead, AFAIK nobody else is.

@bharsaklemukesh975
Copy link

Thanks, @StephanTLavavej ! Can you please also suggest steps to fix this issue, this is my first issue, Thanks!

@StephanTLavavej
Copy link
Member Author

Ah, this is a pretty advanced issue, involving compiler intrinsics. I definitely wouldn't have labeled it as "good first issue". Most of our issues aren't simple to resolve, or they would have been done already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

3 participants