-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Atomic CAS with pad (#23, P0528R3) #1029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Now the test pass, I also improved a bit coverage by using more interesting bit pattern, and filed LLVM-46685. It is ready for review. |
|
I use |
StephanTLavavej
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last batch of questions, then I think this will be ready - thanks!
Co-authored-by: Stephan T. Lavavej <stl@nuwen.net>
Co-authored-by: Stephan T. Lavavej <stl@nuwen.net>
BillyONeal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready to merge once the default constructed _Ty issue is resolved.
| // Padding bits should not participate in cmpxchg comparison starting in C++20. | ||
| // Clang does not have __builtin_zero_non_value_bits to exclude these bits to implement this C++20 feature. | ||
| // The EDG front-end substitutes everything and runs into incomplete types passed to atomic<T>. | ||
| #if _HAS_CXX20 && !defined(__clang__) /* TRANSITION, LLVM-46685 */ && !defined(__EDG__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sure that's not a normal use case problem, but neither is this whole feature)
Indeed.
I'm OK with leaving it '20 only.
…orage_for formerly of <execution>.
BillyONeal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This passed in the DevDiv system.
stl/inc/atomic
Outdated
| __builtin_zero_non_value_bits(_Ptr()); | ||
| } | ||
|
|
||
| _Ty& _Ref() noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-existing:
| _Ty& _Ref() noexcept { | |
| _NODISCARD _Ty& _Ref() noexcept { |
stl/inc/atomic
Outdated
| return reinterpret_cast<_Ty&>(_Storage); | ||
| } | ||
|
|
||
| _Ty* _Ptr() noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this function is newly added, let's make it nodiscard:
| _Ty* _Ptr() noexcept { | |
| _NODISCARD _Ty* _Ptr() noexcept { |
stl/inc/atomic
Outdated
| } | ||
|
|
||
| _Ty* _Ptr() noexcept { | ||
| return reinterpret_cast<_Ty*>(_STD addressof(_Storage)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_Storage is unsigned char[N] so it's immune to operator& evil:
| return reinterpret_cast<_Ty*>(_STD addressof(_Storage)); | |
| return reinterpret_cast<_Ty*>(_Storage); |
stl/inc/atomic
Outdated
| _STD_BEGIN | ||
| // STRUCT TEMPLATE _Storage_for | ||
| struct _Form_mask_t {}; | ||
| _INLINE_VAR constexpr _Form_mask_t _Form_mask; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++14 had an annoying rule about default-init of const objects. That was fixed in C++17, but (1) even if it's not causing compiler errors in the tests, I'm worried about our compiler support matrix and (2) there are 0 occurrences of _INLINE_VAR constexpr \w+ \w+; in the STL, but 6 of _INLINE_VAR constexpr \w+ \w+\{\}; (namely adopt_lock, defer_lock, try_to_lock, ignore, piecewise_construct, and allocator_arg). There are an additional 5 occurrences of inline constexpr \w+ \w+\{\}; (all in C++17+ code) where the braces aren't necessary, but provided. There are 5 occurrences of inline constexpr \w+ \w+; all in ranges (ranges::cbegin etc.) where we omit the braces because it's C++20.
| _INLINE_VAR constexpr _Form_mask_t _Form_mask; | |
| _INLINE_VAR constexpr _Form_mask_t _Form_mask{}; |
stl/inc/atomic
Outdated
| template <class _Ty> | ||
| inline constexpr bool _Cmpxchg_has_padding_bits_v = | ||
| !has_unique_object_representations_v<_Ty> && !is_floating_point_v<_Ty>; | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing #endif comment.
stl/inc/atomic
Outdated
| break; | ||
| if constexpr (_Cmpxchg_has_padding_bits_v<_Ty>) { | ||
| _Storage_for<_Ty> _Mask{_Form_mask}; | ||
| const long long _Mask_val = _Atomic_reinterpret_as<long long>(_Mask._Ref()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long long here appears to be incorrect, since this is for 1-byte _Ty. (Indeed, 2-byte below uses short.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦♂️ Sorry for murdering your poor boy @AlexGuteniev
| }; | ||
| #pragma warning(pop) | ||
|
|
||
| #pragma pack(push) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is packing being pushed/popped here, when it's apparently not being altered?
|
|
||
| void operator&() const = delete; | ||
|
|
||
| void set(char v) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Some, but not all, of the set() functions are being changed to take their parameters by const value.
| template <class X, std::size_t S> | ||
| void test() { | ||
| static_assert(sizeof(X) == S, "Unexpected size"); | ||
| static_assert(!std::has_unique_object_representations_v<X>, "No padding type"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should directly include <type_traits> for std::has_unique_object_representations_v.
| template <class X, std::size_t S> | ||
| void test() { | ||
| static_assert(sizeof(X) == S, "Unexpected size"); | ||
| static_assert(!std::has_unique_object_representations_v<X>, "No padding type"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this static_assert message hard to understand (it is actually less clear than the condition being tested). Something like "Expected type to contain padding" would make sense.
CaseyCarter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find anything Stephan missed; LGTM after fixing his comments.
Co-authored-by: Casey Carter <cartec69@gmail.com>
|
@BillyONeal, have you noticed a comment where I am proposing to get back to the almost original design except handling atomic_ref differently - clear pading with atomic and after first fail? |
That looks OK to me (although since this is ~ready to land and atomic_ref is not yet ready to land I would prefer this land first) |
|
Ok, let's have this, and I'll propose another option with another PR after that (and by having PR on top of this, it would be easier to evaluate just the difference) |
Co-authored-by: Stephan T. Lavavej <stl@nuwen.net>
|
@AlexGuteniev I guess I misunderstood then? Since it seems your proposal there only affects atomic_ref, not atomic. |
The proposal
Unfortunately, yes, this proposal is:
The good side is avoiding inner CAS loop while still keeping |
|
@AlexGuteniev: If it's more complex than the current plan of record and is only optimizing for the exceedingly rare |
|
Thanks for your contribution! 🎉 |
Yes, for the simplicity, the current one is the best bet I think. |
|
And thanks for landing your first feature :D |
|
And thank you for your patience :) |
Co-authored-by: Stephan T. Lavavej <stl@nuwen.net> Co-authored-by: Billy Robert O'Neal III <bion@microsoft.com> Co-authored-by: Casey Carter <cartec69@gmail.com>
__builtin_zero_non_value_bitsincompare_exchange_strong:atomic_refstoreandexchangewhich is sub-optiomal