vector_algorithms.cpp
: minmax
for 64-bit elements: replace ugly x86 workaround with a nice one
#4661
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This: piece
STL/stl/src/vector_algorithms.cpp
Lines 1116 to 1123 in 8dc4faa
works around the oddity of not having
_mm_cvtsi128_si64
on 32-bit x86It has been problematic:
min/max/minmax_element
for 64-bit types on x86 #2821I have discovered a nicer workaround!
If we spill the reg into the stack, the spill will optimize away.
On 32-bit with at least
/arch:SSE2
it even produces better code than the existing workaround.Demo: https://godbolt.org/z/ErGWz8GYT
It still does the actual spill on
/arch:IA32
. But given that this path is executed only once per function call (there are no intermediate reductions for 64-bit elements), and there's a plan to lift to/arch:SSE2
, I think that's fine.