-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving all elements out of a Vec generates worse assembly since 1.45 #78918
Comments
@rustbot modify labels: regression-untriaged |
Other than Godbolt, I also tested on my machine, running Windows 10. I ran rustc 1.46.0: sub rsp, 24
mov rax, rcx
mov qword ptr [rcx], 4
xorps xmm0, xmm0
movups xmmword ptr [rcx + 8], xmm0
mov rcx, qword ptr [rcx + 16]
mov qword ptr [rsp + 16], rcx
mov rcx, qword ptr [rax]
mov qword ptr [rsp], rcx
mov rcx, qword ptr [rax + 8]
mov qword ptr [rsp + 8], rcx
mov rcx, qword ptr [rdx + 16]
mov qword ptr [rax + 16], rcx
movups xmm0, xmmword ptr [rdx]
movups xmmword ptr [rax], xmm0
mov rcx, qword ptr [rsp + 16]
mov qword ptr [rdx + 16], rcx
mov rcx, qword ptr [rsp]
mov qword ptr [rdx], rcx
mov rcx, qword ptr [rsp + 8]
mov qword ptr [rdx + 8], rcx
add rsp, 24
ret rustc 1.44.0: sub rsp, 24
mov rax, rcx
mov rcx, qword ptr [rdx]
movups xmm0, xmmword ptr [rdx + 8]
movaps xmmword ptr [rsp], xmm0
mov qword ptr [rdx], 4
xorps xmm0, xmm0
movups xmmword ptr [rdx + 8], xmm0
mov qword ptr [rax], rcx
movaps xmm0, xmmword ptr [rsp]
movups xmmword ptr [rax + 8], xmm0
add rsp, 24
ret |
Curiously |
Assigning |
Bisected this regression to nightly-2020-05-22; alas CI artefacts from back then are no longer available (to me?), but looking at the commits that made it into that nightly release (c7813ff to 9310e3b, inclusive) the one that most immediately jumps out as likely responsible is 82911b3, which was the merge of #67759 (Update to LLVM 10). |
I was looking for the best way to move all elements out of a Vec and returning them, basically what I'd write in C++ as:
This code generates very good assembly with clang (link to Godbolt)
so I tried to do the equivalent with Rust.
Code
NOTE: I always used
rustc -O
for all the following tests.I tried this code:
I expected to see similar assembly as C++.
Instead, this was the output: (link to Godbolt)
Version it worked on
The same code in 1.44.0 generates much better assembly (only if you mark the method as
inline
): (link to Godbolt)Version with regression
From 1.45.0 until current nightly, the generated code is worse (
inline
doesn't affect the output).Notes
I also tried using
drain
andsplit_off
to achieve the same thing, but they both generate far worse assembly, with jumps and all.The text was updated successfully, but these errors were encountered: