-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc 1.34.0-nightly (da6ab956e 2019-01-27) and later make encoding_rs with packed_simd go from one .s to 31 .rcgu.s files #58023
Comments
|
@hsivonen do you have a |
I don't have a |
In the case of |
One |
The most suspicious commit in that range seems to be the new cargo version. Maybe the compiler didn't change but the way cargo invokes it did. |
Well, in that range cargo turned on incremental compilation by default. |
Link to the PR for turning on incremental compilation. |
Indeed, |
Adding |
Since in the |
You need to set incremental = false for the bench profile.
…On Fri 1. Feb 2019 at 12:05, Henri Sivonen ***@***.***> wrote:
Since in the incremental = true world --emit asm no longer shows the true
output (i.e. ThinLTO) isn't done, I guess I should use an external
disassembler to examine if post-ThinLTO inlining is still bad in the Thumb2
case. (I don't know at what stage of compliation Thumb trampolines are
generated in the ThinLTO case, so I don't know if Thumb trampolines could
interfere with ThinLTO inlining in a manner that's irrelevant to e.g.
x86_64.)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#58023 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA3NphwtdWYnNhpsFTCGY-HLcbPeztRCks5vJB93gaJpZM4acPhN>
.
|
I tried again with [profile.bench]
incremental = false in both the I intend to disassemble the bench binaries next. |
It seems that incremental compilation isn't the cause of the regression. The instructions generated for horizontal reductions are. |
Steps to reproduce
cd encoding_rs
git checkout simd
rustup default rustup default 1.32.0
rustup target add armv7-unknown-linux-gnueabihf
RUSTC_BOOTSTRAP=1 RUSTFLAGS='-C target_feature=+neon,+thumb-mode,+thumb2' cargo rustc --target armv7-unknown-linux-gnueabihf --features simd-accel --release -- --emit asm
find target | grep -c '\.s$'
rm -rf target
git checkout packed_simd
rustup default nightly
rustup target add thumbv7neon-unknown-linux-gnueabihf
cargo rustc --target thumbv7neon-unknown-linux-gnueabihf --features simd-accel --release -- --emit asm
find target | grep -c '\.s$'
Actual results
In the
simd
+ Rust 1.32 case, there is one.s
file. In thepacked_simd
+ Rust 1.34 case,encoding_rs
is split across 31.s
files. These are all.rcgu.s
files. Examining these files suggests lesser inlining withinencoding_rs
, although code frompacked_simd
andcore::arch
appears to have gotten inlined.Expected result
Expected one
.s
file with the same level of inlining in thepacked_simd
+ Rust 1.34 case as with thesimd
+ Rust 1.32 case.Additional info
When building an actual binary from a different top-level crate (
encoding_bench
), thepacked_simd
+ Rust 1.34 case regresses performance relative to thesimd
+ Rust 1.32 on Exynos 5.The text was updated successfully, but these errors were encountered: