Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i686 target spec disagrees with downstream definitions #82435

Closed
sanxiyn opened this issue Feb 23, 2021 · 19 comments
Closed

i686 target spec disagrees with downstream definitions #82435

sanxiyn opened this issue Feb 23, 2021 · 19 comments
Labels
A-target-specs Area: Compile-target specifications O-x86_32 Target: x86 processors, 32 bit (like i686-*) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@sanxiyn
Copy link
Member

sanxiyn commented Feb 23, 2021

See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973414.

@sanxiyn sanxiyn added the O-x86 label Feb 23, 2021
@jyn514
Copy link
Member

jyn514 commented Feb 23, 2021

IIUC, the issue is that rustc generates SSE2 instructions on a CPU that doesn't support SSE2? What would the fix be, to add a new target triple for Geode?

@sanxiyn
Copy link
Member Author

sanxiyn commented Feb 23, 2021

I think what's going on is that Debian and Rust disagree on what i686 is. Debian is clearly right and Rust is clearly wrong (80686 lacks SSE2), but it's what it is and neither here nor there. The issue is analogous to ARM v6/v7 situation.

Rust's i686-unknown-linux-gnu target requires SSE2, so can't run on Geode. Rust's i586-unknown-linux-gnu will run on Geode, but that's unfortunate, because Geode is 686 and not 586 and will miss 686 instructions (including CMOV).

@jyn514 jyn514 added A-target-specs Area: Compile-target specifications T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 23, 2021
@cuviper
Copy link
Member

cuviper commented Feb 23, 2021

Debian could further patch rustc_target to disable target-features on i686 -- it seems they already lowered the cpu target from "pentium4" to "pentiumpro".

@steveklabnik steveklabnik changed the title Rust breaks AMD Geode i686 target spec may disagree with downstream definitions Feb 23, 2021
@steveklabnik
Copy link
Member

I've changed the title to something more descriptive.

@steveklabnik steveklabnik changed the title i686 target spec may disagree with downstream definitions i686 target spec disagrees with downstream definitions Feb 23, 2021
@Shnatsel
Copy link
Member

Shnatsel commented Feb 23, 2021

Geode does not support some instructions that Pentium Pro does, so this behavior seems WAI:

Debian's rustc has a patch to reduce the i386 baseline from upstream's pentium4 to pentiumpro
https://sources.debian.org/src/rustc/1.48.0+dfsg1-1/debian/patches/d-i686-baseline.patch/
but apparently that's not sufficient for a Geode LX. i686 is in a weird situation where the Pentium Pro was the first i686 CPU, but is not a baseline i686 CPU.

from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973414#61

I believe the resolution would be dropping the target from pentiumpro to Geode in Debian.

@sanxiyn
Copy link
Member Author

sanxiyn commented Feb 26, 2021

I would like to note that FreeBSD also changes i686 baseline from pentium4 to pentiumpro: https://github.com/freebsd/freebsd-ports/blob/master/lang/rust/files/patch-compiler_rustc__target_src_spec_i686__unknown__freebsd.rs.

Ignoring Geode question for a moment, does it make sense for Rust to set i686 baseline to pentium4? Who is i686 for anyway? Wouldn't people use x86_64 for new hardwares?

@negge
Copy link

negge commented Mar 1, 2021

Just ran into this with gentoo on a Pentium III today (yes in 2021):

https://bugs.gentoo.org/741708

Cross compiling your own dev-lang/rust-bin is not an option for everyone, and as more system packages like librsvg are written in Rust this cuts off more and more systems from reaping the benefit of a safe systems language.

I am one who would appreciate if i686 definition matched what distro's use. Certainly the rust compiler can emit SSE2 instructions on those systems with these instructions present, even if it does not itself contain SSE2 instructions.

@cuviper
Copy link
Member

cuviper commented Mar 1, 2021

I am one who would appreciate if i686 definition matched what distro's use.

There's no consensus between distros either. Fedora now builds C and C++ with -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign, and that change called out the fact that i686 is primarily only used for multi-lib compatibility on x86_64 hosts. My impression is that Ubuntu is similar, but I'm not sure about that.

@sanxiyn
Copy link
Member Author

sanxiyn commented Mar 2, 2021

@cuviper linked to a change proposal accepted in Fedora 29, but that's misleading, since Fedora 31 dropped i686 install anyway: https://fedoraproject.org/wiki/Changes/Stop_Building_i686_Kernels. Of course Fedora's i686 target is solely used for multi-lib since Fedora can't be installed on i686.

@thomcc
Copy link
Member

thomcc commented Mar 2, 2021

Ignoring Geode question for a moment, does it make sense for Rust to set i686 baseline to pentium4? Who is i686 for anyway? Wouldn't people use x86_64 for new hardwares?

Anecdotally, I've seen most use it as "32-bit x86 on more modernish hardware", and the older i586 is "32-bit x86 on older hardware". I think this distinction is somewhat nice, even if it means the targets are misnamed.

Users of machines where this isn't viable might be better off with i586... But I guess that doesn't solve things for distros...

@negge
Copy link

negge commented Mar 2, 2021

Users of machines where this isn't viable might be better off with i586... But I guess that doesn't solve things for distros...

At least on gentoo this would help for a large set of impacted users.

Just having a dev-lang/rust-bin that runs on the system is enough to build and install packages that require rust. This would unblock a number of users who are stuck back-porting older versions of gnome-base/librsvg, sys-auth/polkit, etc. which do not have dependencies that require rust.

But moreover, in gentoo there is a USE=system-bootstrap on the dev-lang/rust package. This means once you have a working rustc, you can build and install a bespoke i686 compiler that matches your exact i686 CPU architecture (MMX, SSE, or SSE2). This also provides an upgrade path when newer rust versions are released.

goshhhy added a commit to goshhhy/rust that referenced this issue Aug 24, 2022
this commit lowers the baseline for i686 from 'pentium4' to 'pentiumpro'
on the following targets to match their expected baseline for i686:

- i686-unknown-linux-gnu
- i686-unknown-linux-musl
- i686-unknown-freebsd
- i686-unknown-openbsd
- i686-unknown-netbsd
- i686-unknown-haiku

the pentium4 is not i686-class, and setting it as baseline results
in Rust using SSE2, which i686-class processors never had. this
change enables users of these platforms, with their expected
baseline, to use the upstream Rust host tools, and binaries built
for their platform. I fall into this category of users.

at least Debian and Gentoo have issues relating to this, and at least
Debian and FreeBSD patch Rust in order to make a similar change and
enable Rust to work correctly on their expected baseline. Users with
i686 hardware running distributions that do not distribute their own
modified toolchain are left unable to use Rust or software written
in it. as an aside, Rust is now a de-facto requirement to have a usable
GNU/Linux installation with a GUI.

Rust has an existing issue rust-lang#82435, which this patch
resolves; this patch also obviates the need for the MCP i filed as
rust-lang/compiler-team#543.

NetBSD (and i believe OpenBSD) normally targets i486 where possible,
and i586 where atomics make that difficult (which includes all Rust
programs); if targeting i686, these operating systems would also
expect a lower baseline than pentium4.

Haiku normally targets i586, but would also expect a lower baseline
than pentium4 for an i686 target.

Distributions such as Fedora which do not have a 32-bit version and
only use i686 for multilib can easily alter their build system to
enable SSE2, to match their expected baseline of x86_64 processors
running in 32-bit compatibility mode (all of which have SSE2).
@cuviper has expressed willingness to make this patch in Fedora.

Targets not changed are left as-is for the following reasons:

- Rust supports no earlier than Windows 7; Windows 7 and later all
  require SSE2, and so a minimum of Pentium 4 or similar.
- UEFI was not introduced until 2004 and not common until ~2011,
  well after the Pentium 4's heyday.
- I am aware of no scenario under which VxWorks would be running
  on a 32-bit x86 processor with a lower featureset than pentium4,
  unless it is a Vortex86, which is i586-class. even then, these
  would not be compiling programs for themselves, and the party
  building programs for that platform would have the ability to
  set target flags that make sense for their situation.
@workingjubilee
Copy link
Member

We discussed this as part of these issues:

We reviewed the evidence presented here and considered our alternatives. It was found that a frankly ridiculous number of entities, including not only Fedora but multiple Linux distributions, as well as Clang, disagree on what "i686" actually means. It is more of a "branding" than a clearly-defined, well-nailed-down feature level or target, partly due to the "i786" name never catching on like i686 did. In some cases OS, including some Linux distributions, will use "i586" or "i386" to refer to Pentium 4 feature levels!

If there was more consensus on what i686 means, it would be clear we should fix this. Even with less consensus, it might have been worth the breakage if done years ago. But now the Rust ecosystem has built heavily around this assumption, and staying in agreement on our defaults with other LLVM-based compilers is very valuable. Since the main goal is to improve support for lower-end machines, especially in the from-bootstrap case, it was decided that bumping some tier 2 i586 targets to having host tools would be preferred.

So this will, unfortunately, not be fixed.
Or, depending on your perspective, it will be fixed in another way.

@workingjubilee workingjubilee closed this as not planned Won't fix, can't repro, duplicate, stale Mar 3, 2023
@Fierelier
Copy link

Fierelier commented Mar 11, 2023

It is clear you should fix it, because all normal compilers consider i686 as Pentium Pro. Why do they do this? If I had a Pentium Pro, I compile for i686, and then the Pentium III comes out, and I compile again, should the compile then break? Of course not.

i686 is technically all of this: https://en.wikipedia.org/wiki/P6_(microarchitecture)

Also see:

The logic of the people who want this is absolutely sound. i686 should be, and always has been Pentium Pro+.

#548
#1196
#100969

@workingjubilee
Copy link
Member

Yes, Clang does recognize "i686" can mean "the Pentium Pro".

However, that does not mean that the Pentium Pro defines the resulting architectural features that are actually selected when you pass the argument of --target=i686-unknown-linux-gnu to Clang. Please consider this Godbolt: https://llvm.godbolt.org/z/PP69efvfx

In particular, the attributes of the emitted IR, which I will indent here for better reading:

attributes #0 = {
    noinline nounwind optnone uwtable
    "frame-pointer"="all"
    "min-legal-vector-width"="0"
    "no-trapping-math"="true"
    "stack-protector-buffer-size"="8"
    "target-cpu"="pentium4"
    "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87"
    "tune-cpu"="generic"
}

As seen here, Clang absolutely does pick the SSE2 feature for the i686-unknown-linux-gnu target and does pick the Pentium 4 as that target's baseline, not the Pentium Pro. So unfortunately, it seems it is not enough for logic to be sound. The LLVM toolchain simultaneously knows what is correct and also chooses to selectively disregard it. I assure you, I also find this vexing.

@workingjubilee
Copy link
Member

When the i686 targets were first minted, they did not:
https://github.com/rust-lang/rust/blob/6b130e3dd9547e233dca2cfd72a5968891672d9c/src/librustc_back/target/i686_unknown_linux_gnu.rs

However, when they did get adjusted, they were later specifically adjusted to match Clang:

This is because when the "flexible targets" were implemented, they came with explicitly setting the "generic" CPU target, which could affect even baseline features like SSE2 on x86-64 due to LLVM's peculiar interpretation of the word "generic". Thus, the overrides were added in, but overriding them to what the values were expected to be.

fn default() -> TargetOptions {
TargetOptions {
linker: "cc".to_string(),
pre_link_args: Vec::new(),
post_link_args: Vec::new(),
cpu: "generic".to_string(),

@Fierelier
Copy link

Fierelier commented Mar 17, 2023

But as I already stated, Clang wrongly considers i386, i486, i586 triplets as Pentium 4, this was talked about in llvm/llvm-project#61347 (comment). Rust fixes Clang's definitions for these targets by overriding CPU. i686, which is also wrongly a Pentium 4 in Clang, does not get the same treatment.

@Fierelier
Copy link

You assured me you find this vexing (admitting there is an issue), I've outlined a double standard of Rust, you've acknowledged it, yet you lack any initiative to want to set it right. If you are the position of the entire Rust team, then I really hope gcc's Rust compiler catches up soon.

@jayaddison
Copy link
Contributor

I feel like I'm at some slight risk of losing my sanity (not entirely related to i686), and despite that I would like to register my alignment with @Fierelier on this topic, even if it's a losing battle.

There seems to be momentum in both LLVM and Rust to increment the definition of i686 beyond what it originally referred to -- and in parallel, there are inbound efforts by Intel to improve the apparent security properties of both x86 and x86-64 architectures, some of which leverages (at baseline) an instruction from the "incremented" definitions of i686.

Maybe that's worthwhile overall in terms of user safety -- if said safety measures do indeed prove worthwhile and reliable.

But there's another risk, which is a more pernicious and long-lasting one: if we begin to lose clear definitions of what architectures and instruction sets refer to, then it begins to require increasing levels of effort and research to understand what someone is talking about when they refer to an architecture.

This is already true of computing in general - we add additional layers, iterate on technologies and rename things or repurpose names. And to some extent we can use the layers and technological improvements themselves to help us forget details. But the details are still there, and they do become relevant sometimes.

That's my attempt to restate and agree with the 'problem may accumulates over time' message echoed over on a similar LLVM thread.

I'm still not even entirely sure whether I really understand what's going on here: but that's precisely part of the problem.

@Noratrieb Noratrieb added O-x86_32 Target: x86 processors, 32 bit (like i686-*) and removed O-x86-all labels Oct 25, 2023
davidben added a commit to google/boringssl that referenced this issue Jan 31, 2024
Update-Note: Building for 32-bit x86 may require fixing your builds to
pass -msse2 to the compiler. This will also speed up the rest of the
code in your project. If your project needs to support the Pentium III,
please contact BoringSSL maintainers.

As far as I know, all our supported 32-bit x86 consumers require SSE2.
I think, in the past, I've asserted that our assembly skips SSE2
capability detection. Looking at it again, I don't think that's true.
OPENSSL_IA32_SSE2 means to enable runtime detection of SSE2, not
compile-time.

Additionally, I don't believe we have *ever* tested the non-SSE2
assembly codepaths. Also, now that we want to take the OPENSSL_ia32cap_P
accesses out of assembly, those runtime checks are problematic, as we'd
need to bifurcafe functions all the way down to bn_mul_words.

Unfortunately, the situation with compilers is... complicated. Ideally,
everyone would build with the equivalent of -msse2. 32-bit x86 is so
register-poor that running without SSE2 statically available seems
especially foolish. However, per
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9868, while
Clang defaults to enabling SSE2, GCC does not.

We once broke gRPC's build, in
grpc/grpc#17540, by inadvertently assuming
SSE2. In that discussion, gRPC maintainers were okay requiring Pentium 4
as the minimum CPU, but it's unclear if they actually changed their
build. That discussion also said GCC 8 assumes SSE2, but I'm not able to
reproduce this.

LLVM does indeed interpret "i686" as implying SSE2:
llvm/llvm-project#61347
rust-lang/rust#82435

However, Debian LLVM does *not*. Debian carries a patch to turn this
off!
https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/snapshot/debian/patches/disable-sse2-old-x86.diff?ref_type=heads

Meanwhile, Fedora fixed their baseline back in 2018.
https://fedoraproject.org/wiki/Changes/Update_i686_architectural_baseline_to_include_SSE2

So let's start by detecting builds that forgot to pass -msse2 and see if
we can get them fixed. If this sticks, I'll follow up by unwinding all
the SSE2 branches.

Bug: 673
Change-Id: I851184b358aaae2926c3e3fe618f3155e71c2f71
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/65875
Reviewed-by: Bob Beck <bbe@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
davidben added a commit to davidben/rust-openssl that referenced this issue May 2, 2024
Newer BoringSSL requires SSE2 on 32-bit x86, but I opted to apply it to
all the OpenSSLs that CI builds. This allows the compiler to vectorize
random bits of the library, so hopefully it'll run faster on CI.
Especially with 32-bit x86 being so register-poor.

Additionally, Rust's definition of i686 already assumes SSE2 (see
rust-lang/rust#82435), so there's no sense in
pretending to test pre-SSE2 targets.
davidben added a commit to davidben/rust-openssl that referenced this issue May 3, 2024
Newer BoringSSL requires SSE2 on 32-bit x86, but I opted to apply it to
all the OpenSSLs that CI builds. This allows the compiler to vectorize
random bits of the library, so hopefully it'll run faster on CI.
Especially with 32-bit x86 being so register-poor.

Additionally, Rust's definition of i686 already assumes SSE2 (see
rust-lang/rust#82435), so there's no sense in
pretending to test pre-SSE2 targets.
justsmth pushed a commit to justsmth/aws-lc that referenced this issue Aug 26, 2024
Update-Note: Building for 32-bit x86 may require fixing your builds to
pass -msse2 to the compiler. This will also speed up the rest of the
code in your project. If your project needs to support the Pentium III,
please contact BoringSSL maintainers.

As far as I know, all our supported 32-bit x86 consumers require SSE2.
I think, in the past, I've asserted that our assembly skips SSE2
capability detection. Looking at it again, I don't think that's true.
OPENSSL_IA32_SSE2 means to enable runtime detection of SSE2, not
compile-time.

Additionally, I don't believe we have *ever* tested the non-SSE2
assembly codepaths. Also, now that we want to take the OPENSSL_ia32cap_P
accesses out of assembly, those runtime checks are problematic, as we'd
need to bifurcafe functions all the way down to bn_mul_words.

Unfortunately, the situation with compilers is... complicated. Ideally,
everyone would build with the equivalent of -msse2. 32-bit x86 is so
register-poor that running without SSE2 statically available seems
especially foolish. However, per
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9868, while
Clang defaults to enabling SSE2, GCC does not.

We once broke gRPC's build, in
grpc/grpc#17540, by inadvertently assuming
SSE2. In that discussion, gRPC maintainers were okay requiring Pentium 4
as the minimum CPU, but it's unclear if they actually changed their
build. That discussion also said GCC 8 assumes SSE2, but I'm not able to
reproduce this.

LLVM does indeed interpret "i686" as implying SSE2:
llvm/llvm-project#61347
rust-lang/rust#82435

However, Debian LLVM does *not*. Debian carries a patch to turn this
off!
https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/snapshot/debian/patches/disable-sse2-old-x86.diff?ref_type=heads

Meanwhile, Fedora fixed their baseline back in 2018.
https://fedoraproject.org/wiki/Changes/Update_i686_architectural_baseline_to_include_SSE2

So let's start by detecting builds that forgot to pass -msse2 and see if
we can get them fixed. If this sticks, I'll follow up by unwinding all
the SSE2 branches.

Bug: 673
Change-Id: I851184b358aaae2926c3e3fe618f3155e71c2f71
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/65875
Reviewed-by: Bob Beck <bbe@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
(cherry picked from commit 56d3ad9d23bc130aa9404bfdd1957fe81b3ba498)
justsmth pushed a commit to justsmth/aws-lc that referenced this issue Aug 26, 2024
Update-Note: Building for 32-bit x86 may require fixing your builds to
pass -msse2 to the compiler. This will also speed up the rest of the
code in your project. If your project needs to support the Pentium III,
please contact BoringSSL maintainers.

As far as I know, all our supported 32-bit x86 consumers require SSE2.
I think, in the past, I've asserted that our assembly skips SSE2
capability detection. Looking at it again, I don't think that's true.
OPENSSL_IA32_SSE2 means to enable runtime detection of SSE2, not
compile-time.

Additionally, I don't believe we have *ever* tested the non-SSE2
assembly codepaths. Also, now that we want to take the OPENSSL_ia32cap_P
accesses out of assembly, those runtime checks are problematic, as we'd
need to bifurcafe functions all the way down to bn_mul_words.

Unfortunately, the situation with compilers is... complicated. Ideally,
everyone would build with the equivalent of -msse2. 32-bit x86 is so
register-poor that running without SSE2 statically available seems
especially foolish. However, per
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9868, while
Clang defaults to enabling SSE2, GCC does not.

We once broke gRPC's build, in
grpc/grpc#17540, by inadvertently assuming
SSE2. In that discussion, gRPC maintainers were okay requiring Pentium 4
as the minimum CPU, but it's unclear if they actually changed their
build. That discussion also said GCC 8 assumes SSE2, but I'm not able to
reproduce this.

LLVM does indeed interpret "i686" as implying SSE2:
llvm/llvm-project#61347
rust-lang/rust#82435

However, Debian LLVM does *not*. Debian carries a patch to turn this
off!
https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/snapshot/debian/patches/disable-sse2-old-x86.diff?ref_type=heads

Meanwhile, Fedora fixed their baseline back in 2018.
https://fedoraproject.org/wiki/Changes/Update_i686_architectural_baseline_to_include_SSE2

So let's start by detecting builds that forgot to pass -msse2 and see if
we can get them fixed. If this sticks, I'll follow up by unwinding all
the SSE2 branches.

Bug: 673
Change-Id: I851184b358aaae2926c3e3fe618f3155e71c2f71
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/65875
Reviewed-by: Bob Beck <bbe@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
(cherry picked from commit 56d3ad9d23bc130aa9404bfdd1957fe81b3ba498)
justsmth pushed a commit to justsmth/aws-lc that referenced this issue Aug 27, 2024
Update-Note: Building for 32-bit x86 may require fixing your builds to
pass -msse2 to the compiler. This will also speed up the rest of the
code in your project. If your project needs to support the Pentium III,
please contact BoringSSL maintainers.

As far as I know, all our supported 32-bit x86 consumers require SSE2.
I think, in the past, I've asserted that our assembly skips SSE2
capability detection. Looking at it again, I don't think that's true.
OPENSSL_IA32_SSE2 means to enable runtime detection of SSE2, not
compile-time.

Additionally, I don't believe we have *ever* tested the non-SSE2
assembly codepaths. Also, now that we want to take the OPENSSL_ia32cap_P
accesses out of assembly, those runtime checks are problematic, as we'd
need to bifurcafe functions all the way down to bn_mul_words.

Unfortunately, the situation with compilers is... complicated. Ideally,
everyone would build with the equivalent of -msse2. 32-bit x86 is so
register-poor that running without SSE2 statically available seems
especially foolish. However, per
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9868, while
Clang defaults to enabling SSE2, GCC does not.

We once broke gRPC's build, in
grpc/grpc#17540, by inadvertently assuming
SSE2. In that discussion, gRPC maintainers were okay requiring Pentium 4
as the minimum CPU, but it's unclear if they actually changed their
build. That discussion also said GCC 8 assumes SSE2, but I'm not able to
reproduce this.

LLVM does indeed interpret "i686" as implying SSE2:
llvm/llvm-project#61347
rust-lang/rust#82435

However, Debian LLVM does *not*. Debian carries a patch to turn this
off!
https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/snapshot/debian/patches/disable-sse2-old-x86.diff?ref_type=heads

Meanwhile, Fedora fixed their baseline back in 2018.
https://fedoraproject.org/wiki/Changes/Update_i686_architectural_baseline_to_include_SSE2

So let's start by detecting builds that forgot to pass -msse2 and see if
we can get them fixed. If this sticks, I'll follow up by unwinding all
the SSE2 branches.

Bug: 673
Change-Id: I851184b358aaae2926c3e3fe618f3155e71c2f71
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/65875
Reviewed-by: Bob Beck <bbe@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
(cherry picked from commit 56d3ad9d23bc130aa9404bfdd1957fe81b3ba498)
justsmth pushed a commit to justsmth/aws-lc that referenced this issue Aug 27, 2024
Update-Note: Building for 32-bit x86 may require fixing your builds to
pass -msse2 to the compiler. This will also speed up the rest of the
code in your project. If your project needs to support the Pentium III,
please contact BoringSSL maintainers.

As far as I know, all our supported 32-bit x86 consumers require SSE2.
I think, in the past, I've asserted that our assembly skips SSE2
capability detection. Looking at it again, I don't think that's true.
OPENSSL_IA32_SSE2 means to enable runtime detection of SSE2, not
compile-time.

Additionally, I don't believe we have *ever* tested the non-SSE2
assembly codepaths. Also, now that we want to take the OPENSSL_ia32cap_P
accesses out of assembly, those runtime checks are problematic, as we'd
need to bifurcafe functions all the way down to bn_mul_words.

Unfortunately, the situation with compilers is... complicated. Ideally,
everyone would build with the equivalent of -msse2. 32-bit x86 is so
register-poor that running without SSE2 statically available seems
especially foolish. However, per
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9868, while
Clang defaults to enabling SSE2, GCC does not.

We once broke gRPC's build, in
grpc/grpc#17540, by inadvertently assuming
SSE2. In that discussion, gRPC maintainers were okay requiring Pentium 4
as the minimum CPU, but it's unclear if they actually changed their
build. That discussion also said GCC 8 assumes SSE2, but I'm not able to
reproduce this.

LLVM does indeed interpret "i686" as implying SSE2:
llvm/llvm-project#61347
rust-lang/rust#82435

However, Debian LLVM does *not*. Debian carries a patch to turn this
off!
https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/snapshot/debian/patches/disable-sse2-old-x86.diff?ref_type=heads

Meanwhile, Fedora fixed their baseline back in 2018.
https://fedoraproject.org/wiki/Changes/Update_i686_architectural_baseline_to_include_SSE2

So let's start by detecting builds that forgot to pass -msse2 and see if
we can get them fixed. If this sticks, I'll follow up by unwinding all
the SSE2 branches.

Bug: 673
Change-Id: I851184b358aaae2926c3e3fe618f3155e71c2f71
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/65875
Reviewed-by: Bob Beck <bbe@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
(cherry picked from commit 56d3ad9d23bc130aa9404bfdd1957fe81b3ba498)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-target-specs Area: Compile-target specifications O-x86_32 Target: x86 processors, 32 bit (like i686-*) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests