Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using MMX, SSE, SSE2 requires prescott #960

Closed

Conversation

fingolfin
Copy link
Member

If we detect that a 32bit x86 binary is using MMX, SSE, SSE2, then -march=i686 is not sufficient, so suggest -march=prescott instead.

... well, at least based on how I understand things. Perhaps I got this completely backwards, though, I am extrapolating from very localized knowledge of the codebase, I certainly don't understand the "big picture" fully.

There are certainly alternatives to this PR: e.g. one could teach BB more 32bit architectures and refine this. Or one could decide that since Julia itself requires SSE2 for its atomics support, the base architecture shouldn't be i686 (i.e., Pentium Pro), but rather pentium4.

This came to my attention in the context of libjulia_jll, see the discussion starting at this comment.

If we detect that a 32bit x86 binary is using MMX, SSE, SSE2, then -march=i686
is not sufficient, so use -march=prescott instead.
@giordano
Copy link
Member

Aren't mmx and sse always part of i686? That's my understanding of https://en.wikipedia.org/wiki/P6_(microarchitecture). It would also be good to have some references to write down in the comments, since all this instructions set information seems to be oral tradition.

@fingolfin
Copy link
Member Author

fingolfin commented Oct 26, 2020

Nope, neither MMX nor SSE are part of -march=i686. Actually if you read the page you linked to, you'll find it says this:

  • New instructions in Pentium II Deschutes core: MMX, FXSAVE, FXRSTOR.
  • New instructions in Pentium III: SSE.

But the primary reference here is https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html which states that i686 is an alias for pentiumpro (that's in line with marketing names back then: 586 was Pentium, 686 was Pentium Pro, and then they dropped the x86 naming). It also does not list any of MMX, SSE etc.; it explicitly lists this for architectures that support these instruction sets (e.g. compare to pentium4 or prescott entries).

You can also cross check this with e.g. Intel spec sheets but that's really cumbersome.

@giordano
Copy link
Member

Actually if you read the page you linked to, you'll find it says this:

I was looking to the table of P6

But the primary reference here is https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

I know well that page, and how little useful is for base architectures 🙂

which states that i686 is an alias for pentiumpro (that's in line with marketing names back then: 586 was Pentium, 686 was Pentium Pro, and then they dropped the x86 naming).

This bit isn't very useful if you don't know what instructions set pentiumpro covers.

It also does not list any of MMX, SSE etc.

Nor does the x86-64 architecture: "A generic CPU with 64-bit extensions."

You can also cross check this with e.g. Intel spec sheets but that's really cumbersome.

Yeah, I'm asking for human-readable references 🙂

@staticfloat
Copy link
Member

Good catch; it appears that the confusion is that the "base 32-bit intel" architecture for Julia is implicitly pentium4 because (according to Yichao) we require MMX and SSE2 to run.

So the proper fix is to not use the name i686 as the "base 32-bit intel" arch, and just use pentium4 everywhere.

@fingolfin
Copy link
Member Author

Actually if you read the page you linked to, you'll find it says this:

I was looking to the table of P6

I am not sure which table you refer to, but I find that Wikipedia page actually quite clear: Yes, the Pentium Pro was the first using the P6 chip design, but that was extended over time (just as is done these days), and specificially MMX and SSE were only added later (

But the primary reference here is https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

I know well that page, and how little useful is for base architectures 🙂

?? I actually find it pretty useful, and clear? Perhaps we are looking for different kinds of data, though? But for the question of whether MMX, SSE etc. are supported, it is 100% useful, and in fact the only relevant references, because it spells out explicitly how they (the GCC team) treat each -march option -- I mean, even if P6 already supported MMX and SSE, it wouldn't matter as long as GCC explicitly considers -march=i686 to not cover MMX and SSE (which is precisely what that page states).

which states that i686 is an alias for pentiumpro (that's in line with marketing names back then: 586 was Pentium, 686 was Pentium Pro, and then they dropped the x86 naming).

This bit isn't very useful if you don't know what instructions set pentiumpro covers.

It also does not list any of MMX, SSE etc.

Nor does the x86-64 architecture: "A generic CPU with 64-bit extensions."

Agreed, that is a weak point: somewhere on the page they should point out that the 64-bit extension subsumes MMX and SSE. But I think that's the one thing missing in this regard on that page? Other than that, I think all -march variants explicitly list all cpu instruction extensions they support explicitly and exhaustively? E.g.

‘prescott’
Improved version of Intel Pentium 4 CPU with MMX, SSE, SSE2 and SSE3 instruction set support.

or

‘cooperlake’
Intel cooperlake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F, CLWB, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VNNI and AVX512BF16 instruction set support.

You can also cross check this with e.g. Intel spec sheets but that's really cumbersome.

Yeah, I'm asking for human-readable references 🙂

Well, then my revised suggestion for such a reference is: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html plus the information "64 bit implies MMX and SSE". (I'd offer to submit a PR to GCC to add this to that webpage resp. the relevant manpage, but last I looked they don't take PRs and require signing forms before they accept contributions, so I'll pass on that; but perhaps I can at least submit an issue).

@fingolfin
Copy link
Member Author

@staticfloat

Good catch; it appears that the confusion is that the "base 32-bit intel" architecture for Julia is implicitly pentium4 because (according to Yichao) we require MMX and SSE2 to run.

Yeah, discovering this the hard way (by having the compiler reject the code unless sse2 was enabled) is how we got here ;-)

So the proper fix is to not use the name i686 as the "base 32-bit intel" arch, and just use pentium4 everywhere.

OK, that's what I'd prefer, too. I submitted JuliaPackaging/BinaryBuilderBase.jl#65 to this end.

@giordano
Copy link
Member

giordano commented Oct 27, 2020

I am not sure which table you refer to

Screenshot_20201026_234851

I know well that page, and how little useful is for base architectures 🙂

??

Nor does the x86-64 architecture: "A generic CPU with 64-bit extensions."

Agreed, that is a weak point

Well, that's exactly what I was referring to with "how little useful is for base architectures" 🤷

Other than that, I think all -march variants explicitly list all cpu instruction extensions they support explicitly and exhaustively?

As I said already, I know well that page [1] [2] [3]

then my revised suggestion for such a reference is: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html plus the information "64 bit implies MMX and SSE".

To be clear, I never said you're wrong, I just asked for references. Everything is confusing, or implicit at best (see Julia's i686). That's what I was referring to with "oral tradition". I never found a single source with clear information.

@fingolfin
Copy link
Member Author

@giordano so that Wikipedia page's summary box is misleading at best, or just wrong -- I consider that on par for Wikipedia ;-). The actual content of the pages does describe the situation correctly, though. And in any case, this is not the fault of the GCC page.

And I also never doubted that you are familiar with https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html -- I just don't understand why you seem to consider it that useless? I guess I am confused about what you consider confusing besides the incomplete information for the x86-64 entry (to which I already agreed it should be fixed/improved).

Ironically, it does list this information, but in a place where nobody would find it other than by chance or by reading everything (haha right): The documentation for -mfpmath=unit states this:

For the x86-32 compiler, you must use -march=cpu-type, -msse or -msse2 switches to enable SSE extensions and make this option effective. For the x86-64 compiler, these extensions are enabled by default.

I also just looked it up, and they only ask for signing legal stuff etc. for "large" contributions, so I'll look into submitting a patch to that page which improves the march=x86-64 entry accordingly.

@fingolfin
Copy link
Member Author

Submitted https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97585 (submitting even a trivial patch seems to require editing multiple files and complex procedures. Ugh)

@fingolfin
Copy link
Member Author

I guess this is obsolete now?

@staticfloat
Copy link
Member

Yes, I think we've fixed things by re-naming the i686 to pentium4.

@staticfloat staticfloat closed this Nov 2, 2020
@fingolfin fingolfin deleted the mh/detect-mmx-sse-sse2 branch October 21, 2021 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants