Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cpuid checks behind an HAVE_CPUID check. #116

Merged
merged 2 commits into from
May 9, 2024

Conversation

jkbonfield
Copy link
Collaborator

Fixes #115.

Testing compiler versions is tricky, especially given some vendors wrap up standard compilers as their own, but change version numbers (Apple, Intel) and compilers also often masquerade as another (eg setting GNUC).

Instead we just check if it links. Note there's no guard around this here for x86-64, but that's done in the code itself. (In theory we may have cpuid on another architecture that we wish to use at some point.)

Fixes samtools#115.

Testing compiler versions is tricky, especially given some vendors
wrap up standard compilers as their own, but change version numbers
(Apple, Intel) and compilers also often masquerade as another (eg
setting GNUC).

Instead we just check if it links.  Note there's no guard around this
here for x86-64, but that's done in the code itself.  (In theory we may
have cpuid on another architecture that we wish to use at some point.)
@jkbonfield
Copy link
Collaborator Author

jkbonfield commented Apr 22, 2024

Problems still to consider.

How does this work from htslib, which although having this as a submodule doesn't run the configure script ind instead builds the source direct.

  1. Htslib's ./configure - it can just copy this logic and add HAVE_CPUID into its own config.h
  2. Htslib's Makefile (configureless) could just copy the existing ifdef magic to the config.h stub it creates, so it replicates current behaviour (with the same limitations on portability). Or we just define it to be true and accept if it fails then don't do that - run configure!

Htslib builds from source have htscodecs as a submodule, so both can update in sync.
Htslib can also link against an external htscodecs library, but the configure issue is then solved there already, so it's not a problem.

@jmarshall
Copy link
Member

I don't have an x86 machine handy to check on, but could this just be done with the built-in generic tests? e.g.

AC_CHECK_DECLS([__get_cpuid_max, __cpuid_count], [], [], [[#include <cpuid.h>]])

@jkbonfield
Copy link
Collaborator Author

Good thought. Possibly. It'd need validating whether it also solves the problem on the Mac, but given it was a compiler warning about missing prototype as well as a link error it seems likely. I'll try it out.

@jkbonfield
Copy link
Collaborator Author

So... pros and cons. The configure.ac is simpler: just 1 line as you suggest. However the code that uses it is now ugly, made more so by the 0/1 nature rather than defined/not-defined as used elsewhere in autoconf.

#if defined(__x86_64__) && \
    defined(HAVE_DECL___CPUID_COUNT)   && HAVE_DECL___CPUID_COUNT && \
    defined(HAVE_DECL___GET_CPUID_MAX) && HAVE_DECL___GET_CPUID_MAX
...

Plus these will need to be added to the htslib Makefile too for when not using configure.

Thoughts?

Switch to use AC_CHECK_DECLS as suggested by jm18.  It makes the C
code a bit more complex, but is less code overall when we include the
configure script.
@jkbonfield
Copy link
Collaborator Author

I added those changes as a separate commit anyway, so we can either squash them together which means we get the new mechanism, or drop that commit and keep my previous method. Both appear to work fine under linux. I'll validate MinGW

@jmarshall
Copy link
Member

Thoughts?

If you fill in the actions-if-found and/or actions-if-not-found arguments, you can have your own simpler choice of macros to be defined instead of getting the default actions.

@jkbonfield
Copy link
Collaborator Author

I did initially try as you suggest, but the if-found action is called for each in turn, so it ends up as an OR instead of an AND operation which is what I need. You can do it with two queries, to create two simpler looking macros, which also means it can be turned into a if-defined query rather than "defined and !zero" check, but then you're just doubling up code somewhere else so it doesn't feel like a big win and is non-standard to boot.

Anyway, given I cannot validate any of this as fixing the initial issue, I'll wait for feedback from @ryandesign.

@daviesrob
Copy link
Member

I've tested this and can confirm that it fixes the build on MacOS X 10.7 (Lion) with a contemporary xcode, all running in a UTM virtual machine.

configure reports:

checking whether __get_cpuid_max is declared... no
checking whether __cpuid_count is declared... no
checking C compiler flags needed for sse4.1... unsupported
checking C compiler flags needed for avx2... unsupported
checking C compiler flags needed for avx512f... unsupported

which I guess is all we can expect with the missing declarations.

@jkbonfield
Copy link
Collaborator Author

What's the plan for htslib?

We can put in a similar autoconf rule easily enough as it's minimal code to add. There's the Makefile though for when configure isn't used. If we do nothing it should continue to build, just defaulting without optimised implementations. I'm not sure I like that (but frankly I've never liked our ability to build without first running configure either as it's just a needless pitfall). Instead I would advise hard coding the macros to be "on", so the default non-configured is just to assume this stuff works. That's a reasonable assumption which should hold true everywhere given it still has an x86-64 CPU check anyway.

@daviesrob
Copy link
Member

I expect htslib make will probably end up setting these on x86_64. It makes quite a few assumptions already, so I think this would be reasonable given that we can let configure override it if necessary.

@jkbonfield
Copy link
Collaborator Author

My point was we don't need to have x86_64 checks in the Makefile as they're already in the source files. We can just define these to 1 on all platforms and it'll have no consequences elsewhere. We don't need extra fragile or complex code for platform detection when not using autoconf.

@daviesrob daviesrob merged commit a7ce404 into samtools:master May 9, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Undefined symbols ___cpuid_count ___get_cpuid_max building on Mac OS X 10.7
3 participants