Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graviton2-b #330

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Graviton2-b #330

wants to merge 5 commits into from

Conversation

accopeland
Copy link

Another attempt at an aarch64 support. I made some changes to CMakelists.txt to prevent errors at config time. Added some ifdefs around the arch-specific parts so you should get identical x86_64 performance now and working binaries on aarch64 (only graviton2 tested so far).

#include <x86intrin.h>
#if defined(__GNUC__) && defined(__aarch64__)
#define SIMDE_ENABLE_NATIVE_ALIASES
#include "../simde/x86/avx2.h"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR.

Where is ../simde/x86/avx2.h?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why limit it to aarch64? ppc64* could probably be fixed by this as well.

@accopeland
Copy link
Author

accopeland commented Apr 5, 2022 via email

@voutcn
Copy link
Owner

voutcn commented Apr 5, 2022

Found that with this PR HasPopCnt and HasBmi2 will always return false on aarch64. And the megahit python script will always choose the megahit_core binary without popcnt and BMI2. Thus you don't need to include any simde headers.

#if !defined(__aarch64__)
  #include <x86intrin.h>
#endif

and change

#if defined(__BMI2__) && defined(USE_BMI2)

to

defined(__BMI2__) && defined(USE_BMI2) && !defined(__aarch64__)

should be sufficient to get it compiled.

@voutcn
Copy link
Owner

voutcn commented Apr 5, 2022

FYI. We have three megahit_core binary compiled: one for popcnt + BMI2, one for popcnt only and one for none of them. And the megahit python script will choose the correct binary (through the functions in cpu_dispatch.h) here:

megahit/src/megahit

Lines 613 to 630 in f8afe5d

def cpu_dispatch():
if not opt.hw_accel:
logger.info('Using megahit_core without POPCNT and BMI2 support, '
'because --no-hw-accel option manually specified')
else:
has_hw_accel = check_output([opt.megahit_core, 'checkcpu'])
if has_hw_accel == '1':
logger.info('Using megahit_core with POPCNT and BMI2 support')
else:
opt.hw_accel = False
has_popcnt = check_output([opt.megahit_core, 'checkpopcnt'])
if has_popcnt == '1':
opt.has_popcnt = True
logger.info('Using megahit_core with POPCNT support')
else:
logger.info('Using megahit_core without POPCNT and BMI2 support, '
'because the features not detected by CPUID ')

If we have a good enough way to check POPCNT or BMI2 support on ARM we can revisit the POPCNT and BMI2 hardware acceleration options for ARM.

@@ -5,6 +5,11 @@
#ifndef MEGAHIT_CPU_DISPATCH_H
#define MEGAHIT_CPU_DISPATCH_H

#if defined(__GNUC__) && defined(__aarch64__)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why only aarch64?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants