Add flags to support OpenBSD #157

ron-at-swgy · 2024-06-20T17:57:13Z

Summary

There are two primary changes included here. This commit aims to address #156

Type length

This commit includes OpenBSD in the platform checks used to check for size_t, uint64_t, unsigned long and unsigned long long all being 64 bits in length.

Consider the following C code compiled and run on the target platform

#include <unistd.h>
#include <stdio.h>

int
main(int argc, char **argv)
{
        printf("sizeof(unsigned long) = %d\n", sizeof(unsigned long));
        printf("sizeof(long) = %d\n", sizeof(long));
        printf("sizeof(int64_t = %d\n", sizeof(int64_t));
        printf("sizeof(unsigned long long) = %d\n", sizeof(unsigned long long));
        printf("sizeof(uint64_t) = %d\n", sizeof(uint64_t));
        printf("sizeof(long *) = %d\n", sizeof(long *));
        printf("sizeof(uint64_t *) = %d\n", sizeof(uint64_t *));
        return 0;
}

When compiled and run:

 $ ./sizes 
sizeof(unsigned long) = 8
sizeof(long) = 8
sizeof(int64_t = 8
sizeof(unsigned long long) = 8
sizeof(uint64_t) = 8
sizeof(long *) = 8
sizeof(uint64_t *) = 8

Invalid CPU Feature String

Addtionally, the checks for some avx513* flags are removed, as avx512fp16 yield an error: "invalid cpu feature string for builtin" when compiling with clang.

Possibly related to llvm/llvm-project#65320

Testing

The compilation step succeeds and results in the following:

host$ pwd
/home/user/x86-simd-sort/builddir
host$ ls -l lib/
total 144
-rw-r--r--  1 user  user  21368 Jun 20 13:43 liblibavx.a
drwxr-xr-x  2 user  user    512 Jun 20 13:43 liblibavx.a.p
-rw-r--r--  1 user  user   2506 Jun 20 13:43 liblibicl.a
drwxr-xr-x  2 user  user    512 Jun 20 13:42 liblibicl.a.p
-rw-r--r--  1 user  user  20704 Jun 20 13:43 liblibskx.a
drwxr-xr-x  2 user  user    512 Jun 20 13:43 liblibskx.a.p
-rw-r--r--  1 user  user   1378 Jun 20 13:43 liblibspr.a
drwxr-xr-x  2 user  user    512 Jun 20 13:42 liblibspr.a.p

I attempted to build numpy updating the submodule for this project to use my fork. It passes the original failure I encountered but still fails. This indicates I have other dependencies to validate. I can conclusively say the changes here do not make the situation worse!

r-devulap

LGTM, thanks for fixing this! Best way to test the changes is by adding a CI run on an openBSD docker container.

lib/x86simdsort.cpp

ron-at-swgy · 2024-06-20T21:59:46Z

Thank you for your review, I have added the clang version 18 condition as requested. I'm not familiar with any OpenBSD Docker images. I'll try swapping the numpy submodule to use my fork and see if it works

lib/x86simdsort.cpp

r-devulap

Looks good. thanks!

This commit includes OpenBSD in the platform checks used to check for size_t, uint64_t, unsigned long and unsigned long long all being 64 bits in length. Addtionally, the checks for some `avx513*` flags are removed, as `avx512fp16` yield an error: "invalid cpu feature string for builtin" when compiling with clang. This commit aims to address intel#156

ron-at-swgy · 2024-06-21T01:52:45Z

I amended the commit so the changes would be formated with clang-format-16. Hopefully this will pass CI now

r-devulap · 2024-06-21T04:50:50Z

Could you confirm that this fixes the problem?

ron-at-swgy · 2024-06-21T14:14:21Z

Could you confirm that this fixes the problem?

Ragu, thank you for following up. My goal is to build numpy 2.0 - so this change fixes one of the errors I encounter along the way:

host$ make
c++ -o kvsort -mavx512vl -mavx512dq -I../src -std=c++17 -O3 avx512-kv.cpp
c++ -o qsortavx2 -mavx2 -I../src -std=c++17 -O3 skx-avx2.cpp
c++ -o qsortavx512 -mavx512vl -mavx512dq -I../src -std=c++17 -O3 skx-avx2.cpp
c++ -o qsortspr -mavx512vl -mavx512dq -mavx512vbmi2 -mavx512fp16  -I../src -std=c++17 -O3 spr-16bit.cpp
c++ -o qsorticl -mavx512vl -mavx512bw -mavx512dq -mavx512vbmi2 -I../src -std=c++17 -O3 icl-16bit.cpp
host$ for file in !(Makefile|*.cpp)      
> do                            
> echo "Running example $file"  
> ./$file                       
> echo "Result was $?"          
> done
Running example kvsort
Illegal instruction (core dumped) 
Result was 132
Running example qsortavx2
Result was 0
Running example qsortavx512
Illegal instruction (core dumped) 
Result was 132
Running example qsorticl
Illegal instruction (core dumped) 
Result was 132
Running example qsortspr
Illegal instruction (core dumped) 
Result was 132

When trying to build and run the /examples, they compile but only qsortavx2 exits successfully. The others dump cores with illegal hardware instruction errors. This may be expected as all the other examples expect avx512 support.

The same behavior was found in both of the following environments:

# host 1, a Dell running in a datacenter
machdep.cpuvendor=GenuineIntel
machdep.cpuid=0x306c3
machdep.cpufeature=0xbffbfbff

# host 2, a small hp computer
machdep.cpuvendor=GenuineIntel
machdep.cpuid=0x506e3
machdep.cpufeature=0xbffbfbff

Again, this is probably expected. If there is other code I could run to exercise the xf86-simd-sort functionality, I can do so, but I think this changeset accomplishes its goal. What is your opinion?

r-devulap · 2024-06-21T16:49:23Z

you can configue meson to build a test suite with -Dbuild_tests=true, and then run meson test in the builddir to run the test suite.

ron-at-swgy · 2024-06-21T19:04:51Z

host$ meson test
ninja: Entering directory `/home/user/x86-simd-sort/builddir'
[18/18] Linking target testexe
test-keyvalue.cpp(libtests_kvsort.a.p/test-keyvalue.cpp.o:(std::__1::vector<double, std::__1::allocator<double>> get_array<double>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, unsigned long, double, double)) in archive tests/libtests_kvsort.a): warning: rand() may return deterministic values, is that what you want?
1/1 x86 simd sort tests        TIMEOUT        30.02s   killed by signal 15 SIGTERM
>>> MALLOC_PERTURB_=76 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 LD_LIBRARY_PATH=/home/user/x86-simd-sort/builddir/ ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 /home/user/x86-simd-sort/builddir/testexe


Ok:                 0   
Expected Fail:      0   
Fail:               0   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            1   

Full log written to /home/user/x86-simd-sort/builddir/meson-logs/testlog.txt

The test times out on both of my available machines.

r-devulap · 2024-06-21T19:54:48Z

sorry, didn't realize meson test times out in 30s by default. You could directly just run ./testexe by itself or use meson test -t 100 (on my SKX it needs about < 3 min to run the whole thing).

ron-at-swgy · 2024-06-21T20:30:20Z

success! 183 seconds on one platform, 189 on the other

r-devulap · 2024-06-21T20:36:02Z

Perfect, will merge this then. And will update NumPy soon after. Thanks for your contribution @ron-at-swgy!

r-devulap · 2024-06-25T22:48:09Z

@ron-at-swgy Porting this to numpy. See numpy/numpy#26797

ron-at-swgy · 2024-06-26T00:08:44Z

I just checked out and installed your branch of numpy and I'm excited to report total success!!

In [5]: np.version.full_version
Out[5]: '2.1.0.dev0+git20240625.f160b2a'

In [6]: arr = np.array(range(100))

In [7]: arr
Out[7]: 
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

In [9]: arr.std()
Out[9]: np.float64(28.86607004772212)

In [10]: arr.mean()
Out[10]: np.float64(49.5)

In [11]:                                                                          
Do you really want to exit ([y]/n)? y

r-devulap requested changes Jun 20, 2024

View reviewed changes

lib/x86simdsort.cpp Outdated Show resolved Hide resolved

ron-at-swgy force-pushed the main branch from a723fea to c7d3465 Compare June 20, 2024 21:54

ron-at-swgy marked this pull request as ready for review June 20, 2024 21:59

r-devulap reviewed Jun 20, 2024

View reviewed changes

lib/x86simdsort.cpp Outdated Show resolved Hide resolved

ron-at-swgy force-pushed the main branch from c7d3465 to 3fba97b Compare June 20, 2024 22:15

r-devulap previously approved these changes Jun 20, 2024

View reviewed changes

ron-at-swgy dismissed r-devulap’s stale review via 3bb9296 June 21, 2024 01:51

ron-at-swgy force-pushed the main branch from 3fba97b to 3bb9296 Compare June 21, 2024 01:51

r-devulap approved these changes Jun 21, 2024

View reviewed changes

r-devulap merged commit 9a1b616 into intel:main Jun 21, 2024
11 checks passed

This was referenced Jun 25, 2024

BLD: Fix x86-simd-sort build failure on openBSD numpy/numpy#26797

Merged

BUG: CPU Feature Checks Fail on OpenBSD 7.5 numpy/numpy#26785

Closed

charris mentioned this pull request Jul 1, 2024

BLD: Fix x86-simd-sort build failure on openBSD numpy/numpy#26821

Merged

r-devulap mentioned this pull request Jul 4, 2024

Adds support for accelerated sorting with x86-simd-sort pytorch/pytorch#127936

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add flags to support OpenBSD #157

Add flags to support OpenBSD #157

ron-at-swgy commented Jun 20, 2024 •

edited

Loading

r-devulap left a comment

ron-at-swgy commented Jun 20, 2024

r-devulap left a comment

ron-at-swgy commented Jun 21, 2024

r-devulap commented Jun 21, 2024

ron-at-swgy commented Jun 21, 2024

r-devulap commented Jun 21, 2024

ron-at-swgy commented Jun 21, 2024

r-devulap commented Jun 21, 2024

ron-at-swgy commented Jun 21, 2024

r-devulap commented Jun 21, 2024

r-devulap commented Jun 25, 2024

ron-at-swgy commented Jun 26, 2024

Add flags to support OpenBSD #157

Add flags to support OpenBSD #157

Conversation

ron-at-swgy commented Jun 20, 2024 • edited Loading

Summary

Type length

Invalid CPU Feature String

Testing

r-devulap left a comment

Choose a reason for hiding this comment

ron-at-swgy commented Jun 20, 2024

r-devulap left a comment

Choose a reason for hiding this comment

ron-at-swgy commented Jun 21, 2024

r-devulap commented Jun 21, 2024

ron-at-swgy commented Jun 21, 2024

r-devulap commented Jun 21, 2024

ron-at-swgy commented Jun 21, 2024

r-devulap commented Jun 21, 2024

ron-at-swgy commented Jun 21, 2024

r-devulap commented Jun 21, 2024

r-devulap commented Jun 25, 2024

ron-at-swgy commented Jun 26, 2024

ron-at-swgy commented Jun 20, 2024 •

edited

Loading