Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation with target POWER8 fails in AIX 7.2 #1997

Closed
ayappanec opened this issue Feb 4, 2019 · 44 comments
Closed

Compilation with target POWER8 fails in AIX 7.2 #1997

ayappanec opened this issue Feb 4, 2019 · 44 comments

Comments

@ayappanec
Copy link
Contributor

I am trying to compile OpenBLAS (develop branch) in AIX 7.2 with TARGET=POWER8.
GCC compiler used is 8.1.0 version.

#oslevel -s
7200-03-00-0000

#prtconf
System Model: IBM,8286-42A
Machine Serial Number: 1005CEV
Processor Type: PowerPC_POWER8
Processor Implementation Mode: POWER 8

#make BINARY=64 CC="gcc -maix64" FC="gfortran -maix64" CXX="g++ -maix64" TARGET=POWER8
......
......
gcc -maix64 -c -Ofast -mcpu=power8 -mtune=power8 -mvsx -malign-power -DUSE_OPENMP -fno-fast-math -fopenmp -DMAX_STACK_ALLOC=2048 -fopenmp -Wall -DF_INTERFACE_GFORT
-fPIC -DSMP_SERVER -DUSE_OPENMP -DNO_WARMUP -DMAX_CPU_NUMBER=2 -DMAX_PARALLEL_NUMBER=1 -DVERSION="0.3.6.dev" -mpowerpc64 -maix64 -DASMNAME=idamax_k -DASMFNAME=i
damax_k_ -DNAME=idamax_k_ -DCNAME=idamax_k -DCHAR_NAME="idamax_k_" -DCHAR_CNAME="idamax_k" -DNO_AFFINITY -I.. -DDOUBLE -UCOMPLEX -UCOMPLEX -DDOUBLE -DUSE_ABS
-UUSE_MIN ../kernel/power/idamax.c -o idamax_k.o
gcc -maix64 -c -Ofast -mcpu=power8 -mtune=power8 -mvsx -malign-power -DUSE_OPENMP -fno-fast-math -fopenmp -DMAX_STACK_ALLOC=2048 -fopenmp -Wall -DF_INTERFACE_GFORT
-fPIC -DSMP_SERVER -DUSE_OPENMP -DNO_WARMUP -DMAX_CPU_NUMBER=2 -DMAX_PARALLEL_NUMBER=1 -DVERSION="0.3.6.dev" -mpowerpc64 -maix64 -DASMNAME=idamin_k -DASMFNAME=i
damin_k_ -DNAME=idamin_k_ -DCNAME=idamin_k -DCHAR_NAME="idamin_k_" -DCHAR_CNAME="idamin_k" -DNO_AFFINITY -I.. -DDOUBLE -UCOMPLEX -UCOMPLEX -DDOUBLE -DUSE_ABS
-DUSE_MIN ../kernel/power/idamin.c -o idamin_k.o
../kernel/power/idamin.c: In function 'idamin_k':
../kernel/power/idamin.c:52:5: error: 'asm' operand has impossible constraints
asm(
^~~~~~~
../kernel/power/idamax.c: In function 'idamax_k':
../kernel/power/idamax.c:52:5: error: 'asm' operand has impossible constraints
asm(
^~~~~~~
make[1]: *** [Makefile.L1:564: idamax_k.o] Error 1

There is something inside the asm code which is not valid for AIX it seems.

@martin-frbg
Copy link
Collaborator

What does as --version return on that system ? (The most recent change to that file only replaced some comparison mnemonics so if there is anything problematic about the constraints it must have been in the code for the past eleven months already. Before that, the KERNEL.POWER8 file used to point to the generic C implementations of the two functions, i.e. ..arm/idamin.c and ../arm/idamax.c)

@ayappanec
Copy link
Contributor Author

We are using AIX native assembler & linker and not GNU assembler and linker.
The reason being GNU assembler & linker is not ported well in AIX and it has issues.

And i am thinking most likely nobody would have tried AIX build (with TARGET=POWER8).

@martin-frbg
Copy link
Collaborator

Heh, sorry, did not recognise you right away. I suspect we never hit this in #1803 as TeejIBM kept building for POWER5

@ayappanec
Copy link
Contributor Author

Yes, that's right.

@martin-frbg
Copy link
Collaborator

Could this be similar to #1699 (AIX native tools apparently not accepting register names "vs") ?

@ayappanec
Copy link
Contributor Author

That issue mainly targets the problem with XLC compiler.
In this case, i use GCC compiler (version 8.1.0) and AIX native assembler & linker.

From a high level view, it looks like the error seems to be coming from gcc compiler (and not from the assembler itself) . But may be it's internally related ?

@martin-frbg
Copy link
Collaborator

If it was gcc then I think @quickwritereader would have encountered the problem already(?).
Can you try changing the entries for IDAMINKERNEL and IDAMAXKERNEL in kernel/power/KERNEL.POWER8 so that they point at the generic C kernels ../arm/idamin.c and ../arm/idamax.c ,
just to see if any similar errors pop up with the other kernel files ?

@quickwritereader
Copy link
Contributor

I don't have any idea of that issue. gcc 7.2 did not complain when I used it.
If problem will not be handled by small modification and those kernels are in demand I could try to supply vector c version too

 gcc --version
gcc (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.

If anyone have an idea here are constraints

            : [maxf] "=m"(*maxf),[ptr_tmp] "+&b"(x),[index] "=r"(index), [n] "+&r"(n)
            : [mem] "m"(*(const double (*)[n])x), [ptr_x] "b"(x), [ptr_maxf] "b"(maxf) ,
            [i16] "b"(16), [i32] "b"(32), [i48] "b"(48),
            [i64] "b"(64), [i80] "b"(80), [i96] "b"(96), [i112] "b"(112),
            [start] "v"(start),  [adder] "v"(temp_add_index)
            : "cc", "vs0", "vs1","vs2","vs3", "vs4","vs5","vs32", "vs33", "vs34", "vs35", "vs36",
            "vs37", "vs38", "vs39", "vs40", "vs41", "vs42", "vs43", "vs44", "vs45", "vs46", "vs47", "vs48", "vs49", "vs50", "vs51"
            );

@ayappanec
Copy link
Contributor Author

@martin-frbg There is no idamax.c & idamin.c inside kernel/arm.

@martin-frbg
Copy link
Collaborator

Indeed these are ../arm/iamax.c and ../arm/iamin.c , sorry.

@ayappanec
Copy link
Contributor Author

With the above mentioned changes, compilation proceeded further and finally stops with an assembler error. Debugging the error reveals that this instruction "xxswapd" is not understandable by AIX assembler.

After some searching , it appears to me that this instruction is used to convert BIG-ENDIAN word format to LITTLE-ENDIAN format. And AIX being a BIG-ENDIAN , it may not need this. So i am thinking to remove this line from all the files.

@martin-frbg
Copy link
Collaborator

This could point to a bigger problem unfortunately. Actually I am not sure about the status of Power8BE support - I suspect all the recent work by quickwritereader was done on and for LITTLE-ENDIAN systems.

@martin-frbg
Copy link
Collaborator

Actually looking at the code, the "xxswapd" instruction is used only in the older Power8 kernels written by wernsaar and only some of them have any if defined(AIX) - notably not those which use xxswapd.

@ayappanec
Copy link
Contributor Author

Okay. Not sure why AIX assembler couldn't able to recognize "xxswapd" . May be the implementation is not there ? Will check with the AIX assembler guys.

@brada4
Copy link
Contributor

brada4 commented Feb 5, 2019

Try to prepend gnu binutils in path. Or if you have idea about other assembler(s) around.
gcc may be only generating input that is correct only for its own backend (that is even outside openblas asm blocks)

@martin-frbg
Copy link
Collaborator

@brada4 he mentioned above that he is not using gnu assembler on purpose, as it still has problems on AIX. Also the problem appears to be with wernsaar's handcoded assembly files, so code generation by gcc is probably not involved. Also note that he works for IBM (and IIRC on the team that ports open source software to AIX).

@brada4
Copy link
Contributor

brada4 commented Feb 5, 2019

But it is worth mentioning IF gas works in this context, i.e if assemblies are completely wrong for modern AIX, or need small improvement to be more usable.

@ayappanec
Copy link
Contributor Author

"xxswapd" implementation is not there in AIX according to the AIX assembler team. They mentioned it's an extended mnemonic and one can use" xxpermdi" (with proper args) as an alternative.

@brada4 We can't use gas (GNU assembler) right now in AIX. Some time long back binutils is ported to AIX but not properly. Still the XCOFF object files created by gas is not recognizable by GNU linker (or AIX native linker). So we are broken.

I will try to use the alternative mnemonic "xxpermdi" but as @martin-frbg pointed out that the work is done for little-endian systems , i may encounter runtime issues in AIX (which is big-endian).

@martin-frbg
Copy link
Collaborator

Actually I am still unclear about the implementation status - the presence of #if defined(AIX) in some of the files suggests that they were written with both LE and BE in mind but all the IBM-sponsored work done by wernsaar in 2016 appears to have been for Linux.

@martin-frbg
Copy link
Collaborator

Perhaps the quickest approach is to simply comment out all occurences of xxswapd on the assumption that they appear in LE-specific code. At the very least this would allow other syntax problems to show,
and if we are very lucky perhaps some functions will even return meaningful results.

@ayappanec
Copy link
Contributor Author

@martin-frbg Right. That is what i am going to do now.

@martin-frbg
Copy link
Collaborator

martin-frbg commented Feb 26, 2019

Any progress ? If the presence of xxswapd can be used as an indicator for "problematic" files, the majority appear to be microkernels that are conditionally included from related C files that - I think - all have a fallback implementation. So it might be possible to just add an ifndef AIX there, which could reduce the problem to DTRSM,ZGEMM and ZTRMM. If this does not make sense, perhaps the
cpu autodetection code should be made to report all recent cpus as POWER5 on AIX for now ?

@ayappanec
Copy link
Contributor Author

ayappanec commented Feb 26, 2019

I took a different approach.
Tried building with TARGET=POWER8 in RHEL 7.4 Power8 Big-Endian machine to confirm whether it works there or not. It failed at this point.

OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./dblat3 < ./dblat3.dat

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x3fff95befab7 in ???
#1 0x3fff95ce0477 in ???
#2 0x100832a8 in ???
#3 0x100181b3 in ???
#4 0x1000f9c3 in ???
#5 0x10005b87 in dchk3_
at /root/OpenBLAS/OpenBLAS/test/dblat3.f:1059
#6 0x1000e777 in dblat3
at /root/OpenBLAS/OpenBLAS/test/dblat3.f:292
#7 0x10001a5b in main
at /root/OpenBLAS/OpenBLAS/test/dblat3.f:355
OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./cblat2 < ./cblat2.dat
OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./zblat2 < ./zblat2.dat
rm -f ?BLAT2.SUMM
OMP_NUM_THREADS=2 ./sblat2 < ./sblat2.dat
/bin/sh: line 1: 7779 Segmentation fault (core dumped) OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./dblat3 < ./dblat3.dat
make[1]: *** [level3] Error 139

Then i tried TARGET=POWER7 which fall back to POWER6 and the build is successful. It ran openblas_utest as part of the build and all tests passed.

I tried the same on AIX and the build is successful but openblas_utest is not running as part of the build. Manually invoking it , i don't see any tests run.
./openblas_utest
RESULTS: 0 tests (0 ok, 0 failed, 0 skipped) ran in 0 ms

Any idea on this ?

@martin-frbg
Copy link
Collaborator

I am not sure if I understand the inner workings of ctest.h, but I see nothing specific to or against AIX there. openblas_utest not running at all during the build should only happen when the build system is
assuming cross-compilation. (Can you get the utest_main2.c to build manually, it uses a slightly different approach to define and execute the tests that is used for cmake/clang builds on Windows ?)
As the utests "only" check specific corner cases, getting the main BLAS tests to pass is more important - obviously one would need to build with -g (or DEBUG=1) to get a more meaningful backtrace from the dblat3 segfault but line 1059 in dblat3.f is where it tests DTRMM.

@ayappanec
Copy link
Contributor Author

Okay. I see CROSS is set to 1 during AIX build and that is the reason the main BLAS tests are not running.
I manually set cross = 0 in c_check at the appropriate place and now i see the tests are running.
And i see all tests are passed (whether the build will fail if any test fail ?)

But still i see openblas_utest is not getting properly executed.
./openblas_utest
RESULTS: 0 tests (0 ok, 0 failed, 0 skipped) ran in 0 ms

You told to build utest_main2.c manually. But i see the binary is made by multiple object files. Let me dig more on this.

@martin-frbg
Copy link
Collaborator

utest_main2.c has all the tests included, so something like this should work:
gcc -I.. -I. utest_main2.c ../libopenblas.a -o utest2

@ayappanec
Copy link
Contributor Author

It worked. Earlier i was checking on utest_main.c instead of utest_main2.c. My mistake.
gcc -maix64 -I.. -I. utest_main2.c ../libopenblas.a -lgomp -lpthread -lm -o utest2
./utest2
TEST 1/20 amax:samax [OK]
TEST 2/20 drotmg:rotmg [OK]
TEST 3/20 drotmg:rotmg_issue1452 [OK]
TEST 4/20 drotmg:rotmg_D1eqD2_X1eqX2 [OK]
TEST 5/20 drotmg:drotmg_D1_big_D2_big_flag_zero [OK]
TEST 6/20 axpy:daxpy_inc_0 [OK]
TEST 7/20 axpy:zaxpy_inc_0 [OK]
TEST 8/20 axpy:saxpy_inc_0 [OK]
TEST 9/20 axpy:caxpy_inc_0 [OK]
TEST 10/20 zdotu:zdotu_n_1 [OK]
TEST 11/20 zdotu:zdotu_offset_1 [OK]
TEST 12/20 dsdot:dsdot_n_1 [OK]
TEST 13/20 rot:drot_inc_0 [OK]
TEST 14/20 rot:zdrot_inc_0 [OK]
TEST 15/20 rot:srot_inc_0 [OK]
TEST 16/20 rot:csrot_inc_0 [OK]
TEST 17/20 swap:dswap_inc_0 [OK]
TEST 18/20 swap:zswap_inc_0 [OK]
TEST 19/20 swap:sswap_inc_0 [OK]
TEST 20/20 swap:cswap_inc_0 [OK]
RESULTS: 20 tests (20 ok, 0 failed, 0 skipped) ran in 0 ms

@martin-frbg
Copy link
Collaborator

That does look encouraging, so at least POWER6 appears to be working for AIX. (The check for cross-compilation in c_check makes some doubtful assumptions based on the presence of dashes in the compiler name and path if I remember correctly)

@ayappanec
Copy link
Contributor Author

So it seems like with TARGET=POWER6 , the code is working in Power Big-Endian as well.

@ayappanec
Copy link
Contributor Author

ayappanec commented Feb 27, 2019

@martin-frbg Thanks for all your support.
The reason i skipped building the POWER8 code in AIX is there are some more extended mnemonics (apart from xxswapd) which are not implemented in AIX assembler and also AIX being Big-Endian. So i thought i will try first in Linux Big-Endian.
I will try the debug build and analyze the dblat3 segfault.

@ayappanec
Copy link
Contributor Author

I am trying to create a shared object in AIX using the library archive "libopenblas_power6p-r0.3.5.a" .
In linux , this is done in the Makefile using this linker option "--whole-archive". AIX linker won't understand this option so i went ahead with creating a export file using the "CreateExportList" utility from xlc. But then realized that some symbols are ignored and that cause problems while compiling scipy. Exploring other options.
Any ideas ?

@brada4
Copy link
Contributor

brada4 commented Mar 12, 2019

Aix ld manual says it defaults to linking in all unique symbols from command line archives unless told otherwise. It is in local manual page just like in one ifound on the internet.

@michelmno
Copy link

FYI, there is a similar segfault error as reported above

I took a different approach.
Tried building with TARGET=POWER8 in RHEL 7.4 Power8 Big-Endian machine to confirm whether it works there or not. It failed at this point.

OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./dblat3 < ./dblat3.dat

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x3fff95befab7 in ???
#1 0x3fff95ce0477 in ???
#2 0x100832a8 in ???
#3 0x100181b3 in ???
#4 0x1000f9c3 in ???
#5 0x10005b87 in dchk3_
at /root/OpenBLAS/OpenBLAS/test/dblat3.f:1059
#6 0x1000e777 in dblat3
at /root/OpenBLAS/OpenBLAS/test/dblat3.f:292
#7 0x10001a5b in main
at /root/OpenBLAS/OpenBLAS/test/dblat3.f:355
OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./cblat2 < ./cblat2.dat
OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./zblat2 < ./zblat2.dat
rm -f ?BLAT2.SUMM
OMP_NUM_THREADS=2 ./sblat2 < ./sblat2.dat
/bin/sh: line 1: 7779 Segmentation fault (core dumped) OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./dblat3 < ./dblat3.dat
make[1]: *** [level3] Error 139

Then i tried TARGET=POWER7 which fall back to POWER6 and the build is successful. It ran openblas_utest as part of the build and all tests passed.

I tried the same on AIX and the build is successful but openblas_utest is not running as part of the build. Manually invoking it , i don't see any tests run.
./openblas_utest
RESULTS: 0 tests (0 ok, 0 failed, 0 skipped) ran in 0 ms

Any idea on this ?

FYI, there is a similar segfault with linux gnu compiler for ppc64(BE) as reported with openSUSE
https://bugzilla.suse.com/show_bug.cgi?id=1129160
does it need another issue ?

@ayappanec
Copy link
Contributor Author

Checked the bugzilla . The stack trace looks similar.

@brada4
Copy link
Contributor

brada4 commented Mar 14, 2019

the stack traces need function names. I can just guess 0x1 are program and 0x3f are something else, like vdso style stub or libc or kernel.

@martin-frbg
Copy link
Collaborator

martin-frbg commented Mar 14, 2019

I now remember that PPC64BE failure was also mentioned in passing in #1469 (RedHat/Fedora), I understood they normally build for POWER6 as least common denominator. This should probably be made the default in the build system for big-endian until the current power8 kernels are modified to work on big endian as well. (Can I rely on uname -m output not having le attached on big endian systems,
or is there a better, more general way to distinguish the two ?)

@ayappanec
Copy link
Contributor Author

In AIX , uname -m shows Machine ID number.
From man page,
" Displays the machine ID number of the hardware running the system"

So not a good option. Checking on more generic options.

@ayappanec
Copy link
Contributor Author

How about this ?
gcc -E -dM - </dev/null | grep BIG_ENDIAN
#define __BIG_ENDIAN__ 1 #define __FLOAT_WORD_ORDER__ __ORDER_BIG_ENDIAN__ #define _BIG_ENDIAN 1 #define __VEC_ELEMENT_REG_ORDER__ __ORDER_BIG_ENDIAN__ #define __ORDER_BIG_ENDIAN__ 4321 #define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__

But this will support only gcc installed systems.

@martin-frbg
Copy link
Collaborator

Sorry, I assume in AIX Big Endian is a given, so this would matter "only" for Linux (and perhaps any BSD) ?

@ayappanec
Copy link
Contributor Author

Okay. In that case , i think "uname -m" would be sufficient. Not sure about BSD though.

@brada4
Copy link
Contributor

brada4 commented Mar 14, 2019

It is somewhere else like sysctl hw
https://www.freebsd.org/cgi/man.cgi?query=uname&sektion=1

@ayappanec
Copy link
Contributor Author

Aix ld manual says it defaults to linking in all unique symbols from command line archives unless told otherwise. It is in local manual page just like in one ifound on the internet.

I checked it and did some testing. It never worked. It always requires a export file if one needs to create a shared object from an archive containing object files in AIX. If no export file is given, then no symbols are exported. Even checked with the AIX linker team here.

@ayappanec
Copy link
Contributor Author

Because of the above situation, i have to depend on scripts like "CreateExportList" to create the export file before in hand. And the AIX specific asm syntax in file common_power.h needs some changes for which i will create a PR.

@martin-frbg
Copy link
Collaborator

This should be fixed now with kavanabhat's #2338

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants