Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compile OpenBLAS-0.2.20 issues #2655

Closed
SUNLOVEERIN opened this issue Jun 10, 2020 · 15 comments
Closed

compile OpenBLAS-0.2.20 issues #2655

SUNLOVEERIN opened this issue Jun 10, 2020 · 15 comments

Comments

@SUNLOVEERIN
Copy link

Hi, when I compile openblas on the Ubuntu, the error prints:
Backtrace for this error:
#0 0x7f6de75b9cd1 in ???
#1 0x7f6de75b8ea5 in ???
#2 0x7f6de727c20f in ???
#3 0x562d50403763 in ???
#4 0x562d50403b60 in ???
#5 0x562d503d0087 in ???
#6 0x562d503ca947 in ???
#7 0x562d503cf6ea in ???
#8 0x562d503c064e in ???
#9 0x7f6de725d0b2 in ???
#10 0x562d503c06dd in ???
#11 0xffffffffffffffff in ???
OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./zblat3 < ./zblat3.dat
Segmentation fault (core dumped)
make[1]: *** [Makefile:36: level2] Error 139
make[1]: *** Waiting for unfinished jobs....
rm -f ?BLAT3.SUMM
OPENBLAS_NUM_THREADS=2 ./sblat3 < ./sblat3.dat
OPENBLAS_NUM_THREADS=2 ./dblat3 < ./dblat3.dat
OPENBLAS_NUM_THREADS=2 ./cblat3 < ./cblat3.dat
OPENBLAS_NUM_THREADS=2 ./zblat3 < ./zblat3.dat
make[1]: Leaving directory '/home/jiayusun/Downloads/cp2k-6.1/tools/toolchain/build/OpenBLAS-0.2.20/test'
make: *** [Makefile:117: tests] Error 2

How can I handle this problem?
Thank you so much

Sincerely
SUN

@martin-frbg
Copy link
Collaborator

What kind of cpu is this ? And are you really required to use 0.2.20 from three years ago, or could you try with the current 0.3.9 release instead ?

@SUNLOVEERIN
Copy link
Author

The CPU is GUN, I am installing cp2k now, and this version is the default version to install, but when compiling cp2k, errors show as follows:

ERROR: (/home/jiayusun/Downloads/cp2k-6.1/tools/toolchain/scripts/install_openblas.sh, line 58) Non-zero exit code detected.

So, I'd like to install this package by myself to go through this compile process. If I can figure out above issues, then there is no need to install it.

I am a new coder, so it's kind of hard for me.
Do you have some tips?

Thank you so much

@martin-frbg
Copy link
Collaborator

The OpenBLAS build script will normally figure out what model of cpu you have (at least if it is compatible to some known Intel, AMD or ARM model - I have no idea what GUN is).
Perhaps you could install the 7.1 release of cp2k instead of 6.1 (or just copy the toolchain scripts for building openblas from the new version) - the newer one will at least install OpenBLAS 0.3.6

@SUNLOVEERIN
Copy link
Author

Sorry, the CPU is inter, but I use gfortran to compile.
Thank you so much, I will check the new version and have a try.

@SUNLOVEERIN
Copy link
Author

Hi, sorry to bother you but the error also shows when I install cp2k 7.1 version and using the OpenBLAS 0.3.6, the error message are listed below:

==================== Installing OpenBLAS ====================
OpenBLAS-0.3.6.tar.gz is found
Installing from scratch into /home/jiayusun/Downloads/cp2k-7.1/tools/toolchain/install/openblas-0.3.6
patching file kernel/x86_64/KERNEL.SKYLAKEX
ERROR: (/home/jiayusun/Downloads/cp2k-7.1/tools/toolchain/scripts/install_openblas.sh, line 72) Non-zero exit code detected.
ERROR: (./scripts/install_mathlibs.sh, line 34) Non-zero exit code detected.

How can I do in this case?
What the meaning of this sentence: "Non-zero exit code detected." When I compile other packages, this error occur many times.

Thank you for your help or advice.

Sincerely

@martin-frbg
Copy link
Collaborator

The message only means "something went wrong, the script failed", are there any logs written by this install_openblas script that would tell what actually went wrong ? (It could be simply that you do not have a fortran compiler installed, or some other piece of software is missing)

@martin-frbg
Copy link
Collaborator

According to https://github.com/cp2k/cp2k/blob/master/tools/toolchain/scripts/install_openblas.sh there should be a file named install.serial.log with the messages from the attempt to install OpenBLAS. Could you upload this here? Seeing that it now fails in the "install" step, the original problem with the old version is indeed fixed, and it could simply be that you do not have write permissions where the script wants to install OpenBLAS (maybe you need to run this script with "sudo" to have administrator rights to install libraries).

@SUNLOVEERIN
Copy link
Author

Thank you so much, I have installed gfotran package. Also, the file name install.serial.log cannot be found, so some files related to openblas are attached, please have a look.
openblas.zip

@brada4
Copy link
Contributor

brada4 commented Jun 12, 2020

             make -j $NPROCS \
                   MAKE_NB_JOBS=0 \
                   TARGET=NEHALEM \
                   USE_THREAD=0 \
                   CC="${CC}" \
                   FC="${FC}" \
                   PREFIX="${pkg_install_dir}" \
                   > make.serial_nehalem.log 2>&1 \
            ) <---- line 72
            make -j $NPROCS \
                 MAKE_NB_JOBS=0 \
                 USE_THREAD=0 \
                 CC="${CC}" \
                 FC="${FC}" \
                 PREFIX="${pkg_install_dir}" \
                 install > install.serial.log 2>&1

we need make.serial_nehalem.log, maybe inspecting it gives you hints to fix without posting
MAKE_NB_JOBS should be negative , see #829

@brada4
Copy link
Contributor

brada4 commented Jun 12, 2020

If by GUN you mean chinese production of AMD-hygon ZEN, then you need TARGET=HASWELL for 0.2.20
Probably worth building OpenBLAS without any parameters to weight if build script outside openblas breaks it or if openblas is broken.
Ubuntu version always helps btw.

@martin-frbg
Copy link
Collaborator

@brada4 this already moved past the original problem of building 0.2.20 and I do not think there is a reason to go back to that old version. In 0.3.6, the Hygon processor should be autodetected already.

@brada4
Copy link
Contributor

brada4 commented Jun 12, 2020

The thing is OpenBLAS builds on Ubuntu in general, and AVX512 is disabled on Ubuntu 16 with old compiler automatically.
The picture presented is incomplete to understand what is broken, I am suspecting one of two:

  • New CPUID
  • Wrapper script either broken or adversely acting on new CPUID
    Less suspect:
  • Broken toolchain, like diverging CC and FC
  • Bad virtualisation

@SUNLOVEERIN
Copy link
Author

@brada4 Hi, in the make.serial_nehalem.log, only there words are printed:

/home/jiayusun/Downloads/cp2k-7.1/tools/toolchain/scripts/install_openblas.sh: line 55: make: command not found

you mean to insert the words you types here in the make.serial_nehalem.log ??

@martin-frbg
Copy link
Collaborator

You need to install the make package then - this program is used by most source code packages to drive their compilation - "Makefiles" distributed with the code contain the instructions which options to use and which source files to compile into the desired program(s).

@SUNLOVEERIN
Copy link
Author

@brada4 @martin-frbg
Hi, thank you so much for your help.
Now, the openblas has been installed, though I don't know why, it's good news anyway.
Thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants