Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gfortranかつOpenMPのスレッド数を2以上で実行したときに、エラーが発生するテストがある #121

Open
1 of 3 tasks
kohei-noda-qcrg opened this issue Dec 5, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@kohei-noda-qcrg
Copy link
Member

kohei-noda-qcrg commented Dec 5, 2023

どういうバグ?

バグについての説明を簡潔かつ明確に記述してください

以下のようなdirac_caspt2コード外で一部のテストが落ちる問題が起こっている
dlopenなどと書いているのでファイルオープン周りで問題がある?

________________________________________________________ test_c1_methane_slow ________________________________________________________
/home/noda/develop/dirac_caspt2/test/slow/c1_methane_slow/test_c1_methane_slow.py:19: in test_c1_methane_slow
    run_test_dcaspt2(test_command)
/home/noda/develop/dirac_caspt2/test/module_testing.py:59: in run_test_dcaspt2
    process.check_returncode()
/home/noda/.pyenv/versions/3.9.15/lib/python3.9/subprocess.py:460: in check_returncode
    raise CalledProcessError(self.returncode, self.args, self.stdout,
E   subprocess.CalledProcessError: Command '/home/noda/develop/dirac_caspt2/bin/dcaspt2 -i /home/noda/develop/dirac_caspt2/test/slow/c1_methane_slow/active.inp -o /home/noda/develop/dirac_caspt2/test/slow/c1_methane_slow/c1_methane_slow.caspt2.out  --omp 2' returned non-zero exit status 139.
-------------------------------------------------------- Captured stdout call --------------------------------------------------------
/home/noda/develop/dirac_caspt2/test/slow/c1_methane_slow test start
Script path :  /home/noda/develop/dirac_caspt2/bin/dcaspt2
Command submitted directory :  /home/noda/develop/dirac_caspt2/test/slow/c1_methane_slow
Scratch directory : /home/noda/dcaspt2_scratch/active_2023-12-05_18-17-10_y88idwht
log : Copied 1-2 integral files to the scratch directory
Created command :  /home/noda/develop/dirac_caspt2/bin/r4dcascicoexe &&  /home/noda/develop/dirac_caspt2/bin/r4dcaspt2ocoexe

================= Standard error =================

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7f6e1b67d960 in ???
#1  0x7f6e1b67cac5 in ???
#2  0x7f6e1b32551f in ???
        at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#3  0x7f6e228a42c5 in _dl_open
        at ./elf/dl-open.c:824
#4  0x7f6e1b37363b in dlopen_doit
        at ./dlfcn/dlopen.c:56
#5  0x7f6e1b457c87 in __GI__dl_catch_exception
        at ./elf/dl-error-skeleton.c:208
#6  0x7f6e1b457d52 in __GI__dl_catch_error
        at ./elf/dl-error-skeleton.c:227
#7  0x7f6e1b37312d in _dlerror_run
        at ./dlfcn/dlerror.c:138
#8  0x7f6e1b3736c7 in dlopen_implementation
        at ./dlfcn/dlopen.c:71
#9  0x7f6e1b3736c7 in ___dlopen
        at ./dlfcn/dlopen.c:81
#10  0x7f6e1d8da141 in ???
#11  0x7f6e20c6f7b4 in ???
#12  0x7f6e1c90d1c8 in ???
Segmentation fault

================= Calculation finished ================
User Command : /home/noda/develop/dirac_caspt2/bin/dcaspt2 -i /home/noda/develop/dirac_caspt2/test/slow/c1_methane_slow/active.inp -o /home/noda/develop/dirac_caspt2/test/slow/c1_methane_slow/c1_methane_slow.caspt2.out --omp 2
Auto-created Command :  /home/noda/develop/dirac_caspt2/bin/r4dcascicoexe &&  /home/noda/develop/dirac_caspt2/bin/r4dcaspt2ocoexe
Scratch directory : /home/noda/dcaspt2_scratch/active_2023-12-05_18-17-10_y88idwht
Output file : /home/noda/develop/dirac_caspt2/test/slow/c1_methane_slow/c1_methane_slow.caspt2.out
Calculation started at : 2023-12-05 18:17:10
Calculation finished at : 2023-12-05 18:17:11
Elapsed time: 0.7717 sec

ERROR: dirac-caspt2 calculation failed with return code 139

どのテストで問題が起こるか

test_c1_methane_slowtest_cs_methanol_slow

Expected behavior

バグが取れたらどのようになっているべきかを記述してください

  • テストが通る状態

再現性

再現性がある場合、再現するための手順を記述してください

git clone https://github.com/RQC-HU/dirac_caspt2.git
cd dirac_caspt2
git checkout -b refactor
./setup --build --fc=gfortran -j --omp
pytest --omp 10 test/slow/c1_methane_slow/ --all

環境

ビルド環境を選択、記述してください

  • gfortran
  • Intel Fortran
  • mkl

DIRACのバージョンなどの情報は以下に書いてください
(例) DIRAC ver 21.1
parallel build

(optional) inputs

計算時のインプット等について記述してください

(optional) Screenshots

If applicable, add screenshots to help explain your problem.

(optional) Additional context

Add any other context about the problem here.

@kohei-noda-qcrg kohei-noda-qcrg added the bug Something isn't working label Dec 5, 2023
@kohei-noda-qcrg
Copy link
Member Author

kohei-noda-qcrg commented Dec 5, 2023

LAPACKのZHEEVで発生している。ちょっと直しようがないかも

Desktop.2023.12.05.-.18.41.19.01.-.Trim.mp4

kohei-noda-qcrg added a commit that referenced this issue Dec 5, 2023
c1_methane_slow, cs_methanol_slow tests failed with --omp > 1
ref: #121
@kohei-noda-qcrg
Copy link
Member Author

kohei-noda-qcrg commented Dec 5, 2023

かなり複雑そうな問題なので
CIのテストで落ちてしまう問題は落ちるテストについて
--omp 2のオプションをつけないことで一時しのぎだが回避することにした

このissueが解決したら
--allと--slowonlyのテストに--omp 2のオプションをつけなおす予定

- name: Run unittest(serial, run slowonly tests, pull_request)
if: ${{ github.event_name == 'pull_request' }}
run: |
pytest --slowonly
- name: Run unittest(serial, run normal and slow tests, push to other than main branch)
if: ${{ github.ref_name != 'main' && github.event_name == 'push' }}
run: |
pytest --omp=2
- name: Run unittest(serial, run all tests, push to main branch)
if: ${{ github.ref_name == 'main' && github.event_name == 'push' }}
run: |
pytest --all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant