Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing response SCF test #487

Open
stigrj opened this issue May 30, 2024 · 6 comments
Open

Failing response SCF test #487

stigrj opened this issue May 30, 2024 · 6 comments
Labels

Comments

@stigrj
Copy link
Contributor

stigrj commented May 30, 2024

Random segfault in Projecting occupied space in response SCF test

@stigrj stigrj added the bug label May 30, 2024
@stigrj
Copy link
Contributor Author

stigrj commented Jun 4, 2024

@gitpeterwind I see you have written this comment in MRCPP/src/utils/ComplexFunction.cpp :

//For some unknown reason the h2_mag_lda test sometimes fails when schedule(dynamic) is chosen

I have traced a failing test li_pol_lda to the same OMP loop, even with schedule(static), so there seems to be some issue here.

@stigrj
Copy link
Contributor Author

stigrj commented Jun 4, 2024

The issue appears when compiling with pure OMP, no MPI, and is randomly triggered when running the test case as is, but seems to always trigger if I change world_prec from 1e-3 to 1e-2.

@gitpeterwind
Copy link
Member

There seems to be an obvious bug: the line 1804:
Sreal(orbVecBra[i], orbVecKet[j]) += S_temp(i, j);
should be replaced in the same way as is done for the real case (20 lines higher):

                        // must ensure that threads are not competing
                        double &Srealij = Sreal(orbVecBra[i], orbVecKet[j]);
                        double &Stempij = S_temp(i, j);
#pragma omp atomic
                        Srealij += Stempij;

If you have a setup where you can test this easily, that would be nice (even push a fix?)

@stigrj
Copy link
Contributor Author

stigrj commented Jun 4, 2024

Yes, I will test it later 🙂

@stigrj
Copy link
Contributor Author

stigrj commented Jun 4, 2024

Hmm, but I'm now running in the if (serial) branch where the omp atomic is already added 🤔 But for some reason I'm not able to trigger the error anymore, even without changing the code... Need to look closer tomorrow.

I remember having some issues with the += operator in Eigen before, though.

@gitpeterwind
Copy link
Member

Sorry, I read wrongly the if-else block. So my previous comment was meaningless.
It is difficult with these non-reproducible bugs. But I cannot see another unsafe part than that +=. We can put that in a safer way, but I cannot do that now.

@stigrj stigrj mentioned this issue Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants