-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VP exact restart and other nonBFB problems #518
Comments
/cc @JFLemieux73 |
I will try to take a look at some of those this week. |
I'm getting back to this now (!). I ran the
and hopefully fix all of the above. |
I'll document the failures in phil-blain#39 |
New omp_suite also shows non-bfb with dynpicard setting, FAIL cheyenne_intel_smoke_gx3_18x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_intel_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data different decomp producing different answers, this happens to also test omp. |
Hi Tony, I did some more runs of the The In the mean time maybe we could retitle this here issue since there are no restart issues anymore? |
I retitled it. Let me know if you'd prefer a different title. e |
The 'pgmres' subroutine implements a separate GMRES solver and is used as a preconditioner for the FGMRES linear solver. Since it is only a preconditioner, it was decided to skip the halo updates after computing the matrix-vector product (in 'matvec'), for efficiency. This leads to non-reproducibility since the content of the non-updated halos depend on the block / MPI distribution. Add the required halo updates, but only perform them when we are explicitely asking for bit-for-bit global sums, i.e. when 'bfbflag' is set to something else than 'not'. Adjust the interfaces of 'pgmres' and 'precondition' (from which 'pgmres' is called) to accept 'halo_info_mask', since it is needed for masked updates. Closes CICE-Consortium#518
The 'pgmres' subroutine implements a separate GMRES solver and is used as a preconditioner for the FGMRES linear solver. Since it is only a preconditioner, it was decided to skip the halo updates after computing the matrix-vector product (in 'matvec'), for efficiency. This leads to non-reproducibility since the content of the non-updated halos depend on the block / MPI distribution. Add the required halo updates, but only perform them when we are explicitely asking for bit-for-bit global sums, i.e. when 'bfbflag' is set to something else than 'not'. Adjust the interfaces of 'pgmres' and 'precondition' (from which 'pgmres' is called) to accept 'halo_info_mask', since it is needed for masked updates. Closes CICE-Consortium#518
The 'pgmres' subroutine implements a separate GMRES solver and is used as a preconditioner for the FGMRES linear solver. Since it is only a preconditioner, it was decided to skip the halo updates after computing the matrix-vector product (in 'matvec'), for efficiency. This leads to non-reproducibility since the content of the non-updated halos depend on the block / MPI distribution. Add the required halo updates, but only perform them when we are explicitely asking for bit-for-bit global sums, i.e. when 'bfbflag' is set to something else than 'not'. Adjust the interfaces of 'pgmres' and 'precondition' (from which 'pgmres' is called) to accept 'halo_info_mask', since it is needed for masked updates. Closes CICE-Consortium#518
See also #491.
I ran a decomp test suite with evp, eap, and vp-picard. I did not test vp-anderson as that is not ready to use out of the box. Below are the results. evp and eap pass all tests with the various decomps. However, vp-picard is more of a mixed bag. Some configurations don't run at all and some don't restart exactly. The test suite is running the same tests just changing the dynamics option. Looking at one of the failed runs, I see
The text was updated successfully, but these errors were encountered: