Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed Precision Gmres #640

Closed
wants to merge 78 commits into from
Closed

Mixed Precision Gmres #640

wants to merge 78 commits into from

Conversation

thoasm
Copy link
Member

@thoasm thoasm commented Sep 14, 2020

This PR adds the compressed basis GMRES.

TODO:

@thoasm thoasm added type:solver This is related to the solvers 1:ST:WIP This PR is a work in progress. Not ready for review. mod:all This touches all Ginkgo modules. labels Sep 14, 2020
@thoasm thoasm self-assigned this Sep 14, 2020
@tcojean
Copy link
Member

tcojean commented Sep 14, 2020

First, nice that you have several examples. Quick question: do you think it makes sense to split this PR into one purely for the accessor, and another one using the accessor in the mixed precision GMRES? 8.5K lines is a bit much, though we've been having multiple such PR recently.

@thoasm thoasm force-pushed the gmres_mixed_accessor branch 8 times, most recently from c786679 to 5a7be65 Compare September 15, 2020 04:15
@thoasm
Copy link
Member Author

thoasm commented Sep 16, 2020

@tcojean thanks for the suggestion, it would definitively make sense.
I will look into extracting the accessor itself, so it is easier to discuss the changes.
It might also make sense to split the GMRES into a reference (maybe OpenMP as well) PR and a CUDA / HIP PullRequest, which should reduce the sizes as well.

@tcojean
Copy link
Member

tcojean commented Sep 17, 2020

Yes I guess we could also split the GMRES itself into at least Reference/base structure and then a PR for all the kernels. I don't think we would have to split the kernels PR though into OpenMP and another one HIP/CUDA, but we'll see depending on the size of each PR.

aliaga@uji.es and others added 14 commits October 22, 2020 20:43
  * Only core and reference executors.
  * test files don't compile, due to t problem related the macros of gtest (TEST_F -> TYPED_TEST).
  * MGS, MGS with reorthogonalization and CGS with reorthogonalization are considered.
  * Norms are still created in the internal routines.
  * Now, the norms are properly created in the main class.
  * The test files are not repaired yet.
    * For CGS, a loop of kernels is used instead of a kernel with a loop.
    * The test files are not repaired yet.
    * The messages have to be removed.
    * For CGS, a loop of kernels is used instead of a kernel with a loop.
    * For CGS, a loop of kernels is used instead of a kernel with a loop.
    * Consider another base_types for ValueTypeKrylovBases
…some errors which

were detected during the testing process in the repository. The previous value was float
whereas the original results were executed by default_precision, and these are the reason
of the errors. Now, the default value is also default_precision.
…nd cuda executors,

as a first step in the optimization process.
Also the calls for the timing are included.
…cuda executors.

For omp, the omp is trying to move to the outer loop
For cuda, the loop of kernels is change to a kernels with a loop.
  * The main routines (loop of dots and loop of axpy) are still too expensive.
…done.

Also timing instructions are included, whose management is made by some define's.

The next step will be to improve the update kernels.
Added an accessor header file (name might have to change in future)
and used it in all mixed precision kernels (but for now only for the
reduced precision accesses).

Also adds some minor fixes:
- removed unused code in the example in hopes that it compiles on
  windows
- added HIP stubs to allow HIP compilation
… close to 75s for 6221 iters.

Next steps should be:
  * Add the computation of the inf-norm for the next_krylov_basis.
  * Merge updating and norms computations.
The specialization is currently set to only work with float storage
type to test the pipeline, but it can easily be modified to work with
all integer types.
The Accessor was also moved from a shared header to a gmres_mixed
exclusive header.
Thomas Grützmacher added 22 commits October 22, 2020 20:43
Also add instantiation macro for ConstAccessors
Currently, only core is adapted with the reference test started
(not all precision combinations are tested properly).
Also CUDA and OpenMP compiles now for the new accessor layout.
Benchmarks is still TODO.
Also add instantiation for single precision floating point
Also adjust test precision to be more accurate.
Make GmresMixed reference test work on CI.
Make accessor references work with older CUDA versions by having a
conditional constexpr qualifier and by forcing it to use the overloaded
cast operation (when present).
@ginkgo-bot
Copy link
Member

Error: The following files need to be formatted:

benchmark/solver/solver.cpp
core/base/extended_float.hpp
core/base/utils.hpp
core/solver/gmres_mixed.cpp
core/solver/gmres_mixed_accessor.hpp
core/test/base/range_accessors.cpp
core/test/reorder/rcm.cpp
core/test/solver/gmres_mixed.cpp
include/ginkgo/core/base/range.hpp
include/ginkgo/core/base/range_accessors.hpp
include/ginkgo/core/base/types.hpp
include/ginkgo/core/reorder/reordering_base.hpp
include/ginkgo/core/solver/gmres_mixed.hpp
omp/solver/gmres_mixed_kernels.cpp
omp/test/reorder/rcm_kernels.cpp
omp/test/solver/gmres_mixed_kernels.cpp
reference/solver/gmres_mixed_kernels.cpp
reference/stop/residual_norm_kernels.cpp
reference/test/reorder/rcm.cpp

You can find a formatting patch under Artifacts here or run format! if you have write access to Ginkgo

@thoasm thoasm closed this Jan 23, 2021
@thoasm thoasm deleted the gmres_mixed_accessor branch February 24, 2021 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:WIP This PR is a work in progress. Not ready for review. mod:all This touches all Ginkgo modules. type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants