Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed Precision Gmres #640

Closed
wants to merge 78 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
c1b0cc0
First commit for the GmresMixed class:
Apr 27, 2020
5eb6bd7
Inclusion of the omp executor in the repository:
Apr 27, 2020
c90e9d8
Inclusion of the cuda executor in the repository:
Apr 29, 2020
c7557d9
The test files are finally included, but:
Apr 29, 2020
3606b68
The first unoptimized version is done. Next tasks:
Apr 30, 2020
639975b
The default value for ValueTypeKrylovBasis has been changed to avoid …
josealiaga Apr 30, 2020
ff91868
Definition of the CG2 variant of the finish_arnoldi routine for omp a…
josealiaga May 3, 2020
7c2dacc
Add GMRES_mixed to benchmark
May 7, 2020
2405483
Definition of the CGS2 version of finish_arnoldi method, for omp and …
josealiaga May 7, 2020
2b7f889
Finally a good implementation of the multidot_kernels_num_iters_1 is …
josealiaga May 12, 2020
410b8bc
Add Accessor support and extend reference test
May 13, 2020
6051cbf
Made GmresMixed compile with complex types
May 13, 2020
03e1b4f
The update routines have been improved. Now the computational time is…
josealiaga May 19, 2020
0a87e51
Add specialization for integer types for Accessor
May 19, 2020
0ef41f5
Make the scale work with integer types
May 20, 2020
8ce8ede
Add helper to determine if we need a scale or not
May 20, 2020
7ff29bf
Add a helper structure to manage the scale writing in common
May 20, 2020
30a612a
Testing the push command
josealiaga May 20, 2020
b94aef7
Definition of norm2 and norminf routines in CUDA. Only the first one …
josealiaga May 22, 2020
15f915c
remove_complex has been added to the norms variables, and multinorm2_…
josealiaga May 24, 2020
275541f
Fixed cuda step2 to take a view into account
May 25, 2020
25e58aa
Change const accessor to non-const in check_arnoldi_norms_new
May 25, 2020
950f12e
The set_scale method finally works!!
josealiaga May 27, 2020
f19fbc6
Change storage layout of krylov_bases
Jun 1, 2020
d0afd08
Make memory access to krylov_bases coalesced again
Jun 2, 2020
68aba6b
Transpose grid when launching singledot kernel
Jun 4, 2020
0bee884
Add jacobi with block size 1 to benchmark
Jun 4, 2020
88dd368
Reversed the transpose of the grid dim
Jun 4, 2020
6c64a52
Add benchmark option to set RHS to 1
Jun 10, 2020
425adea
Improve argument names of run_all_benchmarks.sh
Jun 15, 2020
9190d9c
Add half precision support to GmresMixed
Jun 15, 2020
74005ea
Hopefully improve singledot performance
Jun 15, 2020
ba058ff
Fix GmresMixed int64 benchmark to use correct type
Jun 16, 2020
da07f93
Infinity norm only computed when scale is present
Jun 17, 2020
7c6e0af
Change the non-random RHS generation
Jun 23, 2020
76e1974
Change the initial guess x0 in benchmarks
Jun 23, 2020
eddf86b
Fix residual_norm calculation in GmresMixed
Jun 23, 2020
236052c
Improve printing of rel_res_goal in benchmark
Jun 24, 2020
062b31f
Change the initial guess x0 back to 0 in benchmark
Jun 24, 2020
cbc35fa
Change generation of non-random RHS
Jun 25, 2020
63856ab
Make sure GmresMixed does not exit early
Jun 25, 2020
2c675fe
Add benchmark parameter for krylov_dim
Jun 26, 2020
352b5ca
Add forced iterations when convergence is detected
Jun 26, 2020
c4f3acf
Add debug output to forced iterations
Jun 26, 2020
945a2f2
Add more jacobi preconditioner in benchmark
Jun 29, 2020
eb52310
Fix reference bug in GmresMixed
Jul 8, 2020
30d792f
DEBUG: Add write output for integral accessor
Jul 30, 2020
a19013e
DEBUG: Move towards `at` with accessor
Aug 14, 2020
694a889
Remove Accessor3dConst
Aug 17, 2020
b22ecc1
Adopt OpenMP support to new Accessor
Aug 17, 2020
244b9c8
Remove unused GMRES_mixed code from Ref & OMP
Aug 17, 2020
769012e
Adopt CUDA to the new accessor format (NOT `at`)
Aug 17, 2020
23df295
Make HIP and CUDA work with new accessor (NOT at)
Aug 17, 2020
f81b0bb
Remove unused code from CUDA
Aug 17, 2020
e490aba
CUDA implementation is now using `at`
Aug 18, 2020
56a63d5
Re-add ConstAccessor
Aug 18, 2020
3fcaee7
Fix accessor by adding additional __restrict__
Aug 19, 2020
dd899cf
GmresMixed storage prec is now a factory parameter
Aug 25, 2020
32770bf
Improve reference test and include the enum there
Aug 25, 2020
544782b
Fix the reference test to pass
Aug 26, 2020
b5ddef7
Adopt to new parameter macros
Sep 1, 2020
5d2a106
Update the helper to throw when complex
Sep 7, 2020
56d1f90
Make GmresMixed work properly with multiple RHS
Sep 9, 2020
f67b5b3
Fix benchmark to work with new GmresMixed layout
Sep 10, 2020
8adf5d3
Move GmresMixed accessor to range_accessors.hpp
Sep 11, 2020
f67e5ee
Make accessor range compatible
Sep 14, 2020
8b27cc4
Remove unnecessary code from CUDA GmresMixed
Sep 14, 2020
5247306
Add HIP kernels
Sep 14, 2020
6b47fef
Half-way of integrating proper const support
Sep 15, 2020
928c8bd
Finish proper const-type support
Sep 15, 2020
84238d5
Add constexpr everywhere in accessor
Sep 15, 2020
f0bbead
Attempt to fix thrust::complex conversion issue
Sep 15, 2020
475d96f
Add workaround for CUDA for reference casting
Sep 17, 2020
7f17f6d
Use better workaround for CUDA references
Sep 17, 2020
1b3bb52
Fix GmresMixed core problem
Sep 17, 2020
d3487c0
REORDER to first of this PR: Add specialization for gko::half in std:…
Oct 22, 2020
9ad48aa
Add TODO list text-file
Dec 2, 2020
70e558d
Improve force-reset behavior
Dec 12, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions TODO_CB_GMRES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
TODO CB-GMRES for Paper review

1. change stopping criterion: 1e-12
2. Use Cusparse ILU preconditioner
3. Always run GmresMixed<float, float> in addition to the other benchmarks
4. Run both Classical and Modified Grahm-Schmidt and compare them
(Add regular Ginkgo Gmres<double> to the benchmark and compare MGS with CGS)
5. Run everything with Restart = 300
6. Remove matrices from same group (e.g. af_*) from the plots
7. Run MGS vs CGS without preconditioner
41 changes: 38 additions & 3 deletions benchmark/run_all_benchmarks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@ if [ ! "${DRY_RUN}" ]; then
DRY_RUN="false"
fi

if [ ! "${KEEP_MTX_FILES}" ]; then
echo "KEEP_MTX_FILES environment variable not set - assuming \"false\"" 1>&2
KEEP_MTX_FILES="false"
fi

if [ ! "${EXECUTOR}" ]; then
echo "EXECUTOR environment variable not set - assuming \"cuda\"" 1>&2
EXECUTOR="cuda"
Expand Down Expand Up @@ -50,6 +55,11 @@ if [ ! "${SOLVERS_MAX_ITERATIONS}" ]; then
SOLVERS_MAX_ITERATIONS=10000
fi

if [ ! "${SOLVERS_KRYLOV_DIM}" ]; then
echo "SOLVERS_KRYLOV_DIM environment variable not set - assuming \"100\"" 1>&2
SOLVERS_KRYLOV_DIM=100
fi

if [ ! "${SYSTEM_NAME}" ]; then
echo "SYSTEM_MANE environment variable not set - assuming \"unknown\"" 1>&2
SYSTEM_NAME="unknown"
Expand All @@ -60,6 +70,11 @@ if [ ! "${DEVICE_ID}" ]; then
DEVICE_ID="0"
fi

if [ ! "${SOLVERS_NUM_RHS}" ]; then
echo "SOLVERS_NUM_RHS environment variable not set - assuming \"1\"" 1>&2
SOLVERS_NUM_RHS="1"
fi

# Control whether to run detailed benchmarks or not.
# Default setting is detailed=false. To activate, set DETAILED=1.
if [ ! "${DETAILED}" ] || [ "${DETAILED}" -eq 0 ]; then
Expand All @@ -68,6 +83,22 @@ else
DETAILED_STR="--detailed=true"
fi

# Control whether to print additional preconditioner information for solver
# benchmarks. Requires `DETAILED` to be set to have an effect.
if [ ! "${SOLVERS_PRINT_PRECOND}" ] || [ "${SOLVERS_PRINT_PRECOND}" -ne 0 ]; then
PRINT_PRECOND_STR="--print_preconditioner_information=true"
else
PRINT_PRECOND_STR="--print_preconditioner_information=false"
fi

# Control whether to randomize the right hand side.
# Default setting is randomize=true. To deactivate, set SOLVERS_RANDOMIZE_RHS=0.
if [ ! "${SOLVERS_RANDOMIZE_RHS}" ] || [ "${SOLVERS_RANDOMIZE_RHS}" -ne 0 ]; then
SOLVERS_RND_STR="--randomize_rhs=true"
else
SOLVERS_RND_STR="--randomize_rhs=false"
fi

# This allows using a matrix list file for benchmarking.
# The file should contains a suitesparse matrix on each line.
# The allowed formats to target suitesparse matrix is:
Expand Down Expand Up @@ -166,8 +197,11 @@ run_solver_benchmarks() {
./solver/solver --backup="$1.bkp" --double_buffer="$1.bkp2" \
--executor="${EXECUTOR}" --solvers="${SOLVERS}" \
--preconditioners="${PRECONDS}" \
--max_iters=${SOLVERS_MAX_ITERATIONS} --rel_res_goal=${SOLVERS_PRECISION} \
${DETAILED_STR} --device_id="${DEVICE_ID}" \
--max_iters="${SOLVERS_MAX_ITERATIONS}" \
--rel_res_goal="${SOLVERS_PRECISION}" \
--krylov_dim="${SOLVERS_KRYLOV_DIM}" "${SOLVERS_RND_STR}" \
--nrhs="${SOLVERS_NUM_RHS}" "${DETAILED_STR}" \
--device_id="${DEVICE_ID}" "${PRINT_PRECOND_STR}" \
<"$1.imd" 2>&1 >"$1"
keep_latest "$1" "$1.bkp" "$1.bkp2" "$1.imd"
}
Expand Down Expand Up @@ -380,7 +414,8 @@ for bsize in ${BLOCK_SIZES}; do
run_preconditioner_benchmarks "${RESULT_FILE}"

echo -e "${PREFIX}Cleaning up problem ${GROUP}/${NAME}" 1>&2
[ "${DRY_RUN}" != "true" ] && rm -r "/tmp/${GROUP}/${NAME}.mtx"
[ "${DRY_RUN}" != "true" ] && [ "${KEEP_MTX_FILES}" != "false" ] && \
rm -r "/tmp/${GROUP}/${NAME}.mtx"
done
if [ "${ID}" -ge "${LOOP_END}" ]; then
break
Expand Down
Loading