Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make the type in deferred_factory more explicit #1439

Merged
merged 8 commits into from
Nov 4, 2023
Merged

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Oct 23, 2023

This PR makes the template type of deferred_factory_parameter more explicit.
That is,
deferred_factory_paramter<const FactoryType>(..).on(exec) returns shared_ptr<const FactoryType> and
deferred_factory_paramter<FactoryType>(..).on(exec) returns shared_ptr<FactoryType>
not deferred_factory_paramter<FactoryType>(..).on(exec) returns shared_ptr<const FactoryType>
which is used in File Config such that users can get the non-const LinOpFactory from deferred_factory<LinOpFactory>

@yhmtsai yhmtsai self-assigned this Oct 23, 2023
@ginkgo-bot ginkgo-bot added mod:core This is related to the core module. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners labels Oct 23, 2023
@yhmtsai yhmtsai added the 1:ST:ready-for-review This PR is ready for review label Oct 23, 2023
@yhmtsai yhmtsai requested review from a team October 23, 2023 18:49
@tcojean tcojean added this to the Release 1.7.0 milestone Oct 24, 2023
@MarcelKoch
Copy link
Member

perhaps a general question, what is the point of a non-const factory? The generate and access to the parameters is const, so I don't know what can be done with a non-const factory.

@yhmtsai
Copy link
Member Author

yhmtsai commented Oct 24, 2023

When it is const type, users can not change its parameters unless they cast const away.
It is also related to the file config if we would like to move out the exec arguments.
build_from_config().on(exec) should give LinOpFactory like Solver::build().on(exec) not const LinOpFactory
For me, the explicit type should be considered prior to const internal because it directly indicates the return type and the explicit type can still handle the original cases.

@MarcelKoch
Copy link
Member

Even for a non-const factory, the user can't change the parameters, since they are returned as a const reference. Anyway, I think modifying the parameters of a created factory doesn't fit into our workflow. They would need to create a new factory for that.

But I can still see the sense in changing it, only I think some of the use-cases are not valid.

@upsj
Copy link
Member

upsj commented Oct 24, 2023

If we were to turn the parameters from const LinOpFactory to LinOpFactory, we would lose const-correctness in general, because pointers don't preserve constness. So I tend to agree that this might loosen restrictions too much. Also modifying something statically that can be controlled from a config file by runtime seems the wrong approach. Maybe instead modify the parsed config?

@MarcelKoch MarcelKoch self-requested a review October 24, 2023 14:15
@yhmtsai
Copy link
Member Author

yhmtsai commented Oct 24, 2023

the parameters still use the deferred_factory<const LinOpFactory> or deferred_factory<const CriterionFactory>.
There should not be any changes to the normal factory setup. If you notice something might affect or destroy the constness in setup, I will try to fix that.

Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we explicitly stored the constness of the parameters before changing to the deferred style I think it's fine to now denote that explicitly.

include/ginkgo/core/base/abstract_factory.hpp Outdated Show resolved Hide resolved
Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'm not opposed to the change, only wanted to clarify the motivations.

typename = std::enable_if_t<std::is_convertible<
FactoryType, deferred_factory_parameter<
const stop::CriterionFactory>>::value>>
Parameters& with_criteria(const std::vector<FactoryType>& criteria_vec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I miss this? Can you add a test for it?

@yhmtsai yhmtsai force-pushed the explicit_deferred branch 8 times, most recently from 8764a7f to 4591e66 Compare October 28, 2023 21:46
@yhmtsai
Copy link
Member Author

yhmtsai commented Oct 28, 2023

@upsj @MarcelKoch I have added tests to check whether the constructor and function can be compiled or not.
The checking for non-compiled function/constructor is only based on the signature, template and SFINAE.
If the compiliation error is in the function/constructor definition, the checking can not detect that.
I can only make the compilation failure for checking the missing allowed input in the original call. (first commit of this branch 6df7a42 and the corresponding failed tests https://gitlab.com/ginkgo-project/ginkgo-public-ci/-/jobs/5403290151)
we use template <typename ... Args> such that the function declaration can allow anything, but it will throw the compiling error when passing vector to deferred_factory_parameters
After main fix, second commit eb1aba1 can compile it in https://gitlab.com/ginkgo-project/ginkgo-public-ci/-/jobs/5404992486. After that, it is replaced by the deferred_factory test.

Other than that, I also make mutable out which should stillbe the unintended usage mentioned in #1336. It's the last commit, so I can also move it to another pull request if needed.
I replace is_base_of ... && (std::is_const<FactoryType>::value || !std::is_const<ConcreteFactoryType>::value) by is_pointer_convert_to which should be the same. I also reuse the macro in the solver_base.hpp.

Do we put deferred_factory_parameter type into detail?

Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the tests. I only have smaller nits for those.

But I'm not sure that one of the changes to the parameter macro is possible.

\
template <typename... Args> \
auto with_##_name(Args&&... _value)->std::decay_t<decltype(*this)>& \
parameters_type& with_##_name(Args&&... _value) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this implicitly assumes that the name of the parameter struct is parameters_type. But we can't guarantee that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I go this way because deferred_factory macro already assume that. With that change, I can reuse the macro into mixin part with using parameters_type = ....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't we rather fix the deferred_factory macro then? I think the old way should be usable as mixin without the type alias.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to keep the new approach, then you must also modify the GKO_CREATE_FACTORY_PARAMETERS and GKO_ENABLE_LIN_OP_FACTORY macro to fix the name of the parameters type to parameters_type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because this is not actual parameter type in mixed, we have self to cast this to the actual parameter type.
self() is not available in normal pr. we can either make the self available in the factory parameter or contain using parameters_type if the type name is not parameters_type.
I have tried additional parameter in macro which denotes how to return the type reference, but it will change many macros inputs unless I also add the overload macro.
Do you have another idea without these requirements/assumption?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I understand the issue, my mind was at a different place. I think then changing the macros would be the best approach. But I think the whole change might be interface breaking. I would be fine with that, since, as you mentioned, the current implementation is not correct for the mixins anyway.

Copy link
Member Author

@yhmtsai yhmtsai Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I check some base type of parameter_types, self is introduced in enable_parameter_type protected scope already.
All parameter types are inherited from that, so we always have self() actually.
I just need to delete the GKO_ENABLE_SELF() in the two parameters collection in solver_bases.hpp and then I can use self()
I am trying this approach based on self()

*/
struct iterative_solver_factory_parameters {
template <typename Parameters, typename Factory>
struct enable_iterative_solver_factory_parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change, but I think @upsj wanted to keep our usual mixin structure of Type, template<ConcreteType> EnableType.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with this change there is no single type that represents an interative solver's parameters, because all enable_... types are distinct due to their template parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What can the interactive solver's parameters be used for? Because the parameter_type is not passed as pointer, so we do not use the base type. and we can use template to handle the common operation if need.
I prefer this way because it reuse the macro such that I do not need to create the test for these class or modify the same code in two place in the future.

Copy link
Member

@upsj upsj Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

void parse_iterative_solver_config(iterative_solver_factory_parameters& params, const config& c);
void dump_iterative_solver_config(const iterative_solver_factory_parameters& params, config& c);

No need to use templates in this case, that means our headers are slimmer and compile times are smaller

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can not do it just in this way, because there's no concrete type out of them.
we still need to know the what's the actual type, so it will still require the template.
I do not think two/three parameter affect the compile time a lot and the part is not in public interface

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you can pass that like this way. like I said, it requires you know the actual type but it is done by the previous template from scratch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following does not work.

iterative_parameter a;
parse(a, config);
a->generate() -> what type is it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the purpose of the type. It is meant to be usable by us and by users to generically access or set parameters for iterative solvers without knowing the actual type and without having to use templates.

In general, this PR is already going beyond what it aims to do (prevent future interface breaks for deferred_factory_parameter).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the previous approach lead to a rather complicated hierarchy with very little gain from that. We don't currently need the common type of the parameters, and I don't expect that we will in the near future. Mike's approach is a lot cleaner, since the hierarchy is a single chain, while in the other case it's a tree.
Regarding the compile times, as long as a user compiles GPU kernels, this will have barely any impact.

core/test/base/deferred_factory.cpp Outdated Show resolved Hide resolved
core/test/base/deferred_factory.cpp Outdated Show resolved Hide resolved
core/test/base/deferred_factory.cpp Outdated Show resolved Hide resolved
core/test/base/deferred_factory.cpp Outdated Show resolved Hide resolved
core/test/base/deferred_factory.cpp Outdated Show resolved Hide resolved
@@ -435,14 +443,15 @@ class deferred_factory_parameter {
* @ingroup LinOp
*/
#define GKO_FACTORY_PARAMETER(_name, ...) \
mutable _name{__VA_ARGS__}; \
_name{__VA_ARGS__}; \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess now this cannot be modified from a const Factory ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. with_* and the parameters only allow changes in non-const

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, with_* returns self() , which can be const or non-const. .on() also has been changed to return FactoryType which is always non-const ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because with_* is not const function, it can only be called by non-const and return non-const.
Yes for last question, .on() always return FactoryType, which is a different type than FactoryType::parameters_type

@yhmtsai yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Nov 2, 2023
@yhmtsai
Copy link
Member Author

yhmtsai commented Nov 2, 2023

rebase!

upsj
upsj previously requested changes Nov 2, 2023
Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the separation into mixin and parameter type, the other change is less of a blocker to me, but I would prefet to make the type a macro parameter instead of inferring it via decltype. This also makes the documentation harder to generate.

@@ -94,7 +94,8 @@ class Schwarz
/**
* Local solver factory.
*/
GKO_DEFERRED_FACTORY_PARAMETER(local_solver, LinOpFactory);
std::shared_ptr<const LinOpFactory> GKO_DEFERRED_FACTORY_PARAMETER(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to go back to this pattern? The macro becomes much more complex due to all the decltypes, I deliberately did not go this way with the new macros.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are fine with me. I think it is good because it gives the same pattern like the other macro.
You only mean the decltype of the distinct parameter not the decltype(self()) for the reference return, right?
decltype(self()) is necessary to avoid additional input argument of macro or assumption of the type name.

Copy link
Member

@upsj upsj Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _type parameter in the original GKO_DEFERRED_FACTORY_PARAMETER macro occurs 4 times. We need to replace all of those by decltype(_name_), which complicates the macro. I dislike the current approach, because it makes the macro look like a member definition, but does much more (e.g. add mutable and declare the function). By making the type a macro parameter, it is clearer that this is fully generated code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it only need once. we have using _name_type = ... and then all _type just use that.
mutable is not longer there as we said it is not intended behavior in the #1336
changing the member and using with_* are only available in non-const object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution is still more complex than leaving the type as a macro parameter, you need to use _name##_type instead of _type everywhere and add the type alias.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me the main reason to use this style is that it's consistent with the non-deferred parameters. I think Tobias raises a good point that the other style makes it clearer that all the code gets generated. It might also help to document the functions, although we still might not be able to rely on generated documentation.
But for now I would prefer this style for consistency, and then with an interface break change to the other style for all parameters.

*/
struct iterative_solver_factory_parameters {
template <typename Parameters, typename Factory>
struct enable_iterative_solver_factory_parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with this change there is no single type that represents an interative solver's parameters, because all enable_... types are distinct due to their template parameters.

@yhmtsai yhmtsai added 1:ST:ready-for-review This PR is ready for review and removed 1:ST:ready-to-merge This PR is ready to merge. labels Nov 2, 2023
@upsj upsj dismissed their stale review November 3, 2023 00:19

won't hold up the release over this

@yhmtsai yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Nov 3, 2023
@yhmtsai
Copy link
Member Author

yhmtsai commented Nov 3, 2023

rebase!

@yhmtsai yhmtsai merged commit 4747692 into develop Nov 4, 2023
12 of 15 checks passed
@yhmtsai yhmtsai deleted the explicit_deferred branch November 4, 2023 03:42
Copy link

sonarcloud bot commented Nov 4, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 9 Code Smells

25.7% 25.7% Coverage
2.6% 2.6% Duplication

warning The version of Java (11.0.3) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17.
Read more here

@tcojean tcojean mentioned this pull request Nov 6, 2023
tcojean added a commit that referenced this pull request Nov 10, 2023
Release 1.7.0 to master

The Ginkgo team is proud to announce the new Ginkgo minor release 1.7.0. This release brings new features such as:
- Complete GPU-resident sparse direct solvers feature set and interfaces,
- Improved Cholesky factorization performance,
- A new MC64 reordering,
- Batched iterative solver support with the BiCGSTAB solver with batched Dense and ELL matrix types,
- MPI support for the SYCL backend,
- Improved ParILU(T)/ParIC(T) preconditioner convergence,
and more!

If you face an issue, please first check our [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues) and the [open issues list](https://github.com/ginkgo-project/ginkgo/issues) and if you do not find a solution, feel free to [open a new issue](https://github.com/ginkgo-project/ginkgo/issues/new/choose) or ask a question using the [github discussions](https://github.com/ginkgo-project/ginkgo/discussions).

Supported systems and requirements:
+ For all platforms, CMake 3.16+
+ C++14 compliant compiler
+ Linux and macOS
  + GCC: 5.5+
  + clang: 3.9+
  + Intel compiler: 2019+
  + Apple Clang: 14.0 is tested. Earlier versions might also work.
  + NVHPC: 22.7+
  + Cray Compiler: 14.0.1+
  + CUDA module: CMake 3.18+, and CUDA 10.1+ or NVHPC 22.7+
  + HIP module: ROCm 4.5+
  + DPC++ module: Intel oneAPI 2022.1+ with oneMKL and oneDPL. Set the CXX compiler to `dpcpp` or `icpx`.
  + MPI: standard version 3.1+, ideally GPU Aware, for best performance
+ Windows
  + MinGW: GCC 5.5+
  + Microsoft Visual Studio: VS 2019+
  + CUDA module: CUDA 10.1+, Microsoft Visual Studio
  + OpenMP module: MinGW.

### Version support changes

+ CUDA 9.2 is no longer supported and 10.0 is untested [#1382](#1382)
+ Ginkgo now requires CMake version 3.16 (and 3.18 for CUDA) [#1368](#1368)

### Interface changes

+ `const` Factory parameters can no longer be modified through `with_*` functions, as this breaks const-correctness [#1336](#1336) [#1439](#1439)

### New Deprecations

+ The `device_reset` parameter of CUDA and HIP executors no longer has an effect, and its `allocation_mode` parameters have been deprecated in favor of the `Allocator` interface. [#1315](#1315)
+ The CMake parameter `GINKGO_BUILD_DPCPP` has been deprecated in favor of `GINKGO_BUILD_SYCL`. [#1350](#1350)
+ The `gko::reorder::Rcm` interface has been deprecated in favor of `gko::experimental::reorder::Rcm` based on `Permutation`. [#1418](#1418)
+ The Permutation class' `permute_mask` functionality. [#1415](#1415)
+ Multiple functions with typos (`set_complex_subpsace()`, range functions such as `conj_operaton` etc). [#1348](#1348)

### Summary of previous deprecations
+ `gko::lend()` is not necessary anymore.
+ The classes `RelativeResidualNorm` and `AbsoluteResidualNorm` are deprecated in favor of `ResidualNorm`.
+ The class `AmgxPgm` is deprecated in favor of `Pgm`.
+ Default constructors for the CSR `load_balance` and `automatical` strategies
+ The PolymorphicObject's move-semantic `copy_from` variant
+ The templated `SolverBase` class.
+ The class `MachineTopology` is deprecated in favor of `machine_topology`.
+ Logger constructors and create functions with the `executor` parameter.
+ The virtual, protected, Dense functions `compute_norm1_impl`, `add_scaled_impl`, etc.
+ Logger events for solvers and criterion without the additional `implicit_tau_sq` parameter.
+ The global `gko::solver::default_krylov_dim`, use instead `gko::solver::gmres_default_krylov_dim`.

### Added features

+ Adds a batch::BatchLinOp class that forms a base class for batched linear operators such as batched matrix formats, solver and preconditioners [#1379](#1379)
+ Adds a batch::MultiVector class that enables operations such as dot, norm, scale on batched vectors [#1371](#1371)
+ Adds a batch::Dense matrix format that stores batched dense matrices and provides gemv operations for these dense matrices. [#1413](#1413)
+ Adds a batch::Ell matrix format that stores batched Ell matrices and provides spmv operations for these batched Ell matrices. [#1416](#1416) [#1437](#1437)
+ Add a batch::Bicgstab solver (class, core, and reference kernels) that enables iterative solution of batched linear systems [#1438](#1438).
+ Add device kernels (CUDA, HIP, and DPCPP) for batch::Bicgstab solver. [#1443](#1443).
+ New MC64 reordering algorithm which optimizes the diagonal product or sum of a matrix by permuting the rows, and computes additional scaling factors for equilibriation [#1120](#1120)
+ New interface for (non-symmetric) permutation and scaled permutation of Dense and Csr matrices [#1415](#1415)
+ LU and Cholesky Factorizations can now be separated into their factors [#1432](#1432)
+ New symbolic LU factorization algorithm that is optimized for matrices with an almost-symmetric sparsity pattern [#1445](#1445)
+ Sorting kernels for SparsityCsr on all backends [#1343](#1343)
+ Allow passing pre-generated local solver as factory parameter for the distributed Schwarz preconditioner [#1426](#1426)
+ Add DPCPP kernels for Partition [#1034](#1034), and CSR's `check_diagonal_entries` and `add_scaled_identity` functionality [#1436](#1436)
+ Adds a helper function to create a partition based on either local sizes, or local ranges [#1227](#1227)
+ Add function to compute arithmetic mean of dense and distributed vectors [#1275](#1275)
+ Adds `icpx` compiler supports [#1350](#1350)
+ All backends can be built simultaneously [#1333](#1333)
+ Emits a CMake warning in downstream projects that use different compilers than the installed Ginkgo [#1372](#1372)
+ Reordering algorithms in sparse_blas benchmark [#1354](#1354)
+ Benchmarks gained an `-allocator` parameter to specify device allocators [#1385](#1385)
+ Benchmarks gained an `-input_matrix` parameter that initializes the input JSON based on the filename [#1387](#1387)
+ Benchmark inputs can now be reordered as a preprocessing step [#1408](#1408)


### Improvements

+ Significantly improve Cholesky factorization performance [#1366](#1366)
+ Improve parallel build performance [#1378](#1378)
+ Allow constrained parallel test execution using CTest resources [#1373](#1373)
+ Use arithmetic type more inside mixed precision ELL [#1414](#1414)
+ Most factory parameters of factory type no longer need to be constructed explicitly via `.on(exec)` [#1336](#1336) [#1439](#1439)
+ Improve ParILU(T)/ParIC(T) convergence by using more appropriate atomic operations [#1434](#1434)

### Fixes

+ Fix an over-allocation for OpenMP reductions [#1369](#1369)
+ Fix DPCPP's common-kernel reduction for empty input sizes [#1362](#1362)
+ Fix several typos in the API and documentation [#1348](#1348)
+ Fix inconsistent `Threads` between generations [#1388](#1388)
+ Fix benchmark median condition [#1398](#1398)
+ Fix HIP 5.6.0 compilation [#1411](#1411)
+ Fix missing destruction of rand_generator from cuda/hip [#1417](#1417)
+ Fix PAPI logger destruction order [#1419](#1419)
+ Fix TAU logger compilation [#1422](#1422)
+ Fix relative criterion to not iterate if the residual is already zero [#1079](#1079)
+ Fix memory_order invocations with C++20 changes [#1402](#1402)
+ Fix `check_diagonal_entries_exist` report correctly when only missing diagonal value in the last rows. [#1440](#1440)
+ Fix checking OpenMPI version in cross-compilation settings [#1446](#1446)
+ Fix false-positive deprecation warnings in Ginkgo, especially for the old Rcm (it doesn't emit deprecation warnings anymore as a result but is still considered deprecated) [#1444](#1444)


### Related PR: #1451
tcojean added a commit that referenced this pull request Nov 10, 2023
Release 1.7.0 to develop

The Ginkgo team is proud to announce the new Ginkgo minor release 1.7.0. This release brings new features such as:
- Complete GPU-resident sparse direct solvers feature set and interfaces,
- Improved Cholesky factorization performance,
- A new MC64 reordering,
- Batched iterative solver support with the BiCGSTAB solver with batched Dense and ELL matrix types,
- MPI support for the SYCL backend,
- Improved ParILU(T)/ParIC(T) preconditioner convergence,
and more!

If you face an issue, please first check our [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues) and the [open issues list](https://github.com/ginkgo-project/ginkgo/issues) and if you do not find a solution, feel free to [open a new issue](https://github.com/ginkgo-project/ginkgo/issues/new/choose) or ask a question using the [github discussions](https://github.com/ginkgo-project/ginkgo/discussions).

Supported systems and requirements:
+ For all platforms, CMake 3.16+
+ C++14 compliant compiler
+ Linux and macOS
  + GCC: 5.5+
  + clang: 3.9+
  + Intel compiler: 2019+
  + Apple Clang: 14.0 is tested. Earlier versions might also work.
  + NVHPC: 22.7+
  + Cray Compiler: 14.0.1+
  + CUDA module: CMake 3.18+, and CUDA 10.1+ or NVHPC 22.7+
  + HIP module: ROCm 4.5+
  + DPC++ module: Intel oneAPI 2022.1+ with oneMKL and oneDPL. Set the CXX compiler to `dpcpp` or `icpx`.
  + MPI: standard version 3.1+, ideally GPU Aware, for best performance
+ Windows
  + MinGW: GCC 5.5+
  + Microsoft Visual Studio: VS 2019+
  + CUDA module: CUDA 10.1+, Microsoft Visual Studio
  + OpenMP module: MinGW.

### Version support changes

+ CUDA 9.2 is no longer supported and 10.0 is untested [#1382](#1382)
+ Ginkgo now requires CMake version 3.16 (and 3.18 for CUDA) [#1368](#1368)

### Interface changes

+ `const` Factory parameters can no longer be modified through `with_*` functions, as this breaks const-correctness [#1336](#1336) [#1439](#1439)

### New Deprecations

+ The `device_reset` parameter of CUDA and HIP executors no longer has an effect, and its `allocation_mode` parameters have been deprecated in favor of the `Allocator` interface. [#1315](#1315)
+ The CMake parameter `GINKGO_BUILD_DPCPP` has been deprecated in favor of `GINKGO_BUILD_SYCL`. [#1350](#1350)
+ The `gko::reorder::Rcm` interface has been deprecated in favor of `gko::experimental::reorder::Rcm` based on `Permutation`. [#1418](#1418)
+ The Permutation class' `permute_mask` functionality. [#1415](#1415)
+ Multiple functions with typos (`set_complex_subpsace()`, range functions such as `conj_operaton` etc). [#1348](#1348)

### Summary of previous deprecations
+ `gko::lend()` is not necessary anymore.
+ The classes `RelativeResidualNorm` and `AbsoluteResidualNorm` are deprecated in favor of `ResidualNorm`.
+ The class `AmgxPgm` is deprecated in favor of `Pgm`.
+ Default constructors for the CSR `load_balance` and `automatical` strategies
+ The PolymorphicObject's move-semantic `copy_from` variant
+ The templated `SolverBase` class.
+ The class `MachineTopology` is deprecated in favor of `machine_topology`.
+ Logger constructors and create functions with the `executor` parameter.
+ The virtual, protected, Dense functions `compute_norm1_impl`, `add_scaled_impl`, etc.
+ Logger events for solvers and criterion without the additional `implicit_tau_sq` parameter.
+ The global `gko::solver::default_krylov_dim`, use instead `gko::solver::gmres_default_krylov_dim`.

### Added features

+ Adds a batch::BatchLinOp class that forms a base class for batched linear operators such as batched matrix formats, solver and preconditioners [#1379](#1379)
+ Adds a batch::MultiVector class that enables operations such as dot, norm, scale on batched vectors [#1371](#1371)
+ Adds a batch::Dense matrix format that stores batched dense matrices and provides gemv operations for these dense matrices. [#1413](#1413)
+ Adds a batch::Ell matrix format that stores batched Ell matrices and provides spmv operations for these batched Ell matrices. [#1416](#1416) [#1437](#1437)
+ Add a batch::Bicgstab solver (class, core, and reference kernels) that enables iterative solution of batched linear systems [#1438](#1438).
+ Add device kernels (CUDA, HIP, and DPCPP) for batch::Bicgstab solver. [#1443](#1443).
+ New MC64 reordering algorithm which optimizes the diagonal product or sum of a matrix by permuting the rows, and computes additional scaling factors for equilibriation [#1120](#1120)
+ New interface for (non-symmetric) permutation and scaled permutation of Dense and Csr matrices [#1415](#1415)
+ LU and Cholesky Factorizations can now be separated into their factors [#1432](#1432)
+ New symbolic LU factorization algorithm that is optimized for matrices with an almost-symmetric sparsity pattern [#1445](#1445)
+ Sorting kernels for SparsityCsr on all backends [#1343](#1343)
+ Allow passing pre-generated local solver as factory parameter for the distributed Schwarz preconditioner [#1426](#1426)
+ Add DPCPP kernels for Partition [#1034](#1034), and CSR's `check_diagonal_entries` and `add_scaled_identity` functionality [#1436](#1436)
+ Adds a helper function to create a partition based on either local sizes, or local ranges [#1227](#1227)
+ Add function to compute arithmetic mean of dense and distributed vectors [#1275](#1275)
+ Adds `icpx` compiler supports [#1350](#1350)
+ All backends can be built simultaneously [#1333](#1333)
+ Emits a CMake warning in downstream projects that use different compilers than the installed Ginkgo [#1372](#1372)
+ Reordering algorithms in sparse_blas benchmark [#1354](#1354)
+ Benchmarks gained an `-allocator` parameter to specify device allocators [#1385](#1385)
+ Benchmarks gained an `-input_matrix` parameter that initializes the input JSON based on the filename [#1387](#1387)
+ Benchmark inputs can now be reordered as a preprocessing step [#1408](#1408)


### Improvements

+ Significantly improve Cholesky factorization performance [#1366](#1366)
+ Improve parallel build performance [#1378](#1378)
+ Allow constrained parallel test execution using CTest resources [#1373](#1373)
+ Use arithmetic type more inside mixed precision ELL [#1414](#1414)
+ Most factory parameters of factory type no longer need to be constructed explicitly via `.on(exec)` [#1336](#1336) [#1439](#1439)
+ Improve ParILU(T)/ParIC(T) convergence by using more appropriate atomic operations [#1434](#1434)

### Fixes

+ Fix an over-allocation for OpenMP reductions [#1369](#1369)
+ Fix DPCPP's common-kernel reduction for empty input sizes [#1362](#1362)
+ Fix several typos in the API and documentation [#1348](#1348)
+ Fix inconsistent `Threads` between generations [#1388](#1388)
+ Fix benchmark median condition [#1398](#1398)
+ Fix HIP 5.6.0 compilation [#1411](#1411)
+ Fix missing destruction of rand_generator from cuda/hip [#1417](#1417)
+ Fix PAPI logger destruction order [#1419](#1419)
+ Fix TAU logger compilation [#1422](#1422)
+ Fix relative criterion to not iterate if the residual is already zero [#1079](#1079)
+ Fix memory_order invocations with C++20 changes [#1402](#1402)
+ Fix `check_diagonal_entries_exist` report correctly when only missing diagonal value in the last rows. [#1440](#1440)
+ Fix checking OpenMPI version in cross-compilation settings [#1446](#1446)
+ Fix false-positive deprecation warnings in Ginkgo, especially for the old Rcm (it doesn't emit deprecation warnings anymore as a result but is still considered deprecated) [#1444](#1444)

### Related PR: #1454
@upsj upsj mentioned this pull request Nov 13, 2023
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. mod:core This is related to the core module. type:preconditioner This is related to the preconditioners type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants