-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release v1.1.0 #368
Release v1.1.0 #368
Conversation
- Added used LaTeX packages into `Doxyfile-usr.in` - Improved code coverage by adding tests for + `IteratorFactory` (specifically for the `operator<` in Reference) + `ParIlu` (Added additional test with sorted CSR matrix) - re-added the overview description for `ParIlu`
To prevent conflicts with other users on the CI system, we now restrict it to only use the first GPU (device ID 0) for all tests. Note: That also restricts the CUDA executor copy test to a single GPU, meaning data will be copied internally on a single GPU instead of across devices.
+ Move `matrix_from` to the external loop. + Benchmark directly into `matrix_to`.
+ The same `grid_dim` was used for the reduction and for the `calculate_nnz_per_row` kernel + The `grid_dim` was limited to maximum default_block_size^2 elements. + For bigger matrices, the extracted `max_nnz_per_row` could be wrong, due to omitted values.
When compiling Ginkgo with gcc 6.4.0, there was an `internal compiler error` when compiling the `IteratorFactory`. This was resolved by changing the return type of the helper functions to `const &` instead of a value copy.
+ Move the construction of `matrix_to` in `convert_matrix` function. This allows to catch the related exceptions. + Do not construct the `matrix_to` object until the use of `copy_from` in the warmup. This allows to catch non-existing conversions quickly, without the overhead of loading big matrix data first. + Catch the `AllocationError` related exceptions for `matrix_from` and add a `completed: false` entry to the results on failure to instantiate a matrix from the data.
+ The previous grid dimensions for `initialize_zero_ell` were `stride * num_rows`, i.e. roughly the dense matrix dimension. + Using `max_nnz_per_row * num_rows` reduces significantly the amount of threads created which makes this kernel call more efficient (less useless thread creation).
- Instead of just computing the nnz of L and U, the CSR `row_ptrs` are computed in the first kernel, allowing for better parallelization - Now checking if the `system_matrix` given to factory is square (including a test that checks if it is working) - Fixed errors in omp test (leading to a lower epsilon in comparison)
- Use a strategy object (the default one that is used if no strategy is provided) in ParIlu - parallized the omp `initialize_row_ptrs_l_u` kernel
- Added additional test with more iterations for OpenMP ParIlu test (Therefore, creating a function `compute_lu` to simplify that) - Renaming variables to better fit their purpose
- Added functions that generate unsorted matrices for Csr tests - Added Omp kernel to check if a CSR matrix is sorted and the sorting kernel itself - Added Omp tests for the newly implemented kernels
See the updated [xSDK template](https://github.com/xsdk-project/xsdk-policy-compatibility/blob/master/template.md). + Add support information in the README.md + Add a CHANGELOG.md file which tracks Ginkgo's changes. The full version 1.0.0 description is provided and changes on top of this base version are detailed.
Additionally, separated the storage of the matrices to a single location to prevent matrix file duplication between CUDA and OpenMP tests.
109ad6b
to
7106bb8
Compare
@tcojean I think the problem persists - at least in the current version. However, I still approve it as I feel this is acceptable. |
I think this is ready. The tools do not show anything terrible. I fixed a few things from clang-tidy and the documentation issue shown by Hartwig. The problem I found was the submenus had different level (black dot vs white dot), but I still kept the submenus and the same layout. The full pipeline from this PR can be accessed here: All the data is available on the dashboard: Here is the updated documentation: The commits which have not been reviewed are the following: |
@tcojean Does it pick the |
@yhmtsai yes this include the commits from the fix jacobi as you can see in the last link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor typo. LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks really good, I have some minor comments.
I also just realized that I forgot to create an issue for add_new_algorithm.sh
for factorization
, which I have done just now (see #369). I am not sure if we want to have this also as part of this release
(I will work on this tomorrow regardless).
c217c76
I think most of the typos are fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR is for pushing the release to the master branch. This brings all commits from develop plus the
fix_jacobi
branch and a few commits for adaptation.DO NOT MERGE THIS. Currently, tests are still ongoing !
https://gitlab.com/ginkgo-project/ginkgo-public-ci/pipelines/89916672