Skip to content

Latest commit

 

History

History
186 lines (167 loc) · 8.7 KB

ChangeLog.md

File metadata and controls

186 lines (167 loc) · 8.7 KB

(changelog)=

Changelog

Version 1.2.0

NDA Version 1.2.0 is a release that

  • Introduces NVidia GPU support for array and view types
  • Adds GPU blas/lapack backends using the CuBLAS and CuSOLVER backend
  • Allows the use of symmetries for initialization and symmetrization of arrays
  • Uses C++20 concepts to constrain generic function implementations
  • Enables sliced hdf5 read/write operations
  • Fixes several library issues

We thank all the people who have contributed to this release: Thomas Hahn, Alexander Hampel, Dominik Kiese, Sergei Iskakoff, Harrison LaBollita, Henri Menke, Miguel Morales, Olivier Parcollet, Dylan Simon, Nils Wentzell

Find below an itemized list of changes in this release.

General

  • Add new test for matmul with permuted view
  • Add flatten function to layout_transforms.hpp
  • Add generic p-norm function
  • Add bindings for batched GEMM through gemm_batch function
  • Add support for slicing in h5_read
  • Enable fast small matrix inverse for matrices of size 1x1, 2x2, 3x3
  • Enable contiguous memory traversal for array iteration
  • Merge reshape_view and reshape, add overload that takes list of integers
  • Allow non-contiguous views in hdf5 read/write
  • Generalize nda::rand for complex value_t
  • Generalize basic_array_view deduction guide from contiguous range
  • Unify public member-types between basic_array and basic_array_view
  • In nda::blas::outer_product check contiguity only at runtime
  • Allow construction of array views from std::array
  • Define algebra of array_adapter as 'A'
  • Generalize get_first_element for scalar types
  • Add test for various nda traits
  • Generalize get_rank for contiguous_range types
  • Enable slicing with ranges that have negative steps + test
  • Make basic_array(idx_map, &&mem_handle) constructor public
  • Allow temporaries in calls to lapack wrapping functions
  • Generlize nda_lapack test to run both double and complex versions
  • Add blas::has_C_layout and blas::has_F_layout traits and use for cleanup
  • Generalize nda_blas and nda_cublas test for various value_t and layout combinations
  • In make_regular do not invoke copy of regular arrays
  • make_regular now returns a decltype(basic_array{...})
  • make_regular converts types with a regular_t member type
  • get_regular_t now uses basic_array{T} instead of make_regular
  • In transpose(A) allow for unary expr_call arguments
  • Allow basic_array rvalues in basic_array_view constructor
  • Extend deduction guides for basic_array and basic_array_view
  • Fix preservation of layout properties in idx_map.transpose(permutation)
  • Rename Layout to LayoutPolicy in array/view template parameters
  • Short-circuit in assign_from_ndarray for empty arrays
  • Generalize most traits to apply equally to A and A&
  • ArrayInitializer concept is now templated on the array type it initializes
  • Add benchmark for array copy operations
  • Allow for copy of block-strided arrays from host/device to host/device + test
  • Restore handle_sso copy constructor
  • Add get_view_t trait
  • Add Automatic include for c2py
  • Add function is_matrix_diagonal
  • Add stack_vector and stack_matrix alias
  • Generalize operator== for idx_maps of different types
  • Allow discarding info return value for lapack functions
  • Generalize nda::diag for types matching the contiguous_range concept
  • Generalize clef expr for multiarg subscript
  • Use range::all over default constructed range
  • Remove REQUIRES macro and use 'requires'
  • Enable slicing also for h5_write operations, assume existing dataset
  • Make pivot array const in getri signature
  • Minor cleanup in nda/h5.hpp template contraints and doc
  • Allow to pass dimensions as integers to factory functions basic_array::ones/zeros/rand
  • Add the 1d array factory nda::arange mimicing numpy arange + test
  • Allow bound checks also for array.extent(int) function
  • Configure and install nda/version.hpp header
  • Synchronize clang-tidy config file with app4triqs
  • Add bugprone checks to clang-tidy
  • Allow multiplication of std::array<T,N> by a T
  • Regenerate GPL copyright headers for C++ files
  • Fix compiler and linter warnings
  • Various documentation improvements
  • Clang-format all source files
  • General cleanup

cmake

  • Find and Link against openmp
  • Add generation of and install nda++ compiler wrapper
  • Do not use Accelerate Framework on OSX
  • Some cleanup in STATUS messages
  • Pick up existing LAPACK_ROOT on OSX builds
  • Only build Benchmarks if not subproject and not sanitizing
  • Do not build documentation as subproject
  • Do not find CUDAToolkit twice in nda-config.cmake
  • Use google-bench main branch
  • Link both cudart and cublas using imported targets provided by CUDAToolkit
  • Update Findsanitizer.cmake to include TSAN and MSAN
  • Disable Python Support by default
  • Only find cpp2py when built with PythonSupport=ON
  • Install cpp2py, needed as a linktime dependency for nda_py
  • Fix llvm package version for ubuntu clang ghactions build
  • In nda-config.cmake.in find Cpp2Py before including exported targets
  • Fix issue in extract_flags.cmake where generator expressions where not properly removed from flags
  • Fix Findsanitizer.cmake for new asan/ubsan runtime location with clang13+
  • Add missing find_dep(Cpp2Py 2.0) to nda-config.cmake.in
  • General cleanup

Concepts

  • Use C++20 concepts to constrain various generic functions and classes
  • Introduce concepts: Array, MemoryArray, Matrix, Vector, Handle
  • Various concept related simplififications and refactorings

GPU Support

  • Introduce GPU support for arrays and views
  • Added traits to check address space compatibility
  • Magma vbatch bindings + test + benchmark
  • Cublas backend for dot, gemm, gemv and ger + test
  • Cusolver backend for gesvd, getrf and getrs + test
  • Add helper functions to_host/to_device/to_unified for the copy of a MemoryArray to different address space
  • Add traits mem::on_host, mem::on_device, mem::on_unified to test memory location
  • Add traits get_regular_host_t, get_regular_device_t and get_regular_unified_t
  • Generic get_addr_space variable template
  • Add variable template have_same_address_space<A0, A1, ..>
  • Allow multiple arguments to on_host, on_device traits. Add on_unified trait
  • Add address space generic memory operations: malloc, free, memset, memcpy

Symmetries

  • Add sym_grp class to perform symmetry operations on arrays
  • Add extensive tests for sym_grp
  • Allow for OpenMP parallelized array initialization
  • 'symmetrize' method to symmetrize an existing array
  • 'init' method to init an array with the minimum number of rhs evaluations

jenkins

  • Specificy LAPACK_ROOT for osx builds
  • Update docker base images
  • Don't keep / publish any nda install
  • Synchronize Jenkinsfile with app4triqs

lapack/blas

  • General cleanup and doc improvements in bindings
  • Use concepts in generic lapack bindings
  • When possible use gemm/gemv with op='C' when passing conj(M)
  • Simplify logic in gemm
  • Rename 'trans' to 'op' in the blas bindings

Doc

  • Add document on design principles for arrays, views and lazy expressions
  • Add link instructions for cmake based projects
  • Provide a link to install instructions in README.md
  • Add additional build options to doc/install.rst

Fixes

  • Fix signature of expr::operator[]
  • Fix bug in lapack::getrs for C-layout matrix input
  • Fix const issue in map_layout_transform for rvalues
  • Fix operation char for non-fortran layout in getrs
  • Fix gcc compilation issue in assignment between tuple and std::array
  • Promote memory layout in matrix multiplication
  • Add Workaround for gcc11 bug
  • In nda::memcpy make sure to take src as a 'const *'
  • Do not create views from temporary arrays in gelss_worker
  • Fix issue in expr_call implementation for the slicing case
  • Fix bug in is_contiguous and is_strided_1d for Fortran layout arrays
  • Fix issue when calling h5::write for array_view
  • Matrix * Vector now returns a Vector and not a 1d array
  • Bugfix in print for arrays of rank>2
  • Avoid narrowing conversions in std::accumulate in multiple places
  • Add missing operator- to stdutil/array.hpp
  • Enable bound-checking for range-based array slicing FIX #22
  • Fix out of bounds error in lapack::gtsv

Version 1.1.0

nda is a C++ library providing an efficient and flexible multi-dimensional array class. It is an essential building-block of the TRIQS project. Some features include

  • coded in C++20 using concepts
  • expressions are implemented lazily for maximum performance
  • flexible and lightweight view-types
  • matrix and vector class with BLAS / LAPACK backend
  • easily store and retrieve arrays to and from hdf5 files using h5
  • common mpi functionality using mpi

This is the initial release for this project.

We thank all the people who have contributed to this release: Philipp Dumitrescu, Alexander Hampel, Olivier Parcollet, Dylan Simon, Hugo U. R. Strand, Nils Wentzell