Releases: ROCm/rocBLAS
rocBLAS-0.10.4.0 release for ROCM 1.7.0
Changelist:
- fix race condition for multi-process and multi-thread
- hipLaunchKernelGGL replaces hipLaunchKernel
- add logging
rocBLAS-0.10.3.0 release for ROCM 1.6.4
Changelist:
- add dgemm assembly from Tensile v3.4.0
- fix packaging install path
- integrate clang-format
rocBLAS-0.10.2.0 release for ROCM 1.6.4
Changelist:
- ported to CentOS
- updated to use Tensile v3.3.7 with v_add_i32->u32 fix and fix for M<4
- refactored code and tests for rocblas_pointer_mode
rocBLAS-0.10.1.0 release for ROCM 1.6.4
Changelist:
- add MI25 tuning for Tensile 3.3.4
- fix sgemm assembly kernels for thread safety
- correct iXamax to 1 based indexing
- refactor tests
Release for ROCM 1.6.4
NOTE: API breaking changes introduced in this release related to: rocblas_iXamax, rocblas_iXamin, complex functions, and half functions.
Changelist:
- correct API: rocblas_samax -> rocblas_isamax, rocblas_damax -> rocblas_idamax
- remove from the API functions for complex and half that have not been implemented
- update to Tensile v3.2.0. This uses sgemm assembly kernels for gfx803 and gfx900
- add rocblas_sgeam and rocblas_dgeam functions
- improve repeatability of rocblas_Xgemm performance tests
- update perf script
release for ROCM 1.6.3
NOTE: API breaking changes introduced in this release, primarily related to library NAME and SONAME.
Changelist:
- Library removed the suffix which annotated platform (i.e. now librocblas.so)
- so-name link renamed to reflect the MAJOR version number, (currently 0, changed from 1)
- Build system entirely rewritten to simplify build/install process. Convenience bash script added to automate builds on Ubuntu distro (install.sh script added to root)
- Tensile updated to v3.0.4, which includes fixes for NaN propogating on GEMM calls with beta == 0
- 2 new samples added in samples directory (gemm & strided gemm)
- haxpy implementation added
- extra unit tests added and benchmarking capabilities for axpy, dot, scal
- Improved stability of TRSM unit tests
rocBLAS-0.4.3.0 release for ROCM 1.6
Library release associated with ROCM v1.6 release.
Library tuned for Fiji family hardware.
rocBLAS-0.4.2.3 release for ROCM 1.5
Library release associated with the ROCm v1.5 platform release.
Library tuned for Fiji family hardware.
API Change: The order parameter has been removed from the gemm function. gemm functions now only support column major ordering. If you have row major matrices switch the following parameters: transa and transb, m and n, A and B, lda and ldb.
Below is the rocblas_sgemm function prototype.
rocblas_sgemm(
rocblas_handle handle,
rocblas_operation transa, rocblas_operation transb,
rocblas_int m, rocblas_int n, rocblas_int k,
const float *alpha,
const float *A, rocblas_int lda,
const float *B, rocblas_int ldb,
const float *beta,
float *C, rocblas_int ldc);
rocBLAS-0.4.2.0 release for ROCM 1.6
Library release associated with ROCM v1.6 platform release.
Library tuned for Fiji family hardware.
rocBLAS-0.4.0.2 release for ROCm 1.5
Library release associated with the ROCm v1.5 platform release
Library tuned for Fiji family hardware. At time of release, there is a known unit test failure in
- rocblas_trsm_matrix_size/trsm_gtest.trsm_gtest_float/12
- and others related to TRSM family
This has been identified as an issue in the software stack below the library, and a fix should be forthcoming. We will update release notes when the fix is available.