Skip to content

hipBLASLt 0.2.0 for ROCm 5.6.1

Compare
Choose a tag to compare
@rocm-ci rocm-ci released this 29 Aug 20:11

Added

  • Added CI tests for tensilelite
  • Initilized extension group gemm APIs (FP16 only)
  • Added group gemm sample app: example_hipblaslt_groupedgemm

Fixed

  • Fixed ScaleD kernel incorrect results

Optimizations

  • Tuned equality sizes for HHS data type
  • Reduced host side overhead for hipblasLtMatmul()
  • Removed unused kernel arguments
  • Schedule valus setup before first s_waitcnt
  • Refactored tensilelite host codes
  • Optimized building time