OpenBLAS 0.3.5 version
martin-frbg
released this
31 Dec 22:17
·
5226 commits
to release-0.3.0
since this release
common:
- loop unrolling in TRMV has been enabled again.
- A domain error in the thread workload distribution for SYRK
has been fixed. - gmake builds will now automatically add -fPIC to the build
options if the platform requires it. - a pthreads key leakage (and associate crash on dlclose) in
the USE_TLS codepath was fixed. - building of the utest cases on systems that do not provide
an implementation of complex.h was fixed.
x86_64:
- the SkylakeX code was changed to compile on OSX.
- unwanted application of the -march=skylake-avx512 option
to the common code parts of a DYNAMIC_ARCH build was fixed. - improved performance of SGEMM for small workloads on Skylake X.
- performance of SGEMM and DGEMM was improved on Haswell.
ARMV8:
- a configuration error that broke the CNRM2 kernel was corrected.
- compilation of the GEMM kernels with CMAKE was fixed.
- DYNAMIC_ARCH builds are now available with CMAKE as well.
- using CMAKE for cross-compilation to the new cpu TARGETs
introduced in 0.3.4 now works.
POWER:
- a problem in cpu autodetection for AIX has been corrected.
md5sum
ec0353a397ad3dbf2b28046b12cec1ae OpenBLAS-0.3.5.zip
579bda57f68ea6e9074bf5780e8620bb OpenBLAS-0.3.5.tar.gz