Skip to content

Commit

Permalink
advance to version 0.1.3
Browse files Browse the repository at this point in the history
  • Loading branch information
giaf committed Feb 10, 2022
1 parent d6d2a8d commit 4668125
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 4 deletions.
12 changes: 10 additions & 2 deletions Changelog.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ BLASFEO ChangeLog


====================================================================
Version 0.1.3-master
23-Dec-2020
Version 0.1.3
10-Feb-2022

BLASFEO_API:
* use macros in REFERENCE backend to allow column- and panel-major formats
Expand All @@ -16,6 +16,13 @@ BLAS_API:
* spotrf for all targets (partially optimized for avx2 and armv8a, generic for others)
* dgemm: optimize switching algorithm for Intel Haswell and ARM Cortex A57
* dgemm: some work on cache blocking: k-block (all targets), m- and n-block (haswell, sandybridge, cortexa76, cortexa73, cortexa57, cortexa55, cortexa53)
* cache-blocking for dgemm, sgemm, dsyrk, dtrmm (llnn, lltn, rlnn), dtrsm (rlnn, lutn, rltn), dpotrf (l); fully optimized for haswell and cortexa57 targets.
* dgetr: optimize for haswell, sandybridge, cortexa57, cortexa53

x64:
* Intel Skylake X:
- optimize panel-major routines needed in HPIPM
- optimize column-major dgemm

ARMv8A:
* add kernel sgemm nt {8x4,8x8} lib44cc & some relative spotrf kernels
Expand All @@ -26,6 +33,7 @@ ARMv8A:
* add Cortex A73 target (makefile only for now)
* add Cortex A55 target (makefile only for now)
* add Cortex A76 target (makefile only for now)
* add Apple M1 target (makefile only for now)

====================================================================
Version 0.1.2
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,11 @@ The API is non-destructive, and compared to the BLAS API it has an additional ma
| ----------------------------- | -------------------------------------------------- |
| BLASFEO <br> (small matrices) | dgemm, dsyrk, dtrmm, dtrsm, dpotrf, dgetrf, dgeqrf, dgelqf, <br> sgemm, ssyrk, strmm, strsm, spotrf |
| BLAS <br> (small matrices) | dgemm, dsyrk, dtrmm, dtrsm, dpotrf, dgetrf <br> sgemm, strsm, spotrf |
| BLAS <br> (large matrices) | dgemm, dsyrk <br> sgemm |
| BLAS <br> (large matrices) | dgemm, dsyrk, dtrmm*, dtrsm*, dpotrf*, <br> sgemm |

Note: BLASFEO is currently under active development.
Some of the routines listed in the previous table may only be optimized for some variants, and provide reference implementations for other variants.
E.g. only some variants of the routines marked with '*' are optimized for large matrices.

## Supported Computer Architectures

Expand Down
2 changes: 1 addition & 1 deletion version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.1.1-1
0.1.3

0 comments on commit 4668125

Please sign in to comment.