Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge develop int0 0.3.0 for release 0.3.16 #3304

Merged
merged 154 commits into from
Jul 11, 2021
Merged
Changes from 1 commit
Commits
Show all changes
154 commits
Select commit Hold shift + click to select a range
ed47326
Merge pull request #1 from xianyi/develop
damonyu1989 Apr 27, 2021
ceb44be
update the intrinsic api to the offical name.
damonyu1989 Apr 27, 2021
aa7b3dc
GEMM: skylake: improve the performance when m is small
guowangy Apr 28, 2021
c59652f
optimize on sgemv_n for small n
Apr 30, 2021
3d4ccd2
fix for build error
Apr 30, 2021
49d18e6
Merge pull request #3217 from xianyi/release-0.3.0
martin-frbg May 2, 2021
380f955
Update version to 0.3.15.dev
martin-frbg May 2, 2021
9721b57
Update version to 0.3.15.dev
martin-frbg May 2, 2021
8b59983
Add error message token for SBGEMM in gemm.c
austinpagan May 4, 2021
206e03f
Delete lapack_wrappers.c.orig
drhpc May 4, 2021
f86b1bc
Merge pull request #3220 from drhpc/drhpc-fixup
martin-frbg May 5, 2021
f497bb9
Merge pull request #3219 from austinpagan/Gemm.ErrorFix
martin-frbg May 5, 2021
c0ca63e
Fix missing conditionals for non-SKX kernels
martin-frbg May 5, 2021
bda8820
Use percent instead of ampersand as placeholder for substitutions
martin-frbg May 6, 2021
c90c23e
Merge pull request #3223 from martin-frbg/develop
martin-frbg May 7, 2021
ec7d6c0
Add an Android crossbuild on OSX to Azure CI (#3224)
martin-frbg May 10, 2021
37ea870
Merge pull request #3192 from damonyu1989/develop
martin-frbg May 11, 2021
bd60fb6
filter out -mavx flag on zgemm kernels as it can cause problems with …
martin-frbg May 13, 2021
8b90e5f
Drop redundant inclusion of complex.h
martin-frbg May 14, 2021
73f637e
Support compilation with pre-C99 versions of MSVC
martin-frbg May 14, 2021
eef1c42
Convert ?chkaa to use dynamic allocation for the larger arrays
martin-frbg May 14, 2021
2c7d4a7
Delete cchkaa.f
martin-frbg May 14, 2021
93cc066
Delete dchkaa.f
martin-frbg May 14, 2021
f7bcd96
Delete schkaa.f
martin-frbg May 14, 2021
15b9d6b
Delete zchkaa.f
martin-frbg May 14, 2021
26e87ac
Support Intel Ice Lake SP as Cooper Lake
martin-frbg May 14, 2021
cbfd3c8
Recognize Intel Ice Lake SP as Cooper Lake
martin-frbg May 14, 2021
c4da892
Only filter out -mavx on Sandybridge ZGEMM/ZTRMM kernels
martin-frbg May 14, 2021
310b76a
Merge pull request #3231 from martin-frbg/issue3227
martin-frbg May 14, 2021
164551d
Merge pull request #3232 from martin-frbg/lapack553
martin-frbg May 14, 2021
5af5100
Merge pull request #3233 from martin-frbg/issue3230
martin-frbg May 14, 2021
4ecf631
Merge pull request #3228 from martin-frbg/issue3226
martin-frbg May 15, 2021
32264ba
Update Makefile.arm64
dnoan May 16, 2021
26ccf64
Add -lm for FreeBSD on ARM/ARM64
martin-frbg May 16, 2021
8f33da4
Merge pull request #3235 from dnoan/develop
martin-frbg May 16, 2021
e1911b2
Merge pull request #3236 from martin-frbg/issue3234
martin-frbg May 16, 2021
5c729c6
Correct function name in error message from SLASQ2 (Reference-LAPACK …
martin-frbg May 17, 2021
03b4d79
Merge pull request #3238 from martin-frbg/lapack555
martin-frbg May 17, 2021
02087a6
Merge pull request #3205 from intelmy/sgemv_n_opt
martin-frbg May 17, 2021
0e73d20
Handle inadvertent use of DYNAMIC_ARCH=0
martin-frbg May 22, 2021
3a53207
Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564)
martin-frbg May 22, 2021
04c60ce
Merge pull request #3242 from martin-frbg/issue3239
martin-frbg May 22, 2021
5f677e7
Merge pull request #3196 from guowangy/skylakex-gemm-batch-k
martin-frbg May 22, 2021
2d8d0af
Merge pull request #3243 from martin-frbg/lapack564
martin-frbg May 22, 2021
03297ff
Add fast path for small xSYR with INCX==1
martin-frbg May 22, 2021
d747260
Merge pull request #3244 from martin-frbg/issue3237
martin-frbg May 22, 2021
4fbc077
Fix typo
May 26, 2021
42f048c
Merge pull request #3249 from MikaelUrankar/develop
martin-frbg May 26, 2021
f0e7345
Add shortcut for small-size gemv_n with increments of one
martin-frbg May 26, 2021
d6d7a66
Add shortcuts for (small) cases that do not need expensive buffer all…
martin-frbg May 27, 2021
1217eb9
Fix copy-paste errors in variables used
martin-frbg May 28, 2021
734bd26
revert symv changes for now
martin-frbg May 29, 2021
f84197c
Add shortcuts for (small) cases that do not need expensive buffer all…
martin-frbg May 29, 2021
8c25b44
revert "try to work around gcc update problems"
martin-frbg Jun 6, 2021
fe9aff1
Merge pull request #3258 from martin-frbg/hbaction
martin-frbg Jun 6, 2021
1e0192a
riscv64/imin: Fix wrong comparison
zhaofengli Jun 7, 2021
3521cd4
RISCV64_GENERIC: Use generic kernel for DSDOT for better precision
zhaofengli Jun 7, 2021
590be3f
riscv64: Add Makefile
zhaofengli Jun 7, 2021
9f3d903
Merge pull request #3259 from zhaofengli/riscv64-fixes
xianyi Jun 8, 2021
706a08d
Optimized sgemv_t for small N based on AVX512
intelmy Jun 8, 2021
cbb7043
POWER10: Fixes for sbgemm kernel
Jun 9, 2021
7fb6e57
Removed use of non portable '-p' arg to install
TAAPArthur Jun 10, 2021
7dfc45e
Remove casts for PPC/POWER and complete parameters for POWER3/4
martin-frbg Jun 10, 2021
7a48247
fix c/zrot and sgemv for POWER5
martin-frbg Jun 10, 2021
dc4fcb4
Fix inverted conditional for caxpy/zaxpy
martin-frbg Jun 10, 2021
fb9e678
Fix caxpy/zaxpy for big-endian
martin-frbg Jun 10, 2021
08e2e60
Add prefetch values for power3
martin-frbg Jun 10, 2021
8adf097
Add prefetch values for power3
martin-frbg Jun 10, 2021
3906ef3
Add prefetch values for power3
martin-frbg Jun 10, 2021
efdbdd8
Add prefetch values for power3
martin-frbg Jun 10, 2021
f61991d
Merge pull request #3264 from RajalakshmiSR/sbgemmp10
martin-frbg Jun 10, 2021
dbba381
Merge pull request #3260 from intelmy/sgemv_t_opt
martin-frbg Jun 10, 2021
2e8ff4a
Merge pull request #3266 from martin-frbg/powerparam
martin-frbg Jun 10, 2021
9d292d3
arm64: add the missing d9 register to the clobber list
ggouaillardet Jun 14, 2021
29417ad
Merge pull request #3270 from ggouaillardet/topic/dznrm2_tx2
martin-frbg Jun 14, 2021
7aab5e8
Merge pull request #3250 from martin-frbg/gemv-shortcut
martin-frbg Jun 15, 2021
baf03a0
Merge pull request #3252 from martin-frbg/more_shortcuts
martin-frbg Jun 15, 2021
e6dd44d
Power10: Fix for SBGEMM
Jun 15, 2021
c4b464c
Merge pull request #3273 from austinpagan/sbgemm_gcc10_fix
martin-frbg Jun 15, 2021
92e024b
Declare SCASUM as EXTERNAL
martin-frbg Jun 16, 2021
5269348
Declare CSROT as EXTERNAL
martin-frbg Jun 16, 2021
9e1b43e
Declare DROT as EXTERNAL
martin-frbg Jun 16, 2021
e2621ef
Declare SROT as EXTERNAL
martin-frbg Jun 16, 2021
cd0e4aa
Declare ZDROT as EXTERNAL
martin-frbg Jun 16, 2021
5958ffc
Declare DZASUM as EXTERNAL
martin-frbg Jun 16, 2021
13fa9f7
Modify defines for CR and RC to work around name collision on Windows
martin-frbg Jun 16, 2021
e83df93
Work around another recent macro name collision with winnt.h
martin-frbg Jun 16, 2021
307c4c0
Fix typo
martin-frbg Jun 16, 2021
9499ab0
Merge pull request #3275 from martin-frbg/lapack580
martin-frbg Jun 16, 2021
a7627c5
Merge pull request #3276 from martin-frbg/issue3274
martin-frbg Jun 16, 2021
b7da75e
WiP CORTEX A55 support
Jun 19, 2021
39ef088
copy conf
Jun 19, 2021
9335d42
add gcc8 version matching
Jun 19, 2021
6423b28
dynamic_arch
Jun 20, 2021
548aa52
remove misplaced file
Jun 20, 2021
91e2b11
add to cmake listings too
Jun 20, 2021
7507195
bugz
Jun 20, 2021
130327e
OK
Jun 22, 2021
f0b822a
Update cpuid_arm64.c
martin-frbg Jun 23, 2021
1a8b613
Merge pull request #3278 from brada4/A55
martin-frbg Jun 23, 2021
3be660c
Add interface declarations for ?potri
martin-frbg Jun 26, 2021
1f8bda7
Add OPENBLAS_LOOPS support to potrf/potrs/potri benchmark
martin-frbg Jun 26, 2021
1b5620b
Add lower threshold for multithreading in ?potrf and ?potri
martin-frbg Jun 26, 2021
6ebcce2
Work around current conda/tqdm auto-update problem
martin-frbg Jun 29, 2021
7ddc9d3
Merge pull request #3287 from martin-frbg/appveyor-conda
martin-frbg Jun 29, 2021
623be66
Merge pull request #3284 from martin-frbg/potrf_potri
martin-frbg Jun 30, 2021
06e3b07
Handle OPENBLAS_LOOPS and OPENBLAS_TEST options
martin-frbg Jul 1, 2021
dcfc5cf
Handle OPENBLAS_LOOPS for more stable results
martin-frbg Jul 1, 2021
726c442
Add lower threshold for multithreading
martin-frbg Jul 1, 2021
4620f98
Mention availability of the Windows binaries in the Releases section
martin-frbg Jul 1, 2021
2376aa1
Merge pull request #3289 from martin-frbg/issue3283
martin-frbg Jul 1, 2021
a4543e4
Handle OPENBLAS_LOOP
martin-frbg Jul 4, 2021
8186963
Add lower limit for multithreading
martin-frbg Jul 4, 2021
3cfdb17
Remove code that disabled EXTRALIB on RISCV C910V
martin-frbg Jul 6, 2021
f20c4ed
Merge pull request #3288 from martin-frbg/getrf-2
martin-frbg Jul 7, 2021
4ed99c2
Merge pull request #3292 from martin-frbg/syrk_limit
martin-frbg Jul 7, 2021
25b602d
Merge pull request #3293 from martin-frbg/issue3290
martin-frbg Jul 7, 2021
40caaef
Merge pull request #3265 from TAAPArthur/improve_portability
martin-frbg Jul 7, 2021
0d8d261
Recognize newer Zhaoxin/Centaur cpus as Nehalem
martin-frbg Jul 8, 2021
eb2fdd3
Recognize newer Zhaoxin/Centaur processors as Nehalem
martin-frbg Jul 8, 2021
da623ae
Add vendor string Shanghai as the successor to Centaur
martin-frbg Jul 8, 2021
8f22ac5
Add vendor string Shanghai as successor to Centaur
martin-frbg Jul 8, 2021
c0d0406
Merge pull request #3296 from martin-frbg/issue3295
martin-frbg Jul 8, 2021
2f6326a
Remove <linux/unistd.h>
Jul 10, 2021
220f6a1
Add feature test macro for proper inclusion of <sched.h>
Jul 10, 2021
cecc2c6
Add test of installed <openblas_config.h>
Jul 10, 2021
ddb6cee
Contribution note
Jul 10, 2021
4f4e286
Fix copy-paste error in LIBCORE assignment for Tiger Lake
martin-frbg Jul 10, 2021
d511063
Move Alpine Linux build job from Travis to Azure
martin-frbg Jul 10, 2021
89429fd
fix typo
martin-frbg Jul 10, 2021
d86290e
add sudo for install in Alpine
martin-frbg Jul 10, 2021
c930419
Update azure-pipelines.yml
martin-frbg Jul 10, 2021
993e56b
Merge pull request #3299 from martin-frbg/issue3298
martin-frbg Jul 10, 2021
db57c44
Update azure-pipelines.yml
martin-frbg Jul 10, 2021
14e33e0
Handle OPENBLAS_LOOPS in SYR2 benchmark
martin-frbg Jul 10, 2021
7e09570
Update azure-pipelines.yml
martin-frbg Jul 10, 2021
0266ba7
Update azure-pipelines.yml
martin-frbg Jul 10, 2021
b2319fd
Merge pull request #3301 from martin-frbg/syr2bench
martin-frbg Jul 11, 2021
69560ad
Update azure-pipelines.yml
martin-frbg Jul 11, 2021
a27a61b
Update azure-pipelines.yml
martin-frbg Jul 11, 2021
c47e35a
Update azure-pipelines.yml
martin-frbg Jul 11, 2021
8acb6fe
Update azure-pipelines.yml
martin-frbg Jul 11, 2021
d2693ea
Update azure-pipelines.yml
martin-frbg Jul 11, 2021
836c7fb
Revert addition of test_install target
martin-frbg Jul 11, 2021
eba2cd9
Revert addition of test_install
martin-frbg Jul 11, 2021
7bb59fc
Clean up some warnings
martin-frbg Jul 11, 2021
be1a425
Merge pull request #3297 from outerpassage/develop
martin-frbg Jul 11, 2021
b4cbfe6
Update azure-pipelines.yml
martin-frbg Jul 11, 2021
498479b
Update azure-pipelines.yml
martin-frbg Jul 11, 2021
e008646
Merge pull request #3302 from martin-frbg/small_cleanup
martin-frbg Jul 11, 2021
19c81a0
Merge pull request #3300 from martin-frbg/AzureAlpine
martin-frbg Jul 11, 2021
239ff33
Update Changelog for 0.3.16
martin-frbg Jul 11, 2021
ed3eb18
Merge pull request #3303 from martin-frbg/changelog16
martin-frbg Jul 11, 2021
847607c
Merge branch 'release-0.3.0' into develop
martin-frbg Jul 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Handle OPENBLAS_LOOPS in SYR2 benchmark
martin-frbg authored Jul 10, 2021
commit 14e33e0f7e05e26b2b1cc2ced015c7722b0adc31
14 changes: 10 additions & 4 deletions benchmark/syr2.c
Original file line number Diff line number Diff line change
@@ -46,14 +46,17 @@ int main(int argc, char *argv[]){

if ((p = getenv("OPENBLAS_UPLO"))) uplo=*p;

blasint m, i, j;
blasint m, i, j, l;
blasint inc_x= 1;
blasint inc_y= 1;
int from = 1;
int to = 200;
int step = 1;
int loops = 1;

double time1;
if ((p = getenv("OPENBLAS_LOOPS"))) loops=*p;

double time1,timeg;

argc--;argv++;

@@ -85,8 +88,9 @@ int main(int argc, char *argv[]){

for(m = from; m <= to; m += step)
{

timeg = 0.;
fprintf(stderr, " %6d : ", (int)m);
for (l = 0; l < loops; l++) {
for(i = 0; i < m * COMPSIZE * abs(inc_x); i++){
x[i] = ((FLOAT) rand() / (FLOAT) RAND_MAX) - 0.5;
}
@@ -107,8 +111,10 @@ int main(int argc, char *argv[]){

end();

time1 = getsec();
timeg += getsec();
} // loops

time1 = timeg/(double)loops;
fprintf(stderr,
" %10.2f MFlops\n",
COMPSIZE * COMPSIZE * 2. * (double)m * (double)m / time1 * 1.e-6);