Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more negative stride support to BLAS Level 1/2 functions #42957

Merged
merged 7 commits into from
Feb 20, 2022

Conversation

N5N3
Copy link
Member

@N5N3 N5N3 commented Nov 5, 2021

This PR could be regarded as the follow-up attempt of #41513 and #39751

In general, negative stride support is added to the following Level 1 functions:
(Let st denotes the input vector's stride)

  1. nrm2, asum
    Openblas always return 0 if st < 0, so I pass in abs(st).
  2. scal!
    st and -st give the same result, so I pass in abs(st).
  3. axpy(by)!, BLAS.copyto!
    st does affect the results' order so I have to pass negative st directly. At least OpenBLAS and MKL could handle these calls.
    (API with stide input, like rot!(n, X, incx, Y, incy, c, s) is not changed in this PR. User should pass in correct pointer if inc < 0)

Besides, st < 0 is supported by MKL and openblas for all the implemented Level 2 functions, so all of them are extended in this PR.

Test Added.

Edit: spr! added

Edit2: commits have been cleaned for easier review.

@sostock sostock added the linear algebra Linear algebra label Nov 5, 2021
@N5N3
Copy link
Member Author

N5N3 commented Nov 6, 2021

iamax with st < 0 failed on linuxaarch64.
Based on MKL's document:
"If either n or incx are not positive, the routine returns 0."
and OpenBLAS x86-64 source code:
https://github.com/xianyi/OpenBLAS/blob/2d7ca63e21e97dcd2db5c3aa7658f6cc2a504e55/kernel/x86_64/iamax.S#L72-L75
0 shoud be the expected return for negetive st.

Looks like openblas on arm only test n < 0: https://github.com/xianyi/OpenBLAS/blob/2d7ca63e21e97dcd2db5c3aa7658f6cc2a504e55/kernel/arm64/iamax_thunderx2t99.c#L215
Should we threat this inconsistency as "bug"?

Anyway, iamax should be seldomly used so I'll add a negetive stride check on Julia side.

stdlib/LinearAlgebra/src/blas.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/src/blas.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/src/blas.jl Show resolved Hide resolved
stdlib/LinearAlgebra/src/blas.jl Outdated Show resolved Hide resolved
@N5N3 N5N3 force-pushed the neg_stride_BLAS_Level1&2 branch 3 times, most recently from 7ea2135 to ca0e655 Compare November 10, 2021 09:06
@N5N3 N5N3 force-pushed the neg_stride_BLAS_Level1&2 branch 2 times, most recently from 5dca9de to d86b66e Compare November 24, 2021 08:35
Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, just one very minor comment.

stdlib/LinearAlgebra/test/blas.jl Outdated Show resolved Hide resolved
@dkarrasch
Copy link
Member

BTW, does this also fix #41513 (comment)? Does it therefore "replace" #42054 on master, so #42054 should be only backported to v1.7 and v1.6, but should not be used on master?

@N5N3
Copy link
Member Author

N5N3 commented Feb 8, 2022

BTW, does this also fix #41513 (comment)? Does it therefore "replace" #42054 on master, so #42054 should be only backported to v1.7 and v1.6, but should not be used on master?

No, that comment is not fixed in this PR.

@dkarrasch
Copy link
Member

@N5N3 When you find the time, could you please resolve the merge conflict (on a quick glance, I couldn't find pA, the pointer to A defined in your version?)? Once tests pass, we can merge.

N5N3 and others added 4 commits February 9, 2022 17:17
Define `vec_pointer_stride`

Blas use the beginning in memory as the vector ptr.
Also add test.

Add neg-stride support to `scal!`

test added

Simplify `dot` implement

Add neg-stride support to `nrm2`/`asum`/`iamax`

test added
some test clean

Add neg-stride support to `axp(b)y!`

And some code clean.
tesst added

Add neg-stride support to `symv`/`hemv`

test added

Add neg-stride support to `gbmv`/`hbmv`/`sbmv`

test added

Add neg-stride support to `hpmv`/`spmv`

test added (and some code clean)

Add neg-stride support to `ger`/`her`/`syr`

test added

Add neg-stride support to `trmv`/`trsv`

test added

Add neg-stride support to `spr!`

Add neg-stride support to `copyto!`

test added

add some lazy

some test clean

Co-Authored-By: Daniel Karrasch <daniel.karrasch@posteo.de>
Only allow 0-stride inputs in `dot`/`axp(b)y`
Also call BLAS for negative `lda` (if possible)
@N5N3
Copy link
Member Author

N5N3 commented Feb 9, 2022

The unreviewed changes are sepetated.
Note: LinearAlgebra.gemv! won't call BLAS if it found a 0-stride src vector.
I'm not sure whether OpenBLAS's axpy/dot work well with 0-stride input on all platform. Let's check it with CI.

@N5N3
Copy link
Member Author

N5N3 commented Feb 20, 2022

@dkarrasch All tests are done. Error on linux32/64 seems unrelated. (BLAS test passed).
I guess it OK to merge?

@dkarrasch dkarrasch merged commit 6409a8a into JuliaLang:master Feb 20, 2022
@N5N3 N5N3 deleted the neg_stride_BLAS_Level1&2 branch February 20, 2022 10:31
staticfloat pushed a commit to JuliaCI/julia-buildkite-testing that referenced this pull request Mar 2, 2022
@ViralBShah ViralBShah added the backport 1.8 Change should be backported to release-1.8 label Mar 6, 2022
ViralBShah pushed a commit that referenced this pull request Mar 6, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
KristofferC pushed a commit that referenced this pull request Mar 15, 2022
KristofferC pushed a commit that referenced this pull request Mar 16, 2022
@KristofferC KristofferC removed the backport 1.8 Change should be backported to release-1.8 label Mar 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants