-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rot to BLAS in stdlib/LinearAlgebra #35124
Conversation
f7c05f4
to
c2932e3
Compare
I don't really mind adding this but notice that the functionality is already covered by julia> LinearAlgebra.Givens(1, 3, 0.3, 0.5)*[1, 1, 1]
3-element Array{Float64,1}:
0.8
1.0
-0.2
julia> lmul!(LinearAlgebra.Givens(1, 3, 0.3, 0.5), [1.0, 1, 1])
3-element Array{Float64,1}:
0.8
1.0
-0.2 and it's probably more efficient than calling out to BLAS. |
Maybe worth a speed comparison then? |
@andreasnoack
with
I update ẘₖ₋₁ and w̄ₖ without allocating extra memory. Does And in general we like to use Givens reflections (symmetric) instead of Givens rotations. |
stdlib/LinearAlgebra/test/blas.jl
Outdated
x2, y2 = BLAS.rot!(n,copy(x),1,copy(y),1,c,s) | ||
@test x2 ≈ c*x + s*y | ||
@test y2 ≈ -s*x + c*y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to include testing that x
and y
are actually overwritten, something along the lines
x2, y2 = BLAS.rot!(n,copy(x),1,copy(y),1,c,s) | |
@test x2 ≈ c*x + s*y | |
@test y2 ≈ -s*x + c*y | |
x2 = copy(x) | |
y2 = copy(y) | |
BLAS.rot!(n, x, 1, y, 1, c, s) | |
@test x ≈ c*x2 + s*y2 | |
@test y ≈ -s*x2 + c*y2 |
and similarly below.
The current methods for |
Indeed, I found and adopted this one julia/stdlib/LinearAlgebra/src/givens.jl Lines 339 to 351 in 2a5bb59
to this: function rot!(x::AbstractVector, y::AbstractVector, c, s)
LinearAlgebra.require_one_based_indexing(x, y)
n = length(x)
n == length(y) || throw(DimensionMismatch("different lenghts"))
@inbounds for i = 1:n
xi, yi = x[i], y[i]
x[i] = c *xi + s*yi
y[i] = -conj(s)*xi + c*yi
end
return x, y
end which passes the tests. This is quick-and-dirty, one could be more permissive in terms of lengths (like pass an extra using BenchmarkTools, Test
c, s = rand(), rand()
x0, y0 = rand(1000), rand(1000);
@btime rot!(x, y, $c, $s) setup=(x = copy(x0); y = copy(y0)); # 245 ns
@btime lmul!(LinearAlgebra.Givens(1, 2, $c, $s), A) setup=(A = [reshape(x0, 1, :); reshape(y0, 1, :)]); # 721 ns
x, y = rand(1000), rand(1000);
A = [reshape(x, 1, :); reshape(y, 1, :)];
rot!(x, y, c, s)
lmul!(LinearAlgebra.Givens(1, 2, c, s), A)
@test A ≈ [reshape(x, 1, :); reshape(y, 1, :)] |
55222ca
to
6507e0a
Compare
Sorry for the delay, I did some benchmarks @dkarrasch and your The BLAS version could be still relevant if you use About the allocation, it's the pointers for |
stdlib/LinearAlgebra/src/generic.jl
Outdated
Returns `x` and `y`. | ||
|
||
!!! compat "Julia 1.5" | ||
`rot!` requires at least Julia 1.5. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rot! -> ref!
stdlib/LinearAlgebra/test/generic.jl
Outdated
|
||
x2 = copy(x) | ||
y2 = copy(y) | ||
rot!(n, x, y, c, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rot!(n, x, y, c, s) | |
rot!(x, y, c, s) |
stdlib/LinearAlgebra/test/generic.jl
Outdated
|
||
x3 = copy(x) | ||
y3 = copy(y) | ||
ref!(n, x, y, c, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref!(n, x, y, c, s) | |
ref!(x, y, c, s) |
Great with generic implementations. Since they are no longer wrappers of BLAS functions, I'll suggest that the names are made more informative since the F77 variable name restrictions don't apply. So maybe |
5c398b8
to
11dc95b
Compare
11dc95b
to
1bce5d3
Compare
a06b0d2
to
d494b11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This seems good. @dkarrasch, anyone else you'd want to have look it over? |
Applies the plane rotation :