Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conversion to pointer not defined for SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true} #219

Closed
ForceBru opened this issue Aug 4, 2021 · 4 comments
Labels

Comments

@ForceBru
Copy link

ForceBru commented Aug 4, 2021

I'm trying to run clustering on a view with indices of type UInt64, but getting the error in the title.

Code:

import Pkg
Pkg.activate(temp=true)
Pkg.add("Clustering", io=devnull)
Pkg.status()

import Clustering

data = rand(100)
my_view = @view data[1:UInt64(50)]

Clustering.kmeans(reshape(my_view, 1, :), 2)

Output:

~/test $ julia uint64_view.jl
  Activating new environment at `/var/folders/ys/3h0gnqns4b98zb66_vl_m35m0000gn/T/jl_TpqK3f/Project.toml`
      Status `/private/var/folders/ys/3h0gnqns4b98zb66_vl_m35m0000gn/T/jl_TpqK3f/Project.toml`
  [aaaa29a8] Clustering v0.14.2
ERROR: LoadError: conversion to pointer not defined for SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] unsafe_convert(#unused#::Type{Ptr{Float64}}, a::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true})
    @ Base ./pointer.jl:67
  [3] unsafe_convert(#unused#::Type{Ptr{Float64}}, a::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}})
    @ Base ./reshapedarray.jl:281
  [4] gemm!(transA::Char, transB::Char, alpha::Float64, A::Matrix{Float64}, B::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}}, beta::Float64, C::Matrix{Float64})
    @ LinearAlgebra.BLAS /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/blas.jl:1458
  [5] gemm_wrapper!(C::Matrix{Float64}, tA::Char, tB::Char, A::Matrix{Float64}, B::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}}, _add::LinearAlgebra.MulAddMul{true, true, Bool, Bool})
    @ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/matmul.jl:671
  [6] mul!
    @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/matmul.jl:377 [inlined]
  [7] mul!
    @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/matmul.jl:444 [inlined]
  [8] mul!
    @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/matmul.jl:275 [inlined]
  [9] _pairwise!(r::Matrix{Float64}, dist::Distances.SqEuclidean, a::Matrix{Float64}, b::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}})
    @ Distances ~/.julia/packages/Distances/gnt89/src/metrics.jl:616
 [10] pairwise!(r::Matrix{Float64}, metric::Distances.SqEuclidean, a::Matrix{Float64}, b::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}}; dims::Int64)
    @ Distances ~/.julia/packages/Distances/gnt89/src/generic.jl:271
 [11] pairwise(metric::Distances.SqEuclidean, a::Matrix{Float64}, b::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}}; dims::Int64)
    @ Distances ~/.julia/packages/Distances/gnt89/src/generic.jl:321
 [12] _kmeans!(X::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}}, weights::Nothing, centers::Matrix{Float64}, maxiter::Int64, tol::Float64, displevel::Int64, distance::Distances.SqEuclidean)
    @ Clustering ~/.julia/packages/Clustering/tt9vc/src/kmeans.jl:138
 [13] kmeans!(X::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}}, centers::Matrix{Float64}; weights::Nothing, maxiter::Int64, tol::Float64, display::Symbol, distance::Distances.SqEuclidean)
    @ Clustering ~/.julia/packages/Clustering/tt9vc/src/kmeans.jl:70
 [14] kmeans(X::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}}, k::Int64; weights::Nothing, init::Symbol, maxiter::Int64, tol::Float64, display::Symbol, distance::Distances.SqEuclidean)
    @ Clustering ~/.julia/packages/Clustering/tt9vc/src/kmeans.jl:112
 [15] kmeans(X::Base.ReshapedArray{Float64, 2, SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{UInt64}}, true}, Tuple{}}, k::Int64)
    @ Clustering ~/.julia/packages/Clustering/tt9vc/src/kmeans.jl:103
 [16] top-level scope
    @ ~/test/uint64_view.jl:11
in expression starting at /Users/forcebru/test/uint64_view.jl:11
~/test [1] $

/Users/forcebru/test/uint64_view.jl:11 is the last line of the code above.


julia> versioninfo()
Julia Version 1.6.1
Commit 6aaedecc44 (2021-04-23 05:59 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i5-3330S CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, ivybridge)
@alyst alyst added the bug label Aug 4, 2021
@alyst
Copy link
Member

alyst commented Aug 4, 2021

This is, sort of, reasonable: to efficiently calculate distance, Distances.jl (note that the error is triggered from within that package) uses BLAS matrix multiplication routines. These routines require that the memory layout for the matrix is regular. Which is not the case in your situation. To fix it, try using convert(Matrix, yourmatrix).
I'm not sure what could be improved here. You can report your case to Distances.jl, maybe they can provide better diagnostic in such situations. Any workaround for handing non-standard matrices within the package would be either slow or memory consuming.

@alyst alyst added question and removed bug labels Aug 4, 2021
@ForceBru
Copy link
Author

ForceBru commented Aug 4, 2021

These routines require that the memory layout for the matrix is regular. Which is not the case in your situation.

However, everything works fine when I change the type of the index: @view data[1:UInt64(50)] to @view data[1:50]. Presumably, the memory layout of the data the view refers to doesn't change since it's the same vector data and the same index, but the code that uses UInt64 throws the error above, while @view data[1:50] works fine.

So, I'm very confused about why the type of the index (1:50 vs 1:UInt64(50)) matters so much in this case.

To fix it, try using convert(Matrix, yourmatrix).

That would allocate a new matrix, right? I specifically don't want to allocate a new matrix because I need to extract tens of thousands of different slices of data (a rolling window), and I think that allocating memory each time will be very slow:

julia> using BenchmarkTools

julia> data = rand(1000);

julia> @benchmark data[1:600]
BenchmarkTools.Trial: 10000 samples with 199 evaluations.
 Range (min  max):  520.452 ns  61.369 μs  ┊ GC (min  max):  0.00%  96.91%
 Time  (median):     867.467 ns              ┊ GC (median):     0.00%
 Time  (mean ± σ):     2.051 μs ±  7.376 μs  ┊ GC (mean ± σ):  52.44% ± 14.08%

  █▂▄                                                          ▁
  ███▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▄▁▁▁▁▃▁▄▄▃▅▄▃▅▇▇▆▇ █
  520 ns        Histogram: log(frequency) by time      52.4 μs <

 Memory estimate: 4.81 KiB, allocs estimate: 1.

julia> @benchmark @view data[1:600]
BenchmarkTools.Trial: 10000 samples with 994 evaluations.
 Range (min  max):  33.188 ns   1.336 μs  ┊ GC (min  max): 0.00%  95.67%
 Time  (median):     34.657 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   37.477 ns ± 37.798 ns  ┊ GC (mean ± σ):  3.13% ±  3.04%

  ▄▇█▆▆▄▄▂▄▄▅▄▃▁                                ▁             ▂
  ██████████████▇▇▆▅▆▅▅██▇▇▇▇▇▆▇▇▆▆▆▅▅▄▄▅▁▄▅▇▇▇████▇▇▇▇▆▆▆▆▆▆ █
  33.2 ns      Histogram: log(frequency) by time      59.1 ns <

 Memory estimate: 48 bytes, allocs estimate: 1.

julia> 

@benchmark data[1:1000] gives a median of 1.338 μs per iteration, while for views it remains at 34 nanoseconds.


Initially, I wanted to report this straight to the main Julia repo (since the error appears inside LinearAlgebra), but the issue template there says to report issues with packages to their repos, so here I am, in the repo of the top package in the hierarchy. I guess I'll report this to Distances.jl as well.

@alyst
Copy link
Member

alyst commented Aug 4, 2021

Thanks for the detailed investigation!
If Integer indices are working, you may consider using them instead of UInt. Or maybe it could be fixed by adding Uint support to existing methods.
I also think reporting to Distances.jl is a good idea. The devs of Distances.jl should be more knowledgeable since they also contribute to Base.LinearAlgebra.

@ForceBru
Copy link
Author

ForceBru commented Aug 4, 2021

Yeah, I've already replaced UInts with regular Ints in my code, and now everything works fine. This is my second bug report involving unexpected behavior when switching from Int to UInt64 today...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants