Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsupported ARROW:extension:name type: "JuliaLang.Nothing" #132

Closed
ericphanson opened this issue Feb 20, 2021 · 1 comment
Closed

unsupported ARROW:extension:name type: "JuliaLang.Nothing" #132

ericphanson opened this issue Feb 20, 2021 · 1 comment

Comments

@ericphanson
Copy link
Member

ericphanson commented Feb 20, 2021

I ran into a weird situation where I saved an arrow table consisting of a single column of one custom type on one computer (WSL 2 on windows if it matters), then deserialized on another (M1 macbook with Julia running on rosetta), registering the type each time, and I got a bunch of warnings

┌ Warning: unsupported ARROW:extension:name type: "JuliaLang.Nothing"
└ @ Arrow.ArrowTypes ~/.julia/packages/Arrow/Re9EM/src/arrowtypes.jl:174

followed by a related error (ERROR: MethodError: Cannot convert an object of type NamedTuple{(), Tuple{}} to an object of type Symbol).

I believe I used Arrow 1.2.4 in each case. Manually defining

julia> Arrow.ArrowTypes.JULIA_TO_ARROW_TYPE_MAPPING[Nothing] = ("JuliaLang.Nothing", Nothing)
("JuliaLang.Nothing", Nothing)

julia> Arrow.ArrowTypes.ARROW_TO_JULIA_TYPE_MAPPING["JuliaLang.Nothing"] = (Nothing, Nothing)
(Nothing, Nothing)

then led to the deserialization succeeding (without warnings or errors, and the table seems correct).

When I try serializing and deserializing the exact same type with the exact same setup locally to produce a MWE, I don't get this error. The actual table is here: https://www.icloud.com/iclouddrive/0fFusdT5pfUior3qaXJ5u3uxA#all_pkgs_results (3 MB). The code to load it is the following:

using Pkg
pkg"add https://github.com/giordano/AnalyzeRegistry.jl#990686aaaafc0449151f1adb340cba4c29fe1788" # current master
using Arrow, AnalyzeRegistry
Arrow.ArrowTypes.registertype!(AnalyzeRegistry.Package, AnalyzeRegistry.Package)
load(path) = copy(Arrow.Table(path).packages)
results = load("all_pkgs_results.arrow")

Here's the output I get:

julia> 
┌ Warning: unsupported ARROW:extension:name type: "JuliaLang.Nothing"
└ @ Arrow.ArrowTypes ~/.julia/packages/Arrow/Re9EM/src/arrowtypes.jl:174
┌ Warning: unsupported ARROW:extension:name type: "JuliaLang.Nothing"
└ @ Arrow.ArrowTypes ~/.julia/packages/Arrow/Re9EM/src/arrowtypes.jl:174
┌ Warning: unsupported ARROW:extension:name type: "JuliaLang.Nothing"
└ @ Arrow.ArrowTypes ~/.julia/packages/Arrow/Re9EM/src/arrowtypes.jl:174
┌ Warning: unsupported ARROW:extension:name type: "JuliaLang.Nothing"
└ @ Arrow.ArrowTypes ~/.julia/packages/Arrow/Re9EM/src/arrowtypes.jl:174
┌ Warning: unsupported ARROW:extension:name type: "JuliaLang.Nothing"
└ @ Arrow.ArrowTypes ~/.julia/packages/Arrow/Re9EM/src/arrowtypes.jl:174
ERROR: MethodError: Cannot `convert` an object of type NamedTuple{(), Tuple{}} to an object of type Symbol
Closest candidates are:
  convert(::Type{T}, ::T) where T at essentials.jl:205
  Symbol(::Any...) at strings/basic.jl:229
Stacktrace:
  [1] convert(#unused#::Type{Union{Nothing, Symbol}}, x::NamedTuple{(), Tuple{}})
    @ Base ./some.jl:36
  [2] cvt1
    @ ./essentials.jl:322 [inlined]
  [3] macro expansion
    @ ./ntuple.jl:74 [inlined]
  [4] ntuple
    @ ./ntuple.jl:69 [inlined]
  [5] convert(#unused#::Type{Tuple{String, Symbol, Union{Nothing, Symbol}, Int64, Int64, Int64, Int64}}, x::Tuple{String, Symbol, NamedTuple{(), Tuple{}}, Int64, Int64, Int64, Int64})
    @ Base ./essentials.jl:323
  [6] Tuple{String, Symbol, Union{Nothing, Symbol}, Int64, Int64, Int64, Int64}(nt::NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}})
    @ Base ./namedtuple.jl:136
  [7] convert(#unused#::Type{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Nothing, Symbol}, Int64, Int64, Int64, Int64}}}, nt::NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}})
    @ Base ./namedtuple.jl:131
  [8] setindex!
    @ ./array.jl:839 [inlined]
  [9] _unsafe_copyto!(dest::Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Nothing, Symbol}, Int64, Int64, Int64, Int64}}}, doffs::Int64, src::Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}}, soffs::Int64, n::Int64)
    @ Base ./array.jl:235
 [10] unsafe_copyto!
    @ ./array.jl:289 [inlined]
 [11] _copyto_impl!
    @ ./array.jl:313 [inlined]
 [12] copyto!
    @ ./array.jl:299 [inlined]
 [13] copyto!
    @ ./array.jl:325 [inlined]
 [14] copyto_axcheck!
    @ ./abstractarray.jl:1050 [inlined]
 [15] Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Nothing, Symbol}, Int64, Int64, Int64, Int64}}}(x::Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}})
    @ Base ./array.jl:540
 [16] convert
    @ ./array.jl:532 [inlined]
 [17] AnalyzeRegistry.Package(name::String, uuid::Base.UUID, repo::String, subdir::String, reachable::Bool, docs::Bool, runtests::Bool, github_actions::Bool, travis::Bool, appveyor::Bool, cirrus::Bool, circle::Bool, drone::Bool, buildkite::Bool, azure_pipelines::Bool, gitlab_pipeline::Bool, license_files::Vector{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}}, licenses_in_project::Vector{String}, lines_of_code::Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}})
    @ AnalyzeRegistry ~/.julia/dev/AnalyzeRegistry/src/AnalyzeRegistry.jl:20
 [18] getindex
    @ ~/.julia/packages/Arrow/Re9EM/src/arraytypes/struct.jl:44 [inlined]
 [19] copyto_unaliased!(deststyle::IndexLinear, dest::Vector{AnalyzeRegistry.Package}, srcstyle::IndexLinear, src::Arrow.Struct{AnalyzeRegistry.Package, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.FixedSizeList{Base.UUID, Vector{UInt8}}, Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{String, Int32, Vector{UInt8}}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.List{Vector{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}}, Int32, Arrow.Struct{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{Vector{String}, Int32, Arrow.List{String, Int32, Vector{UInt8}}}, Arrow.Primitive{Float64, Vector{Float64}}}}}, Arrow.List{Vector{String}, Int32, Arrow.List{String, Int32, Vector{UInt8}}}, Arrow.List{Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}}, Int32, Arrow.Struct{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{Symbol, Int32, Vector{UInt8}}, Arrow.DenseUnion{Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Tuple{Arrow.Struct{Union{Missing, NamedTuple{(), Tuple{}}}, Tuple{}}, Arrow.List{Symbol, Int32, Vector{UInt8}}}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}}}}}})
    @ Base ./abstractarray.jl:964
 [20] copyto!(dest::Vector{AnalyzeRegistry.Package}, src::Arrow.Struct{AnalyzeRegistry.Package, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.FixedSizeList{Base.UUID, Vector{UInt8}}, Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{String, Int32, Vector{UInt8}}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.List{Vector{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}}, Int32, Arrow.Struct{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{Vector{String}, Int32, Arrow.List{String, Int32, Vector{UInt8}}}, Arrow.Primitive{Float64, Vector{Float64}}}}}, Arrow.List{Vector{String}, Int32, Arrow.List{String, Int32, Vector{UInt8}}}, Arrow.List{Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}}, Int32, Arrow.Struct{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{Symbol, Int32, Vector{UInt8}}, Arrow.DenseUnion{Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Tuple{Arrow.Struct{Union{Missing, NamedTuple{(), Tuple{}}}, Tuple{}}, Arrow.List{Symbol, Int32, Vector{UInt8}}}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}}}}}})
    @ Base ./abstractarray.jl:944
 [21] copymutable(a::Arrow.Struct{AnalyzeRegistry.Package, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.FixedSizeList{Base.UUID, Vector{UInt8}}, Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{String, Int32, Vector{UInt8}}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.List{Vector{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}}, Int32, Arrow.Struct{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{Vector{String}, Int32, Arrow.List{String, Int32, Vector{UInt8}}}, Arrow.Primitive{Float64, Vector{Float64}}}}}, Arrow.List{Vector{String}, Int32, Arrow.List{String, Int32, Vector{UInt8}}}, Arrow.List{Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}}, Int32, Arrow.Struct{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{Symbol, Int32, Vector{UInt8}}, Arrow.DenseUnion{Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Tuple{Arrow.Struct{Union{Missing, NamedTuple{(), Tuple{}}}, Tuple{}}, Arrow.List{Symbol, Int32, Vector{UInt8}}}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}}}}}})
    @ Base ./abstractarray.jl:1075
 [22] copy(a::Arrow.Struct{AnalyzeRegistry.Package, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.FixedSizeList{Base.UUID, Vector{UInt8}}, Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{String, Int32, Vector{UInt8}}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.BoolVector{Bool}, Arrow.List{Vector{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}}, Int32, Arrow.Struct{NamedTuple{(:license_filename, :licenses_found, :license_file_percent_covered), Tuple{String, Vector{String}, Float64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{Vector{String}, Int32, Arrow.List{String, Int32, Vector{UInt8}}}, Arrow.Primitive{Float64, Vector{Float64}}}}}, Arrow.List{Vector{String}, Int32, Arrow.List{String, Int32, Vector{UInt8}}}, Arrow.List{Vector{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}}, Int32, Arrow.Struct{NamedTuple{(:directory, :language, :sublanguage, :files, :code, :comments, :blanks), Tuple{String, Symbol, Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Int64, Int64, Int64, Int64}}, Tuple{Arrow.List{String, Int32, Vector{UInt8}}, Arrow.List{Symbol, Int32, Vector{UInt8}}, Arrow.DenseUnion{Union{Missing, NamedTuple{(), Tuple{}}, Symbol}, Tuple{Arrow.Struct{Union{Missing, NamedTuple{(), Tuple{}}}, Tuple{}}, Arrow.List{Symbol, Int32, Vector{UInt8}}}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}, Arrow.Primitive{Int64, Vector{Int64}}}}}}})
    @ Base ./abstractarray.jl:1019
 [23] load(path::String)
    @ Main ./REPL[3]:1
 [24] top-level scope
    @ REPL[55]:1

As I said before, manually defining how to deserialize Nothing gives the right results. Also if you don't register Package at all, you correctly get a vector of NamedTuple's.

(By the way, it's a nice table-- info for every package in the general registry. Fun fact: there are 6.9M lines of Julia-language source code in General!).

@quinnj
Copy link
Member

quinnj commented Mar 25, 2021

Ok, on the #156 PR, this now works for me:

using AnalyzeRegistry, Arrow
const NAME = Symbol("JuliaLang.AnalyzeRegistry.Package")
ArrowTypes.arrowname(::Type{AnalyzeRegistry.Package}) = NAME
ArrowTypes.JuliaType(::Val{NAME}, S) = AnalyzeRegistry.Package
save(path, packages) = Arrow.write(path, (; packages))
load(path) = copy(Arrow.Table(path).packages)
auth = AnalyzeRegistry.github_auth(personal_token)
results = analyze(find_packages("DataFrames", "Flux"); auth=auth);
save("packages.arrow", results)
res = load("packages.arrow")

@quinnj quinnj closed this as completed Mar 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants