Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that ArrowTypes.default is defined for Vararg tuples #466

Merged
merged 1 commit into from
Jun 13, 2023
Merged

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Jun 13, 2023

Fixes #461. This is the other proposal from what I originally suggested. Though Tuple{Varag} is never the concrete type of a value in Julia, it comes up when dealing with structs with fields that have Vararg types; not common at all, but it's allowed and happens.

The reflection code here is a little gross, but it seems to be what Base Julia uses in similar queries to see if a tuple has a vararg element. I've commented out the Arrow/runtests test for now until we bump another version and can then uncomment. Unit tests are added for ArrowTypes in the meantime.

Fixes #461. This is the other proposal from what I originally suggested.
Though `Tuple{Varag}` is never the concrete type of a value in Julia,
it comes up when dealing with structs with fields that have Vararg
types; not common at all, but it's allowed and happens.

The reflection code here is a little gross, but it seems to be what
Base Julia uses in similar queries to see if a tuple has a vararg
element. I've commented out the Arrow/runtests test for now until
we bump another version and can then uncomment. Unit tests are
added for ArrowTypes in the meantime.
@codecov-commenter
Copy link

codecov-commenter commented Jun 13, 2023

Codecov Report

Merging #466 (b56063e) into main (5532270) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #466      +/-   ##
==========================================
+ Coverage   87.53%   87.57%   +0.04%     
==========================================
  Files          26       26              
  Lines        3272     3276       +4     
==========================================
+ Hits         2864     2869       +5     
+ Misses        408      407       -1     
Impacted Files Coverage Δ
src/ArrowTypes/test/tests.jl 100.00% <ø> (ø)
src/ArrowTypes/src/ArrowTypes.jl 87.70% <100.00%> (+0.85%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Moelf
Copy link
Contributor

Moelf commented Jun 13, 2023

just curious what does this serialize to in terms of Arrow types? I guess logically it's not different from a list since the length is not known (someone can serialize a column with different length of vargs in each row), thus not a fixed size list ni Arrow-land?

@quinnj
Copy link
Member Author

quinnj commented Jun 13, 2023

As I noted above, Tuple{Vararg} actually isn't a concrete type of any kind of value in Julia, so it doesn't really make sense to talk about what it translates to in the arrow format. At runtime, you will always have some kind of concrete NTuple{N, T}, based on how many elements there actually are. Obviously, this could be pretty tricky to serialize to arrow, however, if you had, for example, a Vector{Tuple{Vararg{String}}}, with different sized tuples. We could potentially try to switch to treating those as lists instead of fixed-size lists, but it's such a weird/oddball type that it doesn't seem worth it unless someone has a real use-case for it.

@Moelf
Copy link
Contributor

Moelf commented Jun 13, 2023

Tuple{Vararg} actually isn't a concrete type of any kind of value in Julia, so it doesn't really make sense to talk about what it translates to in the arrow format.

I don't see how this follows, surely we can talk about how much information/promise a type makes, and see what's the most precise Arrow type composition we can map to. In this case, if all you see is a Vararg{T}, it's making the same amount of guarantee (as far as serialization to/from bytes is concerned) as Vector{T} no?


We could potentially try to switch to treating those as lists instead of fixed-size lists

I guess right now we would error when people have a column of Vector{Vararg{T}} of varying length, that's fine, we can throw error and ask user to explicitly convert to Vector, somehow, which would be annoying if it's a field in a struct. For (a contrived) example:

julia> col1 = [v"2.0-rc1.x", v"2.0-rc1.x.y"]
2-element Vector{VersionNumber}:
 v"2.0.0-rc1.x"
 v"2.0.0-rc1.x.y"

Arrow types certainly have enough flexibility to represent these.

@quinnj quinnj merged commit 9c439ba into main Jun 13, 2023
@quinnj quinnj deleted the jq-461-2 branch June 13, 2023 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with Union{Missing, VersionNumber}
3 participants