Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Arrow.jl conversion methods so that Samples/SamplesInfo can be (de)serialized as Arrow values #68

Merged
merged 10 commits into from
Mar 31, 2021

Conversation

jrevels
Copy link
Member

@jrevels jrevels commented Mar 14, 2021

requires apache/arrow-julia#150 this now requires (and is being periodically re-semi-updated against) apache/arrow-julia#156, which supersedes apache/arrow-julia#150

@jrevels
Copy link
Member Author

jrevels commented Mar 30, 2021

Looks like upgrading to Arrow 1.3 surfaced a DataFrames bug fixed upstream by JuliaData/DataFrames.jl#2682?

At least, I think that's what's going on...locally using DataFrames' main branch fixes the failure here. I guess we need to wait for an upstream DataFrames tag of that bug fix for CI to pass here...

@jrevels jrevels requested a review from ericphanson March 31, 2021 02:12
Copy link
Member

@ericphanson ericphanson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exciting!

So I guess now we can use Arrow as a storage format for signal data in addition to metadata if we wanted...

end

function Arrow.ArrowTypes.fromarrow(::Type{<:Samples}, arrow_data, arrow_info, arrow_encoded)
info = SamplesInfo(arrow_info; validate=false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if validate=false is the right choice? I guess we can assume it's properly formulated since usually we will be validating on construction and then serializing/deserializing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can assume it's properly formulated since usually we will be validating on construction and then serializing/deserializing

Yeah, this is the assumption we're making

@jrevels
Copy link
Member Author

jrevels commented Mar 31, 2021

So I guess now we can use Arrow as a storage format for signal data in addition to metadata if we wanted...

Yup :) Though this is more useful to enable conveniently (de)serialization of individual sample data segments to/from Arrow (either for storage or IPC)

If you wanted to use Arrow as a storage format for whole sample data files w/ Onda, it'd make more sense to add an AbstractLPCMFormat for your Arrow <-> LPCM mapping (this would especially be useful for storing in planar format xref https://github.com/beacon-biosignals/OndaFormat/issues/8)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants