You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a platform I work on, I decided to write avro log files so I could easily close and append binary files to s3. Since I didn't want to bother transforming it to another format using Spark, which is the thing I wanted to drop in the first place, I started writing what's required to read avro as a datasource in datafusion.
I transformed all parquet test files to avro and plan to add a test case for each of these.
My question would be is Avro support desirable for datafusion or should I just make a sidecar crate on my own ?
Describe alternatives you've considered
Transforming data in json or parquet to reuse the existing code.
Additional context
I'm new to the new arrow data types, and it's been a challenge to find out what I should do with avro union types that are just a nullable field. Ultimately I decided to make them nullable fields and drop the union, but I had to add special cases here and there because of that.
The text was updated successfully, but these errors were encountered:
In a platform I work on, I decided to write avro log files so I could easily close and append binary files to s3. Since I didn't want to bother transforming it to another format using Spark, which is the thing I wanted to drop in the first place, I started writing what's required to read avro as a datasource in datafusion.
Here is the branch on my fork (I merged the nested field PR in it but it can be removed) :
https://github.com/Igosuki/arrow-datafusion/tree/avro2_m
I transformed all parquet test files to avro and plan to add a test case for each of these.
My question would be is Avro support desirable for datafusion or should I just make a sidecar crate on my own ?
Describe alternatives you've considered
Transforming data in json or parquet to reuse the existing code.
Additional context
I'm new to the new arrow data types, and it's been a challenge to find out what I should do with avro union types that are just a nullable field. Ultimately I decided to make them nullable fields and drop the union, but I had to add special cases here and there because of that.
The text was updated successfully, but these errors were encountered: