Research: Arvo format #857
Replies: 3 comments
-
hi, just stumbled upon frictionlessdata .. was thinking of similar idea to datapackage.json for a while now for a pet project for data exchange platform.. it would be good if datapackage can support avro as an option for resource, as avro schema's schema evolution helps with keeping long term data where the schema might change over time. Avro also is less error prone compared CSV when dealing with dirty data. Additionally, avro schema also support complex nested map/array types. thanks |
Beta Was this translation helpful? Give feedback.
-
@kagesenshi thanks for the feedback and we'll definitely keep this in mind. Note data package already supports avro resources -- it is just a question of how it would integrate with table schema. |
Beta Was this translation helpful? Give feedback.
-
+1 for Frictionless providing guidance on the table schemas (both inline or pointing to an avsc file by URL may have benefit?) (nit: having the typo in this issue title "Arvo" fixed would aid discoverability of it) |
Beta Was this translation helpful? Give feedback.
-
http://avro.apache.org/docs/current/
New data format. Supported by e.g Google Bigquery (in beta). Some similarities with TDP e.g. JSON schema format. Overall not that similar: focused on optimizations especially designed for HDFS (e.g. block size).
Schema
Types
https://avro.apache.org/docs/1.8.1/trevni/spec.html
Beta Was this translation helpful? Give feedback.
All reactions