-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-24800][SQL] Refactor Avro Serializer and Deserializer #21762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #92975 has finished for PR 21762 at commit
|
| def deserialize(data: Any): Any = converter(data) | ||
|
|
||
| /** | ||
| * Creates a writer to writer avro values to Catalyst values at the given ordinal with the given |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit a writer to write
| def setInt(ordinal: Int, value: Int): Unit = set(ordinal, value) | ||
| def setLong(ordinal: Int, value: Long): Unit = set(ordinal, value) | ||
| def setDouble(ordinal: Int, value: Double): Unit = set(ordinal, value) | ||
| def setFloat(ordinal: Int, value: Float): Unit = set(ordinal, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we don't need these default implementation
| * This function takes an avro schema and returns a sql schema. | ||
| */ | ||
| def toSqlType(avroSchema: Schema): SchemaType = { | ||
| def toCatalystType(avroSchema: Schema): SchemaType = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we don't need to rename it?
|
Test build #93026 has finished for PR 21762 at commit
|
|
thanks, merging to master! |
Currently the Avro Deserializer converts input Avro format data to `Row`, and then convert the `Row` to `InternalRow`. While the Avro Serializer converts `InternalRow` to `Row`, and then output Avro format data. This PR allows direct conversion between `InternalRow` and Avro format data. Unit test Author: Gengliang Wang <gengliang.wang@databricks.com> Closes apache#21762 from gengliangwang/avro_io. (cherry picked from commit 9603087)
What changes were proposed in this pull request?
Currently in Avro data source module
Row, and then convert theRowtoInternalRow.InternalRowtoRow, and then output Avro format data.This PR allows direct conversion between
InternalRowand Avro format data.How was this patch tested?
Unit test