Use Iceberg generic record to replace Avro IndexedRecord and add a translation to/from data type in Flink

Current implementation uses Avro as the intermediate fomat, as:
`Input as Java Map, serialized to Avro IndexRecord (as the intermediate format), then call Parquet.write(Iceberg API) to write into Parquet file.`

1. We need to remove that dependency and use Iceberg generic record/built-in types (maps with non-string keys, decimals, date/time types...) as the input (addressed)
- Remove AvroSerializer and its implementations (renamed to RecordSerializer as we need a serializer here to transform input format to Iceberg Record))
- Remove AvroUtils (addressed)
- Remove the setting on serializer in IcebergSinkAppender (Kept as we need a serializer here)

2. Add the translation to/from Flink native formats (like we did for InternalRow for Spark) (will be addressed later)

3. FileWriter#write() could accept convention type in addition to Iceberg generic record. See [this comment](https://github.com/apache/incubator-iceberg/pull/856#discussion_r396796302). (will be addressed later, currently, the code only supports Iceberg Record as input )



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Iceberg generic record to replace Avro IndexedRecord and add a translation to/from data type in Flink #870

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use Iceberg generic record to replace Avro IndexedRecord and add a translation to/from data type in Flink #870

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions