Skip to content

Conversation

@the-other-tim-brown
Copy link
Contributor

@the-other-tim-brown the-other-tim-brown commented Dec 29, 2025

Describe the issue this Pull Request addresses

Part 1 of #14280
This aims to remove direct usage of the avro schema in the flink client with the exception of the Flink Record objects which will be covered in #17689

Summary and Changelog

  • Removes the AvroSchemaConverter and updates all calls to go through HoodieSchemaConverter
  • Updates RowDataToAvroConverters converter interface to use HoodieSchema
  • Updates RowDataAvroQueryContexts to use HoodieSchema. Class name is updated to RowDataQueryContexts

Impact

Updates the client to use our new schema system to allow us to add new types not available in Avro.

Risk Level

Low

Documentation Update

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@github-actions github-actions bot added the size:L PR with lines of changes in (300, 1000] label Dec 29, 2025
}

@Test
void testUnionSchemaWithMultipleRecordTypes() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are copied from the TestAvroSchemaConverter

@github-actions github-actions bot added size:XL PR with lines of changes > 1000 and removed size:L PR with lines of changes in (300, 1000] labels Dec 29, 2025
return rowDataQueryContext.getFieldQueryContext(column).getValAsJava(data, allowsNull);
}

private Object getColumnValue(Schema recordSchema, String column, Properties props) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just change the param to take in HoodieSchema so that then in line 204 we dont have to do fromAvroSchema, or does this result in large changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a private method so it is fine but as mentioned in the description, the record classes are handled in a separate ticket

return getColumnValueAsJava(recordSchema, column, props, true);
}

private Object getColumnValueAsJava(Schema recordSchema, String column, Properties props, boolean allowsNull) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

@Override
public Option<HoodieAvroIndexedRecord> toIndexedRecord(Schema recordSchema, Properties props) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the PR description, the Record interface will not be updated as part of this.

@the-other-tim-brown the-other-tim-brown marked this pull request as ready for review December 29, 2025 19:16
@the-other-tim-brown
Copy link
Contributor Author

@danny0405 can you give this a review?

@the-other-tim-brown the-other-tim-brown force-pushed the flink-client-schema-migration branch from 02ba0c6 to 6bc7f5c Compare December 31, 2025 19:06
@hudi-bot
Copy link
Collaborator

hudi-bot commented Jan 1, 2026

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL PR with lines of changes > 1000

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants