Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: iceberg schema airbyte metadata #48604

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jdpgrailsdev
Copy link
Contributor

What

  • Add Airbyte metadata columns to schema
  • Actually detect insert/update/delete operation

How

  • Add Airbyte metadata columns to schema
  • Detect insert/update/delete when converting to Iceberg record

Review guide

  1. AirbyteTypeToIcebergSchema.kt
  2. IcebergUtil.kt

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

@jdpgrailsdev jdpgrailsdev requested a review from a team as a code owner November 21, 2024 19:27
Copy link

vercel bot commented Nov 21, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Nov 21, 2024 8:32pm

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues CDK Connector Development Kit connectors/destination/iceberg-v2 labels Nov 21, 2024
@edgao
Copy link
Contributor

edgao commented Nov 21, 2024

}
}

private fun createAirbyteMetadataFields(): List<NestedField> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these columns are being added to the schema using the mapper architecture. if you take a look at

val schema = pipeline.finalSchema.withAirbyteMeta(true).toIcebergSchema(primaryKeys)

In IcebergV2Writer. The withAirbyteMeta method is adding a bunch of em. If we want to add more like the generation id , etc. we should use the same design.

@jdpgrailsdev
Copy link
Contributor Author

this looks a lot like https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/bulk/core/load/src/main/kotlin/io/airbyte/cdk/load/data/AirbyteTypeToAirbyteTypeWithMeta.kt ? which we can maybe reuse in some way

@edgao It does and I wasn't sure how to reuse that given that it returns Airbyte types and not Iceberg types. I suspect what is really being pointed out here is some refactor to this in the CDK to make the common parts more common.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues CDK Connector Development Kit connectors/destination/iceberg-v2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants