-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(samples): Dataflow to Apache Iceberg with dynamic destinations #9645
base: main
Are you sure you want to change the base?
docs(samples): Dataflow to Apache Iceberg with dynamic destinations #9645
Conversation
Here is the summary of changes. You are about to add 2 region tags.
This comment is generated by snippet-bot.
|
@code-review-assist review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request adds a snippet demonstrating dynamic destinations with Apache Iceberg in Dataflow and refactors integration tests. The changes are well-structured and address a relevant documentation bug. Here are a few suggestions based on the Google Java Style Guide:
- Use diamond operators for type inference.
- Declare
Map
andImmutableMap
using their interfaces. - Consider using
List.of()
for creating immutable lists when possible.
@code-review-assist walkthrough |
Dataflow to Apache Iceberg Samples WalkthroughThis repository contains Java code samples demonstrating how to interact with Apache Iceberg tables using Apache Beam and Google Cloud Dataflow. The primary goal is to provide clear, concise examples for developers to integrate Dataflow pipelines with Iceberg for data ingestion and processing. Repository StructureThe repository is structured as follows:
Code WalkthroughLet's trace the execution flow of the
Integration TestsThe integration tests in
Pull Request ImpactThis pull request introduces a significant enhancement by adding support for dynamic destinations in Apache Iceberg. The updated Potential ImprovementsWhile the PR is well-structured, adding more comprehensive error handling and logging could further improve its robustness. Consider adding checks for null or empty values in the input data and handling potential exceptions during table creation or data writing. More detailed logging would aid in debugging and troubleshooting. |
Description
Add snippet for Iceberg dynamic destinations
Add a snippet that shows the use of [https://github.com/[Task]: Add utilities to easily implement portable dynamic destinations apache/beam#32365](dynamic destinations) when writing to Apache Iceberg from Dataflow, This is a new feature in Beam 2.60.
Refactor integration tests to remove the assumption of exactly 1 destination table.
Relevant doc bug: b/371047621
Checklist
pom.xml
parent set to latestshared-configuration
mvn clean verify
requiredmvn -P lint checkstyle:check
requiredmvn -P lint clean compile pmd:cpd-check spotbugs:check
advisory only