-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #5448: Implement initial Iceberg Connector using PyIceberg #14825
Fixes #5448: Implement initial Iceberg Connector using PyIceberg #14825
Conversation
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
...c/src/main/resources/json/schema/entity/services/connections/database/icebergConnection.json
Outdated
Show resolved
Hide resolved
...c/src/main/resources/json/schema/entity/services/connections/database/icebergConnection.json
Outdated
Show resolved
Hide resolved
}, | ||
"warehouseLocation": { | ||
"title": "Warehouse Location", | ||
"description": "Warehouse Location. Used to specify a custom warehouse location.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe put an example for each value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of making IcebergCatalogFactory
either an instantiable class or just breaking it down with catalog_type_map
being a global and from_connection
being a simple function? I understand the intent but we don't really need that object if we won't use it as an object.
Do you see any advantage in that context of defining this piece of code directly in _init__.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, to be honest I don't have a really strong opinion... Been messing around with both ways actually xD
I believe that trying to wrap things with specific classes tends to make the code a bit more readable because it gives you context straight away, but if we avoid doing that in this project I wouldn't mind having just a from_connection
function.
About defining it on init.py, for me it made sense as the IcebergCatalogFactory
is basically the entrypoint for the sub module.
Again if it's against our common practices I wouldn't mind defining it on another file and to make the import statement more directy we can "re-export" it on the init.py.
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
The Python checkstyle failed. Please run You can install the pre-commit hooks with |
The Java checkstyle failed. Please run You can install the pre-commit hooks with |
Quality Gate passed for 'open-metadata-ui'Kudos, no new issues were introduced! 0 New issues |
Quality Gate passed for 'open-metadata-ingestion'Kudos, no new issues were introduced! 0 New issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @IceS2, for the effort here. Not an easy first connector to tackle.
Congratulations on your first contribution
Describe your changes:
Fixes #5448
Initial implementation on the Iceberg Connector, leveraging PyIceberg to connect to supported catalogs and extract Iceberg table Metadata.
Forcing some requirements to make it work with the rest of the dependencies. It's important that someone double checks it 👀
Notes
Covers
Does not (yet) Support
Type of change:
Checklist:
Fixes <issue-number>: <short explanation>
or decision-making process is reflected in the issue.