-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destination S3: add delta lake/delta table support #16322
Comments
Hi @mustafa-rmd , could you please edit your request to follow our feature request template? This will ensure all details are understood clearly. I've copied it below. Thank you! Tell us about the problem you're trying to solveWhat are you trying to do, and why is it hard? A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] Describe the solution you’d likeA clear and concise description of what you want to see happen, or the change you would like to see Describe the alternative you’ve considered or usedA clear and concise description of any alternative solutions or features you've considered or are using today. Additional contextAdd any other context or screenshots about the feature request here. Are you willing to submit a PR?Remove this with your answer :-) |
@dennyglee Noted in your discussion that you're adding this to your roadmap. Just wanted to confirm that you're planning to contribute here? |
@misteryeo Yes, we are planning to contribute here - it may or may not be me personally, but feel free to ping me on this until we figure this out :) |
Hey @dennyglee is there any update on that? |
@dennyglee @mustafa-rmd Any updates on this by any chance? |
Hi @dennyglee @mustafa-rmd Any updates on this feature request? I am using Airbyte & DeltaLake in production. So I would love to see this destination connector to be available as soon as possible. I'm willing to give you some hands if needed. |
Just want to chime in that I'm also interested in this! Edited to add that I'm interested in writing a delta table to S3. I'm not sure I'll end up making a PR for this, but for anyone else who wants the same thing it looks like a PR would have to be made here: https://github.com/airbytehq/airbyte/tree/0e9fdba1181b2d302b81a057f6fa16a198925eaa/airbyte-integrations/bases/base-java-s3/src/main/java/io/airbyte/integrations/destination/s3 You'd also have to make a PR here: https://github.com/airbytehq/airbyte/blob/0e9fdba1181b2d302b81a057f6fa16a198925eaa/airbyte-integrations/connectors/destination-s3/src/main/resources/spec.json |
Do we have any update on this feature request ? |
My current requirement is to have the following data pipeline:
PostgreSQL (Source)
Air byte
Minio - S3 storage (Destination)
Apache spark configure with (Minio and Delta lake formatting) since spark doesn’t support ACID transactions.
The goals to have air bye move data from PostgreSQL (Source) to Minio storage (Destination) saved in delta format. Spark then will come and read data from S3 expected to be with delta format.
My main issue with the output format for Air bye S3 connector. Currently is only supports 3 data types: CSV, Avro and JSON Lines (JSONL).
What is the recommend way to solve this problem? since I think, many companies are trying to build this data pipeline.
Is there plan to have this feature released in upcoming releases?
Should we implement this feature? If so, is there a good documentation of how to start about it?
Or, is there another method of going about it?
Thanks,
The text was updated successfully, but these errors were encountered: