-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: merge operation #1522
feat: merge operation #1522
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
@Blajda @wjones127 does that mean the full table will be loaded in memory first so it can rewrite the whole table? |
Yes since this is using a nested loop join both the source and target need to be fully loaded into memory. There would need to be follow up improvements to perform pruning and to use a Hash Join / Merge Join. |
Good to know, still useful for datasets that fit in memory then. Hopefully it can be |
I don't feature flag is necessary, but we could label it "experimental" maybe. I think as long as the docs give up-to-date notes on limitations this should be fine. |
Thanks @wjones127 for taking the time to review this. I'll break future enhancements into smaller chunks to make it easier. |
Hoping that merging main will resolve some of the failures, which seem to be related to recently added deletion vector fields... |
Head branch was pushed to by a user without write access
Updated code to add new dv field. hopefully it passes all test now. |
There is one more failure - probably related to a mypy update o.a. - we could just Other then that we are good to go :) |
😕 Seems to fail with the ignore and without. check-python is passes locally for me |
yea, this is a bit annoying :( - unfortunately I do not have permissions to disable or bypass required checks ... Should we try once more without? Local checks are passing for me as well without the comment. Alternatively we could try |
or maybe even both, not sure how make works. i.e. would it fail early if now the ruff check is complaining? otherwise |
Head branch was pushed to by a user without write access
@roeap @wjones127 Please provide approval on the workflows. The lint change should hopefully allow everything to pass now. |
@roeap @wjones127 can you approve the workflow, so this can get merged in the next release? |
@wjones127 Please approve the workflow. Merged with the latest changes in main. |
@wjones127 Thanks for reviewing and merging. |
Description
Implement the Merge operation using Datafusion.
Currently the implementation rewrites the entire DeltaTable limiting the files that are rewritten will be performed in future work.
Related Issue(s)
Documentation