-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Enable Clone of Delta Lake tables #1387
Comments
Great that this is getting visibility, thank you @dennyglee. I think specifically That said, is this feature request encompassing the work to port existing functionality from core databricks offering to OSS? Or rather a new implementation from scratch? |
I think so @p2bauer - I think there is still an open debate on which one makes more sense (port or design). Any particular thoughts on approach? |
It would be great to have the |
Hello, I see on the roadmap (#1307) that shallow clones have been added in 2.3 - is there still plans to add deep clones? edit: removed alternative question. I believe for the time being use are going to utilize something like:
|
What about
This is manual alternative of DEEP COPY for now. It's not complete solution, for example, we don't need to copy all _delta_log directory. |
Interesting take. Aside for Databricks to consider implementing into Delta (currently our org just don't have the manpower to be able to contribute to the project at the moment in any meaningful way), and might drop hints at the DAIS 2023 this week: I think for most organizations, this is typical as older data generally becomes stale and is only necessary to keep for CYA and auditing reasons. Thus, we would be looking to implement a fall-off policy, only keeping versions like: 1 version every year for past 7 years, 1 version every month for last year, 1 version every week for last 3 months, 1 version every day for last 30 days. |
Hi @dennyglee , Any idea when |
Any news? |
Feature request
Enable Clone of Delta Lake tables
Overview
Clones a source Delta table to a target destination at a specific version. A clone can be either deep or shallow: deep clones copy over the data from the source and shallow clones do not.
Motivation
For business continuity disaster recovery to streamlining DevOps, cloning of Delta Lake tables
Further details
The context for this functionality can be found at https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-clone.html
Willingness to contribute
The Delta Lake Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?
The text was updated successfully, but these errors were encountered: