-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issues of RFC: Object Versioning #2611
Comments
cc @drmingdrmer, would you like to try this feature after we release it? |
Hm... I do not see any feature in my schedule that will be using versioning. |
I'm working on nextcloud alternative (still several months from open sourcing) using opendal and I'm very interested to see this feature land. I'm primarily interested in the local file system though. For example, I have servers with ZFS and BTRFS (Synology NAS) running so being able to do native versioning on these would be great. https://gist.github.com/CMCDragonkai/1a4860671145b295fe7a4d8bc3968e87 |
Thanks for using OpenDAL! And looking forward to your project!
I am considering adding version support for local file systems. However, I have encountered a problem: POSIX file systems do not include concepts related to versions. This means that I cannot use the POSIX file system API to read or delete a specific version of a file in the file system. |
I don't think POSIX or any other OS will ever support a generic one as some systems use concept of versioning while some use concept of snapshots. ZFS and BTRFS uses snapshots. I personally have snapshots on my machine running every 15mins and it automatically roles out to clean up old snapshots. You can see sample config here for my dev archlinux here and my ubuntu server here. Then with tools like httm we can look at different versions. To start you could assume the admins take care of Snapshotting but OpenDAL provides viewing versioning and allow to get a file in particular version. Then next feature could be to actually implement a snapshot capability natively via opendal.
Snapshots and Versioning if s3 and other filesystems seems related to me so would be good to think in terms of how it could be possible to work on this. I want to use my app to stores important data and photos so having some sort of versioning is critical. For now I have been thinking of me as and admin I can just revert files around, but being able to expose this natively in the app if opendal makes it easy would be great so I don't need to be the middle man :). |
https://x.com/criccomini/status/1705263488489394470 I think once we support object versioning and expose the corresponding methods in Python, we can provide support. |
cc @criccomini Due to the lack of clear user requirements before, we have made slow progress on this feature. If you are willing to provide some suggestions, we will be able to release an initial implementation quickly. |
Sure! I'll give you a concrete example. I want to build a storage layer for https://github.com/recap-build/recap. I'd like to provide four operations: ls, get, put, delete. def ls(self, path: str | None = None, clock: int | None = None) -> list[str]
def write(
self,
path: str,
val: str,
clock: int | None = None,
)
def read(
self,
path: str,
clock: int | None = None,
) -> str | None
def delete(self, path: str, clock: int | None = None) The path param is a path like All four operations should support a The When all files in a "directory" are deleted, the "directory" automatically should disappear. This mimics object stores like S3. I'd like this to work for S3, GCS, Azure Blob Store, and local FS. Local FS is particularly complex since it doesn't have versioning. I experimented with this a bit. What I had was a completely flat single directory with URL-quoted and a clock suffix for each file:
NOTE: In the example I'm using URLs as the paths, but that is not a requirement for OpenDAL. This implementation works, and the write-operations are O(1). Read operations slow as the number of files increases, but that is fine for my usecase. Caching and binary search (bisect) could be used to increase the read operations, but I didn't bother with that. An implementation where the clock is ignored on local FS would be acceptable to me if you decide implementing local versioning would be too complex. |
Hi, @criccomini, thank you for sharing! It is possible to build the API upon OpenDAL, but it may not be related to the Let's take S3 as an example: the object version is generated by the S3 side, like However, As you suggested for fs, it is possible to achieve the same thing by encoding the clock in the file path and working in a similar manner. The benefits of using OpenDAL are that you only need to write that logic once. 😆 |
Yep, yep. Makes sense. Thanks for the reply. :) |
Hi @suyanhanx, are you still interested in implementing this issue? |
Yes. Any additional info? |
We can implement the support for s3 first and design some behavior test. |
Let's do this. |
The current design does not account for |
|
Wow, nice question. The The problem is that most storage services don't support querying metadata during the list operation. So, the mechanism is just a best effort and doesn't work well, leading most users to ignore it or use it incorrectly. I'm considering removing it in favor of other methods. As for |
Ok. I'd like to implement it |
If versioning is enabled, should we return |
Hi, @Xuanwo what do you think about this issue |
Yep, it's another API changes that returns the object meta while Do you have interest to submit an RFC for this? I'm willing to help review and help you implement it. |
I'd love to do it! :) But I want to finish this RFC first. In the |
Thanks a lot! |
Some storage services, such as Amazon S3, have built-in support for versioning.
This is achieved through a feature called ObjectVersion, which allows the same object to exist in multiple versions and be accessed separately even after deletion. With this feature, users can ensure the safety of their data by rolling back to previous versions in case of unintended deletions or changes.
To implement object versioning in OpenDAL, the following tasks need to be done:
version(bool)
inList
to include version during list or notversion
effective in the serviceThe text was updated successfully, but these errors were encountered: