-
-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
files cache: index by physical extents to support reflinks & snapshots #2743
Comments
According to [1] there is no way to find physical extents (the backing element of reflinks) without either risking data corruption (when btrfs compression is used) or writing code that parses btrfs data structures. Apart from that it could be incorporated into the files cache ( |
Would the new reverse mapping (rmapbt) support on xfs be of any help for identifying CoW files ? https://lwn.net/Articles/695290/ |
I don't see a problem on XFS, apart from this being a rather fickle business overall. It's the btrfs issue I linked to above that seems problematic to me (there needs to be a reliable way to detect compressed files/extents to work around it). coreutils hints at problems with ext4, though the comments are old. Maybepossibly fixed. I'm going to be straight here and say that this won't be implementable casually. I'd estimate that implementing this will take 1-2 developer weeks, i.e. quite an effort. |
Understood. Thanks for takin the time to answer. Those new XFS features (reflink + rmapbt) are still marked as Experimental anyway. I'm going to give them a try on a controlled environment and see how they perform. By the time XFS reflink goes primetime more people might express a need for such feature on borg. I find it a good balance between btrfs and its flacky RAID support and ZFS that is not available straight in the kernel. Cheap snapshots + borg for real dedup backup is probably going to be my next workhorse. |
Some more pointers from the xfs folks:
|
With the release of Linux 4.16, the XFS reflink feature is no longer tagged EXPERIMENTAL. |
@jcharaoui Thanks for pointing this out. I used those features on nearly a year on my photo hard drive and haven't seen any problem. Now I believe that the rmapt feature is also no longer experimental (I'm running Linux 4.16.9) since I no longer see those red warnings when mounting my external drive that has both reflink and rmapbt enabled. |
Interested in this while watching the first borg backup of a btrfs-based container pool take forever :) |
https://gist.github.com/charles-dyfis-net/bfb0e30862f04957d020afe0ff8b093b may be of interest to those here -- invoking Not maintained, not recently tested, not documented at time of development and use, very much YMMV. |
For my use case I'm now investigating https://github.com/systemd/casync, which has btrfs reflink support (don't know if it works on xfs.) I hit a few bugs, but worked out fixes for a couple (systemd/casync#235, systemd/casync#237 - now merged) and found work-arounds for the other two (systemd/casync#239, systemd/casync#240.) |
At the risk of plugging a project I'm a contributor to, I strongly suggest also looking into https://github.com/folbricht/desync. casync may have gotten better over time, but back when desync was started, casync's error handling was atrocious; and desync very much does presently support reflinks when content already exists in another, local |
I'd looked at desync and thought it didn't support reflinking, but it seems I might be mistaken .. will revisit! casync .. does have some issues. |
Depending on when you looked it may not have; but it most definitely does today. See https://github.com/folbricht/desync/blob/4a8700c059471d5f005dd7c9a957072bb1fa5c8a/fileseed.go#L123-L126 |
Yup, just found that - I only looked the other day but I think I'd mis-parsed the beginning of the README. Going off topic here, but quickly - desync doesn't seem to support ACLs in the catar stream, but does now implement xattrs - does that mean it can restore ACLs from catar streams that it generates itself? |
Couldn't say. I'm a regular user and occasional contributor, but I don't use that particular functionality. Frank does have a gitter chat room, though -- I'd suggest asking in there. |
XFS has implemented some new (experimental) exciting features such as reflink that allows instant CoW snapshots similar to what is found on btrfs. I don't think there is a plan to support send/receive commands like on ZFS so the dedup function is pretty much limited to the local filesystem.
I'm envisioning to use the following scheme for my external USB drive that contains my photos and that I usually backup on a second drive or network storage using borg. This applies to both btrfs and the new reflink enabled xfs.
While on the move and without network connectivity or access to my borg drive, I can, before working on the pictures, perform instant pseudo snapshots (not like the snapshots found on btrfs) without eating precious hard drive space.
# cp -r --reflink=always /mnt/usbdrive/photos /mnt/usbdrive/snapshots/snap01
When I'm back home I can connect my photo drive and my backup drive to my PC and initiate a borg backup.
# borg create /mnt/backupdrive/borg::borg-snap1 /mnt/usbdrive
So the borg-snap1 snapshot will contain all the different snapshots I performed while away from home plus the working directory. But since borg doesn't know about the reflink feature it will rescan each of the files found in each snapshot found on my photo drive thinking they are new files but will ultimately find corresponding known dedup blocks so it will effectively not copy over each of the btrfs/xfs pseudo snapshot. I tried it and the size of my photo directore + many snapshots of the same pictures gave pretty much the size of the photo directory on the borg snapshot which is great.
I was wondering it there would be a way to have a new feature in borg to detect such reflink enabled filesystems (btrfs/xfs) so it would immediately know that a file found in a btrfs/xfs directory is a duplicate of an existing known one and therefore use the same dedup blocks.
The text was updated successfully, but these errors were encountered: