-
-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting rid of internally soft-linked hard-links #855
Comments
Guess we need to keep the compat code until we do another major release that requires running an upgrade procedure anyway. So, tag this "2.0"? |
Ape had the good idea of doing this via recreate. I.e. removing all the hardlink_master cruft, implementing a clean solution, and only keeping it in recreate (where it's one of the simpler variants, especially compared with diff!). I'd say this would become a feasible option for 1.2 or 1.3 if recreate has proven reliable in 1.1. |
See also #1473 - one reason why the problem there occurs relatively early is because the chunk list of a file is contained in the ITEM, making the item big (the other reason is having extremely many items). If we would move the chunklist into an INODE (and reference the inode objects from the item), #1473 would be very much relaxed as the item metadata stream would shrink a lot. Also, for this ticket here, we could reference same INODE objects from multiple ITEMs to model hardlinks in a natural way. Note that INODE can not just be 1 storage object (MAX_OBJECT_SIZE = 20MiB) as that only stores ~500.000 object references, with ~2MiB per file content chunk this would mean a file size limit of ~1TB, which is too low. So, we could have a primary (small) list of objects IDs in the ITEM and each of these object contains a secondary list of references to content objects, so we get n * 1TB. An optimization could be done to avoid the indirection for small files: just have the primary list directly point to content objects (as it is now) - this could also be the "compatibility mode". Note: I talked about INODE above. In UNIX filesystems usually also the metadata of the file (except the name) is stored in the INODE. We could discuss doing that or we could just implement the block list part of an INODE. |
Closing in favour of #2325. |
Today in the Borg "Getting rid of…" show: soft-linked hard-links.
This distinction between "regular files" and "regular files with nlink>1" has been a bit of a troublemaker in various places, because it makes it hard to work on subsets of all items. The original solution with the 'source' attribute is nice, because it avoids storing the chunk id list twice, and because it makes it straightforward to link all links together when extracting (the full archive, not a subset).
When working with subsets this solution fails and we kludged stuff together to make it work, but it ain't nice.
Ideas
The text was updated successfully, but these errors were encountered: