Skip to content

CoW File System for Storing Backups

gaborigloi edited this page Mar 16, 2018 · 5 revisions
Zbox Btrfs
Deduplication ✔️ Built-in, ✔️ probably inband (synchronous) ❔, ✔️ content-based data chunk and whole-file deduplication Tools for batch (out of band), ✔️ content-based data chunk and whole-file deduplication
Encrypttion ✔️
COW semantics ✔️ ✔️
ACID Transaction ✔️ CoW snapshots are atomic
Query File Checksum ✖️ See https://stackoverflow.com/questions/32761299/btrfs-ioctl-get-file-checksums-from-userspace https://www.spinics.net/lists/linux-btrfs/msg41687.html https://unix.stackexchange.com/questions/191754/how-do-i-view-the-btrfs-checksum-of-a-file
Programming Language-Independent ✖️ ✔️
Windows Support ✖️ not yet ✖️
Sparse Files ✔️ yes, using truncate https://wiki.archlinux.org/index.php/Sparse_file

Advantages of using a CoW filesystem:

  • We do not need to coalesce the incremental backups ourselves. We get it for free.
  • We can easily remove intermediate backups by just deleting the corresponding files.
  • We can immediately checksum the whole VDI without having to do coalescing, and compare it to the checksum on the server side. This provides extra safety, and ensures the integrity of the backups.
  • We get data chunk deduplication for free. This means that if there is some duplication in the VDI's data itself, hopefully we'll use even less space than the physical size of the VDI.
Clone this wiki locally