Learn ZFS the easy way with this lab environment where disk failures and corruption can be simulated!
- Clone this repo
vagrant up
- This will create a VM with 2 cores and 1 GB of RAM.- 10 1 GB virtual disks will be created in the directory
/disks/
.
- 10 1 GB virtual disks will be created in the directory
vagrant ssh
- All scripts are in/vagrant/bin/
which is in the $PATH.- Optionally run
zfs-lab-create
to create a ZFS pool and filesystem in/zfs/lab1/
.
The way this project works is by creating 10 1 GB files in the /disks/
directory on the VM, named disk0
through disk9
. Because in UNIX, devices have file paths, there's no reason that files themselves cannot
be treated as devices. This means that entire ZFS filesystems can be created, using these files as standins for multiple physical disks.
Obviously you don't want to use this for production, but having an error of files-pretending-to-be-disks works great for learning how to admin a ZFS filesystem, simulating hardware failures and disk corruption, etc.
This README includes a list of the incldued utilities and how they are used, along with some sample exercises to become better familiar with ZFS.
This repo ships with a number of utilities to automate and semi-automate work related to playing
around with ZFS. All scripts are in /vagrant/bin/
in the VM which is also in your path:
break-disk
- Used to break a specific disk file to simulate disk failure.create
- Create a sample file full of X's. Useful for testing file corruption as the corruption will be obvious when viewing it withless
.corrupt
- Corrupt a file at a certain offset with a certain character for a certain length, optionally repeated a certain number of times. Useful for running against disks in/disks/
to simulate disk corruption.populate-zfs-filesystem
- Usescreate
to populate a target directory with a series of files and directories with 1 MB files. Useful for testing files for corrption when corrupting a ZFS disk.sha1-save-files
- Compute SHA1 hashes recursively of a directory and its files and save the files in/data/
.sha1-check-files
- Check against previously computed hashes recursively and look for corruption. Output is written indiff
format.truncate-disk
- Used to truncate a specific disk image to a specific number of bytes. Useful to simulate disk failures.zfs-lab-create
- Stand up a ZFS pool in/zfs/
, a ZFS filesystem in/zfs/lab1/
, and populate it with files withpopulate-zfs-filesystem
.zfs-lab-destroy
- Remove a ZFS pool and filesystem created wihtzfs-lab-create
.- Internal Utilities. These are used by the playground itself but could be useful if you want to change the environment substantially:
zfs-add-disk-file
- Creates a file in/disks/
which can be added to ZFS as a disk. Used during provisioning of the Vagrant instance.zfs-rm-disk-file
- Removes a file created byzfs-add-disk-file
.zfs-create-pool
- Create a ZFS pool of disks. Used byzfs-lab-create
.zfs-destroy-pool
- Destroy a ZFS pool of disks.zfs-destroy-pool-if-exists
- Destory a ZFS pool only if it already exists.
For all exercises, the Zpool should be called zfspool
. When the pool is created, a ZFS filesystem will be created mounted to /zfspool/
by default.
- Create a single disk Zpool with disk
/disks/disk0
. Now destroy it. - Create a Zpool with disk
/disks/disk0
through/disks/disk2
. - Craete a ZFS filesystem in the Zpool you just created.
- Hints:
- Use
zfs set canmount=off ZPOOL_NAME
to disable the mountpoint on the Zpool itself. - ...and try
zfs create
to create a ZFS filesystem under the Zpool.
- Use
- Hints:
- Answers
- Create a mirrored Zpool with disks
/disks/disk0
and/disks/disk1
.- Hint: You should only have two disks in a mirrored Zpool.
- Create a RAID5/RAIDZ Zpool with disks
/disks/disk0
through/disks/disk2
.- Hint: You should have at least 3 disks.
- Create a RAID6/RAIDZ2 Zpool with disks
/disks/disk0
through/disks/disk3
.- Hint: You should have at least 4 disks.
- Create a RAID7/RAIDZ3 Zpool with disks
/disks/disk0
through/disks/disk4
.- Hint: You should have at least 5 disks.
- Answers
-
Create an unmirroed Zpool called
zfspool
, remove/disks/disk0
, catch the error in ZFS, confirm that the pool is utterly broken and that your files are unrecoverable.- Hints:
- Run
populate-zfs-filesystem /zfspool/ 5 5
to create sample files in the ZFS filesystem and save SHA1 hashes of those files. - Simulate breaking the disk with the command
break-disk disk0
- Run
sha1-check-files
to verify that files are corrupted - Run
zfs-add-disk-file disk0 1024
to (re)create thedisk0
file when done with this exercise
- Run
- Hints:
-
Created a mirrored Zpool called
zfspool
, remove/disks/disk0
, catch the error in ZFS, fix the pool, verify that files are unharmed.- Hints:
- Run
populate-zfs-filesystem /zfspool/ 5 5
to create sample files in the ZFS filesystem and save SHA1 hashes of those files. - Simulate breaking the disk with the command
break-disk disk0
- After recovery, run
sha1-check-files
to verify that file contents were unharmed - Run
zfs-add-disk-file disk0 1024
to (re)create thedisk0
file when done with this exercise
- Run
- Hints:
- Create a RAIDZ Zpool with 3 disks, corrupt
disk0
, catch the error in ZFS, and repair the pool. Verify the files are unaffected.- Hints:
- Run
populate-zfs-filesystem /zfspool/ 5 5
to create sample files in the ZFS filesystem and save SHA1 hashes of those files. - Try corrupting a disk with:
corrupt --offset 1000 --repeat 1000000 /disks/disk0
. This will swiss-cheese the disk in about a few 10s of seconds. - Run
sha1-check-files /path/to/zfs/filesystem
to catch corrupted files and verify the repairs were successful.
- Run
- Hints:
- Create a RAIDZ Zpool with 3 disks, corrupt
disk0
anddisk1
. Verify the Zpool is unrecoverable.- Hints:
- Run
populate-zfs-filesystem /zfspool/ 5 5
to create sample files in the ZFS filesystem and save SHA1 hashes of those files. - Try corrupting a disk with:
corrupt --offset 1000 --repeat 1000000 /disks/disk0
. This will swiss-cheese the disk in about a few 10s of seconds. - Run
sha1-check-files /path/to/zfs/filesystem
to catch corrupted files and verify that will need to restore from a backup.
- Run
- Hints:
- Answers
- Play with snapshots and rollbacks.
- Disk quotas for ZFS filesystems
- Create a raw device in the Zpool and put ext4 on it. (rollback from a snapshot)
- Stream one ZFS filessytem to another
A: It's a UI consideration--I want checksums a little smaller, which will be easier to read. Keep in mind that the context is "simulating a filesystem", versus "code that is being run in production". But hey--if this is useful enough that you're looking at using this in production(!), come talk to me and I'll see what I can do. :-)
- The Z File System - From the BSD Documentation. By far the most useful resource that I have found.
- ZFS 101: Understanding ZFS Storage and Performance
- What Is ZFS?
- OpenZFS Documentation
- Oracle Solaris ZFS Administration Guide
- How to force ZFS to replace a failed drive in place
- The logo was made over at https://www.freelogodesign.org/
- This text to ASCII art generator, for the logo I used in the MOTD message.
My name is Douglas Muth, and I am a software engineer in Philadelphia, PA.
There are several ways to get in touch with me:
Feel free to reach out to me if you have any comments, suggestions, or bug reports.