RSync INCRemental backup: Simple, fast, incremental backups with rsync.
Wrapper around rsync to perform incremental backups by hard-linking unchanged files on the filesystem (a.k.a. rsync --link-dest
). Each dated backup directory is a full backup and may be treated independently (deleted etc), but duplicate files from previous backup are hard-linked on the filesystem, so only use incremental disk space.
- Backed up directories and files are accessible on the destination server filesystem (no special tools needed to examine, list, or partially/fully-restore a backup)
- After the initial backup, unchanged files never need to be transferred or stored again, even for a 'full' backup
- A 'full' backup simply forces rsync to read the full file contents on both filesystems (source and destination) and compare them by checksum. If the file is unchanged on the destination filesystem, a new hard-link is made to it in the dated backup. If the file is changed or new, it will be created as a new file.
- This makes rsincr extremely well-suited to situations where full transfers or stores of the backup data are costly or time-consuming, e.g. Large backup sets at sites with limited upload bandwidth
- By design, backups are not encrypted, de-duplicated, compressed, or signed/sealed
- Compression on-disk may be accomplished at the filesystem or volume level if required
- Encryption on-disk may be accomplished at the volume level
- Be aware that data is accessible while the backup filesystem is mounted and would be compromised in the event of malicious access to the destination server
- Potential future enhancement: Mount/unmount an encrypted volume on the destination server using a passphrase or key saved on the source host config
- File metadata (owner, in particular) may not be faithfully reproduced if the owner does not exist on the remote backup host
- Local system (backup source):
- Python 3.6+
- Python modules from requirements.txt
- rsync (any version in recent history, but 3.0+ (2008) recommended)
- Backup destination server:
- rsync (any version in recent history, but 3.0+ (2008) recommended)
- GNU
find
- Filesystem must support hard links
git clone https://github.com/reedbug/rsincr.git # Or git@github.com:reedbug/rsincr.git
cd rsincr/
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Minimum configuration items needed to perform a backup:
- Server
- At least one backup job, with:
- Source path
- Destination path The example config file demonstrates most configuration options, or see configuration reference below.
- Example (6-hourly backups, outputting to a logfile):
0 */6 * * * cd ~/rsincr; venv/bin/python rsincr.py >> logs/$(date "+\%FT\%T").log
- Failures or other errors will be output as normal and the process will exit with a failure, so it is advisable to configure the cron job / system to email failure outputs to a real person
- It is recommended to schedule using a cron job (in a plain user's crontab), since:
- rsincr requires execution as a real user with an SSH key etc setup for access to the remote destination server
- systemd has no native functionality to email notifications of failures in service units (even when triggered by systemd timer units), necessitating a workaround 'service' unit for these notifications
- Example systemd service and timer units (including a service for emailing failure notifications) can be found in example_systemd_units/
./rsincr.py [-h] [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
[-c CONFIG_FILE] [-f FORCE_FULL_BACKUP]
optional arguments:
-h, --help show this help message and exit
-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Logging/output verbosity
-c CONFIG_FILE, --config-file CONFIG_FILE
Config file (default: rsincr.toml)
-f FORCE_FULL_BACKUP, --force-full-backup FORCE_FULL_BACKUP
Force a 'full' backup (compare checksums of files on
both sides), regardless of schedule
Config key | Type | Required | Default | Description |
---|---|---|---|---|
lockfile | String | No | .rsincr.lock |
Lockfile used to ensure only one instance is running |
Config key | Type | Required | Default | Description |
---|---|---|---|---|
bwlimit | String | No | None | Bandwidth limit for rsync; Any string that is interpretable by rsync - see man 1 rsync |
additional_rsync_opts | List of string | No | None | Arbitrary additional options to pass to rsync - see man 1 rsync |
Config key | Type | Required | Default | Description |
---|---|---|---|---|
server | String | Yes | None | Backup destination server in the form of 'hostname' or 'user@hostname' |
Config key | Type | Required | Default | Description |
---|---|---|---|---|
full_backup_week_days | List of integer | No | None | List of week days (0=Sunday) on which to perform a 'full' backup |
full_backup_month_days | List of integer | No | None | List of days of the month on which to perform a 'full' backup |
retention_days | Integer | No | None | Retain backups up to this number of days, and purge older backups |
Backup jobs (i.e. source/destination pairings) to backup. At least one backup job must exist.
Config key | Type | Required | Default | Description |
---|---|---|---|---|
source_dir | String | Yes | None | Source directory on local host |
dest_dir | String | Yes | None | Destination directory on backup server (Note that files will be backed up to a separate timestamped subdirectory per backup) |
compress | Boolean | No | false | Compress files in transfer (rsync -z ) |
exclude | List of string | No | None | Files or path patterns to exclude - see man 1 rsync for pattern rules |
A legacy version of rsincr written in shell (bash) can be found in legacy_shell/. It is unmaintained, and should not be used unless the python version cannot be used (e.g. due to dependencies).