duplicity-unattended

This is my script for unattended host backups using Duplicity that others might find useful. It is configurable through a YAML file, but opinionated in some ways:

Backups go to S3 using Standard-Infrequent Access storage class to save money.
Encryption and signing requires a GnuPG keypair with no passphrase. The key should be protected by filesystem permissions anyway so a passphrase just adds unnecessary complexity.
Time determines the interval between full backups.
Purging old backups happens automatically at the end (unless overridden). The script keeps the last N full backups.

Run duplicity-unattended --help to see all options or just look at the code.

What's inside the box?

duplicity-unattended: Script that runs unattended backups and purges stale backups.
systemd/: Directory containing sample systemd unit files you can customize to run the script periodically.
cfn/host-bucket.yaml: CloudFormation template to set up an S3 bucket and IAM permissions for a new host.
cfn/backup-monitor: CloudFormation (SAM) template and Lambda function to notify you if backups stop working.
terraform-gcp: Terraform template to set up remote backups in Google Cloud Storage. (Sets up GCS folder and Service Account.)

You can use the script without systemd or CloudFormation if you prefer. They all work independently.

Configuring New Hosts

Here are the steps I generally follow to set up backups on a new host.

AWS Setup

I use separate keys, buckets, and AWS credentials so the compromise of any host doesn't affect others.

Set Up An S3 Bucket

First, create an S3 bucket and IAM user/group/policy with read-write access to it. The included cfn/host-bucket.yaml CloudFormation template can do this for you automatically. To apply it:

Go to CloudFormation in the AWS console and click Create Stack.
Select the option to upload a template to S3 and pick the cfn/host-bucket.yaml template.
Fill in the stack name and bucket name. I suggest including the hostname in both for easy identification.
Accept remaining defaults and acknowledge the IAM resource creation warning.
Wait for stack setup to complete. If it fails, it's likely the S3 bucket name isn't unique. Delete the stack and try again with a different name.
Go to IAM in the AWS console and click on the new user. The user name is prefixed with the stack name so you can identify it that way.
Go to the Security credentials tab and click Create access key.
Copy the generated access key ID and secret key. You'll need them later.

Alternatively, you can create the S3 bucket and IAM resources manually. Here are the general steps. Modify as you see fit.

Create the S3 bucket. Default settings are fine.

Create IAM policy with the following permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::<bucket>/*",
                "arn:aws:s3:::<bucket>"
            ]
        }
    ]
}

Replace <bucket> with the bucket name.

Create IAM group with the same name as the policy and assign the policy to it.
Create IAM user for programmatic access. Add the user to the group. Don't forget to copy the access key ID and secret access key at the end of the wizard.

Configure AWS Credentials On The Host

Create a file on the host containing the AWS credentials.
```
[Credentials]
aws_access_key_id = <access_key_id>
aws_secret_access_key = <secret_key>
```
Replace <access_key_id> and <secret_key> with the IAM user credentials. Put it in a location appropriate for the backup user such as /etc/duplicity-unattended/aws_credentials or ~/.duplicity-unattended/aws_credentials.
Make sure only the backup user can access the credentials file.
```
chmod 600 aws_credentials
```
Change ownership if needed.

GCP Setup

Much of this is based on https://systemoverlord.com/2019/09/23/backing-up-to-google-cloud-storage-with-duplicity-and-service-accounts.html

Set Up GCP Account

Create a Google Cloud account at cloud.google.com
Log into the web console
Create a project that will house your backups, and make yourself a "Storage Admin" on that project.

Use Terraform To Set Up Cloud Storage And Service Account

The terraform included in this repository will create everything you need in your GCP project, including the cloud storage bucket, and all the required permissions for your host machine.

You should modify the contents of terraform.tfvars to match your needs before running the following:

cd ./terraform
terraform init
terraform apply

This will output a message from terraform about success/failure, and the path to your service account credentials file. You'll need this path to finish your host setup later on.

Set Up The Host

Basic Setup

Install dependencies:
- Duplicity
- GnuPG
- Python 3
- PyYAML for Python 3
Create new RSA 4096 keypair as the user who will perform the backups. If you're backing up system directories, this probably needs to be root. Do NOT set a passphrase. Leave it blank.
```
gpg --full-generate-key --pinentry loopback
```
Make an off-host backup of the keypair in a secure location. I use my LastPass vault for this. Don't skip this step or you'll be very sad when you realize the keys perished alongside the rest of your data, rendering your backups useless.
```
gpg --list-keys  # to get the key ID
gpg --armor --output pubkey.gpg --export <key_id>
gpg --armor --output privkey.gpg --export-secret-key <key_id>
```
Delete the exported key files from the filesystem once they're secure.

GCP-Specific Host Setup Steps

Install gcs-oauth2-boto-plugin
Install the gcloud sdk (instructions)
Run gcloud init
You will be prompted to log in to your Google Cloud account, which authenticates your machine so that we can run the Terraform
Run gsutil config -e (Enter the path to your service account credentials file when prompted)
The gsutil command given above will create a boto config file at ~/.boto
Create the file ~/.config/boto/plugins/gcs.py
Put the following contents in that file: import gcs_oauth2_boto_plugin
Put the following at the bottom of your ~/.boto file

[Plugin]
plugin_directory = /home/{YOUR_USERNAME}/.config/boto/plugins

Add To Path And Test

Copy the duplicity-unattended script to a bin directory and make sure it's runnable.
```
chmod +x duplicity-unattended
```
I usually clone the repo to /usr/local/share and add a symlink in usr/local/bin.
Copy the sample config.yaml file to the same directory as the AWS credentials file. (Or you can put it somewhere else. Doesn't matter.)
Customize the config.yaml file for the host.
Do a dry-run backup as the backup user to validate most of the configuration:
```
duplicity-unattended --config <config_file> --dry-run
```
Replace <config_file> with the path to the YAML config file. Among other things, this will tell you how much would be backed up.
Do an initial backup as the backup user to make sure everything really works:
```
duplicity-unattended --config <config_file>
```

Schedule Backups

How you schedule backups depends on your OS. I use systemd timers for this. See the systemd directory in this repository for sample unit files you can customize. You'll probably need to change User, Group, and ExecStart to match the user who performs the backups and the location of the duplicity-unattended script, respectively.

On Arch Linux and similar distros, drop these files into /etc/systemd/system and then enable and start the timer with:

sudo systemctl enable duplicity-unattended.timer
sudo systemctl start duplicity-unattended.timer

Make sure the timer is running:

sudo systemctl status duplicity-unattended.timer

And then run the backup once manually and check the output:

sudo systemctl start duplicity-unattended.service
sudo journalctl -u duplicity-unattended.service

You're done! Enjoy your backups.

Set Up Monitoring (AWS Only)

How do make sure backups keep working in the future? You can set up systemd to email you if something goes wrong, but I prefer an independent mechanism. The cfn/backup-monitor directory contains a CloudFormation template (SAM template, actually) with a Lambda function that monitors a bucket for new backups and emails you if no recent backups have occurred. To set it up for a new host/bucket, follow these steps:

If you have not used AWS Simple Email Service (SES) before, follow the instructions to verify the sender and recipient email addresses. See the overview documentation for more information.
Go to duplicity-unattended-monitor in the AWS Serverless Application Repository and click the Deploy button.
Review the template. (You wouldn't deploy a CloudFormation template into your AWS account without knowing what it does first, would you?)
Change the application/stack name. I suggest a name that includes the host or bucket for easy identification.
Fill in the remaining function parameters. Make sure the email addresses exactly match the ones you verified in SES.
Click Deploy and wait for AWS to finish creating all the resources.

Now let's test it.

Click on the function link under Resources. This will take you to the Lambda console for the function.
Click the Test button in the upper-right.
Create a new test event with the following content:
```
{"testEmail": true}
```
Give it a name like BackupMonitorTest and click Create.
Now you should see the new named event next to the Test button. Click the Test button again.

If all goes well, you will get an email with a summary of the most recent backups found in the bucket.

From now on, the function will run once a day and email you only when there have been no recent backups for the number of days you specified. The function will look for recent backups in any S3 "folder" that contains at least one backup set from any time in the past. You can deploy additional stacks for each bucket you want to monitor.

If you prefer to deploy the CloudFormation template directly from source code instead of from the Serverless Application Repository, you can. The steps are roughly as follows:

Install Pipenv for Python 3 if you don't already have it.
From the source repo directory, install the AWS SAM CLI into a virtual environment:
```
pipenv install --dev
```
Change to the cfn/backup-monitor directory.
Set up your AWS CLI credentials so SAM can read them (e.g. using AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables).
Run the SAM command to package the CloudFormation template and upload the Lambda function to S3:
```
pipenv run sam package --s3-bucket <code_bucket> --output-template-file packaged.yaml
```
where <code_bucket> is an S3 bucket to which the AWS CLI user has write access.
You can now use the CloudFormation AWS console or the AWS CLI to deploy the packaged.yaml stack template that SAM just created.

Restoring From Backup

Invoke duplicity directly to restore from a backup. The general procedure is as follows:

If restoring on a new host, import the GPG keypair from its secure backup location:
```
gpg --import privkey.gpg
```
List the keys to get the key ID:
```
gpg --list-keys
```
Make a note of the ID (long hexadecimal number). You'll need it when you run the duplicity command later.
If you don't have a copy of the original AWS credentials file (e.g. it perished along with your data), create a new one. You can create a new access key from the IAM console following the same procedure as described above for setting up a new host. Don't forget to deactivate the old access key in the IAM console if you no longer need it.
Point Duplicity to the AWS credentials file by setting the BOTO_CONFIG environment variable. In bash, you'd run:
```
export BOTO_CONFIG=<aws_credentials_file>
```
Replace <aws_credentials_file> with the path to the file
Run duplicity from the command line to restore each source directory. You can browse the source directories by looking inside the S3 bucket in the AWS console. Here's a basic working restore command that restores a source directory to a new target directory called restored:
```
mkdir restored
duplicity --encrypt-sign-key <key_id> s3+http://<bucket>/<source_dir> restored
```
Replace <key_id> with the GPG key ID, <bucket> with the S3 bucket name, and <source_dir> with the source directory name (S3 key prefix). You might be asked to provide a passphrase during the restore. Just hit ENTER.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
cfn		cfn
systemd		systemd
terraform-gcp		terraform-gcp
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
config.yaml		config.yaml
duplicity-unattended		duplicity-unattended

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

duplicity-unattended

What's inside the box?

Configuring New Hosts

AWS Setup

Set Up An S3 Bucket

Configure AWS Credentials On The Host

GCP Setup

Set Up GCP Account

Use Terraform To Set Up Cloud Storage And Service Account

Set Up The Host

Basic Setup

GCP-Specific Host Setup Steps

Add To Path And Test

Schedule Backups

Set Up Monitoring (AWS Only)

Restoring From Backup

About

Releases

Packages

Contributors 3

Languages

License

rhasselbaum/duplicity-unattended

Folders and files

Latest commit

History

Repository files navigation

duplicity-unattended

What's inside the box?

Configuring New Hosts

AWS Setup

Set Up An S3 Bucket

Configure AWS Credentials On The Host

GCP Setup

Set Up GCP Account

Use Terraform To Set Up Cloud Storage And Service Account

Set Up The Host

Basic Setup

GCP-Specific Host Setup Steps

Add To Path And Test

Schedule Backups

Set Up Monitoring (AWS Only)

Restoring From Backup

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages