µbackup is a program/library/Docker image for taking backups of your Docker containers (or traditional applications) 100 % automatically, properly encrypting and uploading them to S3.
Contents:
- Backing up: Docker containers
- Backing up: traditional applications
- Security & encryption
- How to use:
- Restoring from backup
- Example configuration file
- S3 bucket IAM policy (security + ransomware protection)
- Backup retention
- How can I be sure it keeps working?
(You need to run the µbackup Docker service, or if you run it as an application on your Docker host, configure the Docker integration via µbackup config)
µbackup takes backups from your stateful Docker containers 100 % automatically. Here's how:
+------------+ +-----------------------------+ +------------------------------+ +--------------+
| | | | | | | |
| Once a day +-----> For each container: +------> Compress & encrypt stdout +------> Upload to S3 |
| | | - if backup command defined | | of backup command | | |
+------------+ | | | | +--------------+
+-----------+-----^-----------+ +------------------------------+
| |
| |
| |
+-------------v-----+---------------+
| |
| Execute backup command inside |
| the container, taking its stdout |
| as the backup stream |
| |
+-----------------------------------+
ubackup.command
is a container's label (specified during
$ docker run -l "ubackup.command=..."
for example) that contains the command used to take
a backup of the important state inside the container.
For different use cases, for ubackup.command
specify:
-
If you need to backup a single file inside a container
- Use:
cat /yourfile.db
- Use:
-
For databases etc. that support atomic dumping, e.g. PostgreSQL
- Use:
pg_dump -U postgres nameOfYourDatabase
- Use:
-
For a directory
- Use:
tar -cC /yourdirectory -f - .
-
means$ tar
will write the archive tostdout
,.
just means to process all files in the selected directory).
- Use:
-
If there's no tooling inside the container (think Dockerfile
FROM scratch
), if you have a Docker volume you want to archive in its entirety- Use:
dockervolume://
and µbackup will find the volume source data via Docker and use$ tar
(from host-side) to dump the whole directory tree. - For this option, if you run µbackup as a container you have to run µbackup with
$ docker run -v /var/lib/docker/volumes:/var/lib/docker/volumes
- Use:
This simple approach is surprisingly flexible and its streaming approach is more efficient than having to write temporary files.
You don't have to compress the file inside the container, since that is taken care of for you.
First look at the process for how Docker containers are backed up. Traditional applications follow the same principle where µbackup just backs up the stdout stream of a command it executes (just not inside a container this time).
Look for the "static targets" configuration keys in the example configuration file.
The backups are encrypted with per-backup 256-bit AES (in CTR mode with random IV) key, which itself is asymmetrically encrypted with 4096-bit RSA public key. This means that immediately after the backup is complete, µbackup forgets/loses access to the actual encryption key, and only the user holding the private key will be able to decrypt the backup. This way the servers nor Amazon can ever access your backups (if you store the private key somewhere else).
If you are serious about security, with this design you could even store the private key in a YubiKey (or some other form of HSM).
First, do the configuration steps from below section "How to use: binary installation".
You only need to do them to create correct config.json
file. Now, convert that file for
embedding as ENV variable:
$ cat config.json | base64 -w 0
That value will be your UBACKUP_CONF
ENV var.
Now, just deploy Docker service as global service in the cluster (= runs on every node):
$ docker service create --name ubackup --mode global \
--mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
-e UBACKUP_CONF=... \
fn61/ubackup:VERSION
Note: VERSION
is the same as you would find for the binary installation.
Download appropriate release binary for you from the download link.
Create encryption & decryption keys:
$ ./ubackup decryption-key-generate > backups.key
$ ./ubackup decryption-key-to-encryption-key < backups.key > backups.pub
(For security you should not actually ever store (or even generate) the decryption key on the same machine that takes the backups, but this is provided for demonstration purposes.)
Create configuration file stub (and embed encryption key in the config):
$ ./ubackup config example --pubkey-file backups.pub > config.json
Edit the configuration further (specify your S3 bucket details)
$ vim config.json
Install & start the service:
$ ./ubackup scheduler install-systemd-service-file
Wrote unit file to /etc/systemd/system/ubackup.service
Run to enable on boot & to start now:
$ systemctl enable ubackup
$ systemctl start ubackup
$ systemctl status ubackup
In your script, you can call $ ubackup manual-backup ...
.
Call manual-backup
with --help
to get options you need to specify.
Or: using inside your own application.
See Varasto for an example.
You can pretty much take a data stream from anywhere and give it to µbackup for compression, encryption and storage. In Varasto we take atomic in-process snapshot of the database and hand out the snapshot stream to µbackup.
Varasto even has an UI for displaying/downloading the backups - powered by µbackup library APIs:
Remember to test your backup recovery! Nobody actually wants backups, but everybody wants a restore. Without disaster recovery drills you don't know if your backups work.
Download the .gz.aes
backup file from your S3 bucket. You can also do it from µbackup CLI
(just give the service ID the backup was stored under - in this example it's varasto
):
$ ubackup storage ls varasto
varasto/2019-07-25 0830Z_joonas_10028.gz.aes
varasto/2019-07-26 0825Z_joonas_10028.gz.aes
The lines output are "backup ID"s, which is enough info to download the backup:
$ ubackup storage get 'varasto/2019-07-26 0825Z_joonas_10028.gz.aes' > '2019-07-26 0825Z_joonas_10028.gz.aes'
Now you have the encrypted and compressed file - you still have to decrypt it.
The decrypt-and-decompress
verb of µbackup requires path to your decryption key, reads
the encrypted backup file from stdin and outputs the decrypted, uncompressed file to stdout.
$ ubackup decrypt-and-decompress backups.key < '2019-07-26 0825Z_joonas_10028.gz.aes' > '2019-07-26 0825Z_joonas_10028'
Run the kitchen sink (= all the possible options) example, you get something like this:
$ ubackup config example --kitchensink
{
"encryption_publickey": "-----BEGIN RSA PUBLIC KEY-----\nMIIBCgKCAQEA+xGZ/wcz9ugFpP07Nspo...\n-----END RSA PUBLIC KEY-----",
"docker_endpoint": "unix:///var/run/docker.sock",
"static_targets": [
{
"service_name": "someapp",
"backup_command": [
"cat",
"/var/lib/someapp/file.log"
]
}
],
"storage": {
"s3": {
"bucket": "mybucket",
"bucket_region": "us-east-1",
"access_key_id": "AKIAUZHTE3U35WCD5...",
"access_key_secret": "wXQJhB..."
}
},
"alertmanager": {
"baseurl": "https://example.com/url-to-my/alertmanager"
}
}
Pro-tip: when writing config, there's also config validate
command to validate your config.
You should minimize attack surface by only allowing the backup program to put stuff into the bucket, since read access is not required. If you want the bucket to automatically delete old backups, the backup program should not do it but you should use S3 lifecycle policies instead to make AWS remove your old backups.
You should also consider enabling bucket versioning so that if an attacker gained
credentials used by this backup program, she cannot permantently destroy the backups by
overwriting old backups with empty content (s3:PutObject
can do that). Versioning would
allow you to recover these tampered files in this described scenario. This effectively
implements ransomware protection.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:PutObjectAcl"
],
"Resource": "arn:aws:s3:::YOURBUCKET/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::YOURBUCKET*"
}
]
}
This is the great thing about utilizing S3. By default, your backups are stored forever. But you can configure your S3 bucket to expire your objects in e.g. 21 days.
You could even configure automatic transfer of older backups to Glacier (which are cheaper to store but more expensive to retrieve) for backups that have a low chance of being needed.
See Object Lifecycle Management.
µbackup optionally integrates with lambda-alertmanager to provide "dead man's switch" -like functionality in which µbackup reports successfull backups to alertmanager. If alertmanager doesn't hear back from µbackup in due time, an alert is raised.
This makes it so that even if µbackup wouldn't be able to report to you that it's not ok, an external component will signal you it's not ok because it didn't receive a "check-in".
Integration with alertmanager is driven by config (see example config).