Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Public key encryption support #672

Closed
KenMacD opened this issue Feb 18, 2016 · 27 comments
Closed

Public key encryption support #672

KenMacD opened this issue Feb 18, 2016 · 27 comments
Labels

Comments

@KenMacD
Copy link

KenMacD commented Feb 18, 2016

Storing the key used to encrypt backups on the server used to create the backups is not ideal. It's impossible to tell when it's been stolen, and stealing the key once would provide access to all past and future backup data.

Instead it would be nice if a new symmetric key was somehow for each archive, and then encrypted using the public key. That way the private key could be kept safely offline until a restore was required.

Duplicity does something similar in using gpg to protect the files.

@jungle-boogie
Copy link
Contributor

How would automated backups work?

@KenMacD
Copy link
Author

KenMacD commented Feb 18, 2016

I would hope the same as they currently do, just without the requirement to have the private key material on the server. The public key can exist on all the servers, but with it the data could not be decrypted.

I'm not currently sure what data needs decrypting during future backups, so I'm not sure if this is possible, but it seems like as long as the metadata is available the actual data may not need to be.

@ThomasWaldmann
Copy link
Member

The problem here is that the code (as is) uses the repository mostly as a key/value store.
The repo manifest (has the list of all archives) is stored at key=0.
When a new archive is created, the manifest is read, the new archive entry is added and the manifest is written back. That does not work if the encryption is not reversible for the client.

@KenMacD
Copy link
Author

KenMacD commented Feb 20, 2016

Would there be any way to keep the manifest/metadata using a symmetric key, but encrypt the data with a public key?

@ThomasWaldmann
Copy link
Member

of course one could always specialcase the manifest.

but that was just an example, there are also other chunks that need to get read (and not just written), e.g. for delete / prune. this might be a feature rather than a problem though because if you do not fully trust the client, you maybe do not want that prune / delete works from there.

compaction is also a place where stuff gets read and re-written, it has to be checked - maybe no decryption is needed in this case.

@KenMacD
Copy link
Author

KenMacD commented Feb 23, 2016

Thanks for the info. So if I'm understanding this correctly it may not be impossible, but is certainly not easy or likely to happen any time soon. Is that about right? Is so feel to close this wontfix.

@RonnyPfannschmidt
Copy link
Contributor

How about having a "extend" Operation to add to the manifest,

then knowing the content is no longer needed to add New archives

@thomsh
Copy link

thomsh commented Apr 1, 2016

KenMacD is right,
Encryption on a public key must be the law, you can rely on GnuPG.
I cant imagine a backup job with a static passphrase on the server.
:(
Other feature are great!

@enkore
Copy link
Contributor

enkore commented Jul 18, 2016

Crypto roadmap -> #1044 Currently PK is not on it, but technically possible by the draft DEK spec.

@ghost
Copy link

ghost commented Oct 29, 2016

We really need this feature! Passwords for critical data is not serious...

@enkore
Copy link
Contributor

enkore commented Nov 11, 2016

Stapling #120

@rugk
Copy link
Contributor

rugk commented Nov 11, 2016

As stated in #1786 you may also use gpg for signing these backups if you implemented it.

In any case I think that a kind of hybrid encryption would be nice as it combines the advantages of both encryption methods.

  • Encrypt (& possibly sign) a symmetric key/secret (maybe the one you currently use, so that one can still use a password in addition) with a pgp public key
  • Store the encrypted result (and maybe the pgp key) on the server
  • When one wants to create backups one needs to decrypt the secret stored on the server with the private key (& if used - the password) and then data can just be encrypted in the way it is currently done.
  • Maybe - which would be harder to implement, but may be a future goal - you can implement an automated "key rollover" where old symmetric secrets are thrown away and new ones are generated and encrypted with the same public key. This would then also provide forward secrecy.

This would be good as:

  • you would only use (slow) asymmetric encryption for a small secret instead of all files
  • so fast symmetric encryption is used for the actual encryption process
  • you can keep the current encryption model and build the asymmetric encryption around it
  • the secret can be signed, thus providing integrity
  • even if the asymmetric encryption is broken (yeah, quantum computers looking here¹) one could not decrypt the actual data - if the user still uses a password as currently done
  • it's a kind of "two factor authentication" (one thing you need to know: your password; one thing you need to have [on your computer]: the key pair)

In any case I'd argue:

  • to still provide the current model with a symmetric encryption (where the password is needed) and to
  • only use asymmetric encryption to build this "around" the symmetric core

¹ Note that current symmetric encryption is considered secure against post-quantum attacks.

@enkore
Copy link
Contributor

enkore commented Nov 11, 2016

Signing archives and using asymmetric key derivation to encrypt archives are imho different topics.

Signing is relatively simple to bolt-on externally (or internally), by signing the tip of the hash tree, ie. the archive ID. This is pretty much what git and other software does (do git cat-file -p <signed commit>), but rather expensive to actually verify (not just verifying that it's a correct signature for the archive ID).

@rugk
Copy link
Contributor

rugk commented Nov 12, 2016

Okay, then split signing from encryption, but I'd still find it awesome if we get a hybrid encryption method explained above.

@ghost
Copy link

ghost commented Nov 12, 2016

Better to use different keys for signing and encryption for some reasons.
http://security.stackexchange.com/questions/1806/why-should-one-not-use-the-same-asymmetric-key-for-encryption-as-they-do-for-sig

I agree, better to split this tasks (but gpg can be used for both of them)...

About gpg encryption:
Just try to take this solution and adapt it to borg backup, so something like
PASSPHRASE="passphrase_for_GPG" borg --encrypt-key 4F8A7D0C

https://www.digitalocean.com/community/tutorials/how-to-use-duplicity-with-gpg-to-securely-automate-backups-on-ubuntu here you can read good article about Duplicity with GPG encryption, let's just try repeat this solution, it's good. gpg is good standard for such type of tasks, we can just adapt it and do not reinvent the wheel..

@jody-frankowski
Copy link

@lorddaedra
A gpg passphrase would only be needed when you would need to decrypt (i.e. use the private key) some kind of metadata.
Otherwise gpg can encrypt data fine with only the public key.

IMHO a good solution would be to keep some kind of metadata file locally that borg can use to know what it should backup or not, encrypt the actual backup data and the metadata file asymmetrically, and then send all that to the remote.
That way the whole process could be automatic and only require the private key when restoring.

@rugk
Copy link
Contributor

rugk commented Nov 12, 2016

@lorddaedra Actually the way you propose gpg usage is the one I don't want to use, because there backups will be encrypted asymmetrical. So I recommend not to use (or require) a passphrase for a gpg key and use the unlocked key pair for encryption. For some reasons I outlined above, a symmetrical encryption is better and easier to implement here.
The password you enter should still be used for the symmetric secret you have.

I only want to encrypt the symmetrical secret asymmetrically, so that (faster and from a post-quantum perspective more secure) symmetric encryption is used.

@enkore
Copy link
Contributor

enkore commented Nov 12, 2016

Assuming use of gpg, both schemes are hybrid encryption, but with your idea the session key would be controlled by Borg, allowing alternate access - like you said, a backup password or something similar, whereas just using gpg would mean that the session key is buried somewhere in the ubercomplex PGP format.

(FTR, if public key crypto is done in Borg, it will always be a hybrid scheme, because not doing public key cryptography with hybrid encryption practically always means that you've built an insecure system (aka "craptography").

@ghost
Copy link

ghost commented Nov 13, 2016

some thoughts...

In my opinion, we can't choose one best solution, which will work great for all cases, so better to try make all components customizable. [if somewhere we use password I would like to choose, use Bcrypt or Argon2 or something else ideally / if we use HMAC, I would like to choose, SHA-256 or, SHA-512 / if we use AES, I would like to choose AES 128 or AES 256 etc..]

It's also good idea to look at another projects and check how they solve same problems. For example, I suggest to look at https://www.tarsnap.com/crypto.html Very similar project to borg...

(And article with some information about crypto from author of this project http://www.daemonology.net/blog/2009-06-11-cryptographic-right-answers.html, may be little outdated. read last comments too about GSM vs CTR + HMAC..), also interesting page http://security.stackexchange.com/questions/63132/when-to-use-hmac-alongside-aes

We can use https://cryptography.io/en/latest/ to encrypt with AES 256 GCM (which internally uses CTR afaik) or AES 256 CTR + HMAC ...

@enkore
Copy link
Contributor

enkore commented Nov 13, 2016

(Please try to keep discussion approx on topic - borg has an existing framework for symmetric/secret key encryption, and there are other tickets, linked via #1044, for discussion about that - thanks :)

There's nothing wrong in what Colin Percival says, in fact these are all very good recommendation for the time, and mostly even today (the very specific RSA-based algorithms he recommends are as far as we know secure, but can be tricky to implement correctly). Today I would not recommend anything RSA for new crypto systems, there are better alternatives that have none of the pitfalls and are designed to enable secure implementations by default. (Well and SHA-3 turned out to be infeasible for general purpose applications, and downright bad for password derivation -- but that was impossible to predict in 2009.)

Whether there will be actually any public key crypto code to write in Borg remains to be seen however; if gpg would handle it, then we wouldn't have anything to do with that.

@intelfx
Copy link
Contributor

intelfx commented Feb 16, 2017

@enkore

Well and SHA-3 turned out to be infeasible for general purpose applications, and downright bad for password derivation

Could you please elaborate? Never heard of this.

@rugk
Copy link
Contributor

rugk commented Feb 16, 2017

downright bad for password derivation

Please not. I think SHA-3 is not considered to be used for passwords…
Currently there is also no need to switch. SHA-256 turned out to be quite strong and everyone relies on it, so no concern yet.

@enkore
Copy link
Contributor

enkore commented Feb 16, 2017

SHA-3 is optimized for hardware performance (even from a numbers perspective -- it's a lot (on some archs drastically) slower than SHA-2 in software), which is the exact opposite of what we want for Borg* and "an even stronger opposite" for password derivation.

* unless hardware-acceleration becomes widespread, which took Intel like 12 years for AES.

@enkore
Copy link
Contributor

enkore commented May 24, 2017

This seems unlikely to happen any time soon. There are no plans to implement this and various hard, technical issues preventing implementation at this time (see above). Therefore:

wontfix: No time table or plans to implement. Does not mean that something is rejected for eternity, since arguments pro/contra something can change with time and context.

(from https://github.com/borgbackup/borg/wiki/Project-management-FAQ-RAQ)

@enkore enkore closed this as completed May 24, 2017
@enkore enkore added the wontfix label May 24, 2017
@jcgruenhage
Copy link

Sad to hear this, I'll look if I can come up with some way how to implement this though, since I'd love for some host to be able to back up it's stuff without being able to read backups (of possibly other hosts).

@capi
Copy link

capi commented May 31, 2017

@jcgruenhage Use separate repositories with different SSH keys for the various hosts.

If multiple hosts backup to the same repository, there needs to be a possibility to read parts of the content of the archive and therefore other hosts data for de-duplication. Checking hashes alone is not enough for de-duplication, see the SHA1-collission with the two same-sized PDF files and what it caused on e.g. Subversion.

@jcgruenhage
Copy link

@capi But that makes it so that I can't use deduplication across hosts either. About the SHA1 collision, it is still highly improbable for that to happen "on accident", and to avoid that, one could "just" use a more recent hashing algorithm (BLAKE2b-512 for example, if you are that worried about collisions), regarding that SHA1 is about 22 years old.

I am rather certain borg relies on the hashes right now too, since without that, borg would need to download most of the content from the remote host for each new archive, and large archives with little changes don't take as long as they would need to if that was the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests