Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve qvm-backup key derivation/management #971

Closed
andrewdavidwong opened this issue Apr 26, 2015 · 150 comments
Closed

Improve qvm-backup key derivation/management #971

andrewdavidwong opened this issue Apr 26, 2015 · 150 comments
Labels
C: core cryptography This issue pertains to cryptography. help wanted This issue will probably not get done in a timely fashion without help from community contributors. P: major Priority: major. Between "default" and "critical" in severity. release notes This issue should be mentioned in the release notes. security This issue pertains to the security of Qubes OS.
Milestone

Comments

@andrewdavidwong
Copy link
Member

See:
https://groups.google.com/d/msg/qubes-devel/CZ7WRwLXcnk/u_rZPoVxL5IJ

@marmarek marmarek added this to the Release 3.0 milestone May 3, 2015
@marmarek marmarek modified the milestones: Release 3.1, Release 3.0 May 13, 2015
@marmarek marmarek added the help wanted This issue will probably not get done in a timely fashion without help from community contributors. label Jun 15, 2015
@marmarek
Copy link
Member

We need someone who know cryptography to check discussed key derivation scheme. Especially answer for questions in this message: https://groups.google.com/d/msg/qubes-devel/CZ7WRwLXcnk/1N0sYf6lVvUJ

@andrewdavidwong
Copy link
Member Author

We may never get the assistance of a cryptographer on this. For now, we should at least pass the -md option to openssl and specify sha256 (to match the key length of aes256). In other words:

openssl enc -md sha256 -e -aes-256 [...]

This would be better than nothing. Currently, md5 is used by default, which is actually weakening security for users who select passphrases larger than 128 bits (probably almost all Qubes users).

@marmarek
Copy link
Member

On Thu, Oct 22, 2015 at 02:54:10AM -0700, Axon wrote:

We may never get the assistance of a cryptographer on this. For now, we should at least pass the -md option to openssl and specify sha256 (to match the key length of aes256). In other words:

openssl enc -md sha256 -e -aes-256 [...]

This would be better than nothing. Currently, md5 is used by default, which is actually weakening security for users who select passphrases larger than 128 bits (probably almost all Qubes users).

As the user can (theoretically) freely choose encryption algorithm, just
sha256 may not be an universal option. Do you know any way to get key
length from openssl cipher name? Or just search for some numbers in
it?...

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@andrewdavidwong
Copy link
Member Author

As the user can (theoretically) freely choose encryption algorithm, just sha256 may not be an universal option.

Two ideas:

  1. Set the default to sha256, but give the user the option to change it. If the user wants to choose a different encryption algorithm, then it's up to her to also choose an appropriate message digest algorithm to accompany it.
  2. Hard code sha512 for everything, then let each encryption algorithm take as many truncated bits as it can (assuming that would work).

Do you know any way to get key length from openssl cipher name? Or just search for some numbers in it?...

I'm not aware of any. That would be my guess too...

@andrewdavidwong
Copy link
Member Author

Based on this and after further research, I think it would be best to go with GPG for encryption (but not for integrity-verification).

Ideally, we would read from the user-defined preferences in ~/.gnupg/gpg.conf in dom0. This would allow users to set the following options (among others):

--s2k-cipher-algo name
      Use name as the cipher algorithm used to protect secret keys.  The default cipher
      is CAST5. This cipher is also used for  conventional  encryption  if  --personal-
      cipher-preferences and --cipher-algo is not given.

--s2k-digest-algo name
      Use  name  as  the  digest algorithm used to mangle the passphrases.  The default
      algorithm is SHA-1.

--s2k-mode n
      Selects how passphrases are mangled. If n is 0 a plain passphrase (which  is  not
      recommended)  will  be  used,  a  1  adds  a  salt to the passphrase and a 3 (the
      default) iterates the whole process a number of times (see --s2k-count).   Unless
      --rfc1991 is used, this mode is also used for conventional encryption.

--s2k-count n
      Specify how many times the passphrase mangling is repeated.  This value may range
      between 1024 and 65011712 inclusive.  The default  is  inquired  from  gpg-agent.
      Note  that  not all values in the 1024-65011712 range are legal and if an illegal
      value is selected, GnuPG will round up to the nearest legal value.   This  option
      is only meaningful if --s2k-mode is 3.

Recommended defaults:

gpg --encrypt --symmetric --s2k-cipher-algo AES256 --s2k-digest-algo SHA512 --s2k-mode 3 --s2k-count 65011712 --compress-algo bzip2

@marmarek
Copy link
Member

This seems to be the best option.
Some minor questions/improvements/thoughts:

  1. Should be no --encrypt as it is for asymmetric encryption
  2. We currently have compression handled independently of encryption, I think we should keep that and use --compress-algo none. Our current approach allows to specify any "filter" program to do the actual compression. Personally I like pigz (parallel gzip) for much better performance on 8-core system :)
  3. Generally I'm fine with --s2k-count 65011712; on my system it needs 1.00s to encrypt just "test" string (so mostly key derivation benchmark), but it may be somehow longer on older systems; do we care? On Thinkpad T61 (Core 2 Duo T7300) - 1.42s. Ok, not bad at all.
  4. What about integrity protection (openssl)? is it a problem that the passphrase is used directly here (at least I think it is)? If yes and if we want to change that, it would require some more work to make it compatible with older formats, because integrity protection of backup header (only after its verification we can get some properties like backup format version, encryption algorithm etc).
  5. Backup data will be integrity-protected twice - once by openssl, and additionally by gpg. I don't think it is a problem, but I'm not a cryptographer.
  6. I don't think we should parse ~/.gnupg/gpg.conf. Either let use the defaults (possibly changed by user in that file), or override some option.

@adrelanos
Copy link
Member

As for most secure gpg settings, check this out.

Quote https://github.com/Whonix/anon-gpg-tweaks

"Security and privacy enhancements for gnupg's config file for user
"user" in /home/user/.gnupg/gpg.conf." See also:
https://raw.github.com/ioerror/torbirdy/master/gpg.conf and
ioerror/torbirdy#11.

  1. Backup data will be integrity-protected twice - once by openssl, and additionally by gpg. I don't think it is a problem, but I'm not a cryptographer.

Not a bad thought. In past there were "known plaintext attacks", where
it was recommended against to encrypt known plaintext as this would
weaken the ciphertext. Any serious algorithm nowadays must pass this.

Normally it should not. If it did, it would be a critical cipher weakness.

I am just worried of the extra complexity of double encryption wrt to
manual backup recovery. But if you think it's fine, go for it.

  1. I don't think we should parse ~/.gnupg/gpg.conf. Either let use the defaults (possibly changed by user in that file), or override some option.

Yes, programs should not rely on user settings in ~/.gnupg/gpg.conf
indeed. Configurable options is a pony.

gpg supports --no-options (skip ~/.gnupg/gpg.conf) and --homedir.

By the way, in Qubes Q3 (and earlier) you cannot continue to backup if
you uncheck the encryption box. [I personally use full disk encryption
for all my disks, so the encryption of QVMM is nice, but I prefer to
disable it to ease manual recovery [in case the backup is ever damaged,
version conflict or w/e).]

@marmarek
Copy link
Member

  1. Backup data will be integrity-protected twice - once by openssl, and additionally by gpg. I don't think it is a problem, but I'm not a cryptographer.

Not a bad thought. In past there were "known plaintext attacks", where
it was recommended against to encrypt known plaintext as this would
weaken the ciphertext. Any serious algorithm nowadays must pass this.

Normally it should not. If it did, it would be a critical cipher weakness.

I am just worried of the extra complexity of double encryption wrt to
manual backup recovery. But if you think it's fine, go for it.

Not encryption, just integrity protection - (.hmac files).

By the way, in Qubes Q3 (and earlier) you cannot continue to backup if
you uncheck the encryption box. [I personally use full disk encryption
for all my disks, so the encryption of QVMM is nice, but I prefer to
disable it to ease manual recovery [in case the backup is ever damaged,
version conflict or w/e).]

Can you create separate ticket for it? I was not aware of this
problem...

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@andrewdavidwong
Copy link
Member Author

@marmarek:

  1. Should be no --encrypt as it is for asymmetric encryption

Oops. Right.

  1. We currently have compression handled independently of encryption, I think we should keep that and use --compress-algo none. Our current approach allows to specify any "filter" program to do the actual compression. Personally I like pigz (parallel gzip) for much better performance on 8-core system :)

Sounds good to me. :)

  1. Generally I'm fine with --s2k-count 65011712; on my system it needs 1.00s to encrypt just "test" string (so mostly key derivation benchmark), but it may be somehow longer on older systems; do we care? On Thinkpad T61 (Core 2 Duo T7300) - 1.42s. Ok, not bad at all.

I suggested 65011712 as the default because it's the maximum. I think we should assume Qubes users want the "most secure" options by default. If they value backward compatibility more, they can override the defaults. (The security gain of the "most secure defaults" strategy is in some cases debatable, but the choice of defaults acts as an important reputational signal for a security-oriented OS.)

  1. What about integrity protection (openssl)? is it a problem that the passphrase is used directly here (at least I think it is)?

Yes, this is the other part of the problem.

If yes and if we want to change that, it would require some more work to make it compatible with older formats, because integrity protection of backup header (only after its verification we can get some properties like backup format version, encryption algorithm etc).

True...

  1. Backup data will be integrity-protected twice - once by openssl, and additionally by gpg. I don't think it is a problem, but I'm not a cryptographer.

IANAC either. I take it integrity-protection from gpg can't be turned off?

  1. I don't think we should parse ~/.gnupg/gpg.conf.

The only reason I suggested parsing ~/.gnupg/gpg.conf is because that seemed like the most logical place for the user to put their custom settings, but certainly if it's dangerous to parse it (or undesirable for another reason), then let's not do it. I really only meant to suggest that users should be able to specify their own options for s2k-cipher-algo, s2k-digest-algo, s2k-mode, and s2k-count (and any others that should be available).

Either let use the defaults

Sorry, I don't understand. Can you rephrase?

(possibly changed by user in that file),

Which file?

or override some option.

Would it be something like this?

qvm-backup --encrypt --s2k-count 1024 [...]

@andrewdavidwong
Copy link
Member Author

@adrelanos:

Yes, programs should not rely on user settings in ~/.gnupg/gpg.conf indeed.

I'm glad you and @marmarek pointed this out to me. As mentioned in my previous message, my suggestion about reading from ~/.gnupg/gpg.conf was really just an expression of my desire for users to be able to choose their own custom settings for gpg to use.

However, now I'm curious: Why is it such a bad idea for programs to read from/rely on ~/.gnupg/gpg.conf?

Configurable options is a pony.

What does this mean?

@marmarek
Copy link
Member

On Fri, Oct 23, 2015 at 07:16:42PM -0700, Axon wrote:

  1. Backup data will be integrity-protected twice - once by openssl, and additionally by gpg. I don't think it is a problem, but I'm not a cryptographer.

IANAC either. I take it integrity-protection from gpg can't be turned off?

Yes, I think it can't be disabled.

  1. I don't think we should parse ~/.gnupg/gpg.conf.

The only reason I suggested parsing ~/.gnupg/gpg.conf is because that seemed like the most logical place for the user to put their custom settings, but certainly if it's dangerous to parse it (or undesirable for another reason), then let's not do it. I really only meant to suggest that users should be able to specify their own options for s2k-cipher-algo, s2k-digest-algo, s2k-mode, and s2k-count (and any others that should be available).

Either let use the defaults

Sorry, I don't understand. Can you rephrase?

(possibly changed by user in that file),

Which file?

qvm-backup (one program) should not read ~/.gnupg/gpg.conf
(configuration of another program). We have no control over it, syntax
can change, its location may change, some other options may be
introduced (including another file?) etc.

So for each option we should either set it explicitly on command line
(always), or not set at all. Do not try to guess whether user set some
custom value in gpg.conf or not.

or override some option.

Would it be something like this?

qvm-backup --encrypt --s2k-count 1024 [...]

Yes.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@adrelanos
Copy link
Member

Axon:

@adrelanos:

Yes, programs should not rely on user settings in ~/.gnupg/gpg.conf indeed.

I'm glad you and @marmarek pointed this out to me. As mentioned in my previous message, my suggestion about reading from ~/.gnupg/gpg.conf was really just an expression of my desire for users to be able to choose their own custom settings for gpg to use.

However, now I'm curious: Why is it such a bad idea for programs to read from/rely on ~/.gnupg/gpg.conf?

It's not super bad, not a huge issue, just bad practice. Because that
file is for user settings. Not arbitrary tools. Avoiding conflicts with
whatever the user did. It's hard to modify a user config file.
Specifically if there is no .d folder and especially in the home
folder. By avoiding it, it gets easier to configure it tailed for the
tool in question.

Configurable options is a pony.

What does this mean?

An optional, non-essential, nice to have, thing if someone contributes it.

@adrelanos
Copy link
Member

Axon:

integrity-protection from gpg can't be turned off?

I don't know if it should be turned off (why?) but it can be configured:

--force-mdc / --disable-mdc

@andrewdavidwong
Copy link
Member Author

qvm-backup (one program) should not read ~/.gnupg/gpg.conf (configuration of another program). We have no control over it, syntax can change, its location may change, some other options may be introduced (including another file?) etc.

So for each option we should either set it explicitly on command line (always), or not set at all. Do not try to guess whether user set some custom value in gpg.conf or not.

Ok, that makes sense. Thank you for explaining it to me. :)

@andrewdavidwong
Copy link
Member Author

It's not super bad, not a huge issue, just bad practice. Because that file is for user settings. Not arbitrary tools. Avoiding conflicts with whatever the user did. It's hard to modify a user config file. Specifically if there is no .d folder and especially in the home folder. By avoiding it, it gets easier to configure it tailed for the tool in question.

Right, makes sense.

An optional, non-essential, nice to have, thing if someone contributes it.

Oh, I see. Thanks. :)

I don't know if it should be turned off (why?) but it can be configured:[...]

Oh, ok. I guess the only reason to turn it off would be if we're worried about double integrity-protection causing problems, but it sounds like that's not really a concern.

@pqg
Copy link

pqg commented Oct 25, 2015

FWIW, I wonder whether the minimum effort variation would be to apply the key stretching just once, to encrypt a random master key.

e.g.:

  1. A random Master Key, MK, of 512 bits is generated.
  2. This key is recorded in a file in the archive, say ./master-key, encrypted under the user's passphrase, using a key stretching method of some variety (I'll get back to the details of such later).
  3. The encryption key EK is derived as EK = HMAC_SHA512(MK, "qubes_backup_encryption_key")
  4. EK, truncated to 256 bits, is used to encrypt files in the current manner, but it is supplied to openssl via the parameter -K (i.e. as a raw key); I suspect this also means that the IV will have to be generated by the backup program and be manually fed in to openssl via -iv.
  5. The Authentication Master Key AMK is derived as AMK = HMAC_SHA512(MK, "qubes_backup_authentication_key")
  6. For each file, an Authentication Key AK_<filepath> is derived as AK_<filepath> = HMAC_SHA512(AMK, "<filepath>")
  7. Each file has it's HMAC taken and stored in an adjacent <blah>.hmac file, in the current manner.

Using a unique key for each file's MAC resolves a possible issue where an attacker could swap files around in the archive, potentially migrating an infection from a lower-security VM to a higher-security VM.

This also avoids having an attacker brute-force the user's passphrase against one/any of the HMACs.

A NIST-specified KDF could be used on the Master Key to derive the other keys instead, but none lend themselves to being specified in an openssl one-liner. Also, 2 of the 3 algos in NIST 800-108 more or less reduce to the above HMAC invocations if we don't need subkeys longer than the HMAC's block size.

Alas, this proposal requires more operations before verifying the ./backup-header file, which is bad both for code exposure concerns and backwards compatibility. However, preserving the ./backup-header.hmac in its current form will leave a method by which, as mentioned, an attacker can bypass the key stretching and brute-force the bare passphrase (HashCat has an "HMAC-SHA256 (key = $pass)" mode already that might be able to be dropped in to serve such a purpose, though I've not tested it).

I've not looked at the code yet, but if GnuPG does not do /too much/ parsing, maybe it's still okay to use it for key stretching?

There was actually a competition held recently to pick a good key stretching ("password hashing") algorithm. The winner was an entrant called Argon2, which beat out other entrants including yescrypt, which is an improved scrypt variant. Note that a key issue with scrypt is that it allows time/memory trade-offs to an extent that's undesirable in an optimal key stretching algorithm, though this issue can be mitigated by tuning the scrypt invocation to require sufficiently large memory. And all of these key stretching algorithms wipe the floor with GnuPG's S2K Mode 3, which is just an iterated hash, and therefore not memory-hard at all (i.e. GPUs and ASICs can attack it much faster than the CPUs that will be computing it under normal circumstances).

Of course, the problem with exotic key stretching algos is that it's hard to use them without making manual backup recovery dependent on non-ubiquitous software. And a million iterations of SHA512 is much better than no stretching at all.

I'll look into this a bit more later, and report back with any developments :)

@marmarek
Copy link
Member

On Sun, Oct 25, 2015 at 10:13:47AM -0700, pqg wrote:

FWIW, I wonder whether the minimum effort variation would be to apply the key stretching just once, to encrypt a random master key.

Generally we want to avoid doing to much low-level crypto handling,
because we are not cryptographers.

Using a unique key for each file's MAC resolves a possible issue where an attacker could swap files around in the archive, potentially migrating an infection from a lower-security VM to a higher-security VM.

This isn't a problem, because paths are stored inside of archive(s). Outer layer is only for convenience (streaming the backup while creating it, partial restore). Take a look at point 7 here:
https://www.qubes-os.org/en/doc/backup-emergency-restore-v3/

Alas, this proposal requires more operations before verifying the ./backup-header file, which is bad both for code exposure concerns and backwards compatibility. However, preserving the ./backup-header.hmac in its current form will leave a method by which, as mentioned, an attacker can bypass the key stretching and brute-force the bare passphrase (HashCat has an "HMAC-SHA256 (key = $pass)" mode already that might be able to be dropped in to serve such a purpose, though I've not tested it).

Yes, IMO this is currently the main concern. When we get to the ./backup-header, we can have KDF paramenters, salt, iteration count, or any other configuration needed for restoring the rest of the backup. But before, we need something working without such knowledge...

I've not looked at the code yet, but if GnuPG does not do /too much/ parsing, maybe it's still okay to use it for key stretching?

I think we previously concluded, that we don't want to expose gpg to any unverified input.

Of course, the problem with exotic key stretching algos is that it's hard to use them without making manual backup recovery dependent on non-ubiquitous software. And a million iterations of SHA512 is much better than no stretching at all.

And also using exotic algorithms isn't a good idea, because those didn't received that much review/analysis/tests.

But generally IMHO scrypt as a KDF (exactly the thing it was designed for, right?) should be ok. If we'd need to employ KDF ourself at all...

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@andrewdavidwong
Copy link
Member Author

There was actually a competition held recently to pick a good key stretching ("password hashing") algorithm. The winner was an entrant called Argon2, which beat out other entrants including yescrypt, which is an improved scrypt variant. Note that a key issue with scrypt is that it allows time/memory trade-offs to an extent that's undesirable in an optimal key stretching algorithm, though this issue can be mitigated by tuning the scrypt invocation to require sufficiently large memory. And all of these key stretching algorithms wipe the floor with GnuPG's S2K Mode 3, which is just an iterated hash, and therefore not memory-hard at all (i.e. GPUs and ASICs can attack it much faster than the CPUs that will be computing it under normal circumstances).

Of course, the problem with exotic key stretching algos is that it's hard to use them without making manual backup recovery dependent on non-ubiquitous software. And a million iterations of SHA512 is much better than no stretching at all.

Exactly. I'm skeptical that the practical security benefits of using a newer KDF will outweigh the benefits of being able to do manual backup recovery in a variety of disaster scenarios. Security-concious users (which almost by definition includes all Qubes users) ought to be using reasonably long, high-entropy passphrases (I'll refer to these as "good" passphrases for the sake of brevity).

The current system (in which openssl applies a single round of md5) punishes users who pick good passphrases by capping maximum entropy at a measly 128 bits. We should fix that. However, my understanding (correct me if I'm wrong) is that S2K or PBKDF2, applied to good passphrases, will be uncrackable for the foreseeable long-term future (i.e., even if we assume many quadrillions of guesses per second, it would still take at least millions of years). If that's correct, then it seems to me that we should instead select the most trusted, tested, ubiquitous tools available. My backup is no good to me if I can't recover the data because the only software available to me is whatever's on an old Ubuntu disc, for example. (We can't predict in advance what kinds of emergency scenarios Qubes users might find themselves in, and under what conditions they might need manual access to their data.)

@andrewdavidwong
Copy link
Member Author

Having said all that, if scrypt provides exactly what we're looking for, then it may be worthwhile to go with that. (And advise users to store a copy of scrypt with their backups, or something.) If anyone can get this right, it's Colin Percival.

@mfc
Copy link
Member

mfc commented Oct 26, 2015

we should add a crypto tag to this (and any other similar issues) and I will create a thread with Cure53 (or NCC / iSec Partners) about this. They can provide free crypto advice to OTF-funded projects.

@marmarek marmarek added the cryptography This issue pertains to cryptography. label Oct 26, 2015
@marmarek
Copy link
Member

Great idea!

@pqg
Copy link

pqg commented Oct 26, 2015

@axon-qubes

I concur with your position that, if at all possible, the data should be recoverable with an "old Ubuntu disc".

The scrypt binary and python-scrypt module don't seem to be currently packaged with Fedora anyway (they are available on Debian Jessie; I've not checked elsewhere). And keeping scrypt alongside the backup would be problematic if we need to invoke scrypt /before/ we can verify the integrity of anything.

As an aside, I think we should distinguish between 2 uses of "KDF":

  1. A means by which to derive subkeys of arbitrary length from a master key of fixed length.
  2. A means by which to perform "key stretching" (and salting) so as to make brute force attacks more costly on user-supplied keying material.

For instance, scrypt could provide function 2, but not function 1. Function 1 is served by things like NIST 800-108, which in the special case of all subkeys being of equal or shorter length than the underlying hash function, reduces to HMAC(MK, "\x01<subkey_purpose_specifier_string>\x00\x20"), where the \x01 and \x00 are no longer serving a particular purpose in this reduced form, and \x20 is the length of the resulting subkey. More conservative KDFs (purpose 1) use a single keystream that feeds back the result of the previous HMAC invocations or, more conservative still, they create one feedback pipeline whose output is passed to a second HMAC invocation to produce the actual keystream (TLSv1.2 does this, see section 5). However, all of these are prohibitively awkward in the "old Ubuntu disc" scenario. By contrast:

echo -n qubes_backup_authentication_key | openssl dgst -sha256 -hmac <Hex(Master Key)>

...should work almost anywhere, yield the desired different keys for different purposes, and be just as secure as long as SHA-256 is a strong hash function. (The fancier derivation methods are just meant to give some additional protection if it should turn out that the underlying hash function is weaker than expected.) Though note that the Master Key (randomly generated or the result of scrypt or some other key stretching) will be a binary blob, which will be tough to pass via -hmac unless encoded by hex (as above) or base64, which is kind of unsightly protocol-wise. On OpenSSL >= v1.0.0 (Ubuntu since at least 2011) the following invocation could be used instead to pass the master key hex-encoded:

echo -n qubes_backup_authentication_key | openssl dgst -sha256 -mac hmac -macopt hexkey:<Hex(Master Key)>

So that the output is actually HMAC_SHA256(MK, "qubes_backup_authentication_key") rather than HMAC_SHA256(Hex(MK), "qubes_backup_authentication_key") as it would have to be with -hmac.

However, my understanding (correct me if I'm wrong) is that S2K or PBKDF2, applied to good passphrases, will be uncrackable for the foreseeable long-term future (i.e., even if we assume many quadrillions of guesses per second, it would still take at least millions of years).

With no key stretching, e.g. just take the SHA-256 of the passphrase, an attacker capable of 10 quadrillion tries a second for 10 million years would exhaust an ~101-bit keyspace. This represents approximately an 8-word diceware passphrase (~103 bits). With ~1million rounds of key stretching, as in S2K Mode 3 at max settings, ~81 bits of input keyspace could be searched, which requires a diceware passphrase of 7 words (~90 bits) though you would get pretty close with 6 words (~78 bits).

A GPU running hashcat can currently trial on the order of 2 billion SHA-256 sums per second (2*10^9/sec) so 10^16/sec represents a powerful adversary indeed. 10 million years is of course just providing a robust margin of error, since in 30 years the crypto will be completely obsolete, and in 100 years we'll all be dead.

So 128 bits is still enough for anyone. Unless your threat model includes the arrival of quantum computers in the very near term, in which case you should double all your symmetric crypto key length requirements.

That said, as you originally advanced, even if nothing else changes from this discussion, -md sha256 should be added to the openssl enc invocations. It's much more conservative.

None of this, of course, helps with the MACs, where we really want to salt and stretch the user's passphrase before doing anything, even if the same resulting key is used for both verification and encryption.

Adding a salt not only avoids precomputaion attacks, but binds the files to the particular backup. While the renaming attack I previously feared is indeed impossible, I don't /think/ that there's currently any protection against shuffling in an identically-named VM from a different backup, as long as both backups were made with the same passphrase.

However, there then needs to be some way to store the salt and, if variable, any key stretching parameters, and load them before verification. Would it be an acceptable risk to parse just the key=values out of the ./backup-header and interpret those necessary to establish the key, then check the ./backup-header.hmac, and finally interpret the rest of the values (e.g. if, in the future, a timestamp is added to the header, it would be good to not have to parse that until after verification)? Setting hard line and file length limits might help limit the attack surface.

This certainly presents less attack surface than any way I can think of using GPG to do pre-verification stretching. Though, I'm aware I haven't actually proposed another way to perform key stretching yet. Python has baked-in hashlib.pbkdf2_hmac(), but only since v3.4 and v2.7.8, which are far too recent (mid 2014). I'd hoped it might be possible to use gpg --symmetric on a fixed value, hash the resulting "file", and use that as the key, but I've not been able to find a way to pass a salt and/or IV to GPG, so it unfortunately generates a random one on each invocation. The scrypt binary has the same problem, assuming its availability issues could be overcome.

The bcrypt functionality in python-passlib may be a reasonable bet. I can't guarantee how ubiquitous it is, but it's in Debian >=7 and backports for 6, definitely in Ubuntu >= 12.04, and available on Fedora 21. bcrypt has been fairly heavily popularized and so should common enough; I suspect that a working implementation should be more readily obtainable for our hypothetical frazzled power user than scrypt.

On the other hand, the scrypt binary is pretty easy to build if you've got gcc and autotools. As mentioned, it's only good for encrypting/decrypting files, not plain key stretching, so we'd be back to the ./master-key idea or similar, but it still presents a considerably smaller attack surface than GnuPG.

@marmarek
Copy link
Member

Adding a salt not only avoids precomputaion attacks, but binds the files to the particular backup. While the renaming attack I previously feared is indeed impossible, I don't /think/ that there's currently any protection against shuffling in an identically-named VM from a different backup, as long as both backups were made with the same passphrase.

Yes, such attack probably is possible.

However, there then needs to be some way to store the salt and, if variable, any key stretching parameters, and load them before verification. Would it be an acceptable risk to parse just the key=values out of the ./backup-header and interpret those necessary to establish the key, then check the ./backup-header.hmac, and finally interpret the rest of the values (e.g. if, in the future, a timestamp is added to the header, it would be good to not have to parse that until after verification)? Setting hard line and file length limits might help limit the attack surface.

This might be some option if nothing better comes to us. Those
pre-verification options needs to be clearly marked (some prefix in
name?) and reduced to absolutely minimum. If we'd have to go that way, I
think this should include version field, which would determine most of
parameters. For example it would be bad idea to include hash algorithm
in those pre-authentication set, because the attacker would be able to
replace it with some weak value (vide "cipher_null" in LUKS header[1]).

The bcrypt functionality in python-passlib may be a reasonable bet. I can't guarantee how ubiquitous it is, but it's in Debian >=7 and backports for 6, definitely in Ubuntu >= 12.04, and available on Fedora 21. bcrypt has been fairly heavily popularized and so should common enough; I suspect that a working implementation should be more readily obtainable for our hypothetical frazzled power user than scrypt.

And according to previous comments, bcrypt is also somehow sensible
choice, right?

On the other hand, the scrypt binary is pretty easy to build if you've got gcc and autotools. As mentioned, it's only good for encrypting/decrypting files, not plain key stretching, so we'd be back to the ./master-key idea or similar, but it still presents a considerably smaller attack surface than GnuPG.

I think we can simply have hex-encoded, encrypted master key included in
the backup header.

And as the backup format is going to be more and more complex, maybe
it's a good idea to include emergency restoration script as one of
plain text, integrity protected files (similar to backup-header)? So the
emergency procedure would include manual verification of backup header
and that script, and then simply run the script.

[1]
https://github.com/QubesOS/qubes-secpack/blob/master/QSBs/qsb-019-2015.txt

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@defuse
Copy link

defuse commented Nov 14, 2016

Heads up that I added a feature to crack Qubes v3.1 (and think, but have not verified v3.2) backup encryption (default settings) to CrackStation: https://bqp.io/qubes-backup-encryption-added-to-crackstation.html I hope by doing this I'll attract some more cryptographers in here making sure the next version is rock solid.

In case anyone wants a short-term workaround, what I've done personally is to encrypt my backup disk with LUKS, and then store my Qubes-encrypted backups on there.

@andrewdavidwong
Copy link
Member Author

andrewdavidwong commented Dec 11, 2016

Heads up that I added a feature to crack Qubes v3.1 (and think, but have not verified v3.2) backup encryption (default settings) to CrackStation: https://bqp.io/qubes-backup-encryption-added-to-crackstation.html

Yes, this applies to 3.2 also.

I hope by doing this I'll attract some more cryptographers in here making sure the next version is rock solid.

Thanks, @defuse.

Related question: Theoretically, what's the fastest way to brute force a Qubes backup under the current scheme? There are at least two ways to go, right?

  1. Keep trying 32-digit hexadecimal strings (i.e., possible outputs of md5(passphrase)) as aes-256-cbc keys until one works. (Assuming that the passphrase has >= 128-bits of entropy.)
  2. Keep computing openssl dgst -sha512 -hmac '<passphrase>' backup-header until one matches backup-header.hmac.

Is 1 theoretically faster because it entails a smaller keyspace? Or is there another, faster way?

The reason I ask is because this would seem to affect estimates of how strong/weak the current system is (i.e., how long it would take to brute force given different values of passphrase entropy).

@jpouellet
Copy link
Contributor

jpouellet commented Dec 11, 2016

  1. Keep trying 32-digit hexadecimal strings (i.e., possible outputs of md5(passphrase)) as aes-256-cbc keys until one works. (Assuming that the passphrase has >= 128-bits of entropy.)

No, 128 bits is an unthinkably large state space to try to brute-force. The set of md5(likely_passphrases) is sparse within that space, so you would definitely want to start with likely passphrase candidates and pass those through md5 to generate candidate aes keys, not try to randomly guess a correct 128-bit value (assumed to be so incredibly unlikely that it is why many standards target the 128-bit effective security level).

The larger problem with this approach though is checking whether or not a resulting passphrase is correct. You then need to use the resulting key to try to decrypt the filesystem, and check if the plaintext looks like a filesystem. The HMAC-cracking approach has the benefit that it is immediately obvious whether or not the result is correct.

  1. Keep computing openssl dgst -sha512 -hmac '' backup-header until one matches backup-header.hmac.

Yes. This.

Is 1 theoretically faster because it entails a smaller keyspace? Or is there another, faster way?

No. The question is not about 128-bits vs 256-bits (both are assumed to be sufficient in crypto literature). The design flaw here is that is the HMAC and encryption key are derived from exactly the same input, and an HMAC construction allows fast (cheap to attack) checking of whether or not a key is correct (because it is designed for cheap message authentication using shared secrets), whereas a strong KDF is designed to be expensive-to-compute (and that expense does not need to be paid by an attacker who can bypass it by computing the HMAC for each candidate passphrase instead).

Disclaimer: I am not a cryptographer, I just like reading and thinking about applied crypto, and occasionally have to crack passwords

@jpouellet
Copy link
Contributor

The reason I ask is because this would seem to affect estimates of how strong/weak the current system is (i.e., how long it would take to brute force given different values of passphrase entropy).

In its current (3.2) state, we essentially have the cheapest KDF possible, so I think the only safe way to use it is to have a passphrase that is already sufficiently high entropy for use directly as a key.

When I used qvm-backup (to transfer VMs from old laptop to new laptop), I used a key from dd if=/dev/random, encrypted the result asymmetrically to a key generated on the destination machine, signed that with a key generated on the origin machine, performed the backup & transfer, then decrypted the used symmetric key on the destination machine.

I think such an asymmetric scheme is really the way to go for various reasons, but I've yet to put my arguments together and formally make a case for it. tl;dr - allows to do away with passphrases and have a safely-stored offline backup-recovery key (which is not needed at backup-creation time), and a backup-origin signing key. This would enable the creation of encrypted and authenticated backups in a hostile environment where entering a backup passphrase would be problematic.

@jpouellet
Copy link
Contributor

tl;dr - allows to do away with passphrases and have a safely-stored offline backup-recovery key

That recovery key could (should) of course be protected with a passphrase itself, but it need not be used (or even known) at backup-creation time.

@andrewdavidwong
Copy link
Member Author

  1. Keep trying 32-digit hexadecimal strings (i.e., possible outputs of md5(passphrase)) as aes-256-cbc keys until one works. (Assuming that the passphrase has >= 128-bits of entropy.)

No, 128 bits is an unthinkably large state space to try to brute-force. The set of md5(likely_passphrases) is sparse within that space, so you would definitely want to start with likely passphrase candidates and pass those through md5 to generate candidate aes keys, not try to randomly guess a correct 128-bit value (assumed to be so incredibly unlikely that it is why many standards target the 128-bit effective security level).

Of course. I didn't say anything about randomly guessing 128-bit values. It goes without saying that you would prioritize the possible outputs of likely passphrases before unlikely ones. (This is, of course, an oversimplification of what sophisticated password crackers actually do.)

The larger problem with this approach though is checking whether or not a resulting passphrase is correct. You then need to use the resulting key to try to decrypt the filesystem, and check if the plaintext looks like a filesystem. The HMAC-cracking approach has the benefit that it is immediately obvious whether or not the result is correct.

That's one thing I was wondering about. I suppose it depends on how much additional time it takes to attempt to decrypt the file with each key.

Is 1 theoretically faster because it entails a smaller keyspace? Or is there another, faster way?

No. The question is not about 128-bits vs 256-bits (both are assumed to be sufficient in crypto literature).

You're missing the point. The question is which one is theoretically faster. The question is not whether both would take a sufficiently long time.

The design flaw here is that is the HMAC and encryption key are derived from exactly the same input, and an HMAC construction allows fast (cheap to attack) checking of whether or not a key is correct (because it is designed for cheap message authentication using shared secrets), whereas a strong KDF is designed to be expensive-to-compute (and that expense does not need to be paid by an attacker who can bypass it by computing the HMAC for each candidate passphrase instead).

I know. You just repeated the issue that I originally reported.

@jpouellet
Copy link
Contributor

I know. You just repeated the issue that I originally reported.

Err, oops. Indeed. This thread has been so long I'd forgotten you were the OP. :) My apologies if you feel I've insulted your intelligence by my explanation.

@defuse
Copy link

defuse commented Dec 11, 2016

Related question: Theoretically, what's the fastest way to brute force a Qubes backup under the current scheme? There are at least two ways to go, right?

Keep trying 32-digit hexadecimal strings (i.e., possible outputs of md5(passphrase)) as aes-256-cbc keys until one works. (Assuming that the passphrase has >= 128-bits of entropy.)
Keep computing openssl dgst -sha512 -hmac '<passphrase>' backup-header until one matches backup-header.hmac.

These are about equivalent, if you ignore the difference in time it takes to compute MD5 then AES vs. HMAC-SHA512.

The reason I ask is because this would seem to affect estimates of how strong/weak the current system is (i.e., how long it would take to brute force given different values of passphrase entropy).

The worst problem with the current system is that the password isn't salted. Without a salt, the attacker can perform their brute-force attack once, saving all the (HMAC, password) pairs they compute into a lookup table, then quickly crack any HMAC in that list with a fast (<1 second) lookup. That's a lot cheaper than having to re-run the brute-force attack for each hash (which using salt ensures). Fixing this changes the asymptotics of the best attack -- O(N) precomputation then O(1) for each hash to crack into no precomputation and O(N) for each hash to crack.

The second problem is that we want to use some sort of slow key derivation function on the password, like Argon2, scrypt, bcrypt, etc, to increase the cost of attacks further.

@marmarek
Copy link
Member

See #971 (comment) I think it all have been covered already (mostly by using scrypt utility).

@marmarek
Copy link
Member

One issue with new backup format is that each chunk (currently 100M) has key derived separately - and it takes time (by design). I think the easiest think to do is to increase chunk size to for example 1GB, to minimize the impact.
Other solution would be to derive key once use it for all the chunks, but it will require more work because:

  • cannot use scrypt tool directly - for example go back to openssl enc/openssl dgst - this time with explicit keys (derived using scrypt KDF) instead of very weak built-in KDF; or find some other tool
  • needs some other way to bind encrypted chunk to its name (currently done by prefixing chunk name to the passphrase feed into scrypt KDF)
    And this will also make the manual restore harder.
    Looking at this long discussion, I don't want to go through it all along again...

@andrewdavidwong
Copy link
Member Author

One issue with new backup format is that each chunk (currently 100M) has key derived separately - and it takes time (by design). I think the easiest think to do is to increase chunk size to for example 1GB, to minimize the impact.

Is this "just" a performance issue, or is it also a security issue? How much of a performance impact are we talking about? Why is the current chunk size 100M in the first place?

Other solution would be to derive key once use it for all the chunks, but it will require more work because: [...]

Both of those options sound undesirable. Are there any downsides to just using a 1GB chunk size?

Looking at this long discussion, I don't want to go through it all along again...

Not sure what you mean here. Why would we have to go through it all again?

@marmarek
Copy link
Member

marmarek commented Jan 13, 2017 via email

@andrewdavidwong
Copy link
Member Author

In that case, I agree with you. Better to keep it simple and just increase the chunk size to 1GB (or some similarly suitable size).

@ptitdoc
Copy link

ptitdoc commented Feb 16, 2017

Keep in mind that some people want to backup things because their disk is almost full. That was the reason of using chunks in the first place.

I feel that 1GB chucks are dangerous in such cases. For instance, if I have 5GB free, the backup should not queue more than 4 chunks of 1GB.
Then if the backup fails because there is not enough free space, the files may not be properly removed, and the user ends up with a broken system.

With lvm thin provisioning, I usually ends up with a read-only device with IO errors because of a full disk, with no opportunity to clear some free space without running a live cd.

@ptitdoc
Copy link

ptitdoc commented Feb 16, 2017

Just to add to the discussion. I worked on an alternative version that still uses openssl enc and dgst while using python hashlib key derivation function:

  • perform key derivation using python PBKDF2 hashlib function
  • add a .salt file to the backup file to support PBKDF2
  • use openssl enc and dgst with the derived key as the KEY without using openssl password derivation mechanisms
  • add a random .iv file in the backup for each chuck
  • backup can still be decrypted manually using a small python script to perform PBKDF2 key derivation, reading IVs and feeding it to the openssl enc or dgst tool.

Problems are:

  • in this case we are playing with python random generator os.urandom and python implementation of PBKDF2 (based on the documentation, it could use openssl implementation)
  • it may still be prone to cryptographic attack because of errors in handling or sizes of IVs ...
  • the raw derived backup key is passed as parameters to openssl, and is present everywhere in the memory.

@marmarek
Copy link
Member

Keep in mind that some people want to backup things because their disk is almost full. That was the reason of using chunks in the first place.

Then tmpfs could be used (so RAM + swap). When using 1GB chunks, maximum number of them should be probably decreased to 5 or so (currently it's 10). The number of chunks in the queue is not really important as long as it isn't 1 - it should be only large enough to allow parallel uploading one chunk while preparing the next one. Maybe a good compromise would be 4x500MB?

@ptitdoc
Copy link

ptitdoc commented Feb 16, 2017

Yes, most Qubes users probably use at least 8GB of RAM, and most VMs currently needs to be stopped to perform backups. So we can hope that there is 4GB of free space in dom0, or maybe we can request that before performing a backup, and warn the user about bad performance if this assertion failed ?

@andrewdavidwong
Copy link
Member Author

If the performance impact is significant (e.g., at least several minutes faster per qvm-backup) on systems with sufficient RAM, then I recommend keeping the larger 1GB chunk size. As @ptitdoc pointed out, most Qubes systems will have at least 8GB. The ones that have less will never be running many VMs anyway, so it won't be hard to shut them down for qvm-backup. Meanwhile, the vast majority of users will benefit from performance gains on a task they do (or should be doing) frequently. Across all Qubes users, this seems like it could easily add up to thousands of hours saved over the lifetime of R4.0.

@marmarek
Copy link
Member

It's about 1s for the sole KDF. Lets do some math: assume 100GB backup, with VM sizes being multiply of 1GB. This gives:

  • 1GB chunk: 1m 60s overhead for KDF
  • 500MB chunk: 3m 20s overhead for KDF
  • 100MB chunk: 16m 40s overhead for KDF

Is ~3.5m bad?

@andrewdavidwong
Copy link
Member Author

Suppose a single user makes weekly backups for a year:

(3.5 minutes * 52 weeks) / 60 minutes = ~3 hours saved per year

Suppose a little under 10% of the estimated Qubes userbase does this:

~3 hours * 2,000 users = ~6,000 hours saved per year

So, even with conservative assumptions, we'd be saving users thousands of hours per year, collectively. And the only cost is that users with only 4GB of RAM may have to shut down their VMs before they run qvm-backup, which they'd probably do anyway? Seems worth it to me.

@marmarek
Copy link
Member

marmarek commented Feb 16, 2017

I don't see why to look at collective time of Qubes userbase. It isn't really observable value for anyone. But it does make sense to look at time spent/saved per year.

@grey-olli
Copy link

grey-olli commented Feb 16, 2017 via email

@jpouellet
Copy link
Contributor

I don't see a fundamental reason why the KDF needs to be recomputed for each chunk... the output of the KDF should be safely cacheable, just as we must cache the plaintext passphrase between chunks today.

Do I understand correctly that the current KDF-recomputation-on-each-chunk is just an artifact of wanting to build the backup system out of guaranteed-to-be-easily-obtainable-in-the-future programs and treat each such program as a black box, and the single scrypt binary being responsible for both computing the KDF and doing the bulk encryption, thus having no clean way to cache the KDF output? Or is there some other reason?

@marmarek
Copy link
Member

Yes, exactly this one alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core cryptography This issue pertains to cryptography. help wanted This issue will probably not get done in a timely fashion without help from community contributors. P: major Priority: major. Between "default" and "critical" in severity. release notes This issue should be mentioned in the release notes. security This issue pertains to the security of Qubes OS.
Projects
None yet
Development

No branches or pull requests