Skip to content

Conversation

@mberhault
Copy link
Contributor

@mberhault mberhault commented Nov 3, 2017

This is the initial RFC for encryption at rest.

The Security considerations section must be expanded and carefully
examined (anyone know a security expert with free time?)

The Unresolved questions must be resolved (ah!)

The Future improvements must be examined for in/out-of scope
decisions.

@mberhault mberhault requested a review from bdarnell November 3, 2017 20:24
@mberhault mberhault requested a review from a team as a code owner November 3, 2017 20:24
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@mberhault
Copy link
Contributor Author

CCing a few people for specific sections:
@dianasaur323 up to and including User level explanation
@dt for Enterprise enforcement
@mjibson and @arjunravinarayan for Other uses of rocksdb

@madelynnblue
Copy link
Contributor

Review status: 0 of 1 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 557 at r1 (raw file):

existing data and re-create the rocksdb instance with preamble support.

In the case of multiple stores with different encryption settings, we must pick one.

Does it make sense to dynamically generate these keys in memory on each start? There's no need to persist them since the old data doesn't need to be read. That would remove some user burden.


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 557 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Does it make sense to dynamically generate these keys in memory on each start? There's no need to persist them since the old data doesn't need to be read. That would remove some user burden.

As long as we don't need to persist the data through node restarts, that's an option. But if we already have keys we should be ok. As long as key reuse is acceptable (eg: not on the same disk as the keys).

At this point I'm more concerned about finding all instances of local disk outside of the normal store rocksdb instance. Could you point me to some code?


Comments from Reviewable

@dt
Copy link
Contributor

dt commented Nov 3, 2017

Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 718 at r1 (raw file):

### Enterprise feature gating

The current proposal does gate encryption on a valid license due to the fact that we cannot check the license

s/does/does not/ ?


docs/RFCS/20171101_encryption_at_rest.md, line 727 at r1 (raw file):

* the license can be passed through `init`

This would still cause issues when removing the license (or errors loading/validating the license).

I assume we'd only check this during startup -- a running/happy node wouldn't suddenly stop serving non-system tables if it had been running just because the license was removed.


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions.


docs/RFCS/20171101_encryption_at_rest.md, line 718 at r1 (raw file):

Previously, dt (David Taylor) wrote…

s/does/does not/ ?

oops. Done.


docs/RFCS/20171101_encryption_at_rest.md, line 727 at r1 (raw file):

Previously, dt (David Taylor) wrote…

I assume we'd only check this during startup -- a running/happy node wouldn't suddenly stop serving non-system tables if it had been running just because the license was removed.

Right, but startup can have its own errors. If you restart your entire cluster and cannot load the license, you would have the restrictions applied. I didn't define errors, but anything that may cause a wrong license check would be a problem.


Comments from Reviewable

@madelynnblue
Copy link
Contributor

Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 557 at r1 (raw file):

Previously, mberhault (marc) wrote…

As long as we don't need to persist the data through node restarts, that's an option. But if we already have keys we should be ok. As long as key reuse is acceptable (eg: not on the same disk as the keys).

At this point I'm more concerned about finding all instances of local disk outside of the normal store rocksdb instance. Could you point me to some code?

We already delete this directory on shutdown/startup (or maybe just one of those?), so that is fine. Considering that this directory is now configurable to be outside of a store, using randomized keys seems like it would be nice because users wouldn't have yet another disk/key thing to worry about. The only temp engine I'm aware of is

tempEngine, err := engine.NewTempEngine(s.cfg.TempStorageConfig)
which is used by both distsql and import.


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 557 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

We already delete this directory on shutdown/startup (or maybe just one of those?), so that is fine. Considering that this directory is now configurable to be outside of a store, using randomized keys seems like it would be nice because users wouldn't have yet another disk/key thing to worry about. The only temp engine I'm aware of is

tempEngine, err := engine.NewTempEngine(s.cfg.TempStorageConfig)
which is used by both distsql and import.

Thanks, that should be easy. You're probably right about temporary keys, they would provide a nicely isolated way of doing it. I've added it to the RFC.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Nov 3, 2017

Reviewed 1 of 1 files at r2, 1 of 1 files at r4.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


Comments from Reviewable

@petermattis
Copy link
Collaborator

:lgtm:

Seems very well thought out, though someone with more security chops should scrutinize.


Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 748 at r4 (raw file):

Some files (eg: backups) generated by rocksdb are not included in the "Live files" or "WAL files".
Those files will still be encrypted with the data keys but will not be included in the key-usage computation,

You mention "key usage computation" here and elsewhere, but I didn't find where you define what that is and how the computation is performed.


docs/RFCS/20171101_encryption_at_rest.md, line 796 at r4 (raw file):

### Garbage collection of old data keys

We would prefer not to keep old data keys forever, but we need to be certain that a key is no longer in use

Seems better to keep old data keys around for a long time than to accidentally delete them and possibly a lot of data. Even if we can accurately detect which data keys are in use, I'd suggest keeping an "old data keys" file that would be encrypted with the store key.


Comments from Reviewable

@mberhault
Copy link
Contributor Author

I agree. This is a fairly standard way of doing this but I'd like someone to take a look anyway. At the very least, we need to make sure we're aware of the security assumptions we're making (some are mentioned, but most likely not all) and flesh out the recommended configuration for users to use this safely.


Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 748 at r4 (raw file):

Previously, petermattis (Peter Mattis) wrote…

You mention "key usage computation" here and elsewhere, but I didn't find where you define what that is and how the computation is performed.

Sorry, the subsection for that is titled Reporting encryption status. This boils down to computing number/size of files encrypted per key/cipher. I'll tweak the naming throughout the doc to be more consistent.


docs/RFCS/20171101_encryption_at_rest.md, line 796 at r4 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Seems better to keep old data keys around for a long time than to accidentally delete them and possibly a lot of data. Even if we can accurately detect which data keys are in use, I'd suggest keeping an "old data keys" file that would be encrypted with the store key.

We can definitely keep them for a long long time, individual key entries would be around 100 bytes each (plus the fixed 4KiB preamble). Even if you rotate them once the week, you have a long way to go before making any sort of dent in disk usage. Memory usage could be reduced by loading keys on-demand (but that would take a long time too). An "old keys" file would work as well, just adds more complexity.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Nov 6, 2017

A security friend of mine is highlighting that this document is hard to review to a security expert, because it does not sufficiently outline its threat model. That needs to be addressed before we can solicit more insight.

@knz
Copy link
Contributor

knz commented Nov 6, 2017

(i.e. "what are we protecting against")

@rjnn
Copy link
Contributor

rjnn commented Nov 6, 2017

Reviewed 1 of 1 files at r5.
Review status: all files reviewed at latest revision, 12 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 67 at r1 (raw file):

Encryption is desired for security reasons (prevent access from other users on the same
machine, prevent data leak through drive theft/disposal) as well as regulatory reasons
(GDPR, HIPPA, OCI DSS).

nit: HIPAA


docs/RFCS/20171101_encryption_at_rest.md, line 172 at r5 (raw file):

This two-level approach is used to allow easy rotation of store keys. To rotate the store key, all we need
to do is re-encrypt the file containing the data keys, leaving the bulk of the data as is.

Do we auto rotate the data key? You say so below, can add a line here saying so.


docs/RFCS/20171101_encryption_at_rest.md, line 187 at r5 (raw file):

The need for encryption entails a few recommended changes in production configuration:
* disable swap: we want to avoid any data hitting disk unencrypted, this includes memory being swapped out.
* run on architectures that support the [AES-NI instruction set](https://en.wikipedia.org/wiki/AES_instruction_set).

this is just for performance right? Useful to separate the "absolutely must" recommendations (for security), v.s. the "preferred" recommendations (for performance assuming encr-at-rest is turned on)


docs/RFCS/20171101_encryption_at_rest.md, line 188 at r5 (raw file):

* disable swap: we want to avoid any data hitting disk unencrypted, this includes memory being swapped out.
* run on architectures that support the [AES-NI instruction set](https://en.wikipedia.org/wiki/AES_instruction_set).
* have a separate are (partition, fuse-filesystem, etc...) to store the store-level keys.

s/are/area/


docs/RFCS/20171101_encryption_at_rest.md, line 549 at r5 (raw file):

* `rocksdb::GetSortedWalFiles`: retrieve the sorted list of all wal files

Reading each file preamble means reading the first 4KiB of each file. To avoid performing this too

After reading this section, I don't understand how RocksDB rotates keys. Can RocksDB accept multiple active data keys, and thus it is on us to take a file, rekey it, and then atomically swap it under the hood, invisible to RocksDB?


docs/RFCS/20171101_encryption_at_rest.md, line 550 at r5 (raw file):

Reading each file preamble means reading the first 4KiB of each file. To avoid performing this too
frequently, we can cache results. We must take particular care with files being overwritten and

RocksDB never overwrites SSTs, so what files are being overwritten?


docs/RFCS/20171101_encryption_at_rest.md, line 558 at r5 (raw file):

This applies to:
* backup and restore

I can imagine there being another set of "backup keys", since a user might want backups to be part of an ETL process, and not have to have their entire system forced to use the exact same keys.


docs/RFCS/20171101_encryption_at_rest.md, line 587 at r5 (raw file):

See [Enterprise feature gating](#enterprise-feature-gating) for possible alternatives.

## Security considerations

This should start with a threat model section, as @knz mentioned.


docs/RFCS/20171101_encryption_at_rest.md, line 760 at r5 (raw file):

a different method to rewrite.

Compaction (of the entire key space, or specific ranges determined through live file metadata) may provide

I would appreciate if you took some of these points and wrote up a "how we do key rotation using RocksDB's existing write patterns" section above.


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 12 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 67 at r1 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

nit: HIPAA

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 172 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

Do we auto rotate the data key? You say so below, can add a line here saying so.

There's a whole subsection titled Rotating data keys The next line also says "data keys are generated and rotated by cockroach"


docs/RFCS/20171101_encryption_at_rest.md, line 187 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

this is just for performance right? Useful to separate the "absolutely must" recommendations (for security), v.s. the "preferred" recommendations (for performance assuming encr-at-rest is turned on)

Yeah, that'll need to be hashed out in the docs. recommendations for security vs performance reasons.


docs/RFCS/20171101_encryption_at_rest.md, line 188 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

s/are/area/

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 549 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

After reading this section, I don't understand how RocksDB rotates keys. Can RocksDB accept multiple active data keys, and thus it is on us to take a file, rekey it, and then atomically swap it under the hood, invisible to RocksDB?

rocksdb doesn't know anything about keys. The env_encryption layer only knows about a cypher that encrypts, decrypts. We initialize the cipher with the proper key after reading the preamble. It's up to us to start using a new "active key" when we do rotation.


docs/RFCS/20171101_encryption_at_rest.md, line 550 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

RocksDB never overwrites SSTs, so what files are being overwritten?

Everything written by rocksdb will be encrypted. This includes things like CURRENT (points to the current manifest) and IDENTITY


docs/RFCS/20171101_encryption_at_rest.md, line 558 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

I can imagine there being another set of "backup keys", since a user might want backups to be part of an ETL process, and not have to have their entire system forced to use the exact same keys.

This applies to the sstables generated locally by restore and then loaded. I still need to better understand how this works, but it has nothing to do with the external (eg: S3) backup.


docs/RFCS/20171101_encryption_at_rest.md, line 587 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

This should start with a threat model section, as @knz mentioned.

Yeah, there's some discussion about why customers want this, and some about the assumptions we make. However, I have no idea how to define a threat model, let alone a "rigorous" one.


docs/RFCS/20171101_encryption_at_rest.md, line 760 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

I would appreciate if you took some of these points and wrote up a "how we do key rotation using RocksDB's existing write patterns" section above.

uh, there's not much to say other than "we write new files using the active key". Since we can't rewrite the files ourselves the best we can do is trigger compactions at the rocksdb level. This is all mentioned.


Comments from Reviewable

@rjnn
Copy link
Contributor

rjnn commented Nov 6, 2017

Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 549 at r5 (raw file):

Previously, mberhault (marc) wrote…

rocksdb doesn't know anything about keys. The env_encryption layer only knows about a cypher that encrypts, decrypts. We initialize the cipher with the proper key after reading the preamble. It's up to us to start using a new "active key" when we do rotation.

Then I can imagine the following key rotation process: we encrypt all newly created SSTs with the new store key, and trigger compactions to get rid of old SSTs (so that we don't ever have to do an in-place rewrite of a "live" SST). The only trickiness comes with lower levels that may simply be dormant for a long period of time, in which case it might be safe to re-encrypt them after compaction triggers have failed to invalidate them.


docs/RFCS/20171101_encryption_at_rest.md, line 550 at r5 (raw file):

Previously, mberhault (marc) wrote…

Everything written by rocksdb will be encrypted. This includes things like CURRENT (points to the current manifest) and IDENTITY

How do we "pause" RocksDB if we are ripping a file out from under it?


docs/RFCS/20171101_encryption_at_rest.md, line 587 at r5 (raw file):

Previously, mberhault (marc) wrote…

Yeah, there's some discussion about why customers want this, and some about the assumptions we make. However, I have no idea how to define a threat model, let alone a "rigorous" one.

You want two things:

A list of properties we want to ensure, which I imagine roughly as:

  • no attacker who steals the physical machine when powered off can read anything
  • no attacker who steals the physical machine when powered off can write anything (and presumably try to put it back)

A list of restrictions on the attacker, which if broken, all is lost (this is my initial stab at it, I'm sure there are more):

  • The attacker does not have access (root or otherwise) to the running machine
  • The attacker does not have store keys that are either in-use or active. The attacker can have inactive store keys. (writing this out explicitly has reminded me that there is a vector of attack in that old SSTs now have to be cleanly purged from the disk)

Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 549 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

Then I can imagine the following key rotation process: we encrypt all newly created SSTs with the new store key, and trigger compactions to get rid of old SSTs (so that we don't ever have to do an in-place rewrite of a "live" SST). The only trickiness comes with lower levels that may simply be dormant for a long period of time, in which case it might be safe to re-encrypt them after compaction triggers have failed to invalidate them.

This is addressed in Unresolved questions: forcing re-encryption


docs/RFCS/20171101_encryption_at_rest.md, line 550 at r5 (raw file):

Previously, arjunravinarayan (Arjun Narayan) wrote…

How do we "pause" RocksDB if we are ripping a file out from under it?

We don't, we rely on natural churn. Also addressed in Unresolved questions: forcing re-encryption


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 557 at r1 (raw file):

Previously, mberhault (marc) wrote…

Thanks, that should be easy. You're probably right about temporary keys, they would provide a nicely isolated way of doing it. I've added it to the RFC.

After looking at the code a bit more, I think we need to use the existing keys to write the SST to be ingested. We end up just passing the filename to the rocksdb instance, so the Env in use will need to know the key used to write the file. We may be able to inject temporary keys into the real rocksdb instance, but we may be better off using the real ones for now.


Comments from Reviewable

@tbg
Copy link
Member

tbg commented Nov 6, 2017

Reviewed 1 of 1 files at r6.
Review status: all files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 200 at r6 (raw file):

The first step in allowing at-rest encryption is using the preamble data format.
This must be done at node-creation time with the `--rocksdb-preamble-format` flag.

Seems like the Init command is a more natural fit for this since it obviates having to think about re-specifying this after first boot. Worth a sentence how the init command will interact with this, if at all.


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 200 at r6 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Seems like the Init command is a more natural fit for this since it obviates having to think about re-specifying this after first boot. Worth a sentence how the init command will interact with this, if at all.

I'm not sure how passing it through init helps. Does it become a cluster-wide setting? How do we pass that before we initialize a new rocksdb instance? It's simplest to provide it as a permanent flag, much the same way you have to keep specifying --store.
Eventually, this flag should go away and the preamble format will become the default.


Comments from Reviewable

@bdarnell
Copy link
Contributor

bdarnell commented Nov 6, 2017

Review status: all files reviewed at latest revision, 32 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 67 at r6 (raw file):

Encryption is desired for security reasons (prevent access from other users on the same
machine, prevent data leak through drive theft/disposal) as well as regulatory reasons
(GDPR, HIPAA, OCI DSS).

s/PCI/OCI/?


docs/RFCS/20171101_encryption_at_rest.md, line 134 at r6 (raw file):

1. cipher: the cipher used. Choice of `PLAIN`, `AES128-CTR`, `AES192-CTR`, `AES256-CTR`.
1. timestamp: the current time in seconds since epoch (eg: `date +%s`)
1. key: a key in hexadecimal format (eg: `openssl rand -hex 32` **WARNING** do not use this to generate keys)

Why not use openssl rand? What should be used instead?


docs/RFCS/20171101_encryption_at_rest.md, line 144 at r6 (raw file):

An example file with the first key for this store would look like:

1;1509649195;AES256-CTR;7acff104117d59ae1a6f997a7cd0d2f348b038d09a47ae8cbaaa6288999fef10

Is this format based on anything in particular? It might be worth looking at tools people use for this kind of key rotation and to support a format that at least one tool understands natively (maybe json instead of an ad-hoc delimited format?).

In particular, I wonder if it might be better supported to use a directory containing an immutable file per key instead of a file that is updated on every rotation.


docs/RFCS/20171101_encryption_at_rest.md, line 167 at r6 (raw file):

### Data keys

Data keys are automatically generated by cockroach. They are stored in the data directory and

How are they stored? In separate files or in the preamble?


docs/RFCS/20171101_encryption_at_rest.md, line 177 at r6 (raw file):

There are two parameters controlling how data keys behave:
* encryption cipher: the cipher in use for data encryption. Possible values: `PLAIN`, `AES128-CTR`, `AES192-CTR`, `AES256-CTR`. Default value: `PLAIN`. This is the same as the active store key.
* rotation period: the time before a new key is generated and used. Default value: 1 week. This can be set through a flag.

Why keep the same key across multiple files? I thought the point of the store/data key split would be to allow a fresh key to be generated for each file. (and then we'd have a separate "max key age" that would force a compaction to a fresh key)


docs/RFCS/20171101_encryption_at_rest.md, line 188 at r6 (raw file):

* disable swap: we want to avoid any data hitting disk unencrypted, this includes memory being swapped out.
* run on architectures that support the [AES-NI instruction set](https://en.wikipedia.org/wiki/AES_instruction_set).
* have a separate area (partition, fuse-filesystem, etc...) to store the store-level keys.

This should be called out separately since it's a must-have: the security of this scheme rests almost entirely on the existence of a separate, more secure filesystem.


docs/RFCS/20171101_encryption_at_rest.md, line 362 at r6 (raw file):

    set prefix flag to 1 (AES-CTR)
    set key ID to current active key ID
    set pseudo-random IV and counter

Don't we need strongly random IVs instead of pseudo-random?


docs/RFCS/20171101_encryption_at_rest.md, line 398 at r6 (raw file):

Possible extensions of the encryption flag include:
* other modes (eg: GCM)
* other ciphers (eg: Triple DES, SkipJack, etc...)

3DES and skipjack are both really old. Chacha20 is a more plausible choice today (based on its inclusion in TLS 1.3).


docs/RFCS/20171101_encryption_at_rest.md, line 414 at r6 (raw file):

We propose:
* `--rocksdb-preamble-format` to start a new node with preamble format enabled. Will fail if the data exists in classical format for any store on the node.

As we discussed, mention that the plan is to make the preamble format the default for newly-created stores in some future version, but we are keeping it off by default for now.


docs/RFCS/20171101_encryption_at_rest.md, line 415 at r6 (raw file):

We propose:
* `--rocksdb-preamble-format` to start a new node with preamble format enabled. Will fail if the data exists in classical format for any store on the node.
* a `PREAMBLE_FORMAT` file written at rocksdb-creation time. Its presence indicates use of the preamble format.

Why a new file instead of a new value in the existing COCKROACH_VERSION file?


docs/RFCS/20171101_encryption_at_rest.md, line 430 at r6 (raw file):

	* provided by the user
	* plaintext
	* should be stored on a separate disk

Should we talk explicitly about the use of keywhiz-style virtual filesystems here?


docs/RFCS/20171101_encryption_at_rest.md, line 495 at r6 (raw file):

* desired cipher (eg: `AES128-CTR`)

If the cipher is other than `PLAIN`, we generate a key of the desired length using the pseudorandom `CryptoPP::OS_GenerateRandomBlock(blocking=false`) (see [Random number generator](#random-number-generator) for alternatives).

We definitely don't want pseudorandomness here.


docs/RFCS/20171101_encryption_at_rest.md, line 599 at r6 (raw file):

We can relax this assumption by adding integrity checking to all files on disk (eg: using GCM).
This would add complexity and cost to filesystem-level operations in rocksdb as we would need to read entire

GCM is not the only way to get confidentiality+integrity: CTR+HMAC would work too and still allow random access.


docs/RFCS/20171101_encryption_at_rest.md, line 602 at r6 (raw file):

files to compute authentication tags.

However, GCM can be cheaply used to encode data keys.

This comes back to our discussion about whether parameters (like key sizes and cipher modes) need to differ between store and data keys. It sounds like there's a reason to configure them separately. Or maybe we should just remove the cipher mode from the key file completely (so it just specifies cipher and key size), since we don't expect to allow much flexibility in cipher modes.


docs/RFCS/20171101_encryption_at_rest.md, line 628 at r6 (raw file):

There is no equivalent in Go so the current approach is to avoid loading keys in Go.
This can become problematic if we want to reuse the keys to encrypt log files written in Go.
No good answer presents itself.

Well, we recommend disabling swap entirely above, which is an answer to this question.


docs/RFCS/20171101_encryption_at_rest.md, line 650 at r6 (raw file):

Alternatively, we could design a system to keep track of encryption status outside the contents
of the files (eg: a list of files and their encryption status, absence of the file denoting
it was written by an older version) but this seems overly complex and fragile.

The current EncryptedEnv cannot handle migrations, but I think we could make one that tracked the presence/absence of the preamble on a per-file basis. I don't think we care enough about the ability to migrate existing stores to actually do that, but changes to EncryptedEnv are an option if we need to.


docs/RFCS/20171101_encryption_at_rest.md, line 700 at r6 (raw file):

Cons:
* more complicated logic (we have two sets of keys to worry about)

I'm worried that this process adds risk of its own: we have to keep the data keys in memory all in one place, and rewrite them periodically.

I was thinking that for a two-level key scheme, we'd have one key per data file (encrypted with the store key and stored in the preamble). This permits a fast rotation process by rewriting the preamble block (although this is enough of a deviation from the normal write-once practice that I'm not sure it makes sense).


docs/RFCS/20171101_encryption_at_rest.md, line 701 at r6 (raw file):

Cons:
* more complicated logic (we have two sets of keys to worry about)
* encryption status is harder to understand for users

Are we making it easier to understand or just masking a problem? With a one-level key scheme, you need multiple in-use keys until all the data has been rewritten, and when the key is no longer in use you know it will be useless to an attacker who subsequently gets data from your disk. With this two-level scheme, the store key will quickly leave in-use status, but it becomes hard to tell when the data protected by that key becomes inaccessible (because it's reasonable to assume that the attacker got the store key and the then-current data keys at the same time).


docs/RFCS/20171101_encryption_at_rest.md, line 734 at r6 (raw file):

* always allow node encryption
* when a node joins, communicate its encryption status and refuse the join if no enterprise license exists
* on bootstrap, an encrypted node will only allow SQL operations on the system tables (to set the license)

Or similarly, encrypted nodes could refuse to acquire leases for non-system ranges unless they see a valid license.


docs/RFCS/20171101_encryption_at_rest.md, line 765 at r6 (raw file):

Some possible solutions to investigate:
* patches to rocksdb to force rotation even if nothing has changed (may be the safest)

I can't find it now, but I thought there already was something for this: changing certain options triggers a compaction of all files that were written under older options (unless I'm misremembering, the compression option works this way).


docs/RFCS/20171101_encryption_at_rest.md, line 782 at r6 (raw file):

It allows encryption on a single store.
* **pros**: more flexibility: can have one encrypted store, one plain, or different ciphers

I think we should support a mix of encrypted and unencrypted stores on the same node. We don't need to support preamble and non-preamble stores on one node, though.


docs/RFCS/20171101_encryption_at_rest.md, line 799 at r6 (raw file):

before deleting it. How feasible this is depends on the accuracy of our encryption status reporting.

### Marking data keys as "exposed" if key list was plaintext at any point

This is a drawback of the data key list format. If we either used a single key level or a key per file stored in its preamble, this wouldn't be an issue. If we do stick with the data key file, I think we need to address this.

One option is to allow the configuration of the data ciphers separately from the store key, so you could say that new data files should be in plaintext even while we have a store key. Then it would only be safe to remove the store key when all the data had been rotated into the plaintext format.


docs/RFCS/20171101_encryption_at_rest.md, line 853 at r6 (raw file):

When to mark a store as "encrypted" is not clear. For example: can we mark it as encrypted just because encryption
is enabled, or should we wait until encryption usage is at 100%?

I think we can mark it as "encrypted" (would this be a store attribute or something else?) as soon as it's enabled, because that means that any newly-written data will be encrypted.


docs/RFCS/20171101_encryption_at_rest.md, line 873 at r6 (raw file):

Crypto++ supports multiple block ciphers. It should be reasonably easy to add support for
other ciphers such as Triple DES, Skipjack (both NIST-recommended), and others.

NIST decertified skipjack in 2016. 3DES is still on the list although it's generally considered too weak to use these days.


Comments from Reviewable

@dianasaur323
Copy link
Contributor

LGTM -> for user feedback, I'm going to write up a summary of the user facing stuff in a couple bullets. I'll need you to take a quick skim through before I send it out to sanity check it for correctness.


Review status: all files reviewed at latest revision, 36 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 67 at r1 (raw file):

Encryption is desired for security reasons (prevent access from other users on the same
machine, prevent data leak through drive theft/disposal) as well as regulatory reasons
(GDPR, HIPPA, OCI DSS).

Do you mean HIPAA here? We might also want to add PCI


docs/RFCS/20171101_encryption_at_rest.md, line 90 at r1 (raw file):

The following are unrelated to encryption-at-rest as currently proposed:
* encrypted backup (should be supported regardless of encryption-at-rest status)
* fine-granularity encryption (that cannot use zone configs to select encrypted replicas)

I'm assuming this means table-level partitioning?


docs/RFCS/20171101_encryption_at_rest.md, line 217 at r1 (raw file):

cockroach start --rocksdb-preamble-format
SUCCESS

Do we want to warn if a new node is initialized into a cluster where all other stores have enabled encryption?


docs/RFCS/20171101_encryption_at_rest.md, line 261 at r1 (raw file):

Examine the logs or node debug pages to see that key 2 is now in use. It is now safe to delete key 1 from the file.

This doesn't seem great in terms of usability, but I guess we don't have time to do better in this release. Would you consider that this qualifies as an event that could be surfaced in the event log in the admin UI?


Comments from Reviewable

@mberhault
Copy link
Contributor Author

I'm seeing one massive question here: do we want to use a single level of keys (user keys are used to encrypt the data) or the dual-level described here. I can expand the "Alternatives: Single level of keys" section with more pros/cons, but ultimately we have to pick one. Code complexity and user-friendliness point towards single-level keys.


Review status: all files reviewed at latest revision, 36 unresolved discussions, some commit checks failed.


docs/RFCS/20171101_encryption_at_rest.md, line 67 at r1 (raw file):

Previously, dianasaur323 (Diana Hsieh) wrote…

Do you mean HIPAA here? We might also want to add PCI

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 90 at r1 (raw file):

Previously, dianasaur323 (Diana Hsieh) wrote…

I'm assuming this means table-level partitioning?

This is only db/table-level encryption if you get said db/table onto encrypted nodes through zone configs.


docs/RFCS/20171101_encryption_at_rest.md, line 217 at r1 (raw file):

Previously, dianasaur323 (Diana Hsieh) wrote…

Do we want to warn if a new node is initialized into a cluster where all other stores have enabled encryption?

The preamble format only says that the node can support encryption, we can still be plaintext. As for checking the rest of the cluster, we can't: this is done before we've talked to any other nodes.


docs/RFCS/20171101_encryption_at_rest.md, line 261 at r1 (raw file):

Previously, dianasaur323 (Diana Hsieh) wrote…

This doesn't seem great in terms of usability, but I guess we don't have time to do better in this release. Would you consider that this qualifies as an event that could be surfaced in the event log in the admin UI?

We'll need a better interface, but that will come later. The proposed status reporting will make it easy to build a user-friendly admin UI page.


docs/RFCS/20171101_encryption_at_rest.md, line 67 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/PCI/OCI/?

Wow, I keep having typos in those.


docs/RFCS/20171101_encryption_at_rest.md, line 134 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Why not use openssl rand? What should be used instead?

I think we'll want some examples of how to generate good keys in the docs. For now, I don't want anyone to just copy/paste things in here willy-nilly.


docs/RFCS/20171101_encryption_at_rest.md, line 144 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Is this format based on anything in particular? It might be worth looking at tools people use for this kind of key rotation and to support a format that at least one tool understands natively (maybe json instead of an ad-hoc delimited format?).

In particular, I wonder if it might be better supported to use a directory containing an immutable file per key instead of a file that is updated on every rotation.

Not particularly, mostly easy of writing/parsing in C++, I don't particularly want a json library.
The internal format doesn't matter much, but I do need to look a bit more at external key sources (eg: keywhiz) to make sure it's even doable. So far, I haven't found anything particularly standard, usually just plain key files (no ciphers/sizes/IDs or anything). One (I forget which) has a similar looking format.


docs/RFCS/20171101_encryption_at_rest.md, line 167 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

How are they stored? In separate files or in the preamble?

The data key is the same format as above, with one big file for all keys. It is written by a different instance of the Env which holds the store keys as opposed to the data keys. Said file has its own preamble, but the namespace for the key IDs is the "store keys" list.


docs/RFCS/20171101_encryption_at_rest.md, line 177 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Why keep the same key across multiple files? I thought the point of the store/data key split would be to allow a fresh key to be generated for each file. (and then we'd have a separate "max key age" that would force a compaction to a fresh key)

We could generate a new key for every file (although at that point that's what the nonce is). But I think you're thinking about storing the file-key in a part of the preamble encrypted with the store key. That would mean losing the quick store key rotation scheme.


docs/RFCS/20171101_encryption_at_rest.md, line 188 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This should be called out separately since it's a must-have: the security of this scheme rests almost entirely on the existence of a separate, more secure filesystem.

Sure, well have to be very careful how we draft the requirements and recommendations in the docs.


docs/RFCS/20171101_encryption_at_rest.md, line 362 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Don't we need strongly random IVs instead of pseudo-random?

Sure, but as discussed in Random number generator, this may not be doable.


docs/RFCS/20171101_encryption_at_rest.md, line 398 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

3DES and skipjack are both really old. Chacha20 is a more plausible choice today (based on its inclusion in TLS 1.3).

Sure. dropping all mention of specific ciphers, this isn't relevant for not.


docs/RFCS/20171101_encryption_at_rest.md, line 414 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

As we discussed, mention that the plan is to make the preamble format the default for newly-created stores in some future version, but we are keeping it off by default for now.

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 415 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Why a new file instead of a new value in the existing COCKROACH_VERSION file?

We could change COCKROACHDB_VERSION to be written before opening the DB, in which case we could definitely reuser it. Adding to the doc.


docs/RFCS/20171101_encryption_at_rest.md, line 430 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Should we talk explicitly about the use of keywhiz-style virtual filesystems here?

We should mention it as one of the possible key sources in the docs, but that's irrelevant for this RFC.


docs/RFCS/20171101_encryption_at_rest.md, line 495 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

We definitely don't want pseudorandomness here.

As discussed in the Random number generator section, we can use blocking mode which uses /dev/random, but its low throughput will be an issue. Without a alternative source of randomness, I'm not sure what can be done.


docs/RFCS/20171101_encryption_at_rest.md, line 599 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

GCM is not the only way to get confidentiality+integrity: CTR+HMAC would work too and still allow random access.

True. We need to decide if we want integrity as well, at least for the data keys file (easy since it's small and only used by us). Integrity for all other files is a bit trickier, but may be a reasonable future improvement.


docs/RFCS/20171101_encryption_at_rest.md, line 602 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This comes back to our discussion about whether parameters (like key sizes and cipher modes) need to differ between store and data keys. It sounds like there's a reason to configure them separately. Or maybe we should just remove the cipher mode from the key file completely (so it just specifies cipher and key size), since we don't expect to allow much flexibility in cipher modes.

I think we still want to control key sizes (mostly because it's trivial). If we fix it at 128, someone will say they want 256 for better security, and the other way around will get complaints about slowness.
We could specify the desired cipher elsewhere (eg: the encryption flag passed in, or a more dynamic setting later), but then we get back to having to specify multiple ciphers and sizes. I'm happy with either, we just need to find a reasonably user-friendly way of doing it.


docs/RFCS/20171101_encryption_at_rest.md, line 628 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Well, we recommend disabling swap entirely above, which is an answer to this question.

Partially. I'm still going to call mlock on the key buffers, and we still don't have a way to do it in go.


docs/RFCS/20171101_encryption_at_rest.md, line 650 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

The current EncryptedEnv cannot handle migrations, but I think we could make one that tracked the presence/absence of the preamble on a per-file basis. I don't think we care enough about the ability to migrate existing stores to actually do that, but changes to EncryptedEnv are an option if we need to.

We may be able to detect if something is an sstable or log, but what about the other rocksdb files (MANIFEST, CURRENT, etc...). I don't think we have a reliable way of telling whether the file has a preamble or not. We could stick a fixed X-bytes string at the beginning but I'm not a big fan of that.
Honestly, I'm happy enough with the migration requirement.


docs/RFCS/20171101_encryption_at_rest.md, line 700 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I'm worried that this process adds risk of its own: we have to keep the data keys in memory all in one place, and rewrite them periodically.

I was thinking that for a two-level key scheme, we'd have one key per data file (encrypted with the store key and stored in the preamble). This permits a fast rotation process by rewriting the preamble block (although this is enough of a deviation from the normal write-once practice that I'm not sure it makes sense).

I really want to avoid re-writing files when I'm not being asked to. Rewriting the data keys file isn't particularly complicated, we can follow the normal pattern of DATA_KEYS-<sequence> with as many paranoid checks as we like before considering the file valid. Only once it's been validated do we re-read it and use the highest key as the active one.


docs/RFCS/20171101_encryption_at_rest.md, line 701 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Are we making it easier to understand or just masking a problem? With a one-level key scheme, you need multiple in-use keys until all the data has been rewritten, and when the key is no longer in use you know it will be useless to an attacker who subsequently gets data from your disk. With this two-level scheme, the store key will quickly leave in-use status, but it becomes hard to tell when the data protected by that key becomes inaccessible (because it's reasonable to assume that the attacker got the store key and the then-current data keys at the same time).

Right, and whatever mechanism we would have used to force re-encryption in a single-level solution can still be applied here. Note also that GCing of keys is still an open question, meaning that we may not be able to guarantee that a key is no longer needed (eg: backups).


docs/RFCS/20171101_encryption_at_rest.md, line 734 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Or similarly, encrypted nodes could refuse to acquire leases for non-system ranges unless they see a valid license.

true. We have a number of possibilities here, we just need to be careful about failure modes.
I'm personally happy with a big warning for now. I'm also happy to have my mind changed.


docs/RFCS/20171101_encryption_at_rest.md, line 765 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I can't find it now, but I thought there already was something for this: changing certain options triggers a compaction of all files that were written under older options (unless I'm misremembering, the compression option works this way).

Yeah, I'm still digging through rocksdb for some of these things.


docs/RFCS/20171101_encryption_at_rest.md, line 782 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I think we should support a mix of encrypted and unencrypted stores on the same node. We don't need to support preamble and non-preamble stores on one node, though.

Uh. This is pretty much what I have here. We're agreed that this is reasonable?


docs/RFCS/20171101_encryption_at_rest.md, line 799 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This is a drawback of the data key list format. If we either used a single key level or a key per file stored in its preamble, this wouldn't be an issue. If we do stick with the data key file, I think we need to address this.

One option is to allow the configuration of the data ciphers separately from the store key, so you could say that new data files should be in plaintext even while we have a store key. Then it would only be safe to remove the store key when all the data had been rotated into the plaintext format.

Yeah, this goes back to the one vs two level. This one is not so much an open question as it is a requirement in the two-level system. Not particularly complicated though, just more confusion for the user.


docs/RFCS/20171101_encryption_at_rest.md, line 853 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I think we can mark it as "encrypted" (would this be a store attribute or something else?) as soon as it's enabled, because that means that any newly-written data will be encrypted.

The simplest solution is to do nothing and have the user set their own attributes.
However, we could dynamically set a encrypted attribute on the store. I haven't poked around to see how difficult that would be, but I expect not too much.


docs/RFCS/20171101_encryption_at_rest.md, line 873 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

NIST decertified skipjack in 2016. 3DES is still on the list although it's generally considered too weak to use these days.

Also dropped sample ciphers.


Comments from Reviewable

@awoods187
Copy link
Contributor

Review status: 0 of 1 files reviewed at latest revision, 37 unresolved discussions, all commit checks successful.


docs/RFCS/20171101_encryption_at_rest.md, line 742 at r7 (raw file):

This would still cause issues when removing the license (or errors loading/validating the license).

Less drastic actions may be possible.

Are there other concerns with this approach such as performance concerns? I'd like to see a stronger discussion on why proactively gating is worse for users than the proposed approach


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 37 unresolved discussions, all commit checks successful.


docs/RFCS/20171101_encryption_at_rest.md, line 742 at r7 (raw file):

Previously, awoods187 (Andy Woods) wrote…

Are there other concerns with this approach such as performance concerns? I'd like to see a stronger discussion on why proactively gating is worse for users than the proposed approach

The concern with blocking anything on encryption is that it has be done after the fact (when we initialize the encryption state, we have no way of knowing whether we have a license or not). Any type of degradation of service/features after the fact risks being triggered by bugs/license removal/bad error handling etc...


Comments from Reviewable

@mberhault mberhault force-pushed the marc/RFC_encryption_at_rest branch from e81f066 to 94dffd0 Compare November 7, 2017 19:13
@mberhault
Copy link
Contributor Author

A few major changes since the last round of reviews:

  • protobuf format for preamble
  • protobuf format for data keys
  • security section with a few more details
  • some user recommendations
  • resolved a bunch of questions (not all)
  • split future work into 1) todo before stable release and 2) sometime after (similar to out-of-scope but doable)

@mberhault
Copy link
Contributor Author

This is ready for another look. I'm not collapsing commits for now so that people can see what changed.
Two things that must be done before this is approved:

  • resolve all listed unresolved questions
  • flesh out the security section

@mberhault
Copy link
Contributor Author

Thanks. Typos make things less legible so always good to avoid.

I think most of this doc has been pretty well hashed out by now. The main question in my mind is whether to go for the preamble format of the custom env solution.


Review status: 0 of 1 files reviewed at latest revision, 28 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 217 at r1 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

+1 on enforcing cluster-wide encryption being idiot-proof. A cluster setting of some sort would be nice.

If we want more flexibility, this would also fit nicely in zone configs - we'd make a store's encryption status an attribute on the store, then modify zone configs as desired to require data to be on an encrypted store.

Integration with zone configs is fairly straightforward. The future improvement section proposes a reserved encrypted attribute, making the status available to zone configs.

Cluster settings are about as tricky as license enforcement.


docs/RFCS/20171101_encryption_at_rest.md, line 853 at r6 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

I agree with Ben here.

That's fair. Actual encryption status (how much data is still plaintext or encrypted using old keys) will need to be carefully integrated into the admin-ui/metrics and monitored by DBAs.


docs/RFCS/20171101_encryption_at_rest.md, line 152 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/attacked/attacker

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 204 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/this done/this is done/

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 205 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Core dumps are off by default, but in case they get turned on or someone figures out how to trigger them, should we also set MADV_DONTDUMP on the relevant pages?

Good point, I didn't think about that. Added to the memory safety section and the recommendations (just because we don't dump keys doesn't mean it's safe to write core files, they would still contain plaintext data).
Also added a link in to the cert secure coding standard in related resources. Section 3/rec. 08/MEM06 talks about swapping and core dumps.


docs/RFCS/20171101_encryption_at_rest.md, line 218 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/used the/used to/

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 238 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Should this say that an existing store can't be converted?

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 291 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

...or core dumped, although that's disabled by default in go programs.

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 307 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/node/store/? It seems like a lot of the use of the word "node" in this document should be "store", assuming we're ok with a node having multiple stores where some are encrypted and some aren't.

Done. I've replaced a bunch of mentions of node with store. The proposal is indeed to allow heterogeneous stores (both for the preamble format and actual encryption).


docs/RFCS/20171101_encryption_at_rest.md, line 338 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/have/had

Done.


docs/RFCS/20171101_encryption_at_rest.md, line 633 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Unless I've missed something, this is the first/only time master keys have been mentioned?

oops. I got my terminology mixed up. This is indeed store keys. Renamed all instances of master keys


docs/RFCS/20171101_encryption_at_rest.md, line 758 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

This is a pretty ugly problem. As written, this is suggesting that we will encrypt files without having verified that a license is in place, right? That might concern users who are worried "accidentally" using enterprise features without a valid license. None of our other enterprise features can "accidentally" be used. I don't have a better proposal at the moment, but this probably belongs in the drawbacks section as well.

I'm not too worried about "accidentally" using encryption. You still have to generate keys and specify them through the --enterprise-encryption flag. You would probably also have read the docs and seen the big "enterprise only" labels.
Even then, it's not like we do bad things when you break the license. Even if we were to report license violations, we would probably have a certain amount of leeway.

There's discussion about alternate ways of doing license enforcement in the "future improvements", but I don't know of a safe way of doing it right now.
Anyway, added a section to the drawbacks. It definitely is one.


docs/RFCS/20171101_encryption_at_rest.md, line 819 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Are we still going to recommend filesystem encryption as a best practice, though? Or will this become our recommendation?

We can still mention filesystem-level encryption if people prefer it, somewhere in some production docs.

This is under "alternatives", and it is a valid alternative for some scenarios so not mentioning it would be have been a rather big omission.


docs/RFCS/20171101_encryption_at_rest.md, line 968 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/do to do/to do/

Done.


Comments from Reviewable

@a-robinson
Copy link
Contributor

:lgtm: 👍


Review status: 0 of 1 files reviewed at latest revision, 19 unresolved discussions, all commit checks successful.


docs/RFCS/20171101_encryption_at_rest.md, line 819 at r36 (raw file):

Previously, mberhault (marc) wrote…

We can still mention filesystem-level encryption if people prefer it, somewhere in some production docs.

This is under "alternatives", and it is a valid alternative for some scenarios so not mentioning it would be have been a rather big omission.

I'm more wondering which one we'll recommend first, i.e. as the preferred option.


docs/RFCS/20171101_encryption_at_rest.md, line 213 at r37 (raw file):

At the C++ level, we can control two aspects:
* don't swap to disk: using `mlock` (`man mlock(2)`) on memory holding keys, preventing paging out to disk
* don't code dump: using `madvise` with `MADV_DONTDUMP` (see `man madvise(2)` on Linux) to exclude pages from code dumps.

s/code/core/?


docs/RFCS/20171101_encryption_at_rest.md, line 257 at r37 (raw file):

* restricted access to all cockroach data
* disable swap
* don't enable code dumps

s/code/core/?


Comments from Reviewable

@a-robinson
Copy link
Contributor

Reviewed 1 of 1 files at r37.
Review status: all files reviewed at latest revision, 19 unresolved discussions, all commit checks successful.


Comments from Reviewable

@mberhault mberhault force-pushed the marc/RFC_encryption_at_rest branch from e071dcc to e1f2eb9 Compare November 27, 2017 14:31
@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 19 unresolved discussions.


docs/RFCS/20171101_encryption_at_rest.md, line 819 at r36 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

I'm more wondering which one we'll recommend first, i.e. as the preferred option.

Well, it depends on what you're already doing. If you have filesystem-level encryption with HSM support, I would honestly go for that rather than our encryption. But some people have made it clear that filesystem-level encryption is not an option for them.


docs/RFCS/20171101_encryption_at_rest.md, line 213 at r37 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/code/core/?

wow. I really had a problem with that word for some reason. Fixed all three instances.


docs/RFCS/20171101_encryption_at_rest.md, line 257 at r37 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/code/core/?

Done.


Comments from Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave this only a shallow review, but generally looks good.

to be used by another rocksdb instance and does not survive node restart. We propose to use dynamically-generated
keys to encrypt the temporary rocksdb instance.
1. sideloading for restore. Local SSTables are generated using an in-memory rocksdb instance then written in go
to local disk. We must change this to either be written directly by rocksdb, or move encryption to Go. The former
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what exactly you mean, not putting the sideloaded file on disk directly is likely a non-started, since avoiding the write amplification associated with putting it into RocksDB was one of the goals of side loading.

Also, how will RESTORE work in this case? The sideloaded files are linked in directly into RocksDB (via DB::IngestExternalFile()). What does that mean for the file? Does it have to be encrypted already?

My assumption on how this would work is that instead of slapping the file on disk, it would pass through something that would add the right preamble and encrypt with the store key, and that that is the right thing to pass to IngestExternalFile later, but I'm not sure.

Could you add more details here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the RFC, the file is generated in an in-mem rocksdb instance then written to disk in Go. After that, it is ingested.
We need to make sure the ingested file is encrypted using the same set of keys, so it really needs to go through the existing rocksdb instance, or at least the same Env layer.

@mberhault
Copy link
Contributor Author

there are three PRs showing possible ways of doing some things:

My preference is for the switching env over the preamble (explained in the alternatives section). I'll rewrite this RFC to switch to it.

@mberhault
Copy link
Contributor Author

This RFC now proposes using a switching env for plaintext vs encryption and a registry to keep information about a file's encryption status.
This drops the preamble format entirely, moving it to a small "alternative" section.
@bdarnell, @a-robinson, @tschottdorf: PTAL

@petermattis
Copy link
Collaborator

Review status: 0 of 1 files reviewed at latest revision, 21 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 427 at r39 (raw file):

The state of a file (plaintext or encrypted) is stored in a file registry. This records the list of all
encrypted files by filename and is persisted to disk in a file named `COCKROACHDB_REGISTRY`.

As an alternative to a registry, could we do something with symlinks? For example, if the file is a regular file it is plaintext. If the file is a symlink, read the link name and if it contains the suffix .encrypted it is encrypted. Two benefits to the symlink approach are that it avoids a central registry file that needs to be updated whenever files are written/deleted. Secondly, it makes the encryption status of files more obvious in directory listings. A downside is that it doubles the number of filenames in the storage directory. There are ways to get around that (e.g. instead of using an .encrypted suffix, put the encrypted ssts in a separate directory). Might very well be other downsides to the symlink approach. I've only thought about it very briefly.


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 21 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 427 at r39 (raw file):

Previously, petermattis (Peter Mattis) wrote…

As an alternative to a registry, could we do something with symlinks? For example, if the file is a regular file it is plaintext. If the file is a symlink, read the link name and if it contains the suffix .encrypted it is encrypted. Two benefits to the symlink approach are that it avoids a central registry file that needs to be updated whenever files are written/deleted. Secondly, it makes the encryption status of files more obvious in directory listings. A downside is that it doubles the number of filenames in the storage directory. There are ways to get around that (e.g. instead of using an .encrypted suffix, put the encrypted ssts in a separate directory). Might very well be other downsides to the symlink approach. I've only thought about it very briefly.

That's something Ben and I have iterated over a little (see the "custom env" section in older revisions). Using the filename to describe encrypted status still needs the registry (or preamble) to store encryption fields.
There are a few other issues (probably not too great but need more investigation):

  • this involves modifying/adding filenames. It may be fine, but I would need to double check this
  • there's no symbolic link support in the existing envs. Adding it for posix is fine, but you're perfectly allowed to use an in-memory or hdfs env with encryption

As for creating a whole new file to safely write, we have to do this for the data keys, so one more doesn't seem like too big of a pain.


Comments from Reviewable

@petermattis
Copy link
Collaborator

Review status: 0 of 1 files reviewed at latest revision, 21 unresolved discussions, some commit checks pending.


docs/RFCS/20171101_encryption_at_rest.md, line 427 at r39 (raw file):

Previously, mberhault (marc) wrote…

That's something Ben and I have iterated over a little (see the "custom env" section in older revisions). Using the filename to describe encrypted status still needs the registry (or preamble) to store encryption fields.
There are a few other issues (probably not too great but need more investigation):

  • this involves modifying/adding filenames. It may be fine, but I would need to double check this
  • there's no symbolic link support in the existing envs. Adding it for posix is fine, but you're perfectly allowed to use an in-memory or hdfs env with encryption

As for creating a whole new file to safely write, we have to do this for the data keys, so one more doesn't seem like too big of a pain.

Ack. Carry on.


Comments from Reviewable

@dianasaur323
Copy link
Contributor

docs/RFCS/20171101_encryption_at_rest.md, line 400 at r39 (raw file):

Specifying the `--enterprise-encryption` flag increases the version to `versionSwitchingEnv`. Downgrades to
binaries that do not support this version is not possible.

If I'm reading this correctly, this means that the previous limitation where you couldn't encrypt a store that didn't have the preamble format enabled is instead simply a flag that can be toggled? That's nice!


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 21 unresolved discussions, all commit checks successful.


docs/RFCS/20171101_encryption_at_rest.md, line 400 at r39 (raw file):

Previously, dianasaur323 (Diana Hsieh) wrote…

If I'm reading this correctly, this means that the previous limitation where you couldn't encrypt a store that didn't have the preamble format enabled is instead simply a flag that can be toggled? That's nice!

That's correct. The remaining limitation is that enabling encryption means you can't go back to a version that does not support encryption. Obviously there's nothing I can do about that.


Comments from Reviewable

@dianasaur323
Copy link
Contributor

docs/RFCS/20171101_encryption_at_rest.md, line 400 at r39 (raw file):

Previously, mberhault (marc) wrote…

That's correct. The remaining limitation is that enabling encryption means you can't go back to a version that does not support encryption. Obviously there's nothing I can do about that.

Woohoo~~~


Comments from Reviewable

@mberhault mberhault force-pushed the marc/RFC_encryption_at_rest branch from 3a279a8 to 7a6cfe5 Compare November 28, 2017 23:20
mberhault pushed a commit to mberhault/cockroach that referenced this pull request Nov 29, 2017
This is to be contrasted to the preamble method in cockroachdb#20124.

This method is discussed in the `Custom env for encryption state`
section of the
[Encryption RFC](cockroachdb#19785)

When encryption is enabled, use a switching env that can redirect
each Env method to one of:
* base env for plaintext (same format as currently, no overhead)
* encrypted env (with or without preamble) for encrypted files

The switching env will hold the list of encrypted files to know which
env to pick.
mberhault pushed a commit to mberhault/cockroach that referenced this pull request Nov 30, 2017
This is to be contrasted to the preamble method in cockroachdb#20124.

This method is discussed in the `Custom env for encryption state`
section of the
[Encryption RFC](cockroachdb#19785)

When encryption is enabled, use a switching env that can redirect
each Env method to one of:
* base env for plaintext (same format as currently, no overhead)
* encrypted env (with or without preamble) for encrypted files

The switching env will hold the list of encrypted files to know which
env to pick.
@bdarnell
Copy link
Contributor

bdarnell commented Dec 5, 2017

Reviewed 1 of 1 files at r40.
Review status: all files reviewed at latest revision, 22 unresolved discussions.


docs/RFCS/20171101_encryption_at_rest.md, line 819 at r36 (raw file):

Previously, mberhault (marc) wrote…

Well, it depends on what you're already doing. If you have filesystem-level encryption with HSM support, I would honestly go for that rather than our encryption. But some people have made it clear that filesystem-level encryption is not an option for them.

Personally I'd recommend filesystem-level encryption as long as it meets security and operational requirements.


docs/RFCS/20171101_encryption_at_rest.md, line 441 at r40 (raw file):

    useEncryption = lookup desired encryption (from --enterprise-encryption flag)
    add filename to registry
    persiste registry to disk. Error out on failure.

s/persiste/persist/

Discuss partial failures here: we may persist the write to the registry without creating the file, so the registry contains a superset of the (encrypted) files on disk. Going the other way, when a file is deleted, we must delete the file from disk first, before removing it from the registry.

It is possible for registry entries to "leak" via these partial failures. We could either implement some sort of GC for them (the fact that rocksdb includes a monotonic counter in all filenames should make this straightforward) or just assume that this will be rare enough that we don't have to worry about it.


docs/RFCS/20171101_encryption_at_rest.md, line 575 at r40 (raw file):

flag.

Since the keys are externally provided, there is no concept of key rotation.

What does this mean? Do you have to restart to pick up new keys? I think we need to be able to pick up updates to these keyfiles on SIGHUP. (although renaming key to key.old and creating a new key to rotate keys seems fragile. I think a file with multiple keys may be a better way to do this)


Comments from Reviewable

@mberhault
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 22 unresolved discussions.


docs/RFCS/20171101_encryption_at_rest.md, line 819 at r36 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Personally I'd recommend filesystem-level encryption as long as it meets security and operational requirements.

It should definitely be mentioned as an option in the docs. Or we could say that's the only solution for now and I can seriously free up my next few months :)


docs/RFCS/20171101_encryption_at_rest.md, line 441 at r40 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/persiste/persist/

Discuss partial failures here: we may persist the write to the registry without creating the file, so the registry contains a superset of the (encrypted) files on disk. Going the other way, when a file is deleted, we must delete the file from disk first, before removing it from the registry.

It is possible for registry entries to "leak" via these partial failures. We could either implement some sort of GC for them (the fact that rocksdb includes a monotonic counter in all filenames should make this straightforward) or just assume that this will be rare enough that we don't have to worry about it.

Fixed typo. Added some details about non-existent entries and linked to the existing registry GC "future improvement, maybe in for first version".


docs/RFCS/20171101_encryption_at_rest.md, line 575 at r40 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

What does this mean? Do you have to restart to pick up new keys? I think we need to be able to pick up updates to these keyfiles on SIGHUP. (although renaming key to key.old and creating a new key to rotate keys seems fragile. I think a file with multiple keys may be a better way to do this)

Yup. I have SIGHUP-based reload in the "future improvements, maybe for first version" section.
Loading/reloading/rotating store keys will really be improved once we start adding other mechanisms, be it other config files or third-party integration. I think renaming the keys is moderately safe for now and requires less manual involvement than an arbitrary file format.


Comments from Reviewable

@mberhault mberhault force-pushed the marc/RFC_encryption_at_rest branch from 7a6cfe5 to b938c05 Compare December 5, 2017 17:56
@mberhault
Copy link
Contributor Author

No more unresolved questions:

  • rotting non-live files has moved to "drawbacks"
  • CCL trickiness has moved to "drawbacks"
  • instruction set support has been expanded a bit and moved to "improvements", mostly about detection, warnings, documentation.

@mberhault mberhault force-pushed the marc/RFC_encryption_at_rest branch from 39a24b1 to 262d532 Compare December 6, 2017 16:08
@mberhault
Copy link
Contributor Author

There's been no movement in about over a week. If there are no objections, I'd like to move to FCP.

@mberhault
Copy link
Contributor Author

Moving to FCP. ETA for close is Wednesday Dec 20th.

This is the initial RFC for encryption at rest.

The `Security considerations` sections must be expanded and carefully
examined.

The `Unresolved questions` must be resolved.

The `Future improvements` must be examined for in/out-of scope
decisions.
@mberhault mberhault force-pushed the marc/RFC_encryption_at_rest branch from 262d532 to 4626d29 Compare December 20, 2017 12:21
@mberhault mberhault merged commit 747c280 into cockroachdb:master Dec 20, 2017
@mberhault mberhault deleted the marc/RFC_encryption_at_rest branch December 20, 2017 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.