[RFC] v2 cryptographic overview #54

franky47 · 2023-04-13T08:48:14Z

Overview

v1.x used a naive AES-GCM cipher with a single key and random nonces, which is not scalable. Another issue was a vulnerability to confused deputy attacks (CDA).

v2 aims to improve the cryptographic layer using the following properties:

AEAD construction where AAD is set to the path of the field (model name + field name)
Key derivation to increase the cipher input entropy (pseudorandom key + random IV)

Caveats

Note that the following operations will still not be supported on encrypted fields in v2, and are not planned to be:

Partial matching (startsWith, endsWith, contains etc..)
Ordering

AEAD

In order to strongly bind a ciphertext to its storage location - to defend against CDA and field value swapping - the path where a record is stored should be part of the additional authenticated data (AAD).

This path is made of three dimensions:

The table
The column
The row

While it is fairly easy to pin the column (by setting the table and column name in AAD), pinning the row is more challenging. Usually, pinning the row is done by setting the row ID as AAD. However, this does not work in cases where the row ID is not available.

When encrypting a new record, the row ID may be omitted to be automatically generated by the database engine (eg: autoincremental integer and UUIDs primary keys).

When decrypting a record, the ID may be absent from either the query or the returned data.

Options here are:

Not including any row pinning, and only pin table & column. Pros: simple to implement. Cons: no practical defense against CDA.
Have 1 as default for auto-incrementing (database-generated) IDs, but allow runtime-generated IDs (eg: UUIDs, CUIDs) by parsing the Prisma schema. Pros: allows defense against CDA. Cons: cannot use autoincremental IDs, may be difficult to implement, especially regarding connections & relations. Won't work when using queries that don't supply the ID (where clause on other @unique fields).
Perform operations in multiple steps. Eg: write an empty record to the database to obtain a row and its ID, then encrypt fields with full AAD pinning, and write the ciphertexts, all in a transaction. Pros: can use autogenerated IDs. Cons: performance cost, increased risks of conflicts leaving data in an inconsistent state, possible data race conditions.

Rejected ideas:

Having a separate column managed internally by the middleware to serve as an AAD row reference. Rejected as it would be trivial for an attacker to swap these references along with ciphertext and still cause a valid decryption across rows.

Note: model and column renaming may also cause AAD mismatches, such cases should be covered by data migrations.

Composite IDs (using @@id) could be supported, with extra care about canonicalisation attacks. For example, with a naive string concatenation, those two rows would have the same AAD data:

model User {
  firstName String
  lastName  String
  @@id([firstName, lastName])
}

`firstName`	`lastName`	Resulting AAD
John	Doe	UserJohnDoe
Joh	nDoe	UserJohnDoe

Algorithm selection

The use of AES-GCM with 256 bit keys will be maintained, not for retrocompatibility (there won't be any due to the additional use of AAD), but because it's a common cipher available in most implementations. A non-NIST alternative native to Node.js would be ChaCha20-Poly1305, which conveniently has the same nonce and auth tag sizes as AES GCM.

Key derivation

Rather than specifying a single encryption key in the middleware configuration and use random nonces, a root secret will be used to derive individual keys and nonces, using HKDF-SHA-256.

This still assumes a cryptographically strong root secret.

Key derivation takes care of reducing the reuse of keys across a database. While a per-field derivation would be possible, it may be costly and redundant with the use of AAD, so a per-row derivation may be preferable. The upper bound for the number of encryptions would then be defined by the number of columns on a particular table, the size of the data to encrypt, and the number of edits per row. In most applications, this is far below the threshold where using random nonces becomes problematic.

Update 2024-06-29: A possibly simpler alternative would be to use XAES-256-GCM, which was recently rated FIPS-140.

Key commitment

In order to mitigate the invisible salamanders attack, where multiple keys can decrypt the same ciphertext and verify the authentication tag, the key itself (and probably the nonce too) should be part of the AAD.

Note: AES-GEM does this, but increases the size of the authentication tag. Can we ensure commitment with a standard 16 bytes AT?

Edit: no. https://crypto.stackexchange.com/questions/108200/key-commitment-in-gcm-or-aead-in-general

Rotation

Rotation of the root derivation secret will be planned just as multiple keys were supported in V1. Ciphertext rotations will require the same data migration technique.

Ciphertext format

todo: Document v2 ciphertext format

v1 to v2 migration

While both versions may require different configurations (root secrets/keys differences), it would be recommended to provide a data migration strategy, to ease with adoption.

That being said, the migration workflow may require several deployment phases (expand & contract pattern, akin to database migrations) if a zero-downtime upgrade is desired.

Therefore, v2 will ship with a read-only compatibility layer for v1, which will be removed altogether in a later update.

Resources

https://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/
https://scottarc.blog/2022/10/17/lucid-multi-key-deputies-require-commitment/
https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/supported-algorithms.html
https://words.filippo.io/dispatches/xaes-256-gcm/

The text was updated successfully, but these errors were encountered:

franky47 pinned this issue Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] v2 cryptographic overview #54

[RFC] v2 cryptographic overview #54

franky47 commented Apr 13, 2023 •

edited

Loading

[RFC] v2 cryptographic overview #54

[RFC] v2 cryptographic overview #54

Comments

franky47 commented Apr 13, 2023 • edited Loading

Overview

Caveats

AEAD

Algorithm selection

Key derivation

Key commitment

Rotation

Ciphertext format

v1 to v2 migration

Resources

franky47 commented Apr 13, 2023 •

edited

Loading