You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
v1.x used a naive AES-GCM cipher with a single key and random nonces, which is not scalable. Another issue was a vulnerability to confused deputy attacks (CDA).
v2 aims to improve the cryptographic layer using the following properties:
AEAD construction where AAD is set to the path of the field (model name + field name)
Key derivation to increase the cipher input entropy (pseudorandom key + random IV)
Caveats
Note that the following operations will still not be supported on encrypted fields in v2, and are not planned to be:
In order to strongly bind a ciphertext to its storage location - to defend against CDA and field value swapping - the path where a record is stored should be part of the additional authenticated data (AAD).
This path is made of three dimensions:
The table
The column
The row
While it is fairly easy to pin the column (by setting the table and column name in AAD), pinning the row is more challenging. Usually, pinning the row is done by setting the row ID as AAD. However, this does not work in cases where the row ID is not available.
When encrypting a new record, the row ID may be omitted to be automatically generated by the database engine (eg: autoincremental integer and UUIDs primary keys).
When decrypting a record, the ID may be absent from either the query or the returned data.
Options here are:
Not including any row pinning, and only pin table & column. Pros: simple to implement. Cons: no practical defense against CDA.
Have 1 as default for auto-incrementing (database-generated) IDs, but allow runtime-generated IDs (eg: UUIDs, CUIDs) by parsing the Prisma schema. Pros: allows defense against CDA. Cons: cannot use autoincremental IDs, may be difficult to implement, especially regarding connections & relations. Won't work when using queries that don't supply the ID (where clause on other @unique fields).
Perform operations in multiple steps. Eg: write an empty record to the database to obtain a row and its ID, then encrypt fields with full AAD pinning, and write the ciphertexts, all in a transaction. Pros: can use autogenerated IDs. Cons: performance cost, increased risks of conflicts leaving data in an inconsistent state, possible data race conditions.
Rejected ideas:
Having a separate column managed internally by the middleware to serve as an AAD row reference. Rejected as it would be trivial for an attacker to swap these references along with ciphertext and still cause a valid decryption across rows.
Note: model and column renaming may also cause AAD mismatches, such cases should be covered by data migrations.
Composite IDs (using @@id) could be supported, with extra care about canonicalisation attacks. For example, with a naive string concatenation, those two rows would have the same AAD data:
The use of AES-GCM with 256 bit keys will be maintained, not for retrocompatibility (there won't be any due to the additional use of AAD), but because it's a common cipher available in most implementations. A non-NIST alternative native to Node.js would be ChaCha20-Poly1305, which conveniently has the same nonce and auth tag sizes as AES GCM.
Key derivation
Rather than specifying a single encryption key in the middleware configuration and use random nonces, a root secret will be used to derive individual keys and nonces, using HKDF-SHA-256.
This still assumes a cryptographically strong root secret.
Key derivation takes care of reducing the reuse of keys across a database. While a per-field derivation would be possible, it may be costly and redundant with the use of AAD, so a per-row derivation may be preferable. The upper bound for the number of encryptions would then be defined by the number of columns on a particular table, the size of the data to encrypt, and the number of edits per row. In most applications, this is far below the threshold where using random nonces becomes problematic.
Update 2024-06-29: A possibly simpler alternative would be to use XAES-256-GCM, which was recently rated FIPS-140.
Key commitment
In order to mitigate the invisible salamanders attack, where multiple keys can decrypt the same ciphertext and verify the authentication tag, the key itself (and probably the nonce too) should be part of the AAD.
Note: AES-GEM does this, but increases the size of the authentication tag. Can we ensure commitment with a standard 16 bytes AT?
Rotation of the root derivation secret will be planned just as multiple keys were supported in V1. Ciphertext rotations will require the same data migration technique.
Ciphertext format
todo: Document v2 ciphertext format
v1 to v2 migration
While both versions may require different configurations (root secrets/keys differences), it would be recommended to provide a data migration strategy, to ease with adoption.
That being said, the migration workflow may require several deployment phases (expand & contract pattern, akin to database migrations) if a zero-downtime upgrade is desired.
Therefore, v2 will ship with a read-only compatibility layer for v1, which will be removed altogether in a later update.
Overview
v1.x used a naive AES-GCM cipher with a single key and random nonces, which is not scalable. Another issue was a vulnerability to confused deputy attacks (CDA).
v2 aims to improve the cryptographic layer using the following properties:
Caveats
Note that the following operations will still not be supported on encrypted fields in v2, and are not planned to be:
startsWith
,endsWith
,contains
etc..)AEAD
In order to strongly bind a ciphertext to its storage location - to defend against CDA and field value swapping - the path where a record is stored should be part of the additional authenticated data (AAD).
This path is made of three dimensions:
While it is fairly easy to pin the column (by setting the table and column name in AAD), pinning the row is more challenging. Usually, pinning the row is done by setting the row ID as AAD. However, this does not work in cases where the row ID is not available.
When encrypting a new record, the row ID may be omitted to be automatically generated by the database engine (eg: autoincremental integer and UUIDs primary keys).
When decrypting a record, the ID may be absent from either the query or the returned data.
Options here are:
where
clause on other@unique
fields).Rejected ideas:
Composite IDs (using
@@id
) could be supported, with extra care about canonicalisation attacks. For example, with a naive string concatenation, those two rows would have the same AAD data:firstName
lastName
Algorithm selection
The use of AES-GCM with 256 bit keys will be maintained, not for retrocompatibility (there won't be any due to the additional use of AAD), but because it's a common cipher available in most implementations. A non-NIST alternative native to Node.js would be ChaCha20-Poly1305, which conveniently has the same nonce and auth tag sizes as AES GCM.
Key derivation
Rather than specifying a single encryption key in the middleware configuration and use random nonces, a root secret will be used to derive individual keys and nonces, using HKDF-SHA-256.
This still assumes a cryptographically strong root secret.
Key derivation takes care of reducing the reuse of keys across a database. While a per-field derivation would be possible, it may be costly and redundant with the use of AAD, so a per-row derivation may be preferable. The upper bound for the number of encryptions would then be defined by the number of columns on a particular table, the size of the data to encrypt, and the number of edits per row. In most applications, this is far below the threshold where using random nonces becomes problematic.
Update 2024-06-29: A possibly simpler alternative would be to use XAES-256-GCM, which was recently rated FIPS-140.
Key commitment
In order to mitigate the invisible salamanders attack, where multiple keys can decrypt the same ciphertext and verify the authentication tag, the key itself (and probably the nonce too) should be part of the AAD.
Note: AES-GEM does this, but increases the size of the authentication tag. Can we ensure commitment with a standard 16 bytes AT?
Edit: no. https://crypto.stackexchange.com/questions/108200/key-commitment-in-gcm-or-aead-in-general
Rotation
Rotation of the root derivation secret will be planned just as multiple keys were supported in V1. Ciphertext rotations will require the same data migration technique.
Ciphertext format
v1 to v2 migration
While both versions may require different configurations (root secrets/keys differences), it would be recommended to provide a data migration strategy, to ease with adoption.
That being said, the migration workflow may require several deployment phases (expand & contract pattern, akin to database migrations) if a zero-downtime upgrade is desired.
Therefore, v2 will ship with a read-only compatibility layer for v1, which will be removed altogether in a later update.
Resources
https://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/
https://scottarc.blog/2022/10/17/lucid-multi-key-deputies-require-commitment/
https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/supported-algorithms.html
https://words.filippo.io/dispatches/xaes-256-gcm/
The text was updated successfully, but these errors were encountered: