Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexed DB performance is slow because of large values #3057

Closed
andybalaam opened this issue Jan 26, 2024 · 5 comments · Fixed by #3073
Closed

Indexed DB performance is slow because of large values #3057

andybalaam opened this issue Jan 26, 2024 · 5 comments · Fixed by #3073
Assignees

Comments

@andybalaam
Copy link
Member

andybalaam commented Jan 26, 2024

We've noticed Element Web R is a little slow at some operations (e.g. resetting key backup, exporting keys, importing keys, migrating Element R schemata, upgrading from legacy to rust crypo).

We tracked the worst slowdowns to slow operations on the inbound_group_sessions2 store in Indexed DB.

I did some research and discovered that there are some rules of thumb we can use to have faster performance in Indexed DB: keep your Indexed DB keys and values small if you want good performance! and don’t store arrays of numbers in Indexed DB – use base64 instead.

So, we should apply these rules to our Indexed DB stores, starting with inbound_group_sessions2.

Currently, serializing an InboundGroupSession looks like:

  • matrix_sdk_store_encryption::StoreCipher::encrypt_value_typed -> EncryptedValue
  • matrix_sdk_store_encryption::StoreCipher::encrypt_value -> Vec<u8> (serialized EncryptedValue)
  • matrix_sdk_indexeddb::crypto_store::indexeddb_serializer::IndexeddbSerializer::serialize_value_as_bytes -> Vec<u8> (serialized EncryptedValue)
  • matrix_sdk_indexeddb::crypto_store::IndexeddbCryptoStore::serialize_inbound_group_session -> JsValue (created from an InboundGroupSessionIndexedDbObject that has the Vec<u8> (serialized EncryptedValue) as a property)

There are two ways we can improve this:

  1. Avoid double-serializing: currently the session is a serialized InboundGroupSessionIndexedDbObject that contains a serialized EncryptedValue. This makes the resulting object stored in IndexedDb much larger than it needs to be.
  2. Avoid use of arrays of ints in the values we store. Instead, encode them as base64 strings.

So I propose to change serializing InboundGroupSession to something like:

  • matrix_sdk_store_encryption::StoreCipher::encrypt_value_base64_typed -> EncryptedValueBase64 (same as EncryptedValue except it contains Strings instead of Vec<u8> and [u8; N]).
  • matrix_sdk_indexeddb::crypto_store::indexeddb_serializer::IndexeddbSerializer::maybe_encrypt_value -> MaybeEncrypted (not-serialized wrapper around EncryptedValue representing the fact that in some cases we don't encrypt.)
  • matrix_sdk_indexeddb::crypto_store::IndexeddbCryptoStore::serialize_inbound_group_session -> JsValue (created from an InboundGroupSessionIndexedDbObject that has the MaybeEncrypted as a property)

This will make the values we store much smaller because it avoids double-encoding, and it uses base64 instead of arrays of ints. Both of these should provide good performance improvements.

Of course, we will need a migration from the old schema to the new one.

In addition, since we are changing the schema, I plan to include other changes that make us compatible with future planned changes for Resetting key backup #26892.

@bnjbvr
Copy link
Member

bnjbvr commented Jan 26, 2024

From a high-level perspective, this all sounds sensible, thanks for the issue.

Both of these should provide good performance improvements.

Could we spend some time having a small benchmark setup for this, so we can have measurements of how it gets better (and we don't have to rely on intuition and hope)? Especially in an actual Web environment, a simplistic setup doing the following would be super useful:

  1. log the time of start,
  2. run a "benchmark", written in JS and using our bindings (to be as close to the real world as possible)
  3. log the time now - the time of start

@richvdh
Copy link
Member

richvdh commented Jan 26, 2024

  • matrix_sdk_store_encryption::StoreCipher::encrypt_value_base64_typed -> EncryptedValueBase64 (same as EncryptedValue except it contains Strings instead of Vec<u8> and [u8; N]).

  • matrix_sdk_indexeddb::crypto_store::indexeddb_serializer::IndexeddbSerializer::maybe_encrypt_value -> MaybeEncrypted (not-serialized wrapper around EncryptedValue representing the fact that in some cases we don't encrypt.)

I wondered if we could skip the first of those steps, and have maybe_encrypt_value base64-encode the values from encrypt_value_typed. (I'm more concerned about the proliferation of types and methods, making things ever more confusing, than the amount of actual work involved.)

@richvdh
Copy link
Member

richvdh commented Jan 26, 2024

  • matrix_sdk_store_encryption::StoreCipher::encrypt_value_base64_typed -> EncryptedValueBase64 (same as EncryptedValue except it contains Strings instead of Vec<u8> and [u8; N]).
  • matrix_sdk_indexeddb::crypto_store::indexeddb_serializer::IndexeddbSerializer::maybe_encrypt_value -> MaybeEncrypted (not-serialized wrapper around EncryptedValue representing the fact that in some cases we don't encrypt.)

I wondered if we could skip the first of those steps, and have maybe_encrypt_value base64-encode the values from encrypt_value_typed. (I'm more concerned about the proliferation of types and methods, making things ever more confusing, than the amount of actual work involved.)

... so MaybeEncrypted becomes something like

enum MaybeEncrypted<T> {
    Unencrypted(T),
    Encrypted{
        iv_base64: String,
        ciphertext_base64: String,
    }
}

@andybalaam
Copy link
Member Author

Could we spend some time having a small benchmark setup for this, so we can have measurements of how it gets better (and we don't have to rely on intuition and hope)? Especially in an actual Web environment, a simplistic setup doing the following would be super useful

Yes, I will think about the best way to do this, thanks!

@andybalaam
Copy link
Member Author

... so MaybeEncrypted becomes something like

Yeah, or maybe even collapse it all into InboundGroupSessionIndexedDbObject to simplify further. I'll try it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants