Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(event cache): handle linked chunk updates and reload in the sqlite backend #4340

Merged
merged 7 commits into from
Nov 28, 2024
4 changes: 4 additions & 0 deletions crates/matrix-sdk-base/src/event_cache/store/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,10 @@ pub enum EventCacheStoreError {
#[error("Error encoding or decoding data from the event cache store: {0}")]
Codec(#[from] Utf8Error),

/// The store failed to serialize or deserialize some data.
#[error("Error serializing or deserializing data from the event cache store: {0}")]
Serialization(#[from] serde_json::Error),

/// The database format has changed in a backwards incompatible way.
#[error(
"The database format of the event cache store changed in an incompatible way, \
Expand Down
5 changes: 5 additions & 0 deletions crates/matrix-sdk-common/src/deserialized_responses.rs
Original file line number Diff line number Diff line change
Expand Up @@ -587,6 +587,11 @@ impl fmt::Debug for TimelineEventKind {
/// A successfully-decrypted encrypted event.
pub struct DecryptedRoomEvent {
/// The decrypted event.
///
/// Note: it's not an error that this contains an `AnyMessageLikeEvent`: an
/// encrypted payload *always contains* a room id, by the [spec].
///
/// [spec]: https://spec.matrix.org/v1.12/client-server-api/#mmegolmv1aes-sha2
pub event: Raw<AnyMessageLikeEvent>,

/// The encryption info about the event.
Expand Down
6 changes: 3 additions & 3 deletions crates/matrix-sdk-common/src/linked_chunk/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -975,12 +975,12 @@ pub struct ChunkIdentifier(u64);

impl ChunkIdentifier {
/// Create a new [`ChunkIdentifier`].
pub(super) fn new(identifier: u64) -> Self {
pub fn new(identifier: u64) -> Self {
Self(identifier)
}

/// Get the underlying identifier.
fn index(&self) -> u64 {
pub fn index(&self) -> u64 {
self.0
}
}
Expand All @@ -999,7 +999,7 @@ pub struct Position(ChunkIdentifier, usize);

impl Position {
/// Create a new [`Position`].
pub(super) fn new(chunk_identifier: ChunkIdentifier, index: usize) -> Self {
pub fn new(chunk_identifier: ChunkIdentifier, index: usize) -> Self {
Self(chunk_identifier, index)
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
CREATE TABLE "linked_chunks" (
-- Identifier of the chunk, unique per room. Corresponds to a `ChunkIdentifier`.
"id" INTEGER,
-- Which room does this chunk belong to? (hashed key shared with the two other tables)
"room_id" BLOB NOT NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just learned the difference between BLOB and TEXT in SQLite, but I wonder if we just not used a TEXT here. According to this forum thread:

BLOBs cannot be searched, compared, or manipulated.

I think we want TEXT here. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhuh, interesting. That suggests we may have issues when there are multiple rooms, then 👀 I'll do a few experiments and change it, if needs be — pretty sure we've been using that in other places too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pfew, https://sqlite.org/datatype3.html#sort_order says that BLOB are compared with memcmp, so we're fine. I've added a test that used the same chunk identifier in two different rooms, and it passes, so it's all fine as is. We're using BLOB in many many places, so if this should've failed, it would have failed elsewhere too ^^

So now, I do buy it that using TEXT would offer us more, e.g. searching substrings, prefixes, etc. I don't think it's a useful use case for us, to search for a room id substring, be it in the clear or hashed.

Thanks for raising this, though! TIL 😌


-- Previous chunk in the linked list. Corresponds to a `ChunkIdentifier`.
"previous" INTEGER,
-- Next chunk in the linked list. Corresponds to a `ChunkIdentifier`.
"next" INTEGER,
-- Type of underlying entries: E for events, G for gaps
"type" TEXT CHECK("type" IN ('E', 'G')) NOT NULL
);

CREATE UNIQUE INDEX "linked_chunks_id_and_room_id" ON linked_chunks (id, room_id);

CREATE TABLE "gaps" (
-- Which chunk does this gap refer to? Corresponds to a `ChunkIdentifier`.
"chunk_id" INTEGER NOT NULL,
-- Which room does this event belong to? (hashed key shared with linked_chunks)
"room_id" BLOB NOT NULL,

-- The previous batch token of a gap (encrypted value).
"prev_token" BLOB NOT NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we also want a TEXT here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a use case in mind where we'd need to search by prev_token?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the other comment: no need to specify TEXT unless we think we could need operations specific to a text object, which I don't think we need here either.


-- If the owning chunk gets deleted, delete the entry too.
FOREIGN KEY(chunk_id, room_id) REFERENCES linked_chunks(id, room_id) ON DELETE CASCADE
);

-- Items for an event chunk.
CREATE TABLE "events" (
-- Which chunk does this event refer to? Corresponds to a `ChunkIdentifier`.
"chunk_id" INTEGER NOT NULL,
-- Which room does this event belong to? (hashed key shared with linked_chunks)
"room_id" BLOB NOT NULL,

-- `OwnedEventId` for events, can be null if malformed.
"event_id" TEXT,
-- JSON serialized `SyncTimelineEvent` (encrypted value).
"content" BLOB NOT NULL,
-- Position (index) in the chunk.
"position" INTEGER NOT NULL,

-- If the owning chunk gets deleted, delete the entry too.
FOREIGN KEY(chunk_id, room_id) REFERENCES linked_chunks(id, room_id) ON DELETE CASCADE
);
3 changes: 3 additions & 0 deletions crates/matrix-sdk-sqlite/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,9 @@ pub enum Error {

#[error("An update keyed by unique ID touched more than one entry")]
InconsistentUpdate,

#[error("The store contains invalid data: {details}")]
InvalidData { details: String },
}

macro_rules! impl_from {
Expand Down
Loading
Loading