Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add experimental support for MSC3391: deleting account data #14714

Merged
merged 15 commits into from
Jan 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/14714.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add experimental support for [MSC3391](https://github.com/matrix-org/matrix-spec-proposals/pull/3391) (removing account data).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks plausible over all. I'm a little skeptical about not deleting the data though (and filling it in with {}). #14244 doesn't really seem to address this approach at all and seems to document a completely different approach?

Would using null as the value for deleted items let us keep track of the stream ID while differentiating "empty" from "deleted"?

The advantage of using {} is that clients will then automatically sync the account data type with a value of {}, which as per MSC3391 implies to the client that the account data entry was deleted. It also comes with the benefit that older clients, which set account data content to {} to "delete" it, would actually start deleting items.

We could use NULL, but I don't see any benefit other than it being a more commonplace value for marking an item as deleted. Also note that the content column of account_data and room_account_data is currently marked as NOT NULL.

The reason for all of the faff with the account_data_undelivered_deletes table is the need to keep track of when we can remove the relevant row from (room_)account_data - after all devices of a user has seen the change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of using {} is that clients will then automatically sync the account data type with a value of {}, which as per MSC3391 implies to the client that the account data entry was deleted. It also comes with the benefit that older clients, which set account data content to {} to "delete" it, would actually start deleting items.

But we could still provide that to clients if we stored it as NULL by coercing it when passing back to clients.

We could use NULL, but I don't see any benefit other than it being a more commonplace value for marking an item as deleted.

The main benefit is that we know whether the data was actually deleted or just cleared. I'm not sure if this difference actually matters or not though. Maybe we don't want to actually delete rows if the data is just cleared and not deleted?

Also note that the content column of account_data and room_account_data is currently marked as NOT NULL.

I believe making a column non-null is not difficult: I don't think this would be a blocker.

The reason for all of the faff with the account_data_undelivered_deletes table is the need to keep track of when we can remove the relevant row from (room_)account_data - after all devices of a user has seen the change.

Hmm, OK. I don't think I'm following all of the changes being made then. This PR Is only the first of a few which will make these changes, I guess I'm having trouble lining up what parts of #14244 are in this PR and what is a future PR. That has a bunch of things ticked off which I don't see in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main benefit is that we know whether the data was actually deleted or just cleared. I'm not sure if this difference actually matters or not though. Maybe we don't want to actually delete rows if the data is just cleared and not deleted?

That's true, and could be beneficial in a somewhat niche use case. The plan #14244 talks about involves creating an entry in a account_data_undelivered_deletes table when a delete occurs. That has the side-effect of also letting us know which rows were deleted vs. set to {}.

After this PR, we don't need to differentiate between delete account data entries and those set to {} - as we should consider them the same thing. In either case, clients should see the entry as deleted. And we'll clear rows from the database via the information stored in the account_data_undelivered_deletes table.

Without the account_data_undelivered_deletes table (or some other method to track which devices still need to be informed about an account data entry being deleted) we can't delete rows from (room_)account_data anyway.

And in the case MSC3391 ends up not being accepted, leaving the rows as {} instead of NULL is probably more desirable.

2 changes: 2 additions & 0 deletions docker/complement/conf/workers-shared-extra.yaml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,8 @@ experimental_features:
{% endif %}
# Filtering /messages by relation type.
msc3874_enabled: true
# Enable removing account data support
msc3391_enabled: true

server_notices:
system_mxid_localpart: _server
Expand Down
2 changes: 1 addition & 1 deletion scripts-dev/complement.sh
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ fi

extra_test_args=()

test_tags="synapse_blacklist,msc3787,msc3874"
test_tags="synapse_blacklist,msc3787,msc3874,msc3391"

# All environment variables starting with PASS_ will be shared.
# (The prefix is stripped off before reaching the container.)
Expand Down
3 changes: 3 additions & 0 deletions synapse/config/experimental.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,3 +136,6 @@ def read_config(self, config: JsonDict, **kwargs: Any) -> None:
# Enable room version (and thus applicable push rules from MSC3931/3932)
version_id = RoomVersions.MSC1767v10.identifier
KNOWN_ROOM_VERSIONS[version_id] = RoomVersions.MSC1767v10

# MSC3391: Removing account data.
self.msc3391_enabled = experimental.get("msc3391_enabled", False)
111 changes: 103 additions & 8 deletions synapse/handlers/account_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,12 @@
from typing import TYPE_CHECKING, Awaitable, Callable, Collection, List, Optional, Tuple

from synapse.replication.http.account_data import (
ReplicationAddRoomAccountDataRestServlet,
ReplicationAddTagRestServlet,
ReplicationAddUserAccountDataRestServlet,
ReplicationRemoveRoomAccountDataRestServlet,
ReplicationRemoveTagRestServlet,
ReplicationRoomAccountDataRestServlet,
ReplicationUserAccountDataRestServlet,
ReplicationRemoveUserAccountDataRestServlet,
)
from synapse.streams import EventSource
from synapse.types import JsonDict, StreamKeyType, UserID
Expand All @@ -41,8 +43,18 @@ def __init__(self, hs: "HomeServer"):
self._instance_name = hs.get_instance_name()
self._notifier = hs.get_notifier()

self._user_data_client = ReplicationUserAccountDataRestServlet.make_client(hs)
self._room_data_client = ReplicationRoomAccountDataRestServlet.make_client(hs)
self._add_user_data_client = (
ReplicationAddUserAccountDataRestServlet.make_client(hs)
)
self._remove_user_data_client = (
ReplicationRemoveUserAccountDataRestServlet.make_client(hs)
)
self._add_room_data_client = (
ReplicationAddRoomAccountDataRestServlet.make_client(hs)
)
self._remove_room_data_client = (
ReplicationRemoveRoomAccountDataRestServlet.make_client(hs)
)
self._add_tag_client = ReplicationAddTagRestServlet.make_client(hs)
self._remove_tag_client = ReplicationRemoveTagRestServlet.make_client(hs)
self._account_data_writers = hs.config.worker.writers.account_data
Expand Down Expand Up @@ -112,7 +124,7 @@ async def add_account_data_to_room(

return max_stream_id
else:
response = await self._room_data_client(
response = await self._add_room_data_client(
instance_name=random.choice(self._account_data_writers),
user_id=user_id,
room_id=room_id,
Expand All @@ -121,15 +133,59 @@ async def add_account_data_to_room(
)
return response["max_stream_id"]

async def remove_account_data_for_room(
self, user_id: str, room_id: str, account_data_type: str
) -> Optional[int]:
"""
Deletes the room account data for the given user and account data type.

"Deleting" account data merely means setting the content of the account data
to an empty JSON object: {}.

Args:
user_id: The user ID to remove room account data for.
room_id: The room ID to target.
account_data_type: The account data type to remove.

Returns:
The maximum stream ID, or None if the room account data item did not exist.
"""
if self._instance_name in self._account_data_writers:
max_stream_id = await self._store.remove_account_data_for_room(
user_id, room_id, account_data_type
)
if max_stream_id is None:
# The referenced account data did not exist, so no delete occurred.
return None

self._notifier.on_new_event(
StreamKeyType.ACCOUNT_DATA, max_stream_id, users=[user_id]
)

# Notify Synapse modules that the content of the type has changed to an
# empty dictionary.
await self._notify_modules(user_id, room_id, account_data_type, {})

return max_stream_id
else:
response = await self._remove_room_data_client(
instance_name=random.choice(self._account_data_writers),
user_id=user_id,
room_id=room_id,
account_data_type=account_data_type,
content={},
)
return response["max_stream_id"]

async def add_account_data_for_user(
self, user_id: str, account_data_type: str, content: JsonDict
) -> int:
"""Add some global account_data for a user.

Args:
user_id: The user to add a tag for.
user_id: The user to add some account data for.
account_data_type: The type of account_data to add.
content: A json object to associate with the tag.
content: The content json dictionary.

Returns:
The maximum stream ID.
Expand All @@ -148,14 +204,53 @@ async def add_account_data_for_user(

return max_stream_id
else:
response = await self._user_data_client(
response = await self._add_user_data_client(
instance_name=random.choice(self._account_data_writers),
user_id=user_id,
account_data_type=account_data_type,
content=content,
)
return response["max_stream_id"]

async def remove_account_data_for_user(
self, user_id: str, account_data_type: str
) -> Optional[int]:
"""Removes a piece of global account_data for a user.

Args:
user_id: The user to remove account data for.
account_data_type: The type of account_data to remove.

Returns:
The maximum stream ID, or None if the room account data item did not exist.
"""

if self._instance_name in self._account_data_writers:
max_stream_id = await self._store.remove_account_data_for_user(
user_id, account_data_type
)
if max_stream_id is None:
# The referenced account data did not exist, so no delete occurred.
return None

self._notifier.on_new_event(
StreamKeyType.ACCOUNT_DATA, max_stream_id, users=[user_id]
)

# Notify Synapse modules that the content of the type has changed to an
# empty dictionary.
await self._notify_modules(user_id, None, account_data_type, {})

return max_stream_id
else:
response = await self._remove_user_data_client(
instance_name=random.choice(self._account_data_writers),
user_id=user_id,
account_data_type=account_data_type,
content={},
)
return response["max_stream_id"]

async def add_tag_to_room(
self, user_id: str, room_id: str, tag: str, content: JsonDict
) -> int:
Expand Down
92 changes: 84 additions & 8 deletions synapse/replication/http/account_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
logger = logging.getLogger(__name__)


class ReplicationUserAccountDataRestServlet(ReplicationEndpoint):
class ReplicationAddUserAccountDataRestServlet(ReplicationEndpoint):
"""Add user account data on the appropriate account data worker.

Request format:
Expand All @@ -49,7 +49,6 @@ def __init__(self, hs: "HomeServer"):
super().__init__(hs)

self.handler = hs.get_account_data_handler()
self.clock = hs.get_clock()

@staticmethod
async def _serialize_payload( # type: ignore[override]
Expand All @@ -73,7 +72,45 @@ async def _handle_request( # type: ignore[override]
return 200, {"max_stream_id": max_stream_id}


class ReplicationRoomAccountDataRestServlet(ReplicationEndpoint):
class ReplicationRemoveUserAccountDataRestServlet(ReplicationEndpoint):
"""Remove user account data on the appropriate account data worker.

Request format:

POST /_synapse/replication/remove_user_account_data/:user_id/:type

{
"content": { ... },
}

"""

NAME = "remove_user_account_data"
PATH_ARGS = ("user_id", "account_data_type")
CACHE = False

def __init__(self, hs: "HomeServer"):
super().__init__(hs)

self.handler = hs.get_account_data_handler()

@staticmethod
async def _serialize_payload( # type: ignore[override]
user_id: str, account_data_type: str
) -> JsonDict:
return {}

async def _handle_request( # type: ignore[override]
self, request: Request, user_id: str, account_data_type: str
) -> Tuple[int, JsonDict]:
max_stream_id = await self.handler.remove_account_data_for_user(
user_id, account_data_type
)

return 200, {"max_stream_id": max_stream_id}


class ReplicationAddRoomAccountDataRestServlet(ReplicationEndpoint):
"""Add room account data on the appropriate account data worker.

Request format:
Expand All @@ -94,7 +131,6 @@ def __init__(self, hs: "HomeServer"):
super().__init__(hs)

self.handler = hs.get_account_data_handler()
self.clock = hs.get_clock()

@staticmethod
async def _serialize_payload( # type: ignore[override]
Expand All @@ -118,6 +154,44 @@ async def _handle_request( # type: ignore[override]
return 200, {"max_stream_id": max_stream_id}


class ReplicationRemoveRoomAccountDataRestServlet(ReplicationEndpoint):
"""Remove room account data on the appropriate account data worker.

Request format:

POST /_synapse/replication/remove_room_account_data/:user_id/:room_id/:account_data_type

{
"content": { ... },
}

"""

NAME = "remove_room_account_data"
PATH_ARGS = ("user_id", "room_id", "account_data_type")
CACHE = False

def __init__(self, hs: "HomeServer"):
super().__init__(hs)

self.handler = hs.get_account_data_handler()

@staticmethod
async def _serialize_payload( # type: ignore[override]
user_id: str, room_id: str, account_data_type: str, content: JsonDict
) -> JsonDict:
return {}

async def _handle_request( # type: ignore[override]
self, request: Request, user_id: str, room_id: str, account_data_type: str
) -> Tuple[int, JsonDict]:
max_stream_id = await self.handler.remove_account_data_for_room(
user_id, room_id, account_data_type
)

return 200, {"max_stream_id": max_stream_id}


class ReplicationAddTagRestServlet(ReplicationEndpoint):
"""Add tag on the appropriate account data worker.

Expand All @@ -139,7 +213,6 @@ def __init__(self, hs: "HomeServer"):
super().__init__(hs)

self.handler = hs.get_account_data_handler()
self.clock = hs.get_clock()

@staticmethod
async def _serialize_payload( # type: ignore[override]
Expand Down Expand Up @@ -186,7 +259,6 @@ def __init__(self, hs: "HomeServer"):
super().__init__(hs)

self.handler = hs.get_account_data_handler()
self.clock = hs.get_clock()

@staticmethod
async def _serialize_payload(user_id: str, room_id: str, tag: str) -> JsonDict: # type: ignore[override]
Expand All @@ -206,7 +278,11 @@ async def _handle_request( # type: ignore[override]


def register_servlets(hs: "HomeServer", http_server: HttpServer) -> None:
ReplicationUserAccountDataRestServlet(hs).register(http_server)
ReplicationRoomAccountDataRestServlet(hs).register(http_server)
ReplicationAddUserAccountDataRestServlet(hs).register(http_server)
ReplicationAddRoomAccountDataRestServlet(hs).register(http_server)
ReplicationAddTagRestServlet(hs).register(http_server)
ReplicationRemoveTagRestServlet(hs).register(http_server)

if hs.config.experimental.msc3391_enabled:
ReplicationRemoveUserAccountDataRestServlet(hs).register(http_server)
ReplicationRemoveRoomAccountDataRestServlet(hs).register(http_server)
Loading