Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Presence should send less presence when joining a room #15818

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/15818.misc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Reduce the amount of outgoing presence when joining a room.
63 changes: 57 additions & 6 deletions synapse/handlers/presence.py
Original file line number Diff line number Diff line change
Expand Up @@ -1430,37 +1430,88 @@ async def _handle_state_delta(self, room_id: str, deltas: List[JsonDict]) -> Non
# 2. presence states of newly joined users to all remote servers in
# the room.
#
# TODO: Only send presence states to remote hosts that don't already
# However, only send presence states to remote hosts that don't already
# have them (because they already share rooms).

# Get all the users who were already in the room, by fetching the
# current users in the room and removing the newly joined users.
users = await self.store.get_users_in_room(room_id)
prev_users = set(users) - newly_joined_users

async def is_only_one_room_shared(
current_user_id: str, other_user_id: str
) -> bool:
realtyem marked this conversation as resolved.
Show resolved Hide resolved
"""Check to see if there is more than one shared room between these users"""
pair_of_users_to_check = frozenset((current_user_id, other_user_id))
# By learning if these two users share any other room, we can tell
# if user_id's server needs to receive presence or not
rooms_shared = await self.store.get_mutual_rooms_between_users(
pair_of_users_to_check
)

# They always share at least one room, the current one. As such this will
# never be zero
if len(rooms_shared) < 2:
return True

return False

# Construct sets for all the local users and remote hosts that were
# already in the room
prev_local_users = []
prev_remote_hosts = set()
prev_remote_hosts: Set[str] = set()

for user_id in prev_users:
if self.is_mine_id(user_id):
prev_local_users.append(user_id)
else:
prev_remote_hosts.add(get_domain_from_id(user_id))
# Only save a host to send presence to if this user shares no other
# rooms with other users this server knows about
host = get_domain_from_id(user_id)

# Remove current user_id from the set to check against
prev_users_excluding_current_user = prev_users.copy() - {user_id}

for other_user_id in prev_users_excluding_current_user:
# If this host has been seen in a list already, skip it.
if host not in prev_remote_hosts:
if await is_only_one_room_shared(user_id, other_user_id):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following this, why are we comparing all users previously in the room to see if they share a room? Notably, I think both user_id and other_user_id can be remote users?

I'd have thought that at least one iteration here should be across the newly joined users or hosts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following this, why are we comparing all users previously in the room to see if they share a room?

Excellent question. We want to build a full list of all remote servers in the room, not the actual users. (We shortcut out so not to hit the database more than necessary). But, we want the list to excluded some servers, hence the comparison. The list needs to be both as full as possible and as empty as possible at the same time.

Notably, I think both user_id and other_user_id can be remote users?

Not just can be, they are. It means that we know the host from user_id is in this room, but if other_user_id's host(which we know is in this room obviously) is in yet another room that we know about because it's federated to this server, than user_id's host can be excluded, since they are already receiving presence. For example:
@me:myserver, @alice:server1 and @bob:server2 are already in Synapse Admins.
@me:myserver, @bob:server2 and @charlie:server2 are in Matrix HQ.
@alice:server1 is joining Matrix HQ.
Because we know server1 is already receiving presence info about @bob:server2, it can be excluded from the list of prev_remote_hosts.

In retrospect, perhaps an additional list maintaining all the hosts we know can be excluded would also be a good idea, to cut down on database hits? However, I see a potential gotcha here, as what happens to @charlie:server2? If we maintain an excluded list as well, he would never be checked against and therefore(at least at first) not receive the new presence data.(Depending of course on if @bob or @charlie is checked first)

I'd have thought that at least one iteration here should be across the newly joined users or hosts?

Newly joined(either local or remote) is dealt with in the next block below, starting at about line 1496.

It could actually be argued that presence would be retrieved on the first(next?) sync that occurs after the room is fully joined anyways, so why are we bothering with this at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an epiphany while at work tonight; the prev_remote_hosts only needs to be built if one of the newly joined users is local. If that's what you were getting at with your question, mission accomplished 😉. I'll do a refactor in a bit.

Copy link
Contributor Author

@realtyem realtyem Jul 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an epiphany while at work tonight; the prev_remote_hosts only needs to be built if one of the newly joined users is local. If that's what you were getting at with your question, mission accomplished 😉. I'll do a refactor in a bit.

I still want to make this optimization a thing, but perhaps it's best left to another PR. If that's ok? Nope, never mind. Still need it after all. Testing is great. Can't whittle down the list if we don't have the data. I'm still gonna ponder on this, but right now it's not easily actionable.

prev_remote_hosts.add(host)
# Since we are focusing on the host from user_id, break out
# of the loop once its found
break

# Similarly, construct sets for all the local users and remote hosts
# that were *not* already in the room. Care needs to be taken with the
# calculating the remote hosts, as a host may have already been in the
# room even if there is a newly joined user from that host.
newly_joined_local_users = []
newly_joined_remote_hosts = set()
newly_joined_remote_hosts: Set[str] = set()

for user_id in newly_joined_users:
if self.is_mine_id(user_id):
newly_joined_local_users.append(user_id)
else:
host = get_domain_from_id(user_id)
if host not in prev_remote_hosts:
newly_joined_remote_hosts.add(host)
# To filter out hosts that are already in other rooms shared with
# this user, use the set constructed to get the full list of users as
# we need to check not just prev_users but newly joined as well
#
# We still remove the current user, as checking yourself is redundant
users_excluding_current_user_id = set(users) - {user_id}
for other_user_id in users_excluding_current_user_id:
# If a host is present in either of these lists, there is no point
# looking for it again. This is the key difference from the
# 'previous users' section above.
if (
host not in prev_remote_hosts
and host not in newly_joined_remote_hosts
):
if await is_only_one_room_shared(user_id, other_user_id):
newly_joined_remote_hosts.add(host)
# Since we are focusing on the host from user_id, break out
# of the loop once its found
break

# Send presence states of all local users in the room to newly joined
# remote servers. (We actually only send states for local users already
Expand Down
272 changes: 272 additions & 0 deletions tests/handlers/test_presence.py
Original file line number Diff line number Diff line change
Expand Up @@ -1118,6 +1118,239 @@ def test_remote_gets_presence_when_local_user_joins(self) -> None:
destinations={"server2", "server3"}, states=[expected_state]
)

def test_remote_joins_with_previously_shared_rooms_do_not_send_presence(
self,
) -> None:
# This test starts with our local user, test:server, presence set to "offline"
# Create two rooms
# Join our second local user, test2:server, to the room
# Set test2:server's presence to "online"
# Verify local user test2:server has presence "online"
# Add remote user alice:server2 to room 1
# Verify that presence was sent to server2
# Add remote user to room 2, alice:server2
# Verify that NO presence was sent
# Add remote user bob:server3 to room 2
# Verify that presence was sent only to server3
# Add remote user bob:server3 to room 1
# Verify that NO presence was sent

# We advance time to something that isn't 0, as we use 0 as a special
# value.
self.reactor.advance(1000000000000)

# Create two rooms with two local users
room_id_1 = self.helper.create_room_as(self.user_id)
room_id_2 = self.helper.create_room_as(self.user_id)

self.helper.join(room_id_1, "@test2:server")
self.helper.join(room_id_2, "@test2:server")

# Mark test2 as online, test will be offline with a last_active of 0
self.get_success(
self.presence_handler.set_state(
UserID.from_string("@test2:server"), {"presence": PresenceState.ONLINE}
)
)
self.reactor.pump([0]) # Wait for presence updates to be handled

self.federation_sender.reset_mock()

# Add a new remote server to the room
self._add_new_user(room_id_1, "@alice:server2")

self.reactor.pump([0]) # Wait for presence updates to be handled

# When new server is joined we send it the local users presence states.
# We expect to only see user @test2:server, as @test:server is offline
# and has a zero last_active_ts
expected_state = self.get_success(
self.presence_handler.current_state_for_user("@test2:server")
)
self.assertEqual(expected_state.state, PresenceState.ONLINE)
self.federation_sender.send_presence_to_destinations.assert_called_once_with(
destinations={"server2"}, states=[expected_state]
)

# Don't forget to reset the Mock or we get old data
self.federation_sender.reset_mock()

# Now add alice to the second room. As she and test2 share a room, no presence
# should be sent
self._add_new_user(room_id_2, "@alice:server2")

self.reactor.pump([0]) # Wait for presence updates to be handled

self.federation_sender.send_presence_to_destinations.assert_not_called()

# Now do the same thing with bob, in reverse order
self.federation_sender.reset_mock()
self._add_new_user(room_id_2, "@bob:server3")

self.reactor.pump([0]) # Wait for presence updates to be handled

# When new server is joined we send it the local users presence states.
# We expect to only see user @test2:server, as @test:server is offline
# and has a zero last_active_ts
expected_state = self.get_success(
self.presence_handler.current_state_for_user("@test2:server")
)
self.assertEqual(expected_state.state, PresenceState.ONLINE)
self.federation_sender.send_presence_to_destinations.assert_called_once_with(
destinations={"server3"}, states=[expected_state]
)

# Now add bob to the first room, no presence should be sent as he is already in
# a room on this server
self.federation_sender.reset_mock()
self._add_new_user(room_id_1, "@bob:server3")

self.reactor.pump([0]) # Wait for presence updates to be handled

self.federation_sender.send_presence_to_destinations.assert_not_called()

def test_remote_gets_presence_after_remote_join_leave_join(self) -> None:
# This test starts with our local user, test:server, presence set to "offline"
# Create one rooms
# Set test:server's presence to "online"
# Verify local user test:server has presence "online"
# Add remote user alice:server2 to room 1
# Verify that presence was sent to server2
# Remove remote user(alice:server2) from room
# Re-add alice to room
# Verify that presence is again sent to server2

# We advance time to something that isn't 0, as we use 0 as a special
# value. Not sure if this is actually relevant for this test, but makes
# last_active not be 0
self.reactor.advance(1000000000000)

# Create a room with our local user
room_id = self.helper.create_room_as(self.user_id)

# Mark test as online
self.get_success(
self.presence_handler.set_state(
UserID.from_string("@test:server"), {"presence": PresenceState.ONLINE}
)
)

# Pump the reactor to give presence a chance to update and reload the Mock
self.reactor.pump([0])
self.federation_sender.reset_mock()

# Add remote user alice to room
self._add_new_user(room_id, "@alice:server2")

self.reactor.pump([0]) # Wait for presence updates to be handled

# When new server is joined we send it the local users presence states.
# We expect it to send to server2.
expected_state = self.get_success(
self.presence_handler.current_state_for_user("@test:server")
)
self.assertEqual(expected_state.state, PresenceState.ONLINE)
self.federation_sender.send_presence_to_destinations.assert_called_once_with(
destinations={"server2"}, states=[expected_state]
)

# Don't forget to reset the Mock or we get old data
self.federation_sender.reset_mock()

# Have the remote user leave the room
self._leave_user_from_room(room_id, "@alice:server2")

# Pump the reactor again, but we should see no new presence as they left
self.reactor.pump([0])
self.federation_sender.send_presence_to_destinations.assert_not_called()

# Re-add remote user alice to room, and pump the reactor again
self._add_new_user(room_id, "@alice:server2")
self.reactor.pump([0])

# This server was here before, but they left the room and came back. We should
# be sending them presence again. There is only the one user, test:server
expected_state = self.get_success(
self.presence_handler.current_state_for_user("@test:server")
)
self.assertEqual(expected_state.state, PresenceState.ONLINE)
self.federation_sender.send_presence_to_destinations.assert_called_once_with(
destinations={"server2"}, states=[expected_state]
)

def test_remote_gets_presence_after_local_join_leave_join(self) -> None:
# This test starts with our local user, test:server, presence set to "offline"
# Create one room
# Join our second local user, test2:server, to the room
# Set test2:server's presence to "online"
# Verify local user test2:server has presence "online"
# Add remote user alice:server2 to room 1
# Verify that presence was sent to server2
# Remove local user from room
# Re-add local user to room
# Verify that presence is again sent to server2

# We advance time to something that isn't 0, as we use 0 as a special
# value.
self.reactor.advance(1000000000000)

# Create a room with two local users
room_id = self.helper.create_room_as(self.user_id)
self.helper.join(room_id, "@test2:server")

# Mark test2 as online, test will be offline with a last_active of 0
self.get_success(
self.presence_handler.set_state(
UserID.from_string("@test2:server"), {"presence": PresenceState.ONLINE}
)
)

self.reactor.pump([0]) # Wait for presence updates to be handled

# Don't forget to reset the Mock or we get old data
self.federation_sender.reset_mock()

# Add remote user alice to room and pump the reactor so presence can update
self._add_new_user(room_id, "@alice:server2")
self.reactor.pump([0])

# When new server is joined we send it the local users presence states.
# We expect to only see user @test2:server, as @test:server is offline
# and has a zero last_active_ts
expected_state = self.get_success(
self.presence_handler.current_state_for_user("@test2:server")
)
self.assertEqual(expected_state.state, PresenceState.ONLINE)
self.federation_sender.send_presence_to_destinations.assert_called_once_with(
destinations={"server2"}, states=[expected_state]
)

# Don't forget to reset the Mock or we get old data
self.federation_sender.reset_mock()

# Have the local user leave the room and pump the reactor again
self.helper.leave(room_id, "@test2:server")
self.reactor.pump([0])

# Should get no new presence updates
self.federation_sender.send_presence_to_destinations.assert_not_called()

# Re-add local user test2 to room, and pump the reactor
self.helper.join(room_id, "@test2:server")
self.reactor.pump([0])

# This user was here before, but they left the room and came back. We should
# be sending presence to the (only) remote server in the room again.
# We expect to only see user @test2:server, as @test:server is offline
# and has a zero last_active_ts
expected_state = self.get_success(
self.presence_handler.current_state_for_user("@test2:server")
)
self.assertEqual(expected_state.state, PresenceState.ONLINE)
self.federation_sender.send_presence_to_destinations.assert_called_once_with(
destinations={"server2"}, states=[expected_state]
)

def _add_new_user(self, room_id: str, user_id: str) -> None:
"""Add new user to the room by creating an event and poking the federation API."""

Expand Down Expand Up @@ -1153,3 +1386,42 @@ def _add_new_user(self, room_id: str, user_id: str) -> None:
# Check that it was successfully persisted.
self.get_success(self.store.get_event(event.event_id))
self.get_success(self.store.get_event(event.event_id))

def _leave_user_from_room(self, room_id: str, user_id: str) -> None:
"""
Remove user to the room by creating an event and poking the federation API.
Copied and modified from _add_new_user(), please check
"""

hostname = get_domain_from_id(user_id)

room_version = self.get_success(self.store.get_room_version_id(room_id))

builder = EventBuilder(
state=self.state,
event_auth_handler=self._event_auth_handler,
store=self.store,
clock=self.clock,
hostname=hostname,
signing_key=self.random_signing_key,
room_version=KNOWN_ROOM_VERSIONS[room_version],
room_id=room_id,
type=EventTypes.Member,
sender=user_id,
state_key=user_id,
content={"membership": Membership.LEAVE},
)

prev_event_ids = self.get_success(
self.store.get_latest_event_ids_in_room(room_id)
)

event = self.get_success(
builder.build(prev_event_ids=prev_event_ids, auth_event_ids=None)
)

self.get_success(self.federation_event_handler.on_receive_pdu(hostname, event))

# Check that it was successfully persisted.
self.get_success(self.store.get_event(event.event_id))
self.get_success(self.store.get_event(event.event_id))