Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order heroes by stream_ordering (as spec'ed) #17435

Merged
merged 7 commits into from
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/17435.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Order `heroes` by `stream_ordering` as the Matrix specification states (applies to `/sync`).
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
32 changes: 24 additions & 8 deletions synapse/storage/databases/main/roommember.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,8 +279,19 @@ def _get_users_in_room_with_profiles(

@cached(max_entries=100000) # type: ignore[synapse-@cached-mutable]
async def get_room_summary(self, room_id: str) -> Mapping[str, MemberSummary]:
"""Get the details of a room roughly suitable for use by the room
"""
Get the details of a room roughly suitable for use by the room
summary extension to /sync. Useful when lazy loading room members.

Returns the total count of members in the room by membership type, and a
truncated list of members (the heroes). This will be the first 6 members of the
room:
- We want 5 heroes plus 1, in case one of them is the
calling user.
- They are ordered by `stream_ordering`, which are joined or
invited. When no joined or invited members are available, this also includes
banned and left users.

Args:
room_id: The room ID to query
Returns:
Expand Down Expand Up @@ -308,23 +319,28 @@ def _get_room_summary_txn(
for count, membership in txn:
res.setdefault(membership, MemberSummary([], count))

# we order by membership and then fairly arbitrarily by event_id so
# heroes are consistent
# Note, rejected events will have a null membership field, so
# we we manually filter them out.
# Order by membership (joins -> invites -> leave (former insiders) -> everything else
# including knocks since they are outsiders), then by `stream_ordering` so
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec says:

This should be the first 5 members of the room, ordered by stream ordering, which are joined or invited. The list must never include the client’s own user ID. When no joined or invited members are available, this should consist of the banned and left users.

-- https://spec.matrix.org/v1.10/client-server-api/#_matrixclientv3sync_roomsummary

We're considering banned/knock users to be on the same level (last resort)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knock members were (potentially) never in the room, though, it seems a bit wrong to include them at all IMO. The spec also doesn't mention them it seems?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably consider leave + ban on the same level TBH. Ban is just a leave with more force.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like knocks are already excluded because when we use extract_heroes_from_room_summary(...) it doesn't take them into account. I also added some tests for this.

I'm considering a leave to be a former insider (alumni) and a ban to be an outsider (no longer welcome, excommunicated).

# the first members in the room show up first and to make the sort stable
# (consistent heroes).
#
# Note: rejected events will have a null membership field, so we we manually
# filter them out.
sql = """
SELECT state_key, membership, event_id
FROM current_state_events
WHERE type = 'm.room.member' AND room_id = ?
AND membership IS NOT NULL
ORDER BY
CASE membership WHEN ? THEN 1 WHEN ? THEN 2 ELSE 3 END ASC,
event_id ASC
CASE membership WHEN ? THEN 1 WHEN ? THEN 2 WHEN ? THEN 3 ELSE 4 END ASC,
event_stream_ordering ASC
Comment on lines +335 to +336
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By changing this now, it will probably cause peoples rooms names to calculate to something different across the board (this function is also used by Sync v2). Are we ok with that? Potentially disruptive if people are really used to their old "flawed" name.

It's already a FIXME comment in extract_heroes_from_room_summary(...) but we still need to pull in the right order from the database.

Related to matrix-org/matrix-spec#1334

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To note the "flaws" with the current implementation sorting by event_id, the heroes will shift around as people join because their join event_id might sort lexicographically before others. The heroes will also shift as people change their display name and avatar because their join event_id will change. It also simply doesn't grab the first 5 members of the room as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you say, it's already unstable. I don't see any harm in changing this.

MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
LIMIT ?
"""

# 6 is 5 (number of heroes) plus 1, in case one of them is the calling user.
txn.execute(sql, (room_id, Membership.JOIN, Membership.INVITE, 6))
txn.execute(
sql, (room_id, Membership.JOIN, Membership.INVITE, Membership.LEAVE, 6)
)
for user_id, membership, event_id in txn:
summary = res[membership]
# we will always have a summary for this membership type at this
Expand Down
21 changes: 9 additions & 12 deletions tests/rest/client/test_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -2186,18 +2186,15 @@ def test_rooms_meta_heroes_max(self) -> None:

# Room2 doesn't have a name so we should see `heroes` populated
self.assertIsNone(channel.json_body["rooms"][room_id1].get("name"))
# FIXME: Remove this basic assertion and uncomment the better assertion below
# after https://github.com/element-hq/synapse/pull/17435 merges
self.assertEqual(len(channel.json_body["rooms"][room_id1].get("heroes", [])), 5)
# self.assertCountEqual(
# [
# hero["user_id"]
# for hero in channel.json_body["rooms"][room_id1].get("heroes", [])
# ],
# # Heroes should be the first 5 users in the room (excluding the user
# # themselves, we shouldn't see `user1`)
# [user2_id, user3_id, user4_id, user5_id, user6_id],
# )
self.assertCountEqual(
[
hero["user_id"]
for hero in channel.json_body["rooms"][room_id1].get("heroes", [])
],
# Heroes should be the first 5 users in the room (excluding the user
# themselves, we shouldn't see `user1`)
[user2_id, user3_id, user4_id, user5_id, user6_id],
)
self.assertEqual(
channel.json_body["rooms"][room_id1]["joined_count"],
7,
Expand Down
Loading