-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Remove some more joins on room_memberships #5752
Conversation
This will allow us to efficiently filter out rooms that have been forgotten in other queries without having to join against the `room_memberships` table.
Codecov Report
@@ Coverage Diff @@
## develop #5752 +/- ##
==========================================
+ Coverage 63.36% 63.4% +0.03%
==========================================
Files 331 331
Lines 36399 36424 +25
Branches 6007 6009 +2
==========================================
+ Hits 23063 23093 +30
+ Misses 11697 11692 -5
Partials 1639 1639 |
""" | ||
|
||
def _get_forgotten_rooms_for_user_txn(txn): | ||
# This is a slightly convoluted query that first looks up all rooms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is this in aid of trying to support unforgetting of rooms? is that a thing that's possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, you can just rejoin the room (I believe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually could you update the comment to make this a bit clearer?
sql = """ | ||
SELECT room_id, ( | ||
SELECT count(*) FROM room_memberships | ||
WHERE room_id = m.room_id AND user_id = m.user_id AND forgotten = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't this include membership events that happened before the "forget"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the non-obvious way forgetting works is to mark all room memberships with forgotten = 1
, so to check if a room is forgotten or not is a case of checking if any memberships have forgotten = 0
* limitations under the License. | ||
*/ | ||
|
||
-- We add membership to current state so that we don't need to join against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment looks a bit lost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, yes :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, once the CI is sorted out
WHERE room_id = m.room_id AND user_id = m.user_id AND forgotten = 0 | ||
) AS count | ||
FROM room_memberships AS m | ||
WHERE user_id = ? AND forgotten = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is going to be very painful until that background reindex completes, isn't it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes about ~500ms to 900ms for large accounts, small accounts take a few ms.
The query plan does a bitmap scan across room_id
and user_id
indices:
matrix=> explain analyze SELECT room_id, (
SELECT count(*) FROM room_memberships
WHERE room_id = m.room_id AND user_id = m.user_id AND forgotten = 0
) AS count
FROM room_memberships AS m
WHERE user_id = '@e:matrix.org' AND forgotten = 1
GROUP BY room_id, user_id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Group (cost=2518.90..4262.71 rows=4 width=68) (actual time=0.412..0.412 rows=0 loops=1)
Group Key: m.room_id, m.user_id
-> Sort (cost=2518.90..2518.91 rows=4 width=60) (actual time=0.411..0.411 rows=0 loops=1)
Sort Key: m.room_id
Sort Method: quicksort Memory: 25kB
-> Index Scan using room_memberships_user_id on room_memberships m (cost=0.57..2518.86 rows=4 width=60) (actual time=0.407..0.407 rows=0 loops=1)
Index Cond: (user_id = '@e:matrix.org'::text)
Filter: (forgotten = 1)
Rows Removed by Filter: 46
SubPlan 1
-> Aggregate (cost=435.94..435.95 rows=1 width=8) (never executed)
-> Bitmap Heap Scan on room_memberships (cost=433.92..435.93 rows=1 width=0) (never executed)
Recheck Cond: ((user_id = m.user_id) AND (room_id = m.room_id))
Filter: (forgotten = 0)
-> BitmapAnd (cost=433.92..433.92 rows=1 width=0) (never executed)
-> Bitmap Index Scan on room_memberships_user_id (cost=0.00..32.24 rows=1289 width=0) (never executed)
Index Cond: (user_id = m.user_id)
-> Bitmap Index Scan on room_memberships_room_id (cost=0.00..401.43 rows=17981 width=0) (never executed)
Index Cond: (room_id = m.room_id)
Planning time: 0.144 ms
Execution time: 0.468 ms
(21 rows)
Synapse 1.3.0 (2019-08-15) ========================== Bugfixes -------- - Fix 500 Internal Server Error on `publicRooms` when the public room list was cached. ([\#5851](#5851)) Synapse 1.3.0rc1 (2019-08-13) ========================== Features -------- - Use `M_USER_DEACTIVATED` instead of `M_UNKNOWN` for errcode when a deactivated user attempts to login. ([\#5686](#5686)) - Add sd_notify hooks to ease systemd integration and allows usage of Type=Notify. ([\#5732](#5732)) - Synapse will no longer serve any media repo admin endpoints when `enable_media_repo` is set to False in the configuration. If a media repo worker is used, the admin APIs relating to the media repo will be served from it instead. ([\#5754](#5754), [\#5848](#5848)) - Synapse can now be configured to not join remote rooms of a given "complexity" (currently, state events) over federation. This option can be used to prevent adverse performance on resource-constrained homeservers. ([\#5783](#5783)) - Allow defining HTML templates to serve the user on account renewal attempt when using the account validity feature. ([\#5807](#5807)) Bugfixes -------- - Fix UISIs during homeserver outage. ([\#5693](#5693), [\#5789](#5789)) - Fix stack overflow in server key lookup code. ([\#5724](#5724)) - start.sh no longer uses deprecated cli option. ([\#5725](#5725)) - Log when we receive an event receipt from an unexpected origin. ([\#5743](#5743)) - Fix debian packaging scripts to correctly build sid packages. ([\#5775](#5775)) - Correctly handle redactions of redactions. ([\#5788](#5788)) - Return 404 instead of 403 when accessing /rooms/{roomId}/event/{eventId} for an event without the appropriate permissions. ([\#5798](#5798)) - Fix check that tombstone is a state event in push rules. ([\#5804](#5804)) - Fix error when trying to login as a deactivated user when using a worker to handle login. ([\#5806](#5806)) - Fix bug where user `/sync` stream could get wedged in rare circumstances. ([\#5825](#5825)) - The purge_remote_media.sh script was fixed. ([\#5839](#5839)) Deprecations and Removals ------------------------- - Synapse now no longer accepts the `-v`/`--verbose`, `-f`/`--log-file`, or `--log-config` command line flags, and removes the deprecated `verbose` and `log_file` configuration file options. Users of these options should migrate their options into the dedicated log configuration. ([\#5678](#5678), [\#5729](#5729)) - Remove non-functional 'expire_access_token' setting. ([\#5782](#5782)) Internal Changes ---------------- - Make Jaeger fully configurable. ([\#5694](#5694)) - Add precautionary measures to prevent future abuse of `window.opener` in default welcome page. ([\#5695](#5695)) - Reduce database IO usage by optimising queries for current membership. ([\#5706](#5706), [\#5738](#5738), [\#5746](#5746), [\#5752](#5752), [\#5770](#5770), [\#5774](#5774), [\#5792](#5792), [\#5793](#5793)) - Improve caching when fetching `get_filtered_current_state_ids`. ([\#5713](#5713)) - Don't accept opentracing data from clients. ([\#5715](#5715)) - Speed up PostgreSQL unit tests in CI. ([\#5717](#5717)) - Update the coding style document. ([\#5719](#5719)) - Improve database query performance when recording retry intervals for remote hosts. ([\#5720](#5720)) - Add a set of opentracing utils. ([\#5722](#5722)) - Cache result of get_version_string to reduce overhead of `/version` federation requests. ([\#5730](#5730)) - Return 'user_type' in admin API user endpoints results. ([\#5731](#5731)) - Don't package the sytest test blacklist file. ([\#5733](#5733)) - Replace uses of returnValue with plain return, as returnValue is not needed on Python 3. ([\#5736](#5736)) - Blacklist some flakey tests in worker mode. ([\#5740](#5740)) - Fix some error cases in the caching layer. ([\#5749](#5749)) - Add a prometheus metric for pending cache lookups. ([\#5750](#5750)) - Stop trying to fetch events with event_id=None. ([\#5753](#5753)) - Convert RedactionTestCase to modern test style. ([\#5768](#5768)) - Allow looping calls to be given arguments. ([\#5780](#5780)) - Set the logs emitted when checking typing and presence timeouts to DEBUG level, not INFO. ([\#5785](#5785)) - Remove DelayedCall debugging from the test suite, as it is no longer required in the vast majority of Synapse's tests. ([\#5787](#5787)) - Remove some spurious exceptions from the logs where we failed to talk to a remote server. ([\#5790](#5790)) - Improve performance when making `.well-known` requests by sharing the SSL options between requests. ([\#5794](#5794)) - Disable codecov GitHub comments on PRs. ([\#5796](#5796)) - Don't allow clients to send tombstone events that reference the room it's sent in. ([\#5801](#5801)) - Deny redactions of events sent in a different room. ([\#5802](#5802)) - Deny sending well known state types as non-state events. ([\#5805](#5805)) - Handle incorrectly encoded query params correctly by returning a 400. ([\#5808](#5808)) - Handle pusher being deleted during processing rather than logging an exception. ([\#5809](#5809)) - Return 502 not 500 when failing to reach any remote server. ([\#5810](#5810)) - Reduce global pauses in the events stream caused by expensive state resolution during persistence. ([\#5826](#5826)) - Add a lower bound to well-known lookup cache time to avoid repeated lookups. ([\#5836](#5836)) - Whitelist history visbility sytests in worker mode tests. ([\#5843](#5843))
This involves changing the way we filter out forgotten rooms, which needed a new index.
These commits are independently reviewable.