Optimize `have_seen_events` #13561

MadLittleMods · 2022-08-19T07:28:16Z

Optimize have_seen_events because when backfilling #matrixhq, 20s is just calling have_seen_events on the 200k state and auth events in the room. And in the future, hopefully a PR to avoid the have_seen_events altogether or do it in the background.

have_seen_events (157 db.have_seen_events) takes 6.62s to process 77k events
have_seen_events (246 db.have_seen_events) takes 13.19s to process 122k events

With 50k events, for a cold query, we're seeing a 3.7s -> 0.33s improvement. And the new version doesn't even have a cache yet and it still performs almost as good.

# events

Before

Before with #1391

Before (no cache)

After

50k

.

Benchmark time (1 cold cache ): 3.7170820236206055
Benchmark time (2, warm cache): 0.2985079288482666
Benchmark time (3, warm cache): 0.28847789764404297
Benchmark time (4, odds      ): 0.1537461280822754
Benchmark time (5, odds      ): 0.14780497550964355
Benchmark time (6, evens     ): 0.1475691795349121
Benchmark time (7, evens     ): 0.14868617057800293

.

Benchmark time (1 cold cache ): 1.062384843826294
Benchmark time (2, warm cache): 0.18310809135437012
Benchmark time (3, warm cache): 0.17781400680541992
Benchmark time (4, odds      ): 0.09939789772033691
Benchmark time (5, odds      ): 0.09195113182067871
Benchmark time (6, evens     ): 0.08882665634155273
Benchmark time (7, evens     ): 0.08661413192749023

.

Benchmark time (1 cold cache ): 0.3248419761657715
Benchmark time (2, warm cache): 0.32351016998291016
Benchmark time (3, warm cache): 0.3136260509490967
Benchmark time (4, odds      ): 0.15899014472961426
Benchmark time (5, odds      ): 0.15054106712341309
Benchmark time (6, evens     ): 0.15465688705444336
Benchmark time (7, evens     ): 0.1412408351898193

.

Benchmark time (1 cold cache ): 0.31160497665405273
Benchmark time (2, warm cache): 0.3012270927429199
Benchmark time (3, warm cache): 0.2938399314880371
Benchmark time (4, odds      ): 0.15165090560913086
Benchmark time (5, odds      ): 0.14775395393371582
Benchmark time (6, evens     ): 0.14690089225769043
Benchmark time (7, evens     ): 0.14881110191345215

100k

.

Benchmark time (1 cold cache ): 8.10055136680603
Benchmark time (2, warm cache): 0.6121761798858643
Benchmark time (3, warm cache): 0.6093218326568604
Benchmark time (4, odds      ): 0.29950785636901855
Benchmark time (5, odds      ): 0.3049640655517578
Benchmark time (6, evens     ): 0.3025388717651367
Benchmark time (7, evens     ): 0.29833483695983887

.

Benchmark time (1 cold cache ): 2.264657735824585
Benchmark time (2, warm cache): 0.3539161682128906
Benchmark time (3, warm cache): 0.3556997776031494
Benchmark time (4, odds      ): 0.17580008506774902
Benchmark time (5, odds      ): 0.17888712882995605
Benchmark time (6, evens     ): 0.1756150722503662
Benchmark time (7, evens     ): 0.17418384552001953

.

Benchmark time (1 cold cache ): 0.8466510772705078
Benchmark time (2, warm cache): 0.8022150993347168
Benchmark time (3, warm cache): 0.7888422012329102
Benchmark time (4, odds      ): 0.3941817283630371
Benchmark time (5, odds      ): 0.416118860244751
Benchmark time (6, evens     ): 0.42328405380249023
Benchmark time (7, evens     ): 0.3695280551910400

.

Benchmark time (1 cold cache ): 0.6143028736114502
Benchmark time (2, warm cache): 0.5878009796142578
Benchmark time (3, warm cache): 0.5958888530731201
Benchmark time (4, odds      ): 0.3110220432281494
Benchmark time (5, odds      ): 0.3027939796447754
Benchmark time (6, evens     ): 0.2908449172973633
Benchmark time (7, evens     ): 0.3113112449645996

200k

.

Benchmark time (1 cold cache ): 19.106724977493286
Benchmark time (2, warm cache): 22.98161005973816
Benchmark time (3, warm cache): 23.126408100128174
Benchmark time (4, odds      ): 11.401129007339478
Benchmark time (5, odds      ): 0.6159579753875732
Benchmark time (6, evens     ): 12.087002992630005
Benchmark time (7, evens     ): 0.6241748332977295

.

Benchmark time (1 cold cache ): 4.827088117599487
Benchmark time (2, warm cache): 4.958348035812378
Benchmark time (3, warm cache): 4.923892974853516
Benchmark time (4, odds      ): 2.5139119625091553
Benchmark time (5, odds      ): 0.3530576229095459
Benchmark time (6, evens     ): 2.465486764907837
Benchmark time (7, evens     ): 0.3511021137237549

.

Benchmark time (1 cold cache ): 1.328582763671875
Benchmark time (2, warm cache): 1.279066801071167
Benchmark time (3, warm cache): 1.2781598567962646
Benchmark time (4, odds      ): 0.6520607471466064
Benchmark time (5, odds      ): 0.647273063659668
Benchmark time (6, evens     ): 0.6393017768859863
Benchmark time (7, evens     ): 0.6427278518676758

.

Benchmark time (1 cold cache ): 1.6177170276641846
Benchmark time (2, warm cache): 1.6484057903289795
Benchmark time (3, warm cache): 1.6546461582183838
Benchmark time (4, odds      ): 0.8459029197692871
Benchmark time (5, odds      ): 0.8401181697845459
Benchmark time (6, evens     ): 0.8196630477905273
Benchmark time (7, evens     ): 0.8566310405731201

Dev notes

SYNAPSE_POSTGRES=1 SYNAPSE_TEST_LOG_LEVEL=INFO python -m twisted.trial tests.storage.databases.main.test_events_worker.HaveSeenEventsTestCase.test_benchmark

^.* (test|sentinel) -

Todo

Try with a cache
Refactor functions that use have_seen_events to use the returned remaining events we didn't see (previously, it returned the events we saw)

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Pull request includes a sign off
Code style is correct
(run the linters)

MadLittleMods · 2022-08-19T07:29:40Z

synapse/storage/databases/main/events_worker.py

@@ -1462,68 +1468,68 @@ async def have_seen_events(
            event_ids: events we are looking for

        Returns:
-            The set of events we have already seen.
+            The remaining set of events we haven't seen.


Changed from returning the events we saw to the events we didn't see.

In big rooms like #matrixhq, we've seen most of the events already. Only a little piece of the state is new.

MadLittleMods · 2022-08-19T07:30:20Z

synapse/storage/databases/main/events_worker.py

-        results: Set[str] = set()
-        for chunk in batch_iter(event_ids, 500):
-            r = await self._have_seen_events_dict(
-                [(room_id, event_id) for event_id in chunk]


Just a bunch of avoiding creating new intermediary lists and using sets for lookups.

MadLittleMods · 2022-08-19T07:31:56Z

synapse/storage/databases/main/events_worker.py

            # we deliberately do *not* query the database for room_id, to make the
            # query an index-only lookup on `events_event_id_key`.
            #
            # We therefore pull the events from the database into a set...

            sql = "SELECT event_id FROM events AS e WHERE "
            clause, args = make_in_list_sql_clause(
-                txn.database_engine, "e.event_id", [eid for (_rid, eid) in remaining]
+                txn.database_engine, "e.event_id", remaining_event_ids


Any random Postgres SQL tricks to make looking up 1000 events faster?

Like this sort of stuff:

https://www.datadoghq.com/blog/100x-faster-postgres-performance-by-changing-1-line/

https://towardsdatascience.com/how-we-optimized-postgresql-queries-100x-ff52555eabe#0c6c

MadLittleMods · 2022-08-19T16:26:50Z

synapse/storage/databases/main/events_worker.py


-    @cachedList(cached_method_name="have_seen_event", list_name="keys")


Is there any max size of a @cachedList? Seems like it grows forever

It ought to be the same cache as have_seen_event, which has max_entries=100000 (before multiplying by the global cache factor).

Some discussion in backend internal,

Seems like have_seen_events currently uses the default of 1000 from the DeferredCache.

And it's not one of the per_cache_factors that we configure for matrix.org so it's just some small multiple of that in any case with the global_factor.

It ought to be the same cache as have_seen_event, which has max_entries=100000 (before multiplying by the global cache factor).

@squahtx Ahhh, I think see what you're trying to say. _have_seen_events_dict() re-uses the have_seen_event cache which is defined at max_entries=100000

synapse/synapse/storage/databases/main/events_worker.py

Lines 1481 to 1482 in f9f0342

@cachedList(cached_method_name="have_seen_event", list_name="keys")

async def _have_seen_events_dict(

synapse/synapse/storage/databases/main/events_worker.py

Lines 1523 to 1524 in f9f0342

@cached(max_entries=100000, tree=True)

async def have_seen_event(self, room_id: str, event_id: str) -> bool:

In any case, it seems to have a 0% cache hit rate,

https://grafana.matrix.org/d/000000012/synapse?orgId=1&from=1660891259284&to=1660934459284&viewPanel=1

And the cache size doesn't get above 6k:

This cache still seems so weird to me. Why doesn't it grow bigger? We should at least see 200k state events from Matrix HQ show up and completely fill up the cache sometimes. Or at least the cache should keep growing until full?

https://grafana.matrix.org/d/000000012/synapse?orgId=1&from=1664997921975&to=1666293921975&viewPanel=8

synapse/storage/databases/main/events_worker.py

… two events Split off from #13561

… two events (#13586) Split off from #13561 Part of #13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.2tvwz3yhcafh)

3nprob · 2022-08-30T06:44:30Z

synapse/storage/databases/main/events_worker.py

-            )
-            results.update(eid for ((_rid, eid), have_event) in r.items() if have_event)
+        remaining_event_ids: Set[str] = set()
+        for chunk in batch_iter(event_ids, 1000):


Why the number 1000 specifically? If it's supposed to match with DeferredCache.max_entries, could they be made to refer to the same declaration somehow?

It's just arbitrarily doubled to have less round-trip time to the database with the really big rooms. I don't know of a good heuristic to justify a better value. But I know we don't want the SQL queries to get too big which is why we're batching in the first place.

Good to know that we shouldn't exceed the max_entries of the cache though 👍

Good to know that we shouldn't exceed the max_entries of the cache though

Oh, that was a very tentative "if"; mentioned it as a guess because it was mentioned in the PR discussion.

If (again, if ;)) they are not supposed to coupled, perhaps consider moving it to a file-local constant with EVENT_QUEUE_BATCH_SIZE or similar and move its definition with the existing ones?

3nprob · 2022-08-30T07:31:44Z

synapse/storage/databases/main/events_worker.py

+        # if the event cache contains the event, obviously we've seen it.
+        event_cache_entry_map = self._get_events_from_local_cache(event_ids)
+        event_ids_in_cache = event_cache_entry_map.keys()
+        remaining_event_ids = {


Are these enumerations a significant part of the runtime? Unless I'm missing something being lazy or whatnot here, this:

setdiff(events, subset(eventcache, events))

can be simplified to:

setdiff(events, eventcache)

It should only be necessary to perform one full enumeration of the eventids in a single pass. That'd require complementing or modifying _get_events_from_local_cache, though. Unless you already checked that the JIT is smart enough about it (I think things have changed a lot there in Python since I actually knew this stuff), inlining that logic could also bring improvements.

Are these enumerations a significant part of the runtime?

Maybe but it's practically nothing compared to @cachedList being bad. I'm waiting for changes there to see how necessary these kind of optimizations are.

MadLittleMods added 3 commits August 18, 2022 22:14

Start of optimizing

7cb07d3

Add benchmark

2fdbca6

Better benchmark

fd941cd

MadLittleMods added the A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) label Aug 19, 2022

MadLittleMods commented Aug 19, 2022

View reviewed changes

matrix-org deleted a comment from Nira34 Aug 19, 2022

Better test prefixes

0dbc750

MadLittleMods commented Aug 19, 2022

View reviewed changes

MadLittleMods commented Aug 22, 2022

View reviewed changes

synapse/storage/databases/main/events_worker.py Show resolved Hide resolved

MadLittleMods added a commit that referenced this pull request Aug 22, 2022

Comment about a better future where we can get the state diff between…

0e4e10a

… two events Split off from #13561

MadLittleMods mentioned this pull request Aug 22, 2022

Comment about a better future where we can get the state diff between two events #13586

Merged

4 tasks

MadLittleMods mentioned this pull request Aug 25, 2022

have_seen_events is eating up time while processing state and auth events while backfilling #13625

Open

MadLittleMods added A-Performance Performance, both client-facing and admin-facing T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. labels Aug 27, 2022

3nprob reviewed Aug 30, 2022

View reviewed changes

MadLittleMods mentioned this pull request Sep 6, 2022

Speed up @cachedList #13591

Merged

MadLittleMods mentioned this pull request Sep 21, 2022

have_seen_event cache is not invalidated when we persist an event #13856

Closed

MadLittleMods closed this Dec 2, 2022

matrixbot mentioned this pull request Dec 21, 2023

have_seen_events is eating up time while processing state and auth events while backfilling element-hq/synapse#13625

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `have_seen_events` #13561

Optimize `have_seen_events` #13561

MadLittleMods commented Aug 19, 2022 •

edited

Loading

MadLittleMods Aug 19, 2022

MadLittleMods Aug 19, 2022

MadLittleMods Aug 19, 2022

MadLittleMods Aug 19, 2022

squahtx Aug 19, 2022 •

edited

Loading

MadLittleMods Aug 19, 2022 •

edited

Loading

MadLittleMods Aug 19, 2022 •

edited

Loading

MadLittleMods Oct 20, 2022 •

edited

Loading

3nprob Aug 30, 2022

MadLittleMods Aug 30, 2022 •

edited

Loading

3nprob Aug 30, 2022 •

edited

Loading

3nprob Aug 30, 2022 •

edited

Loading

MadLittleMods Aug 30, 2022


		@cachedList(cached_method_name="have_seen_event", list_name="keys")

	@cachedList(cached_method_name="have_seen_event", list_name="keys")
	async def _have_seen_events_dict(

	@cached(max_entries=100000, tree=True)
	async def have_seen_event(self, room_id: str, event_id: str) -> bool:

Optimize have_seen_events #13561

Optimize have_seen_events #13561

Conversation

MadLittleMods commented Aug 19, 2022 • edited Loading

Dev notes

Todo

Pull Request Checklist

MadLittleMods Aug 19, 2022

Choose a reason for hiding this comment

MadLittleMods Aug 19, 2022

Choose a reason for hiding this comment

MadLittleMods Aug 19, 2022

Choose a reason for hiding this comment

MadLittleMods Aug 19, 2022

Choose a reason for hiding this comment

squahtx Aug 19, 2022 • edited Loading

Choose a reason for hiding this comment

MadLittleMods Aug 19, 2022 • edited Loading

Choose a reason for hiding this comment

MadLittleMods Aug 19, 2022 • edited Loading

Choose a reason for hiding this comment

MadLittleMods Oct 20, 2022 • edited Loading

Choose a reason for hiding this comment

3nprob Aug 30, 2022

Choose a reason for hiding this comment

MadLittleMods Aug 30, 2022 • edited Loading

Choose a reason for hiding this comment

3nprob Aug 30, 2022 • edited Loading

Choose a reason for hiding this comment

3nprob Aug 30, 2022 • edited Loading

Choose a reason for hiding this comment

MadLittleMods Aug 30, 2022

Choose a reason for hiding this comment

Optimize `have_seen_events` #13561

Optimize `have_seen_events` #13561

MadLittleMods commented Aug 19, 2022 •

edited

Loading

squahtx Aug 19, 2022 •

edited

Loading

MadLittleMods Aug 19, 2022 •

edited

Loading

MadLittleMods Aug 19, 2022 •

edited

Loading

MadLittleMods Oct 20, 2022 •

edited

Loading

MadLittleMods Aug 30, 2022 •

edited

Loading

3nprob Aug 30, 2022 •

edited

Loading

3nprob Aug 30, 2022 •

edited

Loading