Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Optimise async get event lookups #13435

Merged
merged 11 commits into from
Aug 4, 2022

Conversation

Fizzadar
Copy link
Contributor

@Fizzadar Fizzadar commented Aug 2, 2022

We turned on Redis get event caching... and it hammered the CPU! The huge number of lookups came from the roomember optimisation lookup (changed in 030888b). I also changed the get events function to prevent double lookups from any external cache there too (unconfirmed whether that had any impact, but seems sensible).

Signed off by Nick @ Beeper (@Fizzadar).

Pull Request Checklist

Still maintains local in memory lookup optimisation, but does any external
lookup as part of the deferred that prevents duplicate lookups for the same
event at once. This makes the assumption that fetching from an external
cache is a non-zero load operation.
@Fizzadar Fizzadar requested a review from a team as a code owner August 2, 2022 14:49
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I bring problems rather than solutions, but let us know if you need a hand solving them.

synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
changelog.d/13435.misc Outdated Show resolved Hide resolved
Fizzadar and others added 3 commits August 3, 2022 19:31
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
This avoid duplicating lookups, and thus metrics, on the in memory
event cache.
@Fizzadar
Copy link
Contributor Author

Fizzadar commented Aug 3, 2022

@richvdh thanks for the speedy review :) d15b7c3 should address the duplicate work & double metric counting properly.

@Fizzadar Fizzadar requested a review from richvdh August 3, 2022 18:38
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm otherwise!

synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/events_worker.py Outdated Show resolved Hide resolved
Fizzadar and others added 2 commits August 4, 2022 13:17
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
@richvdh richvdh merged commit 41320a0 into matrix-org:develop Aug 4, 2022
@Fizzadar Fizzadar deleted the optimise-async-get-event-lookups branch August 4, 2022 15:07
Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Aug 23, 2022
Still maintains local in memory lookup optimisation, but does any external
lookup as part of the deferred that prevents duplicate lookups for the same
event at once. This makes the assumption that fetching from an external
cache is a non-zero load operation.
Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Aug 25, 2022
Synapse 1.65.0 (2022-08-16)
===========================

No significant changes since 1.65.0rc2.

Synapse 1.65.0rc2 (2022-08-11)
==============================

Internal Changes
----------------

- Revert 'Remove the unspecced `room_id` field in the `/hierarchy` response. ([\matrix-org#13365](matrix-org#13365))' to give more time for clients to update. ([\matrix-org#13501](matrix-org#13501))

Synapse 1.65.0rc1 (2022-08-09)
==============================

Features
--------

- Add support for stable prefixes for [MSC2285 (private read receipts)](matrix-org/matrix-spec-proposals#2285). ([\matrix-org#13273](matrix-org#13273))
- Add new unstable error codes `ORG.MATRIX.MSC3848.ALREADY_JOINED`, `ORG.MATRIX.MSC3848.NOT_JOINED`, and `ORG.MATRIX.MSC3848.INSUFFICIENT_POWER` described in [MSC3848](matrix-org/matrix-spec-proposals#3848). ([\matrix-org#13343](matrix-org#13343))
- Use stable prefixes for [MSC3827](matrix-org/matrix-spec-proposals#3827). ([\matrix-org#13370](matrix-org#13370))
- Add a new module API method to translate a room alias into a room ID. ([\matrix-org#13428](matrix-org#13428))
- Add a new module API method to create a room. ([\matrix-org#13429](matrix-org#13429))
- Add remote join capability to the module API's `update_room_membership` method (in a backwards compatible manner). ([\matrix-org#13441](matrix-org#13441))

Bugfixes
--------

- Update the version of the LDAP3 auth provider module included in the `matrixdotorg/synapse` DockerHub images and the Debian packages hosted on packages.matrix.org to 0.2.2. This version fixes a regression in the module. ([\matrix-org#13470](matrix-org#13470))
- Fix a bug introduced in Synapse v1.41.0 where the `/hierarchy` API returned non-standard information (a `room_id` field under each entry in `children_state`). ([\matrix-org#13365](matrix-org#13365))
- Fix a bug introduced in Synapse 0.24.0 that would respond with the wrong error status code to `/joined_members` requests when the requester is not a current member of the room. Contributed by @andrewdoh. ([\matrix-org#13374](matrix-org#13374))
- Fix bug in handling of typing events for appservices. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13392](matrix-org#13392))
- Fix a bug introduced in Synapse 1.57.0 where rooms listed in `exclude_rooms_from_sync` in the configuration file would not be properly excluded from incremental syncs. ([\matrix-org#13408](matrix-org#13408))
- Fix a bug in the experimental faster-room-joins support which could cause it to get stuck in an infinite loop. ([\matrix-org#13353](matrix-org#13353))
- Faster room joins: fix a bug which caused rejected events to become un-rejected during state syncing. ([\matrix-org#13413](matrix-org#13413))
- Faster room joins: fix error when running out of servers to sync partial state with, so that Synapse raises the intended error instead. ([\matrix-org#13432](matrix-org#13432))

Updates to the Docker image
---------------------------

- Make Docker images build on armv7 by installing cryptography dependencies in the 'requirements' stage. Contributed by Jasper Spaans. ([\matrix-org#13372](matrix-org#13372))

Improved Documentation
----------------------

- Update the 'registration tokens' page to acknowledge that the relevant MSC was merged into version 1.2 of the Matrix specification. Contributed by @moan0s. ([\matrix-org#11897](matrix-org#11897))
- Document which HTTP resources support gzip compression. ([\matrix-org#13221](matrix-org#13221))
- Add steps describing how to elevate an existing user to administrator by manipulating the database. ([\matrix-org#13230](matrix-org#13230))
- Fix wrong headline for `url_preview_accept_language` in documentation. ([\matrix-org#13437](matrix-org#13437))
- Remove redundant 'Contents' section from the Configuration Manual. Contributed by @dklimpel. ([\matrix-org#13438](matrix-org#13438))
- Update documentation for config setting `macaroon_secret_key`. ([\matrix-org#13443](matrix-org#13443))
- Update outdated information on `sso_mapping_providers` documentation. ([\matrix-org#13449](matrix-org#13449))
- Fix example code in module documentation of `password_auth_provider_callbacks`. ([\matrix-org#13450](matrix-org#13450))
- Make the configuration for the cache clearer. ([\matrix-org#13481](matrix-org#13481))

Internal Changes
----------------

- Extend the release script to automatically push a new SyTest branch, rather than having that be a manual process. ([\matrix-org#12978](matrix-org#12978))
- Make minor clarifications to the error messages given when we fail to join a room via any server. ([\matrix-org#13160](matrix-org#13160))
- Enable Complement CI tests in the 'latest deps' test run. ([\matrix-org#13213](matrix-org#13213))
- Fix long-standing bugged logic which was never hit in `get_pdu` asking every remote destination even after it finds an event. ([\matrix-org#13346](matrix-org#13346))
- Faster room joins: avoid blocking when pulling events with partially missing prev events. ([\matrix-org#13355](matrix-org#13355))
- Instrument `/messages` for understandable traces in Jaeger. ([\matrix-org#13368](matrix-org#13368))
- Remove an unused argument to `get_relations_for_event`. ([\matrix-org#13383](matrix-org#13383))
- Add a `merge-back` command to the release script, which automates merging the correct branches after a release. ([\matrix-org#13393](matrix-org#13393))
- Adding missing type hints to tests. ([\matrix-org#13397](matrix-org#13397))
- Faster Room Joins: don't leave a stuck room partial state flag if the join fails. ([\matrix-org#13403](matrix-org#13403))
- Refactor `_resolve_state_at_missing_prevs` to compute an `EventContext` instead. ([\matrix-org#13404](matrix-org#13404), [\matrix-org#13431](matrix-org#13431))
- Faster Room Joins: prevent Synapse from answering federated join requests for a room which it has not fully joined yet. ([\matrix-org#13416](matrix-org#13416))
- Re-enable running Complement tests against Synapse with workers. ([\matrix-org#13420](matrix-org#13420))
- Prevent unnecessary lookups to any external `get_event` cache. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13435](matrix-org#13435))
- Add some tracing to give more insight into local room joins. ([\matrix-org#13439](matrix-org#13439))
- Rename class `RateLimitConfig` to `RatelimitSettings` and `FederationRateLimitConfig` to `FederationRatelimitSettings`. ([\matrix-org#13442](matrix-org#13442))
- Add some comments about how event push actions are stored. ([\matrix-org#13445](matrix-org#13445), [\matrix-org#13455](matrix-org#13455))
- Improve rebuild speed for the "synapse-workers" docker image. ([\matrix-org#13447](matrix-org#13447))
- Fix `@tag_args` being off-by-one with the arguments when tagging a span (tracing). ([\matrix-org#13452](matrix-org#13452))
- Update type of `EventContext.rejected`. ([\matrix-org#13460](matrix-org#13460))
- Use literals in place of `HTTPStatus` constants in tests. ([\matrix-org#13463](matrix-org#13463), [\matrix-org#13469](matrix-org#13469))
- Correct a misnamed argument in state res v2 internals. ([\matrix-org#13467](matrix-org#13467))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants