-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Allow background tasks to be run on a separate worker. #8369
Conversation
d690d9c
to
71ade09
Compare
I would like to do a bit more testing locally, but I think this is in a place where some feedback is warranted! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise looks OK (though haven't looked in depth)
Running this manually I've run into two issues:
@erikjohnston Any thoughts on approaches for the above? |
Hmm, I don't really. We do manually call |
Another thought that came up is that we can setup the listener for this, but not actually route anything to it. This wouldn't be ideal, but means we don't need to refactor a bunch of stuff. |
Turns out we already have a mechanism to do this... 🤦 I extended it a bit to include the extra handlers and call it for the background worker. See 2e98b78 for this approach. It somewhat feels like a thing we need to remember in the future, but not sure there's a better way.
This is actually causing me more pain at the moment as it will require moving quite a few methods. I'm not sure how safe this is, but I'll give it a go. |
I believe I've solved the current two issues although I'm still testing this. Please take another look! 👁️ |
I did a bit of testing with this and I'm fairly certain it works. |
Yeah, I think this looks sane |
9a2c42f
to
c015379
Compare
class UIAuthStore(UIAuthWorkerStore): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to remove the UIAuthWorkerStore
and just use UIAuthStore
everywhere instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially, though its not the worse to keep the separation for consistency.
@erikjohnston I managed to fix the sytests, I think this and matrix-org/sytest#958 are reviewable now! Thanks for your help! We talked about it in #synapse-dev:matrix.org, but to re-iterate here: this backs out the changes to the user directory background task since that already had a separate flag ( |
class UIAuthStore(UIAuthWorkerStore): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially, though its not the worse to keep the separation for consistency.
synapse/app/generic_worker.py
Outdated
# If background tasks are running on this worker, start collecting the phone | ||
# home stats. | ||
if hs.config.run_background_tasks: | ||
start_phone_stats_home(hs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can move this to somewhere in synapse.app._base
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps around
Lines 274 to 275 in c619253
setup_sentry(hs) | |
setup_sdnotify(hs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only downside of this is that it also gets used for synapse.app.admin_cmd
. I don't know how frequently this gets run and if it would cause any issues. Thoughts @erikjohnston?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the admin command have the run_background_tasks
disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, there's a section to disable all background tasks anyway, so looks like I should add that there.
@erikjohnston Could you give this another quick look to make sure I didn't do anything silly with moving stuff between data stores? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this still looks sane!
…rd_party_event_rules_public_rooms * 'develop' of github.com:matrix-org/synapse: (109 commits) Add logging on startup/shutdown (#8448) Speed up unit tests when using PostgreSQL (#8450) Allow background tasks to be run on a separate worker. (#8369) move #8444 to 'feature' linkify changelog 1.21.0rc2 1.21.0rc2 Fix bug in remote thumbnail search (#8438) Include a public_baseurl in configs generated by the demo script. (#8443) Fix DB query on startup for negative streams. (#8447) Convert additional templates to Jinja (#8444) Fix malformed log line in new federation "catch up" logic (#8442) Add unit test for event persister sharding (#8433) Add config option for always using "userinfo endpoint" for OIDC (#7658) Do not expose the experimental appservice login flow to clients. (#8440) update changelog fix a logging error in thumbnailer (#8435) changelog fixes fix version number 1.21.0 ...
Synapse 1.22.0rc1 (2020-10-22) ============================== Features -------- - Add a configuration option for always using the "userinfo endpoint" for OpenID Connect. This fixes support for some identity providers, e.g. GitLab. Contributed by Benjamin Koch. ([\#7658](#7658)) - Add ability for `ThirdPartyEventRules` modules to query and manipulate whether a room is in the public rooms directory. ([\#8292](#8292), [\#8467](#8467)) - Add support for olm fallback keys ([MSC2732](matrix-org/matrix-spec-proposals#2732)). ([\#8312](#8312), [\#8501](#8501)) - Add support for running background tasks in a separate worker process. ([\#8369](#8369), [\#8458](#8458), [\#8489](#8489), [\#8513](#8513), [\#8544](#8544), [\#8599](#8599)) - Add support for device dehydration ([MSC2697](matrix-org/matrix-spec-proposals#2697)). ([\#8380](#8380)) - Add support for [MSC2409](matrix-org/matrix-spec-proposals#2409), which allows sending typing, read receipts, and presence events to appservices. ([\#8437](#8437), [\#8590](#8590)) - Change default room version to "6", per [MSC2788](matrix-org/matrix-spec-proposals#2788). ([\#8461](#8461)) - Add the ability to send non-membership events into a room via the `ModuleApi`. ([\#8479](#8479)) - Increase default upload size limit from 10M to 50M. Contributed by @Akkowicz. ([\#8502](#8502)) - Add support for modifying event content in `ThirdPartyRules` modules. ([\#8535](#8535), [\#8564](#8564)) Bugfixes -------- - Fix a longstanding bug where invalid ignored users in account data could break clients. ([\#8454](#8454)) - Fix a bug where backfilling a room with an event that was missing the `redacts` field would break. ([\#8457](#8457)) - Don't attempt to respond to some requests if the client has already disconnected. ([\#8465](#8465)) - Fix message duplication if something goes wrong after persisting the event. ([\#8476](#8476)) - Fix incremental sync returning an incorrect `prev_batch` token in timeline section, which when used to paginate returned events that were included in the incremental sync. Broken since v0.16.0. ([\#8486](#8486)) - Expose the `uk.half-shot.msc2778.login.application_service` to clients from the login API. This feature was added in v1.21.0, but was not exposed as a potential login flow. ([\#8504](#8504)) - Fix error code for `/profile/{userId}/displayname` to be `M_BAD_JSON`. ([\#8517](#8517)) - Fix a bug introduced in v1.7.0 that could cause Synapse to insert values from non-state `m.room.retention` events into the `room_retention` database table. ([\#8527](#8527)) - Fix not sending events over federation when using sharded event writers. ([\#8536](#8536)) - Fix a long standing bug where email notifications for encrypted messages were blank. ([\#8545](#8545)) - Fix increase in the number of `There was no active span...` errors logged when using OpenTracing. ([\#8567](#8567)) - Fix a bug that prevented errors encountered during execution of the `synapse_port_db` from being correctly printed. ([\#8585](#8585)) - Fix appservice transactions to only include a maximum of 100 persistent and 100 ephemeral events. ([\#8606](#8606)) Updates to the Docker image --------------------------- - Added multi-arch support (arm64,arm/v7) for the docker images. Contributed by @maquis196. ([\#7921](#7921)) - Add support for passing commandline args to the synapse process. Contributed by @samuel-p. ([\#8390](#8390)) Improved Documentation ---------------------- - Update the directions for using the manhole with coroutines. ([\#8462](#8462)) - Improve readme by adding new shield.io badges. ([\#8493](#8493)) - Added note about docker in manhole.md regarding which ip address to bind to. Contributed by @maquis196. ([\#8526](#8526)) - Document the new behaviour of the `allowed_lifetime_min` and `allowed_lifetime_max` settings in the room retention configuration. ([\#8529](#8529)) Deprecations and Removals ------------------------- - Drop unused `device_max_stream_id` table. ([\#8589](#8589)) Internal Changes ---------------- - Check for unreachable code with mypy. ([\#8432](#8432)) - Add unit test for event persister sharding. ([\#8433](#8433)) - Allow events to be sent to clients sooner when using sharded event persisters. ([\#8439](#8439), [\#8488](#8488), [\#8496](#8496), [\#8499](#8499)) - Configure `public_baseurl` when using demo scripts. ([\#8443](#8443)) - Add SQL logging on queries that happen during startup. ([\#8448](#8448)) - Speed up unit tests when using PostgreSQL. ([\#8450](#8450)) - Remove redundant database loads of stream_ordering for events we already have. ([\#8452](#8452)) - Reduce inconsistencies between codepaths for membership and non-membership events. ([\#8463](#8463)) - Combine `SpamCheckerApi` with the more generic `ModuleApi`. ([\#8464](#8464)) - Additional testing for `ThirdPartyEventRules`. ([\#8468](#8468)) - Add `-d` option to `./scripts-dev/lint.sh` to lint files that have changed since the last git commit. ([\#8472](#8472)) - Unblacklist some sytests. ([\#8474](#8474)) - Include the log level in the phone home stats. ([\#8477](#8477)) - Remove outdated sphinx documentation, scripts and configuration. ([\#8480](#8480)) - Clarify error message when plugin config parsers raise an error. ([\#8492](#8492)) - Remove the deprecated `Handlers` object. ([\#8494](#8494)) - Fix a threadsafety bug in unit tests. ([\#8497](#8497)) - Add user agent to user_daily_visits table. ([\#8503](#8503)) - Add type hints to various parts of the code base. ([\#8407](#8407), [\#8505](#8505), [\#8507](#8507), [\#8547](#8547), [\#8562](#8562), [\#8609](#8609)) - Remove unused code from the test framework. ([\#8514](#8514)) - Apply some internal fixes to the `HomeServer` class to make its code more idiomatic and statically-verifiable. ([\#8515](#8515)) - Factor out common code between `RoomMemberHandler._locally_reject_invite` and `EventCreationHandler.create_event`. ([\#8537](#8537)) - Improve database performance by executing more queries without starting transactions. ([\#8542](#8542)) - Rename `Cache` to `DeferredCache`, to better reflect its purpose. ([\#8548](#8548)) - Move metric registration code down into `LruCache`. ([\#8561](#8561), [\#8591](#8591)) - Replace `DeferredCache` with the lighter-weight `LruCache` where possible. ([\#8563](#8563)) - Add virtualenv-generated folders to `.gitignore`. ([\#8566](#8566)) - Add `get_immediate` method to `DeferredCache`. ([\#8568](#8568)) - Fix mypy not properly checking across the codebase, additionally, fix a typing assertion error in `handlers/auth.py`. ([\#8569](#8569)) - Fix `synmark` benchmark runner. ([\#8571](#8571)) - Modify `DeferredCache.get()` to return `Deferred`s instead of `ObservableDeferred`s. ([\#8572](#8572)) - Adjust a protocol-type definition to fit `sqlite3` assertions. ([\#8577](#8577)) - Support macOS on the `synmark` benchmark runner. ([\#8578](#8578)) - Update `mypy` static type checker to 0.790. ([\#8583](#8583), [\#8600](#8600)) - Re-organize the structured logging code to separate the TCP transport handling from the JSON formatting. ([\#8587](#8587)) - Remove extraneous unittest logging decorators from unit tests. ([\#8592](#8592)) - Minor optimisations in caching code. ([\#8593](#8593), [\#8594](#8594))
Synapse 1.22.0 (2020-10-27) =========================== No significant changes. Synapse 1.22.0rc2 (2020-10-26) ============================== Bugfixes -------- - Fix bugs where ephemeral events were not sent to appservices. Broke in v1.22.0rc1. ([\#8648](matrix-org/synapse#8648), [\#8656](matrix-org/synapse#8656)) - Fix `user_daily_visits` table to not have duplicate rows per user/device due to multiple user agents. Broke in v1.22.0rc1. ([\#8654](matrix-org/synapse#8654)) Synapse 1.22.0rc1 (2020-10-22) ============================== Features -------- - Add a configuration option for always using the "userinfo endpoint" for OpenID Connect. This fixes support for some identity providers, e.g. GitLab. Contributed by Benjamin Koch. ([\#7658](matrix-org/synapse#7658)) - Add ability for `ThirdPartyEventRules` modules to query and manipulate whether a room is in the public rooms directory. ([\#8292](matrix-org/synapse#8292), [\#8467](matrix-org/synapse#8467)) - Add support for olm fallback keys ([MSC2732](matrix-org/matrix-spec-proposals#2732)). ([\#8312](matrix-org/synapse#8312), [\#8501](matrix-org/synapse#8501)) - Add support for running background tasks in a separate worker process. ([\#8369](matrix-org/synapse#8369), [\#8458](matrix-org/synapse#8458), [\#8489](matrix-org/synapse#8489), [\#8513](matrix-org/synapse#8513), [\#8544](matrix-org/synapse#8544), [\#8599](matrix-org/synapse#8599)) - Add support for device dehydration ([MSC2697](matrix-org/matrix-spec-proposals#2697)). ([\#8380](matrix-org/synapse#8380)) - Add support for [MSC2409](matrix-org/matrix-spec-proposals#2409), which allows sending typing, read receipts, and presence events to appservices. ([\#8437](matrix-org/synapse#8437), [\#8590](matrix-org/synapse#8590)) - Change default room version to "6", per [MSC2788](matrix-org/matrix-spec-proposals#2788). ([\#8461](matrix-org/synapse#8461)) - Add the ability to send non-membership events into a room via the `ModuleApi`. ([\#8479](matrix-org/synapse#8479)) - Increase default upload size limit from 10M to 50M. Contributed by @Akkowicz. ([\#8502](matrix-org/synapse#8502)) - Add support for modifying event content in `ThirdPartyRules` modules. ([\#8535](matrix-org/synapse#8535), [\#8564](matrix-org/synapse#8564)) Bugfixes -------- - Fix a longstanding bug where invalid ignored users in account data could break clients. ([\#8454](matrix-org/synapse#8454)) - Fix a bug where backfilling a room with an event that was missing the `redacts` field would break. ([\#8457](matrix-org/synapse#8457)) - Don't attempt to respond to some requests if the client has already disconnected. ([\#8465](matrix-org/synapse#8465)) - Fix message duplication if something goes wrong after persisting the event. ([\#8476](matrix-org/synapse#8476)) - Fix incremental sync returning an incorrect `prev_batch` token in timeline section, which when used to paginate returned events that were included in the incremental sync. Broken since v0.16.0. ([\#8486](matrix-org/synapse#8486)) - Expose the `uk.half-shot.msc2778.login.application_service` to clients from the login API. This feature was added in v1.21.0, but was not exposed as a potential login flow. ([\#8504](matrix-org/synapse#8504)) - Fix error code for `/profile/{userId}/displayname` to be `M_BAD_JSON`. ([\#8517](matrix-org/synapse#8517)) - Fix a bug introduced in v1.7.0 that could cause Synapse to insert values from non-state `m.room.retention` events into the `room_retention` database table. ([\#8527](matrix-org/synapse#8527)) - Fix not sending events over federation when using sharded event writers. ([\#8536](matrix-org/synapse#8536)) - Fix a long standing bug where email notifications for encrypted messages were blank. ([\#8545](matrix-org/synapse#8545)) - Fix increase in the number of `There was no active span...` errors logged when using OpenTracing. ([\#8567](matrix-org/synapse#8567)) - Fix a bug that prevented errors encountered during execution of the `synapse_port_db` from being correctly printed. ([\#8585](matrix-org/synapse#8585)) - Fix appservice transactions to only include a maximum of 100 persistent and 100 ephemeral events. ([\#8606](matrix-org/synapse#8606)) Updates to the Docker image --------------------------- - Added multi-arch support (arm64,arm/v7) for the docker images. Contributed by @maquis196. ([\#7921](matrix-org/synapse#7921)) - Add support for passing commandline args to the synapse process. Contributed by @samuel-p. ([\#8390](matrix-org/synapse#8390)) Improved Documentation ---------------------- - Update the directions for using the manhole with coroutines. ([\#8462](matrix-org/synapse#8462)) - Improve readme by adding new shield.io badges. ([\#8493](matrix-org/synapse#8493)) - Added note about docker in manhole.md regarding which ip address to bind to. Contributed by @maquis196. ([\#8526](matrix-org/synapse#8526)) - Document the new behaviour of the `allowed_lifetime_min` and `allowed_lifetime_max` settings in the room retention configuration. ([\#8529](matrix-org/synapse#8529)) Deprecations and Removals ------------------------- - Drop unused `device_max_stream_id` table. ([\#8589](matrix-org/synapse#8589)) Internal Changes ---------------- - Check for unreachable code with mypy. ([\#8432](matrix-org/synapse#8432)) - Add unit test for event persister sharding. ([\#8433](matrix-org/synapse#8433)) - Allow events to be sent to clients sooner when using sharded event persisters. ([\#8439](matrix-org/synapse#8439), [\#8488](matrix-org/synapse#8488), [\#8496](matrix-org/synapse#8496), [\#8499](matrix-org/synapse#8499)) - Configure `public_baseurl` when using demo scripts. ([\#8443](matrix-org/synapse#8443)) - Add SQL logging on queries that happen during startup. ([\#8448](matrix-org/synapse#8448)) - Speed up unit tests when using PostgreSQL. ([\#8450](matrix-org/synapse#8450)) - Remove redundant database loads of stream_ordering for events we already have. ([\#8452](matrix-org/synapse#8452)) - Reduce inconsistencies between codepaths for membership and non-membership events. ([\#8463](matrix-org/synapse#8463)) - Combine `SpamCheckerApi` with the more generic `ModuleApi`. ([\#8464](matrix-org/synapse#8464)) - Additional testing for `ThirdPartyEventRules`. ([\#8468](matrix-org/synapse#8468)) - Add `-d` option to `./scripts-dev/lint.sh` to lint files that have changed since the last git commit. ([\#8472](matrix-org/synapse#8472)) - Unblacklist some sytests. ([\#8474](matrix-org/synapse#8474)) - Include the log level in the phone home stats. ([\#8477](matrix-org/synapse#8477)) - Remove outdated sphinx documentation, scripts and configuration. ([\#8480](matrix-org/synapse#8480)) - Clarify error message when plugin config parsers raise an error. ([\#8492](matrix-org/synapse#8492)) - Remove the deprecated `Handlers` object. ([\#8494](matrix-org/synapse#8494)) - Fix a threadsafety bug in unit tests. ([\#8497](matrix-org/synapse#8497)) - Add user agent to user_daily_visits table. ([\#8503](matrix-org/synapse#8503)) - Add type hints to various parts of the code base. ([\#8407](matrix-org/synapse#8407), [\#8505](matrix-org/synapse#8505), [\#8507](matrix-org/synapse#8507), [\#8547](matrix-org/synapse#8547), [\#8562](matrix-org/synapse#8562), [\#8609](matrix-org/synapse#8609)) - Remove unused code from the test framework. ([\#8514](matrix-org/synapse#8514)) - Apply some internal fixes to the `HomeServer` class to make its code more idiomatic and statically-verifiable. ([\#8515](matrix-org/synapse#8515)) - Factor out common code between `RoomMemberHandler._locally_reject_invite` and `EventCreationHandler.create_event`. ([\#8537](matrix-org/synapse#8537)) - Improve database performance by executing more queries without starting transactions. ([\#8542](matrix-org/synapse#8542)) - Rename `Cache` to `DeferredCache`, to better reflect its purpose. ([\#8548](matrix-org/synapse#8548)) - Move metric registration code down into `LruCache`. ([\#8561](matrix-org/synapse#8561), [\#8591](matrix-org/synapse#8591)) - Replace `DeferredCache` with the lighter-weight `LruCache` where possible. ([\#8563](matrix-org/synapse#8563)) - Add virtualenv-generated folders to `.gitignore`. ([\#8566](matrix-org/synapse#8566)) - Add `get_immediate` method to `DeferredCache`. ([\#8568](matrix-org/synapse#8568)) - Fix mypy not properly checking across the codebase, additionally, fix a typing assertion error in `handlers/auth.py`. ([\#8569](matrix-org/synapse#8569)) - Fix `synmark` benchmark runner. ([\#8571](matrix-org/synapse#8571)) - Modify `DeferredCache.get()` to return `Deferred`s instead of `ObservableDeferred`s. ([\#8572](matrix-org/synapse#8572)) - Adjust a protocol-type definition to fit `sqlite3` assertions. ([\#8577](matrix-org/synapse#8577)) - Support macOS on the `synmark` benchmark runner. ([\#8578](matrix-org/synapse#8578)) - Update `mypy` static type checker to 0.790. ([\#8583](matrix-org/synapse#8583), [\#8600](matrix-org/synapse#8600)) - Re-organize the structured logging code to separate the TCP transport handling from the JSON formatting. ([\#8587](matrix-org/synapse#8587)) - Remove extraneous unittest logging decorators from unit tests. ([\#8592](matrix-org/synapse#8592)) - Minor optimisations in caching code. ([\#8593](matrix-org/synapse#8593), [\#8594](matrix-org/synapse#8594))
The overall plan here is to run (some) background tasks on a separate worker, instead of the main process. This causes two different changes:
I'm hoping this will have the following benefits:
I mostly found these be searching for callers oflooping_call
. Note that some of the callers use in process (in-memory) state, while others use database state. Only the latter is applicable to be moved to a separate process.I simplified this to only move a few tasks to the new worker. I think this is clearer than trying to move them all at once. Follow-up PRs can move additional tasks.
By default the main process still runs background tasks.
Related to #8500.