-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add metrics to track /messages
response time by room size
#13545
Add metrics to track /messages
response time by room size
#13545
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sensible idea otherwise.
@@ -396,6 +397,7 @@ def _get_room_summary_txn( | |||
) | |||
|
|||
@cached() | |||
@trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I could put the @trace
on the outside but the cached stuff does not like getting wrapped. Same with the other caching decorators I've noticed (like @cachedList
). It returns a DeferredCacheDescriptor
/DeferredCacheListDescriptor
instead of a wrapped function.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/Users/eric/Documents/github/element/synapse/synapse/app/homeserver.py", line 37, in <module>
from synapse.app import _base
File "/Users/eric/Documents/github/element/synapse/synapse/app/_base.py", line 61, in <module>
from synapse.events.spamcheck import load_legacy_spam_checkers
File "/Users/eric/Documents/github/element/synapse/synapse/events/spamcheck.py", line 35, in <module>
from synapse.rest.media.v1._base import FileInfo
File "/Users/eric/Documents/github/element/synapse/synapse/rest/__init__.py", line 18, in <module>
from synapse.rest import admin
File "/Users/eric/Documents/github/element/synapse/synapse/rest/admin/__init__.py", line 27, in <module>
from synapse.rest.admin._base import admin_patterns, assert_requester_is_admin
File "/Users/eric/Documents/github/element/synapse/synapse/rest/admin/_base.py", line 19, in <module>
from synapse.api.auth import Auth
File "/Users/eric/Documents/github/element/synapse/synapse/api/auth.py", line 22, in <module>
from synapse import event_auth
File "/Users/eric/Documents/github/element/synapse/synapse/event_auth.py", line 45, in <module>
from synapse.storage.databases.main.events_worker import EventRedactBehaviour
File "/Users/eric/Documents/github/element/synapse/synapse/storage/__init__.py", line 34, in <module>
from synapse.storage.databases import Databases
File "/Users/eric/Documents/github/element/synapse/synapse/storage/databases/__init__.py", line 20, in <module>
from synapse.storage.databases.main.events import PersistEventsStore
File "/Users/eric/Documents/github/element/synapse/synapse/storage/databases/main/__init__.py", line 33, in <module>
from .account_data import AccountDataStore
File "/Users/eric/Documents/github/element/synapse/synapse/storage/databases/main/account_data.py", line 39, in <module>
from synapse.storage.databases.main.push_rule import PushRulesWorkerStore
File "/Users/eric/Documents/github/element/synapse/synapse/storage/databases/main/push_rule.py", line 41, in <module>
from synapse.storage.databases.main.appservice import ApplicationServiceWorkerStore
File "/Users/eric/Documents/github/element/synapse/synapse/storage/databases/main/appservice.py", line 35, in <module>
from synapse.storage.databases.main.roommember import RoomMemberWorkerStore
File "/Users/eric/Documents/github/element/synapse/synapse/storage/databases/main/roommember.py", line 82, in <module>
class RoomMemberWorkerStore(EventsWorkerStore):
File "/Users/eric/Documents/github/element/synapse/synapse/storage/databases/main/roommember.py", line 401, in RoomMemberWorkerStore
async def get_number_joined_users_in_room(self, room_id: str) -> int:
File "/Users/eric/Documents/github/element/synapse/synapse/logging/opentracing.py", line 954, in trace
return trace_with_opname(func.__name__)(func)
AttributeError: 'DeferredCacheDescriptor' object has no attribute '__name__'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can make DeferredCacheDescriptor
have a __name__
field to make it compatible.
That said, is there any point in tracing the cached function from the outside? If it's a cache hit then it should take negligible time and it's probably not that interesting in your trace view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can make
DeferredCacheDescriptor
have a__name__
field to make it compatible.
I was exploring that option a few days ago but there is more to change to make it work.
That said, is there any point in tracing the cached function from the outside? If it's a cache hit then it should take negligible time and it's probably not that interesting in your trace view.
It's nice to have to so you know whether it's running. It's weird to just not see a call in the trace when you see it in the code.
@@ -396,6 +397,7 @@ def _get_room_summary_txn( | |||
) | |||
|
|||
@cached() | |||
@trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this @trace
call is messing up some of the tests (passes without it). The method still gives results so it's unclear to me why it would change things 🤔
[FAIL]
Traceback (most recent call last):
File "/home/runner/work/synapse/synapse/tests/rest/client/test_rooms.py", line 726, in test_post_room_initial_state
self.assertEqual(50, channel.resource_usage.db_txn_count)
File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/trial/_synctest.py", line 422, in assertEqual
super().assertEqual(first, second, msg)
File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/unittest/case.py", line 837, in assertEqual
assertion_func(first, second, msg=msg)
File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/unittest/case.py", line 830, in _baseAssertEqual
raise self.failureException(msg)
twisted.trial.unittest.FailTest: 50 != 49
tests.rest.client.test_rooms.RoomsCreateTestCase.test_post_room_initial_state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if it's because the function being returned by trace
always has the same __name__
and so just a guess, but these two get the same cache entries. Seems like a massive trap. I'll see if I can confirm.
@cached()
@trace
def fun1(): ...
@cached()
@trace
def fun2(): ...
edit: nope, the __name__
passes through @trace
like you'd hope, so it's not that. It would be good to know why this is causing a test to fail :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll re-introduce this in another PR if I get an inkling to have it again ⏩
Thanks for the review and help working out the details @reivilibre 🐫 Also thanks to @DMRobertson and @richvdh for the info to get background coroutines fully going 🦦 |
Synapse 1.66.0rc1 (2022-08-23) ============================== This release removes the ability for homeservers to delegate email ownership verification and password reset confirmation to identity servers. This removal was originally planned for Synapse 1.64, but was later deferred until now. See the [upgrade notes](https://matrix-org.github.io/synapse/v1.66/upgrade.html#upgrading-to-v1660) for more details. Features -------- - Improve validation of request bodies for the following client-server API endpoints: [`/account/password`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountpassword), [`/account/password/email/requestToken`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountpasswordemailrequesttoken), [`/account/deactivate`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountdeactivate) and [`/account/3pid/email/requestToken`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3account3pidemailrequesttoken). ([\#13188](#13188), [\#13563](#13563)) - Add forgotten status to [Room Details Admin API](https://matrix-org.github.io/synapse/latest/admin_api/rooms.html#room-details-api). ([\#13503](#13503)) - Add an experimental implementation for [MSC3852 (Expose user agents on `Device`)](matrix-org/matrix-spec-proposals#3852). ([\#13549](#13549)) - Add `org.matrix.msc2716v4` experimental room version with updated content fields. Part of [MSC2716 (Importing history)](matrix-org/matrix-spec-proposals#2716). ([\#13551](#13551)) - Add support for compression to federation responses. ([\#13537](#13537)) - Improve performance of sending messages in rooms with thousands of local users. ([\#13522](#13522), [\#13547](#13547)) Bugfixes -------- - Faster room joins: make `/joined_members` block whilst the room is partial stated. ([\#13514](#13514)) - Fix a bug introduced in Synapse 1.21.0 where the [`/event_reports` Admin API](https://matrix-org.github.io/synapse/develop/admin_api/event_reports.html) could return a total count which was larger than the number of results you can actually query for. ([\#13525](#13525)) - Fix a bug introduced in Synapse 1.52.0 where sending server notices fails if `max_avatar_size` or `allowed_avatar_mimetypes` is set and not `system_mxid_avatar_url`. ([\#13566](#13566)) - Fix a bug where the `opentracing.force_tracing_for_users` config option would not apply to [`/sendToDevice`](https://spec.matrix.org/v1.3/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid) and [`/keys/upload`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3keysupload) requests. ([\#13574](#13574)) Improved Documentation ---------------------- - Add `openssl` example for generating registration HMAC digest. ([\#13472](#13472)) - Tidy up Synapse's README. ([\#13491](#13491)) - Document that event purging related to the `redaction_retention_period` config option is executed only every 5 minutes. ([\#13492](#13492)) - Add a warning to retention documentation regarding the possibility of database corruption. ([\#13497](#13497)) - Document that the `DOCKER_BUILDKIT=1` flag is needed to build the docker image. ([\#13515](#13515)) - Add missing links in `user_consent` section of configuration manual. ([\#13536](#13536)) - Fix the doc and some warnings that were referring to the nonexistent `custom_templates_directory` setting (instead of `custom_template_directory`). ([\#13538](#13538)) Deprecations and Removals ------------------------- - Remove the ability for homeservers to delegate email ownership verification and password reset confirmation to identity servers. See [upgrade notes](https://matrix-org.github.io/synapse/v1.66/upgrade.html#upgrading-to-v1660) for more details. Internal Changes ---------------- - Update the rejected state of events during de-partial-stating. ([\#13459](#13459)) - Avoid blocking lazy-loading `/sync`s during partial joins due to remote memberships. Pull remote memberships from auth events instead of the room state. ([\#13477](#13477)) - Refuse to start when faster joins is enabled on a deployment with workers, since worker configurations are not currently supported. ([\#13531](#13531)) - Allow use of both `@trace` and `@tag_args` stacked on the same function. ([\#13453](#13453)) - Instrument the federation/backfill part of `/messages` for understandable traces in Jaeger. ([\#13489](#13489)) - Instrument `FederationStateIdsServlet` (`/state_ids`) for understandable traces in Jaeger. ([\#13499](#13499), [\#13554](#13554)) - Track HTTP response times over 10 seconds from `/messages` (`synapse_room_message_list_rest_servlet_response_time_seconds`). ([\#13533](#13533)) - Add metrics to track how the rate limiter is affecting requests (sleep/reject). ([\#13534](#13534), [\#13541](#13541)) - Add metrics to time how long it takes us to do backfill processing (`synapse_federation_backfill_processing_before_time_seconds`, `synapse_federation_backfill_processing_after_time_seconds`). ([\#13535](#13535), [\#13584](#13584)) - Add metrics to track rate limiter queue timing (`synapse_rate_limit_queue_wait_time_seconds`). ([\#13544](#13544)) - Update metrics to track `/messages` response time by room size. ([\#13545](#13545)) - Refactor methods in `synapse.api.auth.Auth` to use `Requester` objects everywhere instead of user IDs. ([\#13024](#13024)) - Clean-up tests for notifications. ([\#13471](#13471)) - Add some miscellaneous comments to document sync, especially around `compute_state_delta`. ([\#13474](#13474)) - Use literals in place of `HTTPStatus` constants in tests. ([\#13479](#13479), [\#13488](#13488)) - Add comments about how event push actions are rotated. ([\#13485](#13485)) - Modify HTML template content to better support mobile devices' screen sizes. ([\#13493](#13493)) - Add a linter script which will reject non-strict types in Pydantic models. ([\#13502](#13502)) - Reduce the number of tests using legacy TCP replication. ([\#13543](#13543)) - Allow specifying additional request fields when using the `HomeServerTestCase.login` helper method. ([\#13549](#13549)) - Make `HomeServerTestCase` load any configured homeserver modules automatically. ([\#13558](#13558))
Synapse 1.66.0 (2022-08-31) =========================== No significant changes since 1.66.0rc2. This release removes the ability for homeservers to delegate email ownership verification and password reset confirmation to identity servers. This removal was originally planned for Synapse 1.64, but was later deferred until now. See the [upgrade notes](https://matrix-org.github.io/synapse/v1.66/upgrade.html#upgrading-to-v1660) for more details. Deployments with multiple workers should note that the direct TCP replication configuration was deprecated in Synapse v1.18.0 and will be removed in Synapse v1.67.0. In particular, the TCP `replication` [listener](https://matrix-org.github.io/synapse/v1.66/usage/configuration/config_documentation.html#listeners) type (not to be confused with the `replication` resource on the `http` listener type) and the `worker_replication_port` config option will be removed . To migrate to Redis, add the [`redis` config](https://matrix-org.github.io/synapse/v1.66/workers.html#shared-configuration), then remove the TCP `replication` listener from config of the master and `worker_replication_port` from worker config. Note that a HTTP listener with a `replication` resource is still required. See the [worker documentation](https://matrix-org.github.io/synapse/v1.66/workers.html) for more details. Synapse 1.66.0rc2 (2022-08-30) ============================== Bugfixes -------- - Fix a bug introduced in Synapse 1.66.0rc1 where the new rate limit metrics were misreported (`synapse_rate_limit_sleep_affected_hosts`, `synapse_rate_limit_reject_affected_hosts`). ([\matrix-org#13649](matrix-org#13649)) Synapse 1.66.0rc1 (2022-08-23) ============================== Features -------- - Improve validation of request bodies for the following client-server API endpoints: [`/account/password`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountpassword), [`/account/password/email/requestToken`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountpasswordemailrequesttoken), [`/account/deactivate`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountdeactivate) and [`/account/3pid/email/requestToken`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3account3pidemailrequesttoken). ([\matrix-org#13188](matrix-org#13188), [\matrix-org#13563](matrix-org#13563)) - Add forgotten status to [Room Details Admin API](https://matrix-org.github.io/synapse/latest/admin_api/rooms.html#room-details-api). ([\matrix-org#13503](matrix-org#13503)) - Add an experimental implementation for [MSC3852 (Expose user agents on `Device`)](matrix-org/matrix-spec-proposals#3852). ([\matrix-org#13549](matrix-org#13549)) - Add `org.matrix.msc2716v4` experimental room version with updated content fields. Part of [MSC2716 (Importing history)](matrix-org/matrix-spec-proposals#2716). ([\matrix-org#13551](matrix-org#13551)) - Add support for compression to federation responses. ([\matrix-org#13537](matrix-org#13537)) - Improve performance of sending messages in rooms with thousands of local users. ([\matrix-org#13522](matrix-org#13522), [\matrix-org#13547](matrix-org#13547)) Bugfixes -------- - Faster room joins: make `/joined_members` block whilst the room is partial stated. ([\matrix-org#13514](matrix-org#13514)) - Fix a bug introduced in Synapse 1.21.0 where the [`/event_reports` Admin API](https://matrix-org.github.io/synapse/develop/admin_api/event_reports.html) could return a total count which was larger than the number of results you can actually query for. ([\matrix-org#13525](matrix-org#13525)) - Fix a bug introduced in Synapse 1.52.0 where sending server notices fails if `max_avatar_size` or `allowed_avatar_mimetypes` is set and not `system_mxid_avatar_url`. ([\matrix-org#13566](matrix-org#13566)) - Fix a bug where the `opentracing.force_tracing_for_users` config option would not apply to [`/sendToDevice`](https://spec.matrix.org/v1.3/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid) and [`/keys/upload`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3keysupload) requests. ([\matrix-org#13574](matrix-org#13574)) Improved Documentation ---------------------- - Add `openssl` example for generating registration HMAC digest. ([\matrix-org#13472](matrix-org#13472)) - Tidy up Synapse's README. ([\matrix-org#13491](matrix-org#13491)) - Document that event purging related to the `redaction_retention_period` config option is executed only every 5 minutes. ([\matrix-org#13492](matrix-org#13492)) - Add a warning to retention documentation regarding the possibility of database corruption. ([\matrix-org#13497](matrix-org#13497)) - Document that the `DOCKER_BUILDKIT=1` flag is needed to build the docker image. ([\matrix-org#13515](matrix-org#13515)) - Add missing links in `user_consent` section of configuration manual. ([\matrix-org#13536](matrix-org#13536)) - Fix the doc and some warnings that were referring to the nonexistent `custom_templates_directory` setting (instead of `custom_template_directory`). ([\matrix-org#13538](matrix-org#13538)) Deprecations and Removals ------------------------- - Remove the ability for homeservers to delegate email ownership verification and password reset confirmation to identity servers. See [upgrade notes](https://matrix-org.github.io/synapse/v1.66/upgrade.html#upgrading-to-v1660) for more details. Internal Changes ---------------- - Update the rejected state of events during de-partial-stating. ([\matrix-org#13459](matrix-org#13459)) - Avoid blocking lazy-loading `/sync`s during partial joins due to remote memberships. Pull remote memberships from auth events instead of the room state. ([\matrix-org#13477](matrix-org#13477)) - Refuse to start when faster joins is enabled on a deployment with workers, since worker configurations are not currently supported. ([\matrix-org#13531](matrix-org#13531)) - Allow use of both `@trace` and `@tag_args` stacked on the same function. ([\matrix-org#13453](matrix-org#13453)) - Instrument the federation/backfill part of `/messages` for understandable traces in Jaeger. ([\matrix-org#13489](matrix-org#13489)) - Instrument `FederationStateIdsServlet` (`/state_ids`) for understandable traces in Jaeger. ([\matrix-org#13499](matrix-org#13499), [\matrix-org#13554](matrix-org#13554)) - Track HTTP response times over 10 seconds from `/messages` (`synapse_room_message_list_rest_servlet_response_time_seconds`). ([\matrix-org#13533](matrix-org#13533)) - Add metrics to track how the rate limiter is affecting requests (sleep/reject). ([\matrix-org#13534](matrix-org#13534), [\matrix-org#13541](matrix-org#13541)) - Add metrics to time how long it takes us to do backfill processing (`synapse_federation_backfill_processing_before_time_seconds`, `synapse_federation_backfill_processing_after_time_seconds`). ([\matrix-org#13535](matrix-org#13535), [\matrix-org#13584](matrix-org#13584)) - Add metrics to track rate limiter queue timing (`synapse_rate_limit_queue_wait_time_seconds`). ([\matrix-org#13544](matrix-org#13544)) - Update metrics to track `/messages` response time by room size. ([\matrix-org#13545](matrix-org#13545)) - Refactor methods in `synapse.api.auth.Auth` to use `Requester` objects everywhere instead of user IDs. ([\matrix-org#13024](matrix-org#13024)) - Clean-up tests for notifications. ([\matrix-org#13471](matrix-org#13471)) - Add some miscellaneous comments to document sync, especially around `compute_state_delta`. ([\matrix-org#13474](matrix-org#13474)) - Use literals in place of `HTTPStatus` constants in tests. ([\matrix-org#13479](matrix-org#13479), [\matrix-org#13488](matrix-org#13488)) - Add comments about how event push actions are rotated. ([\matrix-org#13485](matrix-org#13485)) - Modify HTML template content to better support mobile devices' screen sizes. ([\matrix-org#13493](matrix-org#13493)) - Add a linter script which will reject non-strict types in Pydantic models. ([\matrix-org#13502](matrix-org#13502)) - Reduce the number of tests using legacy TCP replication. ([\matrix-org#13543](matrix-org#13543)) - Allow specifying additional request fields when using the `HomeServerTestCase.login` helper method. ([\matrix-org#13549](matrix-org#13549)) - Make `HomeServerTestCase` load any configured homeserver modules automatically. ([\matrix-org#13558](matrix-org#13558)) # -----BEGIN PGP SIGNATURE----- # # iQGzBAABCgAdFiEEWMTnW8Z8khaaf90R+84KzgcyGG8FAmMPT8QACgkQ+84Kzgcy # GG9CUAv+Pv/iDpE2jKlV7zQ/cagaKCGsFK5jy0+K9Wr215nP89tuhU37bJXsgvVu # GP3A8k1c/ENPhXwYHLCnnxV3jick1FuVE0W6h0j2PMYeIGNCQhDswytnsQO4JExg # fGLL4ygCzpe8bFX9+mhIM4z8xkZjZX3lIa8CN2LtRLIo0m7qoT1ZWqdt7kAjj5yL # XMk+3Y1yq/Y4SHHqgKurBNdwNcwnv7ynchWxTYa12WVTINt26dLV0Syk3p8u2SLl # 5YNzcDs2TAM7+VxAu7E0AQl426+Ufi122Oj1ZBUG2FxTPLH8Xr18cN2M/at6WxoX # 8pOkGiuahKKvahw1iCoHAGIC66gFIPxBE9xW4R2SKrQtG4sDuKJI0kvunRV8+cy5 # TuJ9cmdDmJR2vj3P3OULqLXGkWsGNJqfZZF8OWkHEI8LUIXZLrAZocFtlonkr9rV # Y8r8LxL8Id1rbHAnCXcJnYdaJ6ol0RIObDFpitY/D8BDUONVw/byeOyAEkq/XPrZ # Ke/9K8sy # =eg1L # -----END PGP SIGNATURE----- # gpg: Signature made Wed Aug 31 13:10:44 2022 BST # gpg: using RSA key 58C4E75BC67C92169A7FDD11FBCE0ACE0732186F # gpg: Can't check signature: No public key # Conflicts: # synapse/api/auth.py # synapse/push/baserules.py # synapse/push/bulk_push_rule_evaluator.py # synapse/push/push_rule_evaluator.py # synapse/storage/databases/main/event_push_actions.py # tests/server_notices/test_resource_limits_server_notices.py
Add metrics to track
/messages
response time by room sizeFollow-up to #13533
Part of #13356
Mentioned in internal doc
Dev notes
Discussion from
#synapse-dev
for reference:Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.Pull request includes a sign off(run the linters)