Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Individual workers will fetch remote server keys without sharing a cache #12767

Closed
anoadragon453 opened this issue May 17, 2022 · 2 comments · Fixed by #14804
Closed

Individual workers will fetch remote server keys without sharing a cache #12767

anoadragon453 opened this issue May 17, 2022 · 2 comments · Fixed by #14804
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) T-Other Questions, user support, anything else. Z-Time-Tracked Element employees should track their time spent on this issue/PR.

Comments

@anoadragon453
Copy link
Member

We don't seem to try fetching server keys from the database when validating federation requests on a worker. This can lead to multiple workers each making a server key request to the same remote homeserver.

async def process_request(self, verify_request: VerifyJsonRequest) -> None:
"""Processes the `VerifyJsonRequest`. Raises if the object is not signed
by the server, the signatures don't match or we failed to fetch the
necessary keys.
"""
if not verify_request.key_ids:
raise SynapseError(
400,
f"Not signed by {verify_request.server_name}",
Codes.UNAUTHORIZED,
)
found_keys: Dict[str, FetchKeyResult] = {}
# If we are the originating server, short-circuit the key-fetch for any keys
# we already have
if verify_request.server_name == self._hostname:
for key_id in verify_request.key_ids:
if key_id in self._local_verify_keys:
found_keys[key_id] = self._local_verify_keys[key_id]
key_ids_to_find = set(verify_request.key_ids) - found_keys.keys()
if key_ids_to_find:
# Add the keys we need to verify to the queue for retrieval. We queue
# up requests for the same server so we don't end up with many in flight
# requests for the same keys.
key_request = _FetchKeyRequest(
server_name=verify_request.server_name,
minimum_valid_until_ts=verify_request.minimum_valid_until_ts,
key_ids=list(key_ids_to_find),
)
found_keys_by_server = await self._server_queue.add_to_queue(
key_request, key=verify_request.server_name
)
# Since we batch up requests the returned set of keys may contain keys
# from other servers, so we pull out only the ones we care about.
found_keys.update(found_keys_by_server.get(verify_request.server_name, {}))

We have a database table where we cache server key responses (server_keys_json). I can't think of any reason why we wouldn't use it here?

@anoadragon453 anoadragon453 added T-Other Questions, user support, anything else. Z-Time-Tracked Element employees should track their time spent on this issue/PR. labels May 17, 2022
@MadLittleMods MadLittleMods added the A-Workers Problems related to running Synapse in Worker Mode (or replication) label Jun 3, 2022
@anoadragon453
Copy link
Member Author

anoadragon453 commented Jan 9, 2023

It turns out that we do consult the local database first before making requests to remote servers. A Keyring object has a _key_fetchers class variable which determine the instances of KeyFetcher (or classes that inherit from it) that the Keyring will use to fetch keys.

The first one that it will try is a StoreKeyFetcher, which will - as the name implies - fetch keys from the local datastore:

if key_fetchers is None:
key_fetchers = (
StoreKeyFetcher(hs),
PerspectivesKeyFetcher(hs),
ServerKeyFetcher(hs),
)
self._key_fetchers = key_fetchers

class StoreKeyFetcher(KeyFetcher):
"""KeyFetcher impl which fetches keys from our data store"""

This is not particularly obvious from the current code though, so I'll drop in a comment.

@DMRobertson
Copy link
Contributor

I think this issue was onto something: see #15171

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) T-Other Questions, user support, anything else. Z-Time-Tracked Element employees should track their time spent on this issue/PR.
Projects
None yet
3 participants