Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Federation automatic joins raise 403 on callee homeserver #15012

Open
csuriano23 opened this issue Feb 7, 2023 · 0 comments · Fixed by element-hq/synapse#18075
Open

Federation automatic joins raise 403 on callee homeserver #15012

csuriano23 opened this issue Feb 7, 2023 · 0 comments · Fixed by element-hq/synapse#18075
Labels
A-Federated-Join joins over federation generally suck A-Invite Inviting users to rooms and accepting invites A-Modules Module API: https://matrix-org.github.io/synapse/latest/modules/index.html O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. Z-Read-After-Write A lack of read-after-write consistency, usually due to cache invalidation races with workers

Comments

@csuriano23
Copy link

csuriano23 commented Feb 7, 2023

Description

I have two federated homeservers, call them alpha and beta.

On both servers it is installed the synapse-auto-accept-invite plugin that triggers a join as a third-party rule each time a user is invited to a room.

When I create a room on alpha and then I invite an user of beta the rule is triggered, but it fails, below the stacktrace on beta homeserver (see also this issue):

2023-02-07 11:49:38 2023-02-07 10:49:38,629 - synapse.events.third_party_rules - 406 - ERROR - PUT-625 - Failed to run module API callback <bound method InviteAutoAccepter.on_new_event of <synapse_auto_accept_invite.InviteAutoAccepter object at 0xffff99f762b0>>: 403: You are not invited to this room.
2023-02-07 11:49:38 Traceback (most recent call last):
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/federation/federation_client.py", line 852, in _try_destination_list
2023-02-07 11:49:38     return await callback(destination)
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/federation/federation_client.py", line 962, in send_request
2023-02-07 11:49:38     ret = await self.transport_layer.make_membership_event(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/federation/transport/client.py", line 324, in make_membership_event
2023-02-07 11:49:38     return await self.client.get_json(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/http/matrixfederationclient.py", line 1077, in get_json
2023-02-07 11:49:38     response = await self._send_request_with_optional_trailing_slash(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/http/matrixfederationclient.py", line 406, in _send_request_with_optional_trailing_slash
2023-02-07 11:49:38     response = await self._send_request(request, **send_request_args)
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/http/matrixfederationclient.py", line 668, in _send_request
2023-02-07 11:49:38     raise exc
2023-02-07 11:49:38 synapse.api.errors.HttpResponseException: 403: Forbidden
2023-02-07 11:49:38 
2023-02-07 11:49:38 The above exception was the direct cause of the following exception:
2023-02-07 11:49:38 
2023-02-07 11:49:38 Traceback (most recent call last):
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/events/third_party_rules.py", line 404, in on_new_event
2023-02-07 11:49:38     await callback(event, state_events)
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse_auto_accept_invite/__init__.py", line 99, in on_new_event
2023-02-07 11:49:38     await self._api.update_room_membership(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/module_api/__init__.py", line 1063, in update_room_membership
2023-02-07 11:49:38     event_id, _ = await self._hs.get_room_member_handler().update_membership(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/handlers/room_member.py", line 597, in update_membership
2023-02-07 11:49:38     result = await self.update_membership_locked(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/handlers/room_member.py", line 965, in update_membership_locked
2023-02-07 11:49:38     remote_join_response = await self._remote_join(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/handlers/room_member.py", line 1747, in _remote_join
2023-02-07 11:49:38     event_id, stream_id = await self.federation_handler.do_invite_join(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/handlers/federation.py", line 574, in do_invite_join
2023-02-07 11:49:38     origin, event, room_version_obj = await self._make_and_verify_event(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/handlers/federation.py", line 1082, in _make_and_verify_event
2023-02-07 11:49:38     ) = await self.federation_client.make_membership_event(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/federation/federation_client.py", line 1014, in make_membership_event
2023-02-07 11:49:38     return await self._try_destination_list(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/federation/federation_client.py", line 894, in _try_destination_list
2023-02-07 11:49:38     raise synapse_error from e
2023-02-07 11:49:38 synapse.api.errors.ProxiedRequestError: 403: You are not invited to this room.

On alpha homeserver here is the stacktrace (I've patched the original code on /synapse/http/server.py:_async_render_wrapper, line ~315 in order to print this):

2023-02-07 11:49:38 Traceback (most recent call last):
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/http/server.py", line 307, in _async_render_wrapper
2023-02-07 11:49:38     callback_return = await self._async_render(request)
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/http/server.py", line 514, in _async_render
2023-02-07 11:49:38     callback_return = await raw_callback_return
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/federation/transport/server/_base.py", line 349, in new_func
2023-02-07 11:49:38     response = await func(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/federation/transport/server/federation.py", line 291, in on_GET
2023-02-07 11:49:38     result = await self.handler.on_make_join_request(
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/federation/federation_server.py", line 647, in on_make_join_request
2023-02-07 11:49:38     pdu = await self.handler.on_make_join_request(origin, room_id, user_id)
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/handlers/federation.py", line 952, in on_make_join_request
2023-02-07 11:49:38     await self._event_auth_handler.check_auth_rules_from_context(event)
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/handlers/event_auth.py", line 69, in check_auth_rules_from_context
2023-02-07 11:49:38     check_state_dependent_auth_rules(event, auth_events_by_id.values())
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/event_auth.py", line 304, in check_state_dependent_auth_rules
2023-02-07 11:49:38     _is_membership_change_allowed(event.room_version, event, auth_dict)
2023-02-07 11:49:38   File "/usr/local/lib/python3.9/site-packages/synapse/event_auth.py", line 649, in _is_membership_change_allowed
2023-02-07 11:49:38     raise AuthError(403, "You are not invited to this room.")
2023-02-07 11:49:38 synapse.api.errors.AuthError: 403: You are not invited to this room.

This is due to the fact that events received by the function /synapse/event_auth.py:_is_membership_change_allowed (arg auth_events) are these (once again patched version with more logging):

[{('m.room.power_levels', ''): <FrozenEventV3 event_id=$QBnOW4DH8asI1Sjbb4cEp1kGF1DHO3kOxBeuHilgfjs, type=m.room.power_levels, state_key=, outlier=False>, ('m.room.join_rules', ''): <FrozenEventV3 event_id=$jonFOBLPEwbG3oTctHGv8s-LYw7SW0s7bW5eXiKq3z0, type=m.room.join_rules, state_key=, outlier=False>, ('m.room.create', ''): <FrozenEventV3 event_id=$s6joCXpyxW38sPpSE6dWS_2A62Op39z71awJ0lZAU6U, type=m.room.create, state_key=, outlier=False>}]

When I manually accept from the beta client everything goes ok and the auth events logged are different (call from beta server is the same, I've compared the url):

[{('m.room.power_levels', ''): <FrozenEventV3 event_id=$QBnOW4DH8asI1Sjbb4cEp1kGF1DHO3kOxBeuHilgfjs, type=m.room.power_levels, state_key=, outlier=False>, ('m.room.join_rules', ''): <FrozenEventV3 event_id=$jonFOBLPEwbG3oTctHGv8s-LYw7SW0s7bW5eXiKq3z0, type=m.room.join_rules, state_key=, outlier=False>, ('m.room.member', '@betauser:beta.dbridge.dev'): <FrozenEventV3 event_id=$ym49hJLGIrVg094rFk2C0DdAfF1s92FNGevTfosF6I4, type=m.room.member, state_key=@betauser:beta.dbridge.dev, outlier=False>, ('m.room.create', ''): <FrozenEventV3 event_id=$s6joCXpyxW38sPpSE6dWS_2A62Op39z71awJ0lZAU6U, type=m.room.create, state_key=, outlier=False>}]

Even hardcoding a time.sleep(30) on the third-party rule doesn't seem to solve the issue. I've also tried patching the third party rule to retry N times with backoff factor, but still the same issue.

Is this inconsistent behavior on event storage a known issue?
If yes, has a fix already been planned?
Are there workaround possible?

Steps to reproduce

See description

Homeserver

local homeserver

Synapse Version

1.75.0

Installation Method

Docker (matrixdotorg/synapse)

Database

PostgreSQL, no separate servers, no porting, no restoring

Workers

Single process

Platform

See docker image matrixdotorg/synapse:v1.75.0

Configuration

Third party plugin:

modules:
  - module: synapse_auto_accept_invite.InviteAutoAccepter
    config:
      accept_invites_only_for_direct_messages: false

Relevant log output

See description

Anything else that would be useful to know?

No response

@DMRobertson DMRobertson added S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. A-Modules Module API: https://matrix-org.github.io/synapse/latest/modules/index.html O-Uncommon Most users are unlikely to come across this or unexpected workflow Z-Read-After-Write A lack of read-after-write consistency, usually due to cache invalidation races with workers A-Federated-Join joins over federation generally suck A-Invite Inviting users to rooms and accepting invites labels Feb 8, 2023
MadLittleMods added a commit to element-hq/synapse that referenced this issue Jan 27, 2025
This also happens for rejecting an invite. Basically, any out-of-band membership transition where we first get the membership as an `outlier` and then rely on federation filling us in to de-outlier it.

This PR mainly addresses automated test flakiness, bots/scripts, and options within Synapse like [`auto_accept_invites`](https://element-hq.github.io/synapse/v1.122/usage/configuration/config_documentation.html#auto_accept_invites) that are able to react quickly (before federation is able to push us events), but also helps in generic scenarios where federation is lagging.

I initially thought this might be a Synapse consistency issue (see issues labeled with [`Z-Read-After-Write`](https://github.com/matrix-org/synapse/labels/Z-Read-After-Write)) but it seems to be an event auth logic problem. Workers probably do increase the number of possible race condition scenarios that make this visible though (replication and cache invalidation lag).

Fix #15012
(probably fixes matrix-org/synapse#15012 (#15012))
Related to matrix-org/matrix-spec#2062

Problems:

 1. We don't consider [out-of-band membership](https://github.com/element-hq/synapse/blob/develop/docs/development/room-dag-concepts.md#out-of-band-membership-events) (outliers) in our `event_auth` logic even though we expose them in `/sync`.
 1. (This PR doesn't address this point) Perhaps we should consider authing events in the persistence queue as events already in the queue could allow subsequent events to be allowed (events come through many channels: federation transaction, remote invite, remote join, local send). But this doesn't save us in the case where the event is more delayed over federation.


### What happened before?

I wrote some Complement test that stresses this exact scenario and reproduces the problem: matrix-org/complement#757

```
COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh -run TestSynapseConsistency
```


We have `hs1` and `hs2` running in monolith mode (no workers):

 1. `@charlie1:hs2` is invited and joins the room:
     1. `hs1` invites `@charlie1:hs2` to a room which we receive on `hs2` as `PUT /_matrix/federation/v1/invite/{roomId}/{eventId}` (`on_invite_request(...)`) and the invite membership is persisted as an outlier. The `room_memberships` and `local_current_membership` database tables are also updated which means they are visible down `/sync` at this point.
     1. `@charlie1:hs2` decides to join because it saw the invite down `/sync`. Because `hs2` is not yet in the room, this happens as a remote join `make_join`/`send_join` which comes back with all of the auth events needed to auth successfully and now `@charlie1:hs2` is successfully joined to the room.
 1. `@charlie2:hs2` is invited and and tries to join the room:
     1. `hs1` invites `@charlie2:hs2` to the room which we receive on `hs2` as `PUT /_matrix/federation/v1/invite/{roomId}/{eventId}` (`on_invite_request(...)`) and the invite membership is persisted as an outlier. The `room_memberships` and `local_current_membership` database tables are also updated which means they are visible down `/sync` at this point.
     1. Because `hs2` is already participating in the room, we also see the invite come over federation in a transaction and we start processing it (not done yet, see below)
     1. `@charlie2:hs2` decides to join because it saw the invite down `/sync`. Because `hs2`, is already in the room, this happens as a local join but we deny the event because our `event_auth` logic thinks that we have no membership in the room ❌ (expected to be able to join because we saw the invite down `/sync`)
     1. We finally finish processing the `@charlie2:hs2` invite event from and de-outlier it.
         - If this finished before we tried to join we would have been fine but this is the race condition that makes this situation visible.


Logs for `hs2`:

```
🗳️ on_invite_request: handling event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=False>
🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True>
🔦 _store_room_members_txn update local_current_membership: <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True>
📨 Notifying about new event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True>
✅ on_invite_request: handled event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True>
🧲 do_invite_join for @user-2-charlie1:hs2 in !sfZVBdLUezpPWetrol:hs1
🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$bwv8LxFnqfpsw_rhR7OrTjtz09gaJ23MqstKOcs7ygA, type=m.room.member, state_key=@user-1-alice:hs1, membership=join, outlier=True>
🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$oju1ts3G3pz5O62IesrxX5is4LxAwU3WPr4xvid5ijI, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=join, outlier=False>
📨 Notifying about new event <FrozenEventV3 event_id=$oju1ts3G3pz5O62IesrxX5is4LxAwU3WPr4xvid5ijI, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=join, outlier=False>

...

🗳️ on_invite_request: handling event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False>
🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True>
🔦 _store_room_members_txn update local_current_membership: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True>
📨 Notifying about new event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True>
✅ on_invite_request: handled event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True>
📬 handling received PDU in room !sfZVBdLUezpPWetrol:hs1: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False>
📮 handle_new_client_event: handling <FrozenEventV3 event_id=$WNVDTQrxy5tCdPQHMyHyIn7tE4NWqKsZ8Bn8R4WbBSA, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=join, outlier=False>
❌ Denying new event <FrozenEventV3 event_id=$WNVDTQrxy5tCdPQHMyHyIn7tE4NWqKsZ8Bn8R4WbBSA, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=join, outlier=False> because 403: You are not invited to this room.
synapse.http.server - 130 - INFO - POST-16 - <SynapseRequest at 0x7f460c91fbf0 method='POST' uri='/_matrix/client/v3/join/%21sfZVBdLUezpPWetrol:hs1?server_name=hs1' clientproto='HTTP/1.0' site='8080'> SynapseError: 403 - You are not invited to this room.
📨 Notifying about new event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False>
✅ handled received PDU in room !sfZVBdLUezpPWetrol:hs1: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False>
```
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Federated-Join joins over federation generally suck A-Invite Inviting users to rooms and accepting invites A-Modules Module API: https://matrix-org.github.io/synapse/latest/modules/index.html O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. Z-Read-After-Write A lack of read-after-write consistency, usually due to cache invalidation races with workers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants