Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better specify how to validate events #365

Open
richvdh opened this issue Aug 31, 2018 · 6 comments
Open

Better specify how to validate events #365

richvdh opened this issue Aug 31, 2018 · 6 comments
Labels
A-S2S Server-to-Server API (federation) clarification An area where the expected behaviour is understood, but the spec could do with being more explicit feature Suggestion for a significant extension which needs considerable consideration

Comments

@richvdh
Copy link
Member

richvdh commented Aug 31, 2018

We should clearly specify how to validate an event that has been received over federation, and how the receiving server should behave when validation fails.

The sorts of things we should consider checking include:

  • json compliance:
    • does it conform to the rules of canonicaljson (eg no floats, no bigints)?
    • Does it have invalid UTF-8 (invalid byte sequences, overlong encodings, things which decode to the UTF-16 surrogates or to codepoints above U+10FFFF)?
  • schema compliance:
  • whether the origin matches the sender sending the event, for pushed transactions.
  • whether the sender and event_id match the origin (except for join events)
  • whether the creator on a create event matches the sender and the room_id
  • whether it has a "sensible" number of prev_events
  • signatures: whether the event has been signed by the origin (and the sender's server, where different).
    • also, whether the signing key was valid at the time the signature was made.
  • hashes: whether the event's content matches the content hash
  • whether hashes in prev_events and auth_events correspond to the hashes on events we have (though this will probably be made much simpler by MSC: Replace event IDs with hashes matrix-spec-proposals#1640)

Parts of this list overlap with the rules already set out for Authorization of PDUs.

Synapse currently checks many, but not all, of the things on this list, so there may well be rooms in existence which do not conform. Starting to check them now is problematic in that it may prevent servers from participating in those existing rooms. We should therefore consider whether it is reasonable to apply any additional checks now, or which should wait for a future room version.

We should also define the behaviour when a validation check is not met: should we ignore such events, or reject incoming pushes? For pulls, should we try to get the events from elsewhere?

@ara4n ara4n changed the title Better specify how to validate events MSC: Better specify how to validate events Sep 2, 2018
@turt2live turt2live self-assigned this Sep 4, 2018
@turt2live
Copy link
Member

Related: matrix-org/matrix-spec-proposals#463

@turt2live
Copy link
Member

Somehow I managed to miss this when digging for event validation information :(

Given the validation rules are almost certainly going to need to be a rooms v2 thing, I think we should go a bit more strongly down the path of validation. In general, the things listed to be considered in the draft seem sensible. I only have a few specific concerns:

  • We should just ignore the creator on create events: 'creator' field of m.room.create events is redundant matrix-spec-proposals#1193
  • The origin should disappear: https://github.com/matrix-org/matrix-doc/issues/1664 (although we should still make sure that example.org is sending events for example.org)
  • No floats isn't great as it also affects the content of the event - can we get away with using the content hash instead of relying on the content field? My concern is that float data is useful in IoT and similar scenarios, and the devices parsing the floating points are likely to be aware of the troubles in doing so.
    • Arguably one could work around this limitation as an implementation detail. For instance, the sending party could use end-to-end encryption or hash the value itself, or otherwise encode the float into multiple parts that are compliant with the spec. This seems a bit silly for a workaround, however.
  • How does an arbitrary number of 10 sound for max prev_events?
  • Validating event schema might be difficult, but we should certainly do it for state events. There's an argument to be made that encrypted events can't be have their true content validated and therefore we shouldn't bother validating the remainder of room events because the server would only catch half the problems. State events cannot be encrypted (yet), and we rely heavily on those being accurate, so we should validate them to no end. Non-state events should still get the length check and general PDU schema check, but we may want to optimize the spec and leave it to clients to safely handle mangled content.

On the point of figuring out how to handle invalid events, considering they would have already gone through the auth rules I don't think it's a good idea to reject them outright at the validation step. If the event is permitted to be in the room, then we should allow it to be there. That being said, servers should consider a soft failure approach to prevent badly formatted events from making it down to clients.

As a more minor point, the spec should clarify how events are meant to flow through the various algorithms. It's not entirely clear if validation is done before or after the auth rules, for instance.

To answer the question of "what about v1 rooms?", I think we should just say there that validation should be relaxed and only applied where it strictly applies (ie: use if membership == "join" rather than if membership not in [kick, ban, leave, join]).

@turt2live turt2live removed their assignment Sep 14, 2018
@richvdh
Copy link
Member Author

richvdh commented May 30, 2019

see also matrix-org/synapse#1237

@richvdh
Copy link
Member Author

richvdh commented Jul 15, 2019

see also #504

@richvdh richvdh changed the title MSC: Better specify how to validate events Better specify how to validate events Sep 23, 2020
@richvdh richvdh transferred this issue from matrix-org/matrix-spec-proposals Mar 1, 2022
@DMRobertson
Copy link
Contributor

DMRobertson commented May 4, 2022

schema compliance:

Note that the expectations here are sensitive to the room version in play; see #1044 #1045.

@turt2live turt2live added clarification An area where the expected behaviour is understood, but the spec could do with being more explicit feature Suggestion for a significant extension which needs considerable consideration A-S2S Server-to-Server API (federation) labels May 31, 2022
@Kladki
Copy link
Contributor

Kladki commented Jul 7, 2024

whether it has all the required fields listed at https://matrix.org/docs/spec/server_server/unstable.html#persistent-data-unit-schema

Doesn't seem to be enforced (fully) by any server implementation, see #1904

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-S2S Server-to-Server API (federation) clarification An area where the expected behaviour is understood, but the spec could do with being more explicit feature Suggestion for a significant extension which needs considerable consideration
Projects
None yet
Development

No branches or pull requests

6 participants