-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC1849: Proposal for m.relates_to aggregations #1849
Changes from 3 commits
0b5dab5
e03457d
8567149
ed2a3a0
07050a4
25b879b
f50de1f
ebb25e0
7db56e1
a9d2efa
c47391e
1b24334
a12db5b
3255ae0
e0f165e
d38d7a7
08de484
e78e7ad
dacae1f
5375a23
0f7cf5e
4bb8d6b
60a3d61
117ae97
8627fb7
2dde2c1
b166d6c
adb240d
a6c8b65
081ea3d
9c90cc1
3f3d60e
d98d5b3
c313489
fa7c338
1409119
e8fde9d
4d236c6
f63b20a
b228688
fb4446f
6225aae
6751de6
eab778d
a63d27e
4e8d370
e816233
4a96865
e57033a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,230 @@ | ||
# Proposal for aggregations via m.relates_to | ||
|
||
> WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP | ||
|
||
A very rough WIP set of notes on how relations could work in Matrix. | ||
|
||
Today, replies looks like: | ||
|
||
```json | ||
"type": "m.room.message", | ||
"contents": { | ||
ara4n marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"m.relates_to": { | ||
"m.in_reply_to": { | ||
"event_id": "$another:event.com" | ||
} | ||
} | ||
} | ||
``` | ||
|
||
`m.relates_to` is the signal to the server that the fields within describe aggregation operations. | ||
|
||
This is a bit clunky as for other types of relations, we end up duplicating the "m.in_reply_to" | ||
type between the event type and the relationship type. | ||
So instead, perhaps we should only specify a relationship type if strictly needed, e.g.: | ||
|
||
```json | ||
"type": "m.room.message", | ||
"contents": { | ||
"m.relates_to": { | ||
"event_id": "$another:event.com", | ||
"type": "m.reply" | ||
} | ||
} | ||
``` | ||
|
||
and | ||
|
||
```json | ||
"type": "m.reaction", | ||
"contents": { | ||
"m.text": "👍", | ||
"m.relates_to": { | ||
"event_id": "$another:event.com", | ||
} | ||
} | ||
``` | ||
|
||
or if building a relation DAG (for ordering for edits, and for pulling in missing relations after a federation outage): | ||
|
||
```json | ||
"type": "m.reaction", | ||
"contents": { | ||
"m.text": "👍", | ||
"m.relates_to": { | ||
"event_id": ["$another:event.com", "$another2:event.com"] | ||
} | ||
} | ||
``` | ||
|
||
And then the server just blitzes through `m.relates_to.event_id` and builds up all the relationships based on that field. | ||
This is kinda similar to pik's proposal below, but without the JSON schema. | ||
|
||
XXX: `event_id` should take either a string or a list of strings, to support relation DAGs (needed for ordering edits) | ||
|
||
## CS API considerations | ||
|
||
Then, whenever the event is served down the CS API, we inline the relations for a given event (modulo filters)? | ||
|
||
For edits, we'd want the most recent relation (by default) | ||
For reactions, we want all the reaction objects (or ideally their sum?) | ||
For replies, we don't want the original at all; the client can load it if needed via /context. | ||
|
||
We should send the aggregated event down during normal pagination, | ||
as well as the individual relations down incrementally during sync. | ||
|
||
After a limited sync, we should send a fresh aggregated event rather | ||
than try to calculate a delta. | ||
|
||
This is similar to how redactions are calculated today. We just build up a table of which events reference | ||
which other events, and expand them out when syncing. | ||
|
||
So: | ||
|
||
```json | ||
"type": "m.room.message", | ||
"event_id": "$another:event.com", | ||
"contents": { | ||
"m.text": "I have an excellent idea", | ||
}, | ||
"relations": [ | ||
{ | ||
"type": "m.reaction", | ||
"event_id": "$reaction:alice.com", | ||
"sender": "@alice:alice.com", | ||
"contents": { | ||
"m.text": "👍", | ||
} | ||
}, | ||
{ | ||
"type": "m.reaction", | ||
"event_id": "$reaction2:bob.com", | ||
"sender": "@bob:bob.com", | ||
"contents": { | ||
"m.text": "👎", | ||
} | ||
}, | ||
] | ||
``` | ||
|
||
## Sending relations | ||
|
||
PUT /_matrix/client/r0/rooms/{roomId}/send_relation/{parent_id}/{eventType}/{txnId} | ||
```json | ||
{ | ||
"m.text": "👍", | ||
} | ||
``` | ||
|
||
N.B. that the server then gets to populate out the m.relates_to field itself, | ||
adding the `parent_id` as one parent, but also adding any other dangling relations-dag extremitie. | ||
|
||
## Receiving relations | ||
|
||
TODO: | ||
|
||
* /sync | ||
* /messages | ||
* /context | ||
|
||
## Pagination considerations | ||
|
||
How do we handle 20K edits in a row? | ||
* we need to paginate 'vertically' somehow | ||
|
||
How do we handle a message with 20K different emojis? | ||
* we need to paginate 'horizontally' somehow - return the 10 most popular emojis? | ||
|
||
Do relations automatically give us threads somehow? | ||
* No; we will at the least need to define how to permalink to a relation then and then paginate around it. | ||
|
||
## Edge cases | ||
ara4n marked this conversation as resolved.
Show resolved
Hide resolved
ara4n marked this conversation as resolved.
Show resolved
Hide resolved
ara4n marked this conversation as resolved.
Show resolved
Hide resolved
ara4n marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how does all of this interact when you've ignored the user who reacted, edited, etc someone else's message? For example: User A (not ignored) says something that User B (ignored) edits - does the server send the edit down or does it let the conversation diverge? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like given the aggregations happening in the proposal that the client can still receive the aggregation for ignored users (which is fine, I think) in terms of a count. The new APIs should probably filter them out though, and mention that they do this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In an ideal world, i'd prefer that we filtered out all relations from ignored users - including aggregated reaction counts. It's awful if you've ignored someone who's harassing you, and then see anonymous +1 🍆 reactions or whatever all over your msgs. @erikjohnston how do we handle this atm in synapse? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe we filter out reactions from ignored user, except when doing aggregate counts. I'm a bit wary of committing to having the counts take account of ignored users as that explodes the complexity of the calculation, which happens anytime we send events down to the client. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. agreed that it explodes the complexity... but it also does introduce quite a miserable harassment vector. :/ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, its not ideal but I'm just not sure matrix.org would handle the extra load terribly well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about a partial solution like replacing the count with zero if all reactions are from ignored users?
ara4n marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
XXX: What happens when you react to an edit? | ||
* You should be able to, but the reaction should be attributed to the edit (or its contents) rather than the message as a whole. | ||
* So how do you aggregate? | ||
|
||
How do you handle racing edits? | ||
* The edits could form a DAG of relations for robustness. | ||
* Tie-break between forward DAG extremities based on origin_ts | ||
* m.relates_to should be able to take an array of event_ids. | ||
* ...or do we just always tiebreak on origin_ts, and rely on a social problem for it not to be abused? | ||
* problem is that other relation types might well need a more robust way of ordering. | ||
|
||
Redactions | ||
* Redacting an edited event in the UI should redact the original; the client will need to redact the original event to make this happen. | ||
* Clients could also try to expand the relations and redact those too if they wanted to, but in practice the server shouldn't send down relations to redacted messages, so it's overkill. | ||
* You can also redact specific relations if needed (e.g. to remove a reaction from ever happening) | ||
* If you redact an relation, we keep the relation DAG (and solve that metadata leak alongside our others) | ||
|
||
What does it mean to call /context on a relation? | ||
* We should probably just return the root event for now, and then refine it in future for threading? | ||
|
||
## E2E considerations | ||
|
||
In E2E rooms: | ||
* The payload should be encrypted. This means that we can't sum emoji reactions serverside; | ||
they'll have to be passed around one by one. Given E2E rooms tend to be smaller, this is | ||
hopefully not a major problem. We could reduce bandwidth by reusing the same key to | ||
encrypt the relations as the original message. | ||
* This means that reputation data can't be calculated serverside for E2E rooms however. | ||
* It might be okay to calculate it clientside? Or we could special-case reputation data to not be E2E? | ||
* The m.relates_to field however should not be encrypted, so that the server can use it for | ||
performing aggregations where possible (e.g. returning only the most recent edit). | ||
|
||
## Federation considerations | ||
|
||
In general, no special considerations are needed for federation; relational events are just sent as needed over federation | ||
same as any other event type - aggregated onto the original event if needed. | ||
|
||
XXX: We have a problem with resynchronising relations after a gap in federation. | ||
We have no way of knowing that an edit happened in the gap to one of the events | ||
we already have. So, we'll show inconsistent data until we backfill the gap. | ||
* We could write this off as a limitation. | ||
* Or we could make *ALL* relations a DAG, so we can spot holes at the next relation, and | ||
go walk the DAG to pull in the missing relations? Then, the next relation for an event | ||
could pull in any of the missing relations. | ||
* Could we also ask the server, after a gap, to provide all the relations which happened during the gap to events whose root preceeded the gap. | ||
* "Give me all relations which happened between this set of forward-extremities when I lost sync, and the point i've rejoined the DAG, for events which preceeded the gap"? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens when the set of relations becomes very large? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. excellent question. god knows; this is being swept under the carpet for now. |
||
|
||
In future, we might want to aggregate relational events (e.g. send a summary of the events, rather than the | ||
actual original events). This requires the payload to be non-E2E encrypted, and would also require some kind of | ||
challenge-response mechanism to prove that the summary is accurate to the recipients (a ZK mechanism of some kind). | ||
In some ways this is a subset of the more general problem of how we can efficiently send summaries of rooms and even | ||
room state over federation without having to send all the events up front. | ||
|
||
## Historical context | ||
|
||
pik's MSC441 has: | ||
|
||
Define the JSON schema for the aggregation event, so the server can work out which fields should be aggregated. | ||
|
||
```json | ||
"type": "m.room._aggregation.emoticon", | ||
"contents": { | ||
"emoticon": "::smile::", | ||
"msgtype": "?", | ||
"target_id": "$another:event.com" | ||
} | ||
``` | ||
|
||
These would then aggregated, based on target_id, and returned as annotations on the source event in an | ||
`aggregation_data` field: | ||
|
||
```json | ||
"contents": { | ||
... | ||
"aggregation_data": { | ||
"m.room._aggregation.emoticon": { | ||
"aggregation_data": [ | ||
{ | ||
"emoticon": "::smile::", | ||
"event_id": "$14796538949JTYis:pik-test", | ||
"sender": "@pik:pik-test" | ||
} | ||
], | ||
"latest_event_id": "$14796538949JTYis:pik-test" | ||
} | ||
} | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please bear in mind that I'm posting this comment with my community member hat on rather than my NV employee one
While I haven't seen it stated explicitly in the proposal (I might have missed it though), from what I heard and read in various exchanges, including element-hq/element-web#9793, the way edits are going to be implemented specifically allows admins and mods in rooms to edit anyone's messages in rooms, which I see as a massive red flag and a huge ethical concern in a protocol that's currently mainly used for IM.
I've already discussed my PoV on this in the linked issue, but I thought I should also make my point clear in this timeline in order to keep the discussion around aggregations in one place.
I see this specific aspect of edits as a huge risk, as imho it's easy to abuse, and abuse would cause confusion in the best case scenario, impersonation with potentially harmful consequences for the victim in the worst case scenario. As a user, I feel like the risk it makes me take, in terms of potential disinformation and how I might act upon it without realising the fake, to even read messages as edited (rather than see the edits as individual messages) exceeds by a lot what the positive aspect of the feature. And I find it very intrusive to even allow other users to edit my own messages without my consent.
I do see the point that it might be a useful tool for moderation, but it really sounds to me like it brings little to the table, nor can I see the absolute need for it given we already have a redaction feature. I'm waaaay more comfortable with mods deleting my messages rather than putting something I don't think or agree with under my name.
On a UX/UI point of view, I'd tend to think that this aspect adds a hard requirement for a perfectly explicit UX that allows users to figure out from a glance, and without additional actions, who authored which part of a message as easily and as straightforwardly as figuring out who authored a message, and this on every client that has at least one user. This doesn't sound a realistic goal to me.
A small and brief text at the end of the message like the "(edited)" we have now is not enough, imho, as I myself often misses it when reading messages from others, and I'm almost sure that non-technical users would either miss it as well most of the time, or not know what it means and decide not to care (my dad would definitely be in either of those 100% of the time). Plus, it turns an informational metadata that you don't have to process if you don't care into an information that'd be presented as non-important but is in fact as important as the message and its context if you want to understand both.
As to the argument "some platforms such as some forums and github do it", I'd answer that it doesn't make it a good thing to have (and some chose not to have it, such as Reddit afaik). I already find it pretty intrusive and badly implemented there, where the use of such a feature is more justified than IM (e.g. GitHub's target audience is used to anything being editable by a bunch of people and processing changes histories).
Also, as a user, I really feel like that, with this rolled out everywhere, Matrix, something that got my interest because it gives me control over my communications, takes a big chunk of this control back without me being able to do anything about it, and it makes me sad. And will probably prevent me from making announcements, or even speaking, in rooms in which I don't know each mod and admin to some extent.
I'm currently chatting with (h)activists and journalists (and some other people with interest in tech and/or privacy) to try to get them interested in Matrix, but this element being added to Matrix would probably greatly hamper these efforts, because I can't genuinely promote a platform which allows disinformation and impersonation by design.
In summary, to me this aspect of message editing is both risky and unnecessary, and will likely keep me away from message editing as long as it's a thing, and that makes me sad (especially since I was waiting on such a feature for a long time). Someone abusing it would be able to make me say "I like nazis" in a way that would look very convincing and legitimate, all without my consent (and possibly without even me noticing), and most non-technical people would just think I like nazis and possibly hate me for it. I like a message editing feature in which I'm the only one that can edit my messages and noone else, and if a mod has an issue with it, they can talk to me or just delete my message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are currently not allowing edits from other users right now.
That being said, it sounds like all you're asking for is that if this does get implemented then we ensure that the UX is very clear about the fact it has been edited by a third party? That seems entirely reasonable and easy enough to do, e.g. changing the avatar to an amalgamation of the sender and editor as well as the display name, etc.
Its also worth noting that you can always see edit history, which feels like a net win over redactions (which IMO are a horrible moderation tool) in terms of maintaining control over the conversation.
In general we have a balancing act between moderation tools and "control", after all moderation tools by definition are about taking power from individuals and giving them to moderators. I think its clear that we do need more moderation tools in Matrix, otherwise it will not be usable by most communities. However, so long as we are clear in our UX I think we can support a happy compromise that allows Matrix to be useful across the board.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even mods/admins? If so, can we please keep it like this?
My point is that I doubt most non-tech users will use it, or even notice there is such a thing.
That would be a good start tbh, though I don't think relying on Riot to do the right thing UX-wise to prevent abuses on all of Matrix is the right way to go, nor is fair to other clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get more in depth on what I mean, I think if we are to rely on client UX to prevent abuses to go horribly wrong, it needs to:
Which is why I don't consider this as a reasonable expectation.
I would rather have Matrix prevent abuses that could go horribly wrong instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really? Again, this feels like a solvable UX issue. If there is a big fat button that screams "This messages has been EDITED click here to see the history" users will quickly realise that its a thing.
I'm sympathetic to this view point, but again it feels like an argument that can be made against many features. What we see in Matrix (and across tech) is that UIs are greatly informed by existing implementations. If we implement an appropriate UX in Riot I would be comfortable assuming that the majority of clients will similarly highlight third party edits, and certainly if they didn't then users of those clients would quickly start complaining about it.
Can other clients implement confusing UX? Yes, they can, but that's true of many things today.
(I also fail to see how its unfair to other clients? Its not like we're special casing Riot in anyway?)
I think this is setting an impossibly high bar. Why does this not apply to end to end encryption? We heavily rely on clients implementing the correct behaviour, and the consequences of them getting it wrong are at least as damaging, if not more so, than third party edit UX mess ups.
We cannot create a protocol that is 100% fool proof against attackers and client implementation problems. What we can do is to try and highlight such areas of high risk and give clear and unambiguous guidelines on how to deal with those issues, setting appropriate examples in the flagship apps that are built.
Broadly, yes, one of the reasons we aren't implementing third party edits today is over concerns that it is more effort than its worth, and that implementing the necessary UX is non-trivial to get right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a big fat button easy to spot, maybe. With the current UX in Riot ("(edited)" after the message), probably not. I'm convinced most non-tech people would just see it as "here's something I don't fully understand but doesn't prevent me from reading the timeline so let's just not care". And again, I myself misses it quite often, and am convinced that most non-tech people I know would either miss it or be in the "not understanding, thus not caring" scenario.
Only if the client provides the tools to spot it I think, which there's no guarantee of.
It does apply, but I wouldn't say it's comparable because there's nothing a server implem can do about a client implementing E2EE in a confusing way, which is not true with edits. My point is that I believe that if we can avoid clients from causing damage to Matrix users because they don't provide the right UX, we probably should imho.
Not sure if that's what you meant, but I understood that part of your message as "we'll have a nice UX in Riot so it's not a problem".
Which imho doesn't mean it's not worth trying.
Then this may need to be more explicit, because I never saw Riot as an example to follow UX-wise when building a client, nor have I seen any client dev talk about it that way.
And I also don't like the idea of seeing Riot as a reference implementation, because that's what a "flagship app that should be the example to follow" sounds like to me. Riot is developed by NV, not the Matrix.org Foundation, and should imho be as valid as an example as any other clients out there.
As a user, I feel these concerns are enough to make me start looking for a non-Matrix-based IM platform to move to if it gets implemented or allowed in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me bit a bit more explicit about this: Riot is not special cased in any way at all. That doesn't mean that Riot, as currently the client with the most user share, won't have a direct impact on how other clients implement features. I'm not saying that other clients must, or event should, follow the choices Riot makes, and indeed it'd be pointless if clients did. But what will happen is that client implementers will take cues from how other clients have implemented features, and try and insure that they implement similar safe guards.
As a spec we can't ignore how the existing ecosystem will implement things.
We would also, of course, spell all this out in the spec. We could also ensure that third party edits behave differently down the
/sync
api to ensure that clients do specifically handle them differently.I'm saying if we implement third party events we should change that?
Hence my suggestion that we change the timeline to not keep the avatar and display name of the original sender. That way, if the user doesn't understand what's happened they won't believe it came from the original sender either.
There's no guarantee that clients won't display the wrong avatars next to messages. Yes, that's a bug that we've seen.
Of course we should try, my point throughout all of this is that perfect is an enemy of good. What matters is what happens in practice, if an attack is not actually possible in practice then it doesn't really matter if its theoretically possible in other scenarios. We can create the most theoretically secure app in the world, and no one would use it if it doesn't have the features that people actually need.
From an end user security perspective the distinction doesn't really matter? Yes, from a practical perspective it means that we need to be careful to help clients ensure that they do the right thing to be secure, just like we do for E2E.