-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3266: Room summary API #3266
base: old_master
Are you sure you want to change the base?
Conversation
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
855306e
to
642f4e1
Compare
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
…est of the path separate Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
That way the requesting server knows, if any user would have access to that room and it can forward the room to the user. Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
It suffers from the same knock_restricted issue.
I've given this a green tick in order not to block its progress, given the MSC as a whole is definitely useful and very proven at this point. @deepbluev7 can you please deal with the remaining open PR comments though (e.g. spelling out what 'public' means; etc)? |
Mautrix has also implemented it now: https://github.com/mautrix/go/releases/tag/v0.23.0 |
This has got stuck waiting for updates once more, so in the interests of pushing it over the line, the Spec Core Team are going to make the final few updates. |
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com> Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
guest_can_join: false, | ||
name: "CHEESE", | ||
num_joined_members: 37, | ||
topic: "Tasty tasty cheese", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is an entirely new API, could it make sense to use the rich topic format from #3765 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the value of having rich text in the room summary. The way this API is utilised implies brief representation. As a client author, I would prefer the topic reduced to plain text. Maybe it's just me though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3765 argued along the same lines though that was partially motivated by not wanting to touch the existing endpoints. 🙈
I think it's probably fine to keep this as plain text. Just wanted to raise it because changing it now would still be easy due to this being a new API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think basically, we need to keep /room_summary
in sync with GET /_matrix/federation/v1/hierarchy/{roomId}
, so that the information available for remote rooms matches that for local rooms.
Given we haven't yet changed the spec for GET /_matrix/federation/v1/hierarchy/{roomId}
to use rich topics (beyond preferring the plain-text representation of m.topic
over topic
), I don't think we should do so for /room_summary
either. In future, we can update both together.
That said, this MSC probably does need clarifying that it should prefer m.topic
over topic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no reason why this MSC should add support for rich text topics, when they are not supported by the hierarchy API and public rooms API. It would make more sense to me to add that to all of those APIs in a targeted MSC, if there is a use case for it. I don't really see one right now.
I also don't see why this MSC needs to clarify if it is using m.topic over topic now. That stuff didn't exist when this MSC was written and this MSC already clarifies where it is taking the fields from (the hierarchy API and related to the public rooms API). At some point we need to stop clarifying every little detail, it causes needless churn on the MSC and most of those things are already obvious enough. This MSC doesn't care about those changes and the rich text topic MSC explicitly mentions that servers should switch to m.topic. The MSC was merged yesterday and is not part of the spec yet. This MSC references the spec. If I now update it to reference some MSC, I will get to update it in a few months again to reference the spec and so on. At some point people will just have to consider some context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't see why this MSC needs to clarify if it is using m.topic over topic now ... this MSC already clarifies where it is taking the fields from (the hierarchy API and related to the public rooms API)
Yup, this is fair, on closer inspection: this MSC just says "These are the same fields as those returned by /publicRooms
..." which seems clear enough.
It should be possible to call this API without authentication, but servers may | ||
rate limit how often they fetch information over federation more heavily, if the | ||
user is unauthenticated. Being able to call this API unauthenticated is | ||
beneficial to avoid third parties registering guest users for one-shot API | ||
calls. Restricting this API to guests only would provide no security benefit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should server admins be allowed to disable returning results on this API similar to how profile look-ups can be disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be an implementation detail for servers, no? Nothing would change in the API whether a server has such an internal control or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wouldn't change the API (beyond error codes maybe). But we do explicitly call out other cases where admins can deny look-up such as: https://spec.matrix.org/unstable/client-server-api/#server-behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should call it out, and indeed I am very hesitant to recommend unauthenticated access by default. We probably do want to allow this to be unauthenticated for e.g. matrix.org as a service to the community, but in general I don't think homeservers should not be supporting third-party services out-of-the-box. Allowing unauthenticated access opens up the server to abuse, which not all server admins can or want to deal with.
I think there are two benefits from having this unauthenticated:
- Third-party services that want to piggy-back of a well known public home server
- A client wants to show a preview of a room before the user logs in.
Either way, these depend on using a pre-configured default server, which can then be specifically configured to allow unauthenticated access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- a sane invite experience not exclusive to publicly federating homeservers via a landing page like matrix.to - otherwise the experience will be very normie unfriendly with just a room ID visible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that all of these are solvable issues (and there is some disagreement about how much each actually matters), but I think the larger point stands that we should only consider adding unauthenticated APIs if there is a need to do so.
The need to do so is demonstrated by @t3chguy's point about wanting to offer a good experience in matrix.to. We've established that there are alternatives, but none of them are great. So we're not just adding unauthenticated APIs for a laugh: we have a usecase. The question is, do the downsides outweigh the upsides? It's not as black and white as this sentence implies.
- Information leaking: fundamentally you're giving information out to potential attackers.
Yeah, this is compelling. As things currently stand: suppose I run a homeserver where I have disabled federation. On that homeserver I have a few rooms where any of my users can join (join_rules: public
). I trust my users, but I don't want randoms on the internet to see my room names. This MSC makes that room metadata available to anyone on the internet, which I think would be a surprise to most admins in that situation.
For me, this is an absolute blocker to landing the MSC as it currently stands.
- It makes it easy to rack up costs.
I think the actual cost is debatable, and caching/ratelimiting could be improved iteratively later. This is not a compelling argument for me.
- This endpoint can trigger federation requests,
This feels like a specific case of point 2.
- ... create a public room with illicit content on e.g. matrix.org. Then any server with unauthenticated summary endpoint enabled would return (and potentially cache) that data.
I don't think MSC4108 is particularly relevant to this discussion, but the second half of your point is important. This feels a lot like one of the problems we tried to fix in authenticated media: I don't want randoms in the internet pulling evil room topics onto my server.
- From a server admin PoV, in most deployments you mostly trust your own users so you don't need to worry too much about abuse. If you see abuse its easy enough to lock out a particular user and move on with your life. Admining a server that can be used by unauthenticated third-party services can be more tedious if you have to deal with IP-locking out things, or abuse requests from your server provider, etc. We to an extent see these problems already with the federation APIs.
Given we already have these problems with federation APIs, maybe we don't need to worry about it too much? Or is your point that it will even affect admins who have disabled federation (and thus hope to be exempt from such abuse?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking on behalf of the SCT: we don't have significant time to spend on this MSC, either in terms of changing the Synapse implementation, or in terms of changing matrix.to
. Nevertheless, the current situation, of matrix.org
supporting an unstable endpoint is, IMHO, untenable: we need to find a way forward, and in a way that does not block this MSC for another 3.5 years.
I can see for and against allowing unauthenticated access, but I think there are some real blockers to it, and right now, I am minded to proceed as suggested earlier.
Instead, it is proposed to recommend that homeservers make the endpoint require authentication by default. Homeservers would remain free to allow anonymous access if they wish, and anonymous access would remain enabled on
matrix.org
.
I appreciate that this may not be the ideal outcome for some , especially matrix.to
, but I think it's acceptable, and critically, it is no worse, by any measure, than where we are today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fundamentally disagree, that this endpoint should be available only with authentication by default. There are several users of the API already, that are using it unauthenticated (shields.io, matrix.to). So there clearly is a use case. Those users also all allow you to overwrite which server you want to use for the preview, because even if those services are centralized, Matrix is not. Having every preview go through matrix.org violates the design principles behind Matrix and matrix.to. matrix.to goes through a lot of effort to not leak the room addresses or other data to the server hosting matrix.to. As such people should generally have the choice to use their own servers for previews.
The load generated by this API is a LOT lower than what we had before with guest users for previews. This API was already mentioned as a replacement for those use cases, where people were using guest users before. Requiring authentication for this API now would violate that announcement (although yes, matrix.org could make an exception).
If servers want to disable unauthenticated access for this in a private federation, that is fine. But in an open federation we generally consider public rooms public. You aren't providing any additional security by requiring authentication for this API. All of that data is already available to anyone, who can set up a homeserver. While that is a minuscule amount of work and as such harder than calling an unauthenticated API directly, it won't prevent anyone from getting access to this data.
And the same issue exists in the other direction. Yes, unauthenticated access might cache unwelcome data for a certain amount of time. But to access that data someone would need to call the same API. And that data is deleted automatically, since it is just a cache, not persisted forever like media used to be for most people. Additionally that data can be pushed to a server also by simply sending an invite.
So my opinion is, that if servers want to disable unauthenticated access because they have special federation needs, that is absolutely fine. But they will have to do more work to actually prevent any of the abuse mentioned so far (like only allowing certain domains to federate like in the TI-Messenger federation). Otherwise requiring authentication doesn't technically provide any additional security, doesn't make less data available, but does make Matrix more centralized and this API less useful for use cases already relying on this API. If you want to provide more security, you need to lock down the federation API more and provide restrictions for invites and similar APIs. Requiring authentication for this endpoint makes only a superficial difference.
The authentication requirement of this API should be decided upon by whoever hosts the server and I don't think the spec can make a recommendation either way. It depends too much on what group you are running the server for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If servers want to disable unauthenticated access for this in a private federation, that is fine. But in an open federation we generally consider public rooms public. You aren't providing any additional security by requiring authentication for this API. All of that data is already available to anyone, who can set up a homeserver. While that is a minuscule amount of work and as such harder than calling an unauthenticated API directly, it won't prevent anyone from getting access to this data.
This is true but I think allowing unauthenticated access via the client-server API lowers the bar substantially. Everyone with a browser can query this endpoint. Critically, in a private federation those are users you don't control and may not want to share data with. I don't have a strong opinion on whether we default to authenticated access with the option to allow unauthenticated access or the other way around. But I think as an admin I should be able to lock down this API on my own server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we can accommodate both points of view by simply making it clear that allowing unauthenticated access is left up to the server implementer and admin, as @Johennes stated at the start of this thread.
It should be possible to call this API without authentication, but servers may | ||
rate limit how often they fetch information over federation more heavily, if the | ||
user is unauthenticated. Being able to call this API unauthenticated is | ||
beneficial to avoid third parties registering guest users for one-shot API | ||
calls. Restricting this API to guests only would provide no security benefit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should call it out, and indeed I am very hesitant to recommend unauthenticated access by default. We probably do want to allow this to be unauthenticated for e.g. matrix.org as a service to the community, but in general I don't think homeservers should not be supporting third-party services out-of-the-box. Allowing unauthenticated access opens up the server to abuse, which not all server admins can or want to deal with.
I think there are two benefits from having this unauthenticated:
- Third-party services that want to piggy-back of a well known public home server
- A client wants to show a preview of a room before the user logs in.
Either way, these depend on using a pre-configured default server, which can then be specifically configured to allow unauthenticated access.
(1) The `membership` field will not be present when called unauthenticated, but | ||
is required when called authenticated. It should be `leave` if the server | ||
doesn't know about the room, since for all other membership states the server | ||
would know about the room already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the server doesn't know about the room, don't we return 04 as per the previous paragraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. It looks like the 404 paragraph was added in ed79007 (and later clarified in 424189c), but this leave
requirement was forgotten.
Looking at the Synapse implementation, it looks like it indeed returns a 404 if the server doesn't know about the room. I propose to remove this bit about leave
.
@deepbluev7: any objections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is specifically about when the room info is fetched over federation. In those cases no membership information is available on the local server, but it can reasonably assume none of its local users are joined or invited. While that is obvious, I thought it deserved spelling this out explicitly, since servers shouldn't just omit the membership key in that case. It probably makes more sense to change doesn't know about the room
to is not participating in the room
or does not have any local users in the room
or something about having had to fetch the summary over federation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok well, let's change it to "It should be leave
if the server does not have any local users in the room" if you think it's important to have something here, though IMHO it's clearer with nothing.
These are the same fields as those returned by [`/publicRooms`](https://spec.matrix.org/v1.13/client-server-api/#get_matrixclientv3publicrooms) or | ||
[`/hierarchy`](https://spec.matrix.org/v1.13/client-server-api/#get_matrixclientv1roomsroomidhierarchy) | ||
, with a few additions: `membership`, `room_version`, | ||
`encryption` and `allowed_room_ids`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allowed_room_ids
isn't mentioned in the format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, which format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean in the example above? Yes, because the join_rule
is public. It is explained in the table below though, which also says that the field is optional. Does this need an example specifically with the room_ids
, if the /hierarchy API already uses that field?
Note that rooms the user has been invited to or knocked at might result in | ||
outdated or partial information, since the the homeserver may not have access | ||
to the current state of the room. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Wouldn't the server just go & get the latest data the same as it would if they weren't invited?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. @deepbluev7: can you shed any light here?
Looking at the synapse implementation, the logic goes:
- If the server has at least one user whose membership is
join
, generate the summary locally - Otherwise, call out to
GET /_matrix/federation/v1/hierarchy/{roomId}
(or the local cache).
I don't see any reason that rooms the user has been invited to or knocked at would be special here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no way to fetch a room summary over federation, that isn't publicly joinable and a user has been invited to. How would it fetch a more up to date state? That is not how invites currently work on Matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmhmm. So, in the current implementation: if you you have been invited to a remote room that isn't publicly joinable, /summary will return an 404 M_NOT_FOUND for that room, right?
Which I guess is kinda "outdated or partial information".
### The `sync` API | ||
|
||
For joined rooms, the `/sync` API can be used to get a summary for all joined | ||
rooms. Apart from not working for unjoined rooms, like knocks, invites and space | ||
children, `/sync` is very heavy for the server and the client needs to cobble | ||
together information from the `state`, `timeline` and | ||
[`summary`](https://github.com/matrix-org/matrix-doc/issues/688) sections to | ||
calculate the room name, topic and other fields provided in this MSC. | ||
|
||
Furthermore, the membership counts in the summary field are only included, if | ||
the client is using lazy loading. This MSC provides similar information as | ||
calling `/sync`, but to allow it to work for unjoined rooms it only uses information | ||
from the stripped state. Additionally, it excludes `m.heroes` as well as membership | ||
events, since those are not included in the stripped state of a room. (A client | ||
can call `/joined_members` to receive those if needed. It may still make sense | ||
to include heroes so that clients could construct a human-friendly room display | ||
name in case both the name and the canonical alias are absent; but solving the | ||
security implications with that may better be left to a separate MSC.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is really an alternative, or at least not one that needs 2 paragraphs of discussion. It's not a problem that it's here, but just as feedback for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is mentioned as an alternative, because people brought up "just call sync" when I initially discussed this MSC with people. This section is an argument in that discussion. It obviously isn't a good alternative, but people needed that spelled out at the time and I didn't want to have this argument multiple times.
federation API and a client could then easily request a summary of all joined | ||
rooms. It could still request the summary of a single room by just including | ||
only a single room in the POST or a convenience GET could be provided by the | ||
server (that looks like this proposal). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like the conclusion is that we should do this instead? Why wouldn't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess just because there isn't a usecase for a batch API here, and frankly opening up the endpoint to use more resources makes it an even harder sell to land.
@deepbluev7 any other insights?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far nobody said they would benefit from a batch API apart from one use cases, that wanted an easy way to show a room list for the /joined_rooms response. The vast majority of use cases I have seen only query one room. Batching is simply more complicated since you also might not be able to pass vias as query parameters, etc. Unless someone provides a use case, I don't see why a batched API is a better design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fair not to specify a batch API on the premise of lacking use cases and increased complexity. The alternative, as written, doesn't include this reasoning, however. In my understanding the main reason for writing down alternatives is to prove that the proposed solution is the best solution by way of showing that all known alternatives are inferior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guessed this was probably the answer, I was more pointing out that your "alternative" sections should probably end with why you didn't choose this option (even if that reason is something like, "it doesn't matter at all but we had to pick one so we picked this one").
@mscbot concern should unauthenticated access be enabled by default? |
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Rendered.
Somewhat related to MSC2946 this provides an API to get a summary for a specific room either from the local server or over federation.
Useful for a few cases, where you need to show a summary for a room like matrix.to, traveler bots, showing spaces, lightweight clients, etc.
Open design questions looking for feedback: see this comment chain: #3266 (comment)resolved as of 2021/10/06Implementations:
allowed_room_ids
part is implemented here: deepbluev7/synapse@37f4253.Signed-off-by: Nicolas Werner nicolas.werner@hotmail.de
SCT stuff:
Shepherd: @richvdh
FCP tickyboxes
Checklist: #3266 (comment)