Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC4133: Extending User Profile API with Key:Value Pairs #4133

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

tcpipuk
Copy link

@tcpipuk tcpipuk commented Apr 19, 2024

@tcpipuk tcpipuk changed the title MSC0000: Extending User Profile API with Key:Value Pairs MSC4133: Extending User Profile API with Key:Value Pairs Apr 19, 2024
@turt2live turt2live added proposal A matrix spec change proposal client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Apr 19, 2024
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
@turt2live
Copy link
Member

@tcpipuk when you get a chance, please sign off on your changes to allow the MSC to eventually be eligible for acceptance.

@tcpipuk

This comment was marked as resolved.

@turt2live
Copy link
Member

Looks great, thanks!

@tcpipuk tcpipuk marked this pull request as ready for review April 19, 2024 19:46
@andrewzhurov
Copy link

Seems we discuss here a key-value CRDT, as keys would need to be updated over time. Personally, I'm all in favor.
With no regard be those events a part of a (messaging) Room or be in its own Profile Room (#1769) (which would be neat),
we need a mechanism to get the latest key-value pairs.

Matrix Event Graph is isomorphic to Merkle-CRDT that powers OrbitDB (key-value DB is one of the supported CRDT).
Taking an inspiration from how key-value CRDT works may be of help to spec it for those Servers wishing to opt-in.

proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
Copy link
Contributor

@MTRNord MTRNord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I cant add concerns I can only use the review feature 🤷

proposals/4133-extended-profiles.md Show resolved Hide resolved
Copy link
Member

@tulir tulir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size limits look good now and I think only showing profile data for unredacted members should be sufficient for T&S. However, there should probably be a client implementation that supports user-defined fields rather than just the timestamp one.

@tulir
Copy link
Member

tulir commented Sep 30, 2024

@mscbot concern no client implementation for user-defined fields

Copy link
Contributor

@Half-Shot Half-Shot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a few more concerns off the back of the latest round of changes. I'd like to give personal thanks to @tcpipuk at this stage for preserving on one of the more difficult MSCs. You're doing a great job of wrangling this.


On the MSC, I am slightly worried it is starting to include an absolute behemoth of implementation details. Some of which feel more like they are rephrasing other parts of the MSC in different language.

On the need for an implementation for user defined fields, I agree. Albeit, I am starting to feel inclined to suggest reserving the namespace in this MSC and creating a successor MSC that deals with the implications (that means leaving u*. effectively ignored until clients choose to support that specific MSC). That would leave clients able to use custom profiles, for defined fields they choose to support.

Anyway, that's my 2c. I suspect the SCT will have their own beliefs on what's right here. Again, good job!

proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
Comment on lines 300 to 305
- Servers are *allowed* to suppress some or all local/remote profile fields they do not wish to
share with their local users (e.g. if there are moderation concerns during a go-live phase);
however, server admins *must* disclose to users if they are publishing profile fields on behalf
of a user over federation that they cannot see, as this could be considered a breach of trust.
This feature could cause confusion if some users can see fields that other users cannot, so this
should be used sparingly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situation would this occur? This seems extremely unusual. I'd almost say this seems like a bad practice thing to do, and contorts how profiles work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Jim raised this as an option for large servers to roll out profile fields, i.e. they'd update to a server that supports them but want to block fields being seen/updated until they were ready.

I agree it feels like a bit of a hack, I'm just not sure whether there's a better way to soft-enable/disable profile fields, or if we should even do that at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'd like to understand what @jimmackenzie had in mind here. This feels like it raises the complexity of profiles AND reduces trust in profiles being accurate to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the meantime, I've rewritten this to be a lot less "servers can suppress whatever they like, whenever they like" and more "this entire feature can be temporarily disabled, but only sparingly": tcpipuk@4afb8b8#diff-e408d6c936a5ae3f7dd4386ed643830fb2191b459b35cbec73b2987d73cae439R294-R296

proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, I support the feature this MSC is aiming to bring forward. I've left some more detailed comments in my review, but the summary is:

  • There's a few places where the text leaves me a bit lost, or has a very blunt transition. Context from past discussions appears missing or overly included as well. A top to bottom read and edit is recommended to unify the feedback received.
  • User-specified namespacing should be moved out to a dedicated MSC, or at least deferred to a future MSC. It opens up way too many doors for consideration, and I don't think that blocking this MSC on those discussions is worthwhile.
  • Some words about traffic volume would be good.

Thanks for putting this proposal together, and for handling the ongoing feedback. Looking forward to this being merged, and profiles being fixed (finally)!

proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved

## Client-Server API Changes

- **Get a Profile Field**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: it would be good to mention that this is modeled off the existing endpoints, and link to those endpoints.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done this up top: 9b4741e#diff-e408d6c936a5ae3f7dd4386ed643830fb2191b459b35cbec73b2987d73cae439R11

Or would you prefer a specific mention in the actual CS API section itself?

}
```

*Note**: Setting a `null` value does not delete the key; the key remains with a `null` value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: words to explain this behaviour would be appreciated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if permitted by the homeserver.

- **Get All Profile Fields**
- **Endpoint**: `GET /_matrix/client/v3/profile/{userId}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this endpoint already exists? Linking to it would be best.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 118 to 120
There is no method to verify the history of global profile fields over federation, so this endpoint
MUST only accept requests for local users on the current homeserver, and homeservers MUST only
request a profile from the homeserver specified in that user's MXID.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest moving the verification bit to an explainer in Potential Issues, and dropping everything after "so this endpoint..." because it's just repeating the spec itself.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
proposals/4133-extended-profiles.md Outdated Show resolved Hide resolved
namespaces.

- Clients SHOULD only display profiles of users in the current room whose membership status is
`join`, `invite`, or `knock`, and whose `m.room.member` event has *not* been redacted. If a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(clients may not be able to determine if the membership event has been redacted - just excluding leaves, kicks, and bans is probably fine)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completely new API specifically for extended information. However, these approaches could lead
to more significant changes to the existing API and higher complexity for client developers.

## Security Considerations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the thread overall: moderation/safety tooling will likely have to adapt to scan-on-read behaviour rather than scan-constantly or scan-on-join. The server is likely best positioned to have this tooling, and can correlate API calls for server-side moderation purposes.

For example:

  1. Client requests profile of a user.
  2. Before responding, the server scans the profile and omits obvious violations (possibly auto-reporting internally too).
  3. Client receives amended profile, displays to user.
  4. User spots scam link, reports user using relevant API.
  5. Server sees report was within ~5 minutes of it serving the profile, attaches the profile as-served to the internal ticket for review.
  6. Server moderator sees both as-served profile and the related ticket for field omission, and can make decisions from there.

None of this exists today, but it would be an immensely valuable tool considering this MSC's feature design. This is not something the MSC needs to describe or spell out, but it should reference that moderation and safety tooling will need to consider the caching semantics of profiles in their design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MSC appears to have gone through a few rounds of comments, and organically iterated upon as a result. It's a bit difficult to read from the dozens of small edits, but I think I get the idea the proposal is trying to put forward.

It's not necessarily a blocker for FCP, but I would encourage a top to bottom read/edit to ensure the paragraphs and sections flow well together and create a cohesive idea. I've annotated a couple of places which need particular attention in my review for reference.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I've adopted the specific edits mentioned, and also reorganised sections and corrected some unusual formatting/phrasing in the hope that it helps make the overall document a bit easier.

@turt2live
Copy link
Member

@mscbot resolve Size limits section needs iteration
@mscbot concern General clarity
@mscbot concern User-specified fields should be deferred out of this MSC

@turt2live
Copy link
Member

(please avoid force pushing as it makes review significantly harder)

Comment on lines +163 to +165
- **When capability is missing**: Clients SHOULD assume extended profiles are supported and that
they can be created or modified. If a server intends to deny some (or all) changes, it SHOULD use
the capability to advertise this, improving the client experience.
Copy link
Member

@clokep clokep Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is backwards compatible -- if it is missing shouldn't it be assumed to be not supported?

Or maybe it should be assumed to be enabled, but the only allowed fields depend on m.set_displayname and m.set_avatar_url?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was primarily off the back of this thread: #4133 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this just needs to state that when the capability is missing, but the homeserver's supported spec versions supports custom profile fields?

Comment on lines +231 to +238
- **`M_TOO_MANY_KEYS`**: Exceeds key limits.

```json
{
"errcode": "M_TOO_MANY_KEYS",
"error": "The user has exceeded the maximum number of allowed keys in their profile."
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here -- what is the maximum number of keys? If we have the 64 KiB limit why do we have this also?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was an example, so that a homeserver may optionally limit the number of keys a user may create, so clients know to recognise this specific type of situation when it crops up. Should I add explanation for its existence, or just remove it? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the size of a key and the total size is probably enough limitations? 🤷

Comment on lines +93 to +94
- **Description**: Merge the provided JSON object into the user's profile, updating any
specified keys without altering others, if permitted by the homeserver.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify -- is this a full merge (e.g. RFC 7396) or replacing any values by the top-level key?

For example, if the current values is:

{
  "initial": "untouched",
  "foo": {"bar": "test"}
}

And a PATCH request is received:

{
  "key": true,
  "foo": {"baz": 1}
}

What's the resulting JSON?

Full merge

{
  "initial": "untouched",
  "key": true,
  "foo": {
    "bar": "test",
    "baz": 1
  }
}

Top level merge

{
  "initial": "untouched",
  "key": true,
  "foo": {"baz": 1}
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'd assumed it would overwrite the top-level key, but I guess that also invites the same race conditions we've been trying to avoid with the single-key operations.

One worth asking the SCT, perhaps? Is merging trees behaviour we want to protect in Matrix, or should we treat each key as an insular piece of data that should always be overwritten entirely?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get much traction from the SCT about which way to go.

@tulir Do you know what hungryserv does? The Synapse PR does just a top-level merge.

I think ways forward are to:

  1. Just choose one and see what people think.
  2. Move PATCH into a separate MSC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like hungry does a top-level merge too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I see the benefit of PATCH here, whatever merging method we use is going to cause confusion. I think the easiest thing to do is to remove support for this and just have clients either re-upload or set individual keys.

}
```

- **`M_PROFILE_TOO_LARGE`**: Exceeds total profile size limits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just M_TOO_LARGE which already exists?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was because of this thread: #4133 (comment)

Now we've lowered the amount of possible errors, I could switch to using the errors you've proposed, it's just at the time Dave suggested I use clear errors to differentiate which thing was "too large".

}
```

- **`M_KEY_TOO_LARGE`**: Exceeds individual key length limits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not M_INVALID_PARAM?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error should probably already be used if someone tries to do something like use an invalid type for the key name:

M_INVALID_PARAM A parameter that was specified has the wrong value. For example, the server expected an integer and instead received a string.

We could just use this to reject all unacceptable keys though, what do you think? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we would reject all unacceptable keys with M_INVALID_PARAM.

To minimise the impact of abuse, clients SHOULD offer suitable defaults for displaying user-entered
content. A user MAY choose to display fields from all users globally, but *by default* profiles
SHOULD only be shown when the users share the current room and the other user is in the `join`,
`invite`, or `knock` membership states.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about knock here, given for knockable rooms this can be done by any random Matrix user and so there is no extra "trust" gleaned from the fact you "share" a room?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that you'd want to see the fields to help decide if you should allow them to enter.

- Servers SHOULD cache remote profiles for at least 5 minutes after retrieval. Until a method for
servers to notify each other of updates is established, it is recommended that profiles be cached
no longer than an hour to avoid displaying stale data. Servers SHOULD NOT cache profiles for
longer than 24 hours to avoid unwanted data persistence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is really needed if this is triggered by a user-driven API requests? Generally we're OK with doing a federation request in that case, as federation requests aren't that expensive. The issue is when servers end up doing automated tasks that end up doings lots of requests.

The downside with recommending caching is that it makes the UX materially worse over federation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good to know. I do think there's some room for optimisation (e.g. if someone joins a room with 10,000 users in it, we don't really want their server asked for timezone information 10,000 times) but we also don't want stale information.

My preference would be to use an EDU to signal when a profile's been updated, so servers/clients can hold onto profile information rather than needing to repeatedly call it. I've avoided including that in this MSC, but this is the rationale behind including caching because it'll likely be encouraged in a future one.

Perhaps I should switch it to a "may" so it's up to servers whether they cache now (and admins can decide how much to compromise UX for performance) but then it's a simple wording change later?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-server Client-Server API disposition-merge kind:feature MSC for not-core and not-maintenance stuff proposal A matrix spec change proposal proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. unresolved-concerns This proposal has at least one outstanding concern
Projects
Status: Ready for FCP ticks
Development

Successfully merging this pull request may close these issues.