Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC1763: Proposal for specifying configurable message retention periods #1763

Open
wants to merge 37 commits into
base: old_master
Choose a base branch
from

Conversation

ara4n
Copy link
Member

@ara4n ara4n commented Dec 30, 2018

@ara4n ara4n added proposal A matrix spec change proposal proposal-in-review labels Dec 30, 2018
@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2018

This is an attempt to unblock matrix-org/matrix-spec#32 and matrix-org/matrix-spec#35 and solve the problem of how to specify configurable per-room and per-user message retention. This is particularly driven by Disroot & Hackint disabling their Matrix bridges due to concerns over unlimited message retention, and meanwhile there are also corporate Matrix users on the horizon who require configurable history retention. Hence trying to slay the beast for once and for all...

proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
proposals/1763-configurable-retention-periods.md Outdated Show resolved Hide resolved
@richvdh richvdh self-requested a review January 2, 2019 07:18
@ara4n
Copy link
Member Author

ara4n commented Feb 18, 2019

One thought: i wonder if we should do something to expire megolm key backup for sessions whose events have all been deleted from the server...

@richvdh richvdh removed their request for review February 27, 2019 11:10
Comment on lines 38 to 39
* to provide "ephemeral messaging" semantics where messages are best-effort
deleted after being read.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I question the feasibility of this - on what I essentially see as a matrix-specced version of Synapse's History Purge functionality. What would qualify exactly as "after read"? Shouldn't this be removed and left alone for MSC2228 to specify or address?


In the instance of `min_lifetime` or `max_lifetime` being overridden, the
invariant that `max_lifetime >= min_lifetime` must be maintained by clamping
max_lifetime to be equal to `min_lifetime`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this also be added as a fallback when the max_lifetime >= min_lifetime invariant is broken?

Comment on lines 128 to 129
The UI for this could be a warning banner in the room to remind the user that
that room's retention setting doesn't match their preferred default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read somewhere else that the spec doesn't mandate (anymore) how clients will expose UI elements to users, maybe a more abstract description should be used as to when the client is warned, such as;

  • when the client joins a room
  • when the retention settings are changed
  • when the client is viewing retention settings (e.g. "warning, there are 4 rooms which override this behaviour")

@@ -0,0 +1,315 @@
# Proposal for specifying configurable per-room message retention periods.
Copy link
Contributor

@ShadowJonathan ShadowJonathan Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m sensing an innate conflict within this MSCs interests, one where it both wants to reduce server history in rooms, but where it also simultaneously expects to be able to fetch that history from thin air at any convenient time. I have a feeling it’s written with the underlying idea that large servers will carry all the events in the federation, with some servers being able to fetch from those at any time.

…however, this is mentioned nowhere in the MSC, where it skirts around these problems by putting these assumptions between the lines, while not thinking critically about what this means for the larger federation; more dependency on large servers.

With this, it does not bring a lucid solution to the problem of dealing with history retention, one where any server eventually has to face that it cannot fetch events it knows exist(ed), but are now expected to respond with them to a client’s query.

The semantic equivalent of HTTP Error 410 (“gone”) has to exist somewhere here, to be able to tell clients it’s unable to fetch a historical event due to history retention, and all sad and happy paths that spring from that. The current stance against this is “you’re SOL, have a 404 with no context”.


I don’t see this MSC deal with the reality that it is deleting events, I don’t see a coherent solution to allow some servers to “archive” history, and make that explicit (also in the rooms, for privacy concerns, for people who wanna know which servers are ignoring retention rules and archiving anyways)

Servers ignoring retention rules does have a basis, namely one of actually archiving historic conversations, in a similar philosophy as The Internet Archive. If this MSC were to go through as-is, then we’d have a similar situation as the general internet, namely one where all history is lost to time due to individual retention strategies.

While reliance on large servers isn’t what a federation would want, an explicit form of mentioning where at least people are aware which servers are backing up, and which ones aren’t, would help this MSC greatly in the long run.

@babolivier
Copy link
Contributor

babolivier commented Apr 4, 2022

As a status update: I've started rewriting bits of this MSC to better match reality and fix some issues but haven't finished that yet, so it shouldn't be considered for review until this is done. I'll remove the "in review" label. Thanks all for feedback.

@natecovington
Copy link

This might not be the right way to request such a thing, but if I could wave my magic wand, I'd build this into Synapse Admin, so I can delete an individual users' media:
Screenshot from 2022-08-19 10-44-52

And something like this into the "room settings" so I can control the room message / media history settings right from Element:
Screenshot from 2022-08-19 10-51-18

@aaronraimist
Copy link
Contributor

@natecovington
Copy link

Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Sep 15, 2022
Synapse 1.67.0 (2022-09-13)
===========================

This release removes using the deprecated direct TCP replication configuration
for workers. Server admins should use Redis instead. See the [upgrade
notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670).

The minimum version of `poetry` supported for managing source checkouts is now
1.2.0.

**Notice:** from the next major release (1.68.0) installing Synapse from a source
checkout will require a recent Rust compiler. Those using packages or
`pip install matrix-synapse` will not be affected. See the [upgrade
notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670).

**Notice:** from the next major release (1.68.0), running Synapse with a SQLite
database will require SQLite version 3.27.0 or higher. (The [current minimum
 version is SQLite 3.22.0](https://github.com/matrix-org/synapse/blob/release-v1.67/synapse/storage/engines/sqlite.py#L69-L78).)
See [matrix-org#12983](matrix-org#12983) and the [upgrade notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670) for more details.

No significant changes since 1.67.0rc1.

Synapse 1.67.0rc1 (2022-09-06)
==============================

Features
--------

- Support setting the registration shared secret in a file, via a new `registration_shared_secret_path` configuration option. ([\matrix-org#13614](matrix-org#13614))
- Change the default startup behaviour so that any missing "additional" configuration files (signing key, etc) are generated automatically. ([\matrix-org#13615](matrix-org#13615))
- Improve performance of sending messages in rooms with thousands of local users. ([\matrix-org#13634](matrix-org#13634))

Bugfixes
--------

- Fix a bug introduced in Synapse 1.13 where the [List Rooms admin API](https://matrix-org.github.io/synapse/develop/admin_api/rooms.html#list-room-api) would return integers instead of booleans for the `federatable` and `public` fields when using a Sqlite database. ([\matrix-org#13509](matrix-org#13509))
- Fix bug that user cannot `/forget` rooms after the last member has left the room. ([\matrix-org#13546](matrix-org#13546))
- Faster Room Joins: fix `/make_knock` blocking indefinitely when the room in question is a partial-stated room. ([\matrix-org#13583](matrix-org#13583))
- Fix loading the current stream position behind the actual position. ([\matrix-org#13585](matrix-org#13585))
- Fix a longstanding bug in `register_new_matrix_user` which meant it was always necessary to explicitly give a server URL. ([\matrix-org#13616](matrix-org#13616))
- Fix the running of [MSC1763](matrix-org/matrix-spec-proposals#1763) retention purge_jobs in deployments with background jobs running on a worker by forcing them back onto the main worker. Contributed by Brad @ Beeper. ([\matrix-org#13632](matrix-org#13632))
- Fix a long-standing bug that downloaded media for URL previews was not deleted while database background updates were running. ([\matrix-org#13657](matrix-org#13657))
- Fix [MSC3030](matrix-org/matrix-spec-proposals#3030) `/timestamp_to_event` endpoint to return the correct next event when the events have the same timestamp. ([\matrix-org#13658](matrix-org#13658))
- Fix bug where we wedge media plugins if clients disconnect early. Introduced in v1.22.0. ([\matrix-org#13660](matrix-org#13660))
- Fix a long-standing bug which meant that keys for unwhitelisted servers were not returned by `/_matrix/key/v2/query`. ([\matrix-org#13683](matrix-org#13683))
- Fix a bug introduced in Synapse v1.20.0 that would cause the unstable unread counts from [MSC2654](matrix-org/matrix-spec-proposals#2654) to be calculated even if the feature is disabled. ([\matrix-org#13694](matrix-org#13694))

Updates to the Docker image
---------------------------

- Update docker image to use a stable version of poetry. ([\matrix-org#13688](matrix-org#13688))

Improved Documentation
----------------------

- Improve the description of the ["chain cover index"](https://matrix-org.github.io/synapse/latest/auth_chain_difference_algorithm.html) used internally by Synapse. ([\matrix-org#13602](matrix-org#13602))
- Document how ["monthly active users"](https://matrix-org.github.io/synapse/latest/usage/administration/monthly_active_users.html) is calculated and used. ([\matrix-org#13617](matrix-org#13617))
- Improve documentation around user registration. ([\matrix-org#13640](matrix-org#13640))
- Remove documentation of legacy `frontend_proxy` worker app. ([\matrix-org#13645](matrix-org#13645))
- Clarify documentation that HTTP replication traffic can be protected with a shared secret. ([\matrix-org#13656](matrix-org#13656))
- Remove unintentional colons from [config manual](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html) headers. ([\matrix-org#13665](matrix-org#13665))
- Update docs to make enabling metrics more clear. ([\matrix-org#13678](matrix-org#13678))
- Clarify `(room_id, event_id)` global uniqueness and how we should scope our database schemas. ([\matrix-org#13701](matrix-org#13701))

Deprecations and Removals
-------------------------

- Drop support for calling `/_matrix/client/v3/rooms/{roomId}/invite` without an `id_access_token`, which was not permitted by the spec. Contributed by @Vetchu. ([\matrix-org#13241](matrix-org#13241))
- Remove redundant `_get_joined_users_from_context` cache. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13569](matrix-org#13569))
- Remove the ability to use direct TCP replication with workers. Direct TCP replication was deprecated in Synapse v1.18.0. Workers now require using Redis. ([\matrix-org#13647](matrix-org#13647))
- Remove support for unstable [private read receipts](matrix-org/matrix-spec-proposals#2285). ([\matrix-org#13653](matrix-org#13653), [\matrix-org#13692](matrix-org#13692))

Internal Changes
----------------

- Extend the release script to wait for GitHub Actions to finish and to be usable as a guide for the whole process. ([\matrix-org#13483](matrix-org#13483))
- Add experimental configuration option to allow disabling legacy Prometheus metric names. ([\matrix-org#13540](matrix-org#13540))
- Cache user IDs instead of profiles to reduce cache memory usage. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13573](matrix-org#13573), [\matrix-org#13600](matrix-org#13600))
- Optimize how Synapse calculates domains to fetch from during backfill. ([\matrix-org#13575](matrix-org#13575))
- Comment about a better future where we can get the state diff between two events. ([\matrix-org#13586](matrix-org#13586))
- Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls for understandable traces in Jaeger. ([\matrix-org#13588](matrix-org#13588))
- Improve performance of `@cachedList`. ([\matrix-org#13591](matrix-org#13591))
- Minor speed up of fetching large numbers of push rules. ([\matrix-org#13592](matrix-org#13592))
- Optimise push action fetching queries. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13597](matrix-org#13597))
- Rename `event_map` to `unpersisted_events` when computing the auth differences. ([\matrix-org#13603](matrix-org#13603))
- Refactor `get_users_in_room(room_id)` mis-use with dedicated `get_current_hosts_in_room(room_id)` function. ([\matrix-org#13605](matrix-org#13605))
- Use dedicated `get_local_users_in_room(room_id)` function to find local users when calculating `join_authorised_via_users_server` of a `/make_join` request. ([\matrix-org#13606](matrix-org#13606))
- Refactor `get_users_in_room(room_id)` mis-use to lookup single local user with dedicated `check_local_user_in_room(...)` function. ([\matrix-org#13608](matrix-org#13608))
- Drop unused column `application_services_state.last_txn`. ([\matrix-org#13627](matrix-org#13627))
- Improve readability of Complement CI logs by printing failure results last. ([\matrix-org#13639](matrix-org#13639))
- Generalise the `@cancellable` annotation so it can be used on functions other than just servlet methods. ([\matrix-org#13662](matrix-org#13662))
- Introduce a `CommonUsageMetrics` class to share some usage metrics between the Prometheus exporter and the phone home stats. ([\matrix-org#13671](matrix-org#13671))
- Add some logging to help track down matrix-org#13444. ([\matrix-org#13679](matrix-org#13679))
- Update poetry lock file for v1.2.0. ([\matrix-org#13689](matrix-org#13689))
- Add cache to `is_partial_state_room`. ([\matrix-org#13693](matrix-org#13693))
- Update the Grafana dashboard that is included with Synapse in the `contrib` directory. ([\matrix-org#13697](matrix-org#13697))
- Only run trial CI on all python versions on non-PRs. ([\matrix-org#13698](matrix-org#13698))
- Fix typechecking with latest types-jsonschema. ([\matrix-org#13712](matrix-org#13712))
- Reduce number of CI checks we run for PRs. ([\matrix-org#13713](matrix-org#13713))

# -----BEGIN PGP SIGNATURE-----
#
# iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmMgR2QQHGVyaWtAbWF0
# cml4Lm9yZwAKCRClQuTtGw+sCfG7B/94PwW1ChsaI8hkz/3e+93PEl/mNJ6YFaEB
# 5pP4Dh/0dipP/iKbpgNuj5xz/JFnIi8D49A8sKNnku3jk0/8AZHgqDiBgOkrN76z
# Y3awo5Q9ag4xww/105V3bhdnX1NrX8Avf6F2jchDv6/9q8wQHGBPg6DMgfZ/m/BL
# SB4dypbbNpgLykuwtWxx6YMUYH+trsXJOn/MoAqld3QcZsqkDR25wXCt9+Dr+6AT
# dPd/czi8kV8ruU59tf2K5HB7XKzBW9S3Qb3dJJmGOTTJ7ccUkN/XuTwqnII950Mo
# bSlMXjY2hqk8rKUNhGZpi9bqUkwNhMgOkZl9A0Y1XtsXx6yjy0T/
# =zSGi
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue Sep 13 10:03:32 2022 BST
# gpg:                using RSA key 053191DFF4670330465227F7A542E4ED1B0FAC09
# gpg:                issuer "erik@matrix.org"
# gpg: Can't check signature: No public key

# Conflicts:
#	synapse/config/experimental.py
#	synapse/push/bulk_push_rule_evaluator.py
#	synapse/storage/databases/main/event_push_actions.py
#	synapse/util/caches/descriptors.py
In order to better match the existing implementation, and to specify missing components that are needed for a working end-to-end implementation.
@jaonoctus
Copy link

Any updates on this?

@babolivier
Copy link
Contributor

Any updates on this?

I've pushed a significant rewrite of the MSC last month, the next step is fixing the Synapse implementation to

(or implement it in a different server implementation)

It also needs implementing into a client. Once both the server and client implementations fully exist, the MSC can move forward.

@flatsponge
Copy link

any updates? it has been almost half a year :(

@x48115
Copy link

x48115 commented Jul 28, 2023

I'm also interested in per-room message retention policies

@natecovington
Copy link

My solution was to move my Matrix server from a "cloud VPS" to a selfhosted mini-PC that I'm running out of my home office. I use a $5/month cloud VPS to use the public IP address and securely tunnel to my mini-PC at home. That way I don't have to pay for all my storage in the cloud every month.
https://www.covingtoncreations.com/blog/decentralized-web-app-self-hosting

My other thought was if a room gets "too big" you can create a fresh room, boot all the users, and I'm pretty sure Synapse will purge and remove that old room and related media after some time automatically. I haven't tried this yet personally because I solved the problem by switching the server location. A small form factor PC or mini-NUC with a 512SSD is cheap compared to how much it costs for a decent size monthly disk space on a cloud VPS.

@sallyFoster
Copy link

Damn, this has been petitioned for 5 years and we still don't have it...

@natecovington
Copy link

I have a scenario for you to illustrate how this gets more complicated than what you see on the surface. Suppose we have a matrix room with a lot of history, media, etc... and we're dealing with storage space issues.

Now, suppose we have two different matrix servers federating that same room. One homeserver is on a tight budget and only has enough room for X. The second homeserver has unlimited budget.

How do we make it so you can change in Element for one homeserver, what happens to the room history and media on another homeserver? In terms of record keeping, room history, etc... this gets complicated quickly. We don't want to tell one server to remove files just because another server is running out of space, etc.

@otech47
Copy link

otech47 commented Nov 27, 2024

ACK, would be a great feature to have this. Does anyone know to what degree this is already implemented in Synapse? Max retention only? Found these

@anoadragon453
Copy link
Member

@otech47 Synapse's message retention documentation contains info about its current status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:core MSC which is critical to the protocol's success needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.