-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
synapse_port_db does not correctly create indexes #4877
Comments
The indexes in question are created on background updates. We don't run the background updates on a migration ( The best fix for now is probably to create a new schema snapshot; however it would be good to think about how we could avoid this becoming a problem again in future. We'll also need to think about how to put everyone's schema back together again now that we have messed it up. |
I get this following error aswell (after applying the top posts corrections) #2210 error says: Does that mean for this Where can i look up how it should look, to sort it out once and for all? Thanks |
With Synapse 0.99.3 (schema version 53), this is how to fix the database (I diffed the schemas from a normal PostgreSQL installation and from one migrated from SQLite): CREATE INDEX access_tokens_device_id ON public.access_tokens USING btree (user_id, device_id);
CREATE INDEX current_state_events_member_index ON public.current_state_events USING btree (state_key) WHERE (type = 'm.room.member'::text);
DROP INDEX device_inbox_stream_id;
CREATE INDEX device_inbox_stream_id_user_id ON public.device_inbox USING btree (stream_id, user_id);
CREATE UNIQUE INDEX device_lists_remote_cache_unique_id ON public.device_lists_remote_cache USING btree (user_id, device_id);
CREATE UNIQUE INDEX device_lists_remote_extremeties_unique_idx ON public.device_lists_remote_extremeties USING btree (user_id);
CREATE INDEX device_lists_stream_user_id ON public.device_lists_stream USING btree (user_id, device_id);
CREATE INDEX event_contains_url_index ON public.events USING btree (room_id, topological_ordering, stream_ordering) WHERE ((contains_url = true) AND (outlier = false));
CREATE INDEX event_push_actions_highlights_index ON public.event_push_actions USING btree (user_id, room_id, topological_ordering, stream_ordering) WHERE (highlight = 1);
CREATE INDEX event_push_actions_u_highlight ON public.event_push_actions USING btree (user_id, stream_ordering);
CREATE UNIQUE INDEX event_search_event_id_idx ON public.event_search USING btree (event_id);
CREATE INDEX event_to_state_groups_sg_index ON public.event_to_state_groups USING btree (state_group);
CREATE INDEX local_media_repository_url_idx ON public.local_media_repository USING btree (created_ts) WHERE (url_cache IS NOT NULL);
DROP INDEX state_groups_state_id;
CREATE INDEX state_groups_state_type_idx ON public.state_groups_state USING btree (state_group, type, state_key);
CREATE INDEX user_ips_device_id ON public.user_ips USING btree (user_id, device_id, last_seen);
CREATE INDEX user_ips_last_seen ON public.user_ips USING btree (user_id, last_seen);
CREATE INDEX user_ips_last_seen_only ON public.user_ips USING btree (last_seen);
DROP INDEX user_ips_user_ip;
CREATE UNIQUE INDEX user_ips_user_token_ip_unique_index ON public.user_ips USING btree (user_id, access_token, ip);
CREATE INDEX users_creation_ts ON public.users USING btree (creation_ts); After running this, the two schemas are exactly the same (except for the presence of the |
Thank you, @huguesdk ! Can you also post how you compared the schemas and the expected pristine schema of 0.99.3, so I can check mine? (I have run your fixes, but I don't know what synapse version I was running when I migrated from sqlite, so the fixes I need might be different from yours.) |
Glad to help! I’m running PostgreSQL in a Docker container, so to dump the schema I use:
If you are not using Docker, just use the last part:
I simply compared the two schemas with diff:
The expected schema is here. (It’s gzipped, because GitHub doesn’t support attaching .sql files.) |
Thanks @huguesdk, had the same issue after migrating from Sqlite to Postgres, this solved my problem also. |
@huguesdk I get this error when I run the first query
|
@jayavanth: Was your database migrated from SQLite? To be sure what’s missing, dump the schema and diff it like explained above. What issue do you actually have? |
@huguesdk Here's the diff https://pastebin.com/hK3Nv5Dm. We can ignore the db username in the diff right? (Posted this in Synapse Admins but it got lost in the threads)
I get (0) rows Even when I do So, that's why I think something is wrong with my port. |
@jayavanth: Looking at your diff, it seems that the only difference (besides the I have no experience in purging older messages, and I know little about the database structure. Did you actually do any operation on the database to purge older messages? If you did and still see the older messages in Riot, maybe it’s because they are still in Riot’s cache? Did you try clearing the cache? In Riot web, you can do this by going to Settings > Help & About > Clear Cache and Reload. |
@huguesdk No, I didn't do any operation to purge it. Just cleared the cache but I can still see the messages 🙁. What's a good place to ask this question? A new issue maybe? |
@jayavanth: Are you trying to purge messages manually by making changes directly in the database? I would advice against this (unless you know the structure very well). Did you take a look at the Purge History API? Currently, the best place to ask questions about this is in the Synapse Admins room. Unfortunately, questions get quickly lost in the conversation, since there is no proper threading yet. I think that a forum (like Discourse for example) would be more appropriate for this, but currently there is none. |
@huguesdk I'm using this script https://github.com/matrix-org/synapse/blob/master/contrib/purge_api/purge_history.sh which calls Purge History API. The script does a similar query to what I described above. Totally agree with having different channels to avoid messages getting lost in the pile |
I have 2 missing indices, which cannot be created because of duplicate values:
Is it safe to shut down synapse and then wipe the 2 tables as they seem to be cache tables only? |
Further to my earlier message, I can confirm that the following made synapse properly create the indices automatically with no adverse effects (that I have seen) - but please see the next message for a warning on federated e2e encrypted chats: DELETE FROM public.device_lists_remote_cache;
DELETE FROM public.device_lists_remote_extremeties;
DELETE FROM public.user_ips; I hope I didn't break anything but so far it looks good. |
Just as a warning to others tempted to try the same thing: this has the potential to break e2e messaging. If you're not using e2e over federation, you'll be fine. Otherwise, you get to keep both parts when it breaks ;) |
Do you by chance have a safe set of sql statements that can be run to clean things up? I am thinking cleaning up duplicates and the likes so that the indices can be created properly? |
@ara4n @richvdh @erikjohnston
Finally, two sql dumps (if anybody is interested): |
What's the recommended action to take after such a failed migration? Continue using the sqlite db or is it safe to use the "broken" postgres db? |
I just applied @huguesdk's schema changes after running a few days with the wrong schema, after migrating from SQLite, so I'm very much about to find out. |
I did some investigation and fiddling around this, but won't be able to go much further due to a lack of time to dedicate to it. Here's what we've decided needed to be done to fix this issue:
To do the latter, we need to create a new class in Synapse that holds the code for all background updates but doesn't require access to the |
I have a similar problem. I cannot upload/attach files anymore due to missing indices. This however only manifested after upgrading to synapse v1.2.1. I would appreciate a quick fix, which reestablishes the mandatory db structure very much. |
@jptsroh did you try the SQL script in huguesdk's comment? |
After I chose the right db I could execute those sql statements. Now I can upload files again. Thanks! ERROR: duplicate key value violates unique constraint "device_uniqueness" |
Good news, a fix to this is on its way: #6102! :) |
Make `synapse_port_db` correctly create indexes in the PostgreSQL database, by having it run the background updates on the database before migrating the data. To ensure we're migrating the right data, also block the port if the SQLite3 database still has pending or ongoing background updates. Fixes #4877
Awesome! Which version will this be included in? 1.4.2, 1.5.0? |
1.5.0, which should be out in one or two weeks (don't take my word for it though) |
I'm going to reopen this I'm afraid, because I feel like an important part of it (helping out all those people who have migrated to postgres at some point in the last 4 years) hasn't been done. |
Since it might be related and help people out here: I also had wrong behavior. Namely read markers didn't work correctly and (way more dangerous for the network, if my instance wasn't the only one doing it), the timeout for destinations didn't get saved properly. Turns out I apologize to anyone, my instance might have spammed while their instances were already struggling. @richvdh Do you think that there is a way to find other instances with that behavior (if there are any) and notify the administrators of them? |
A good way to do that would be to configure a future version of synapse to detect inconsistencies on first startup and correct them, or at least produce a suitably informative fatal error. |
the initial plan here is to produce a document to help admins fix their databases. Hopefully we'll get tot that in the next few days. Automated detection of the missing indexes might happen later but there are no current plans to implement that. |
I think a document to fix the indexes is appropriate, but I do think the bug I found for the missing constraints lead to a way more serious behavious since it floods the network of hosts that are already struggling to answer to requests in time. I think it would be appropriate to rather not start an intstance (at least without a switch), than just let it continue to do harm to the network. |
I'll try and make some progress here. Here are some schema dumps for various clean databases:
|
Diffs between the ported database and the clean db are at https://gist.github.com/richvdh/f1d84edf7c3da1fce2347675dd3d55e5. So, to fix a broken database, you should be able to run the following sql in psql: CREATE INDEX IF NOT EXISTS access_tokens_device_id ON access_tokens USING btree (user_id, device_id);
CREATE INDEX IF NOT EXISTS current_state_events_member_index ON current_state_events USING btree (state_key) WHERE (type = 'm.room.member'::text);
CREATE INDEX IF NOT EXISTS device_inbox_stream_id_user_id ON device_inbox USING btree (stream_id, user_id);
DROP INDEX IF EXISTS device_inbox_stream_id;
CREATE UNIQUE INDEX IF NOT EXISTS device_lists_remote_cache_unique_id ON device_lists_remote_cache USING btree (user_id, device_id);
CREATE UNIQUE INDEX IF NOT EXISTS device_lists_remote_extremeties_unique_idx ON device_lists_remote_extremeties USING btree (user_id);
CREATE INDEX IF NOT EXISTS device_lists_stream_user_id ON device_lists_stream USING btree (user_id, device_id);
CREATE INDEX IF NOT EXISTS event_contains_url_index ON events USING btree (room_id, topological_ordering, stream_ordering) WHERE ((contains_url = true) AND (outlier = false));
CREATE INDEX IF NOT EXISTS event_push_actions_highlights_index ON event_push_actions USING btree (user_id, room_id, topological_ordering, stream_ordering) WHERE (highlight = 1);
CREATE INDEX IF NOT EXISTS event_push_actions_u_highlight ON event_push_actions USING btree (user_id, stream_ordering);
CREATE UNIQUE INDEX IF NOT EXISTS event_search_event_id_idx ON event_search USING btree (event_id);
CREATE INDEX IF NOT EXISTS event_to_state_groups_sg_index ON event_to_state_groups USING btree (state_group);
CREATE INDEX IF NOT EXISTS local_media_repository_url_idx ON local_media_repository USING btree (created_ts) WHERE (url_cache IS NOT NULL);
CREATE INDEX IF NOT EXISTS state_groups_state_type_idx ON state_groups_state USING btree (state_group, type, state_key);
DROP INDEX IF EXISTS state_groups_state_id;
CREATE UNIQUE INDEX IF NOT EXISTS user_ips_user_token_ip_unique_index ON user_ips USING btree (user_id, access_token, ip);
CREATE INDEX IF NOT EXISTS user_ips_device_id ON user_ips USING btree (user_id, device_id, last_seen);
CREATE INDEX IF NOT EXISTS user_ips_last_seen ON user_ips USING btree (user_id, last_seen);
CREATE INDEX IF NOT EXISTS user_ips_last_seen_only ON user_ips USING btree (last_seen);
DROP INDEX IF EXISTS user_ips_user_ip;
CREATE INDEX IF NOT EXISTS users_creation_ts ON users USING btree (creation_ts);
CREATE INDEX IF NOT EXISTS room_memberships_user_room_forgotten ON room_memberships USING btree (user_id, room_id) WHERE (forgotten = 1); Note that some of these tables (especially Note also that some of the |
I'd welcome any feedback from administrators who have success or otherwise running the above. |
so we think the above looks about right. It would be nice to figure out a way to either automate the repair process, or at least point people to the commands when necessary, but I'm going to consider that as a potential issue for the future and close this one. |
There are a few reports of people having migrated their database from sqlite to postgres and ending up with missing or completely wrong indexes.
Tables known to be affected include:
user_ips
(missing unique index so upserts fail)device_lists_remote_cache
(missing unique index so upserts fail)state_groups_state
(index is on(state_group)
instead of(state_group, type, state_key)
)The text was updated successfully, but these errors were encountered: