synapse_port_db does not correctly create indexes #4877

richvdh · 2019-03-18T16:43:33Z

There are a few reports of people having migrated their database from sqlite to postgres and ending up with missing or completely wrong indexes.

Tables known to be affected include:

user_ips (missing unique index so upserts fail)
device_lists_remote_cache (missing unique index so upserts fail)
state_groups_state (index is on (state_group) instead of (state_group, type, state_key))

The text was updated successfully, but these errors were encountered:

richvdh · 2019-03-18T16:57:08Z

The indexes in question are created on background updates. We don't run the background updates on a migration (synapse_port_db truncates the background update table), which is normally fine because we expect the data in the sqlite database to have been updated. However, we create the postgres data schema by walking the delta files, so we'll end up with entirely the wrong schema.

The best fix for now is probably to create a new schema snapshot; however it would be good to think about how we could avoid this becoming a problem again in future.

We'll also need to think about how to put everyone's schema back together again now that we have messed it up.

MparkG · 2019-04-01T15:18:02Z

I get this following error aswell (after applying the top posts corrections)
however the first two points are not clear to me..

#2210 error says:
psycopg2.ProgrammingError: there is no unique or exclusion constraint matching the ON CONFLICT specification

Does that mean for this device_lists_remote_cache table there are not just indexes misconfigured, but also Constraints missing?

Where can i look up how it should look, to sort it out once and for all?
Is there a .sql file with the base structure of the postgresql db, where all is listed?

Thanks

huguesdk · 2019-04-07T21:38:32Z

With Synapse 0.99.3 (schema version 53), this is how to fix the database (I diffed the schemas from a normal PostgreSQL installation and from one migrated from SQLite):

CREATE INDEX access_tokens_device_id ON public.access_tokens USING btree (user_id, device_id);
CREATE INDEX current_state_events_member_index ON public.current_state_events USING btree (state_key) WHERE (type = 'm.room.member'::text);
DROP INDEX device_inbox_stream_id;
CREATE INDEX device_inbox_stream_id_user_id ON public.device_inbox USING btree (stream_id, user_id);
CREATE UNIQUE INDEX device_lists_remote_cache_unique_id ON public.device_lists_remote_cache USING btree (user_id, device_id);
CREATE UNIQUE INDEX device_lists_remote_extremeties_unique_idx ON public.device_lists_remote_extremeties USING btree (user_id);
CREATE INDEX device_lists_stream_user_id ON public.device_lists_stream USING btree (user_id, device_id);
CREATE INDEX event_contains_url_index ON public.events USING btree (room_id, topological_ordering, stream_ordering) WHERE ((contains_url = true) AND (outlier = false));
CREATE INDEX event_push_actions_highlights_index ON public.event_push_actions USING btree (user_id, room_id, topological_ordering, stream_ordering) WHERE (highlight = 1);
CREATE INDEX event_push_actions_u_highlight ON public.event_push_actions USING btree (user_id, stream_ordering);
CREATE UNIQUE INDEX event_search_event_id_idx ON public.event_search USING btree (event_id);
CREATE INDEX event_to_state_groups_sg_index ON public.event_to_state_groups USING btree (state_group);
CREATE INDEX local_media_repository_url_idx ON public.local_media_repository USING btree (created_ts) WHERE (url_cache IS NOT NULL);
DROP INDEX state_groups_state_id;
CREATE INDEX state_groups_state_type_idx ON public.state_groups_state USING btree (state_group, type, state_key);
CREATE INDEX user_ips_device_id ON public.user_ips USING btree (user_id, device_id, last_seen);
CREATE INDEX user_ips_last_seen ON public.user_ips USING btree (user_id, last_seen);
CREATE INDEX user_ips_last_seen_only ON public.user_ips USING btree (last_seen);
DROP INDEX user_ips_user_ip;
CREATE UNIQUE INDEX user_ips_user_token_ip_unique_index ON public.user_ips USING btree (user_id, access_token, ip);
CREATE INDEX users_creation_ts ON public.users USING btree (creation_ts);

After running this, the two schemas are exactly the same (except for the presence of the port_from_sqlite3 table, of course). This solved the problem for me. 🎉

julianfoad · 2019-04-09T13:13:58Z

Thank you, @huguesdk ! Can you also post how you compared the schemas and the expected pristine schema of 0.99.3, so I can check mine? (I have run your fixes, but I don't know what synapse version I was running when I migrated from sqlite, so the fixes I need might be different from yours.)

huguesdk · 2019-04-09T19:30:49Z

Glad to help! I’m running PostgreSQL in a Docker container, so to dump the schema I use:

docker exec <container> pg_dump -s -d homeserver -U synapse > synapse_schema.sql

If you are not using Docker, just use the last part:

pg_dump -s -d homeserver -U synapse > synapse_schema.sql

I simply compared the two schemas with diff:

diff synapse_schema_expected.sql synapse_schema.sql

The expected schema is here. (It’s gzipped, because GitHub doesn’t support attaching .sql files.)

marcotuna · 2019-04-20T21:57:46Z

Thanks @huguesdk, had the same issue after migrating from Sqlite to Postgres, this solved my problem also.

jayavanth · 2019-04-22T22:41:51Z

@huguesdk I get this error when I run the first query

synapse=# CREATE INDEX access_tokens_device_id ON public.access_tokens USING btree (user_id, device_id);
ERROR:  relation "access_tokens_device_id" already exists

huguesdk · 2019-04-23T08:22:47Z

@jayavanth: Was your database migrated from SQLite? To be sure what’s missing, dump the schema and diff it like explained above. What issue do you actually have?

jayavanth · 2019-04-23T16:55:40Z

@huguesdk Here's the diff https://pastebin.com/hK3Nv5Dm. We can ignore the db username in the diff right?

(Posted this in Synapse Admins but it got lost in the threads)
I'm trying to purge older messages in a particular room. I can see the older messages in Riot but when I run

SELECT event_id FROM events WHERE type='m.room.message' AND received_ts<'1553290290477' AND room_id=<room_id>;

I get (0) rows

Even when I do SELECT event_id FROM events WHERE type='m.room.message' ; I only get like 1000 events. We've been using it for a long time so there's wayyyyyyyy more messages than that

So, that's why I think something is wrong with my port.

huguesdk · 2019-04-23T19:54:28Z

@jayavanth: Looking at your diff, it seems that the only difference (besides the port_from_sqlite3 table of course) is the owner name, so your schema seems correct.

I have no experience in purging older messages, and I know little about the database structure. Did you actually do any operation on the database to purge older messages? If you did and still see the older messages in Riot, maybe it’s because they are still in Riot’s cache? Did you try clearing the cache? In Riot web, you can do this by going to Settings > Help & About > Clear Cache and Reload.

jayavanth · 2019-04-23T21:18:01Z

@huguesdk No, I didn't do any operation to purge it. Just cleared the cache but I can still see the messages 🙁. What's a good place to ask this question? A new issue maybe?

huguesdk · 2019-04-24T22:29:14Z

@jayavanth: Are you trying to purge messages manually by making changes directly in the database? I would advice against this (unless you know the structure very well). Did you take a look at the Purge History API? Currently, the best place to ask questions about this is in the Synapse Admins room. Unfortunately, questions get quickly lost in the conversation, since there is no proper threading yet. I think that a forum (like Discourse for example) would be more appropriate for this, but currently there is none.

jayavanth · 2019-04-24T23:26:55Z

@huguesdk I'm using this script https://github.com/matrix-org/synapse/blob/master/contrib/purge_api/purge_history.sh which calls Purge History API. The script does a similar query to what I described above.

Totally agree with having different channels to avoid messages getting lost in the pile

peterhoeg · 2019-05-30T02:09:32Z

I have 2 missing indices, which cannot be created because of duplicate values:

psycopg2.IntegrityError: could not create unique index "device_lists_remote_cache_unique_id"
psycopg2.IntegrityError: could not create unique index "user_ips_user_token_ip_unique_index"

Is it safe to shut down synapse and then wipe the 2 tables as they seem to be cache tables only?

peterhoeg · 2019-05-30T09:50:13Z

Further to my earlier message, I can confirm that the following made synapse properly create the indices automatically with no adverse effects (that I have seen) - but please see the next message for a warning on federated e2e encrypted chats:

DELETE FROM public.device_lists_remote_cache;
DELETE FROM public.device_lists_remote_extremeties;
DELETE FROM public.user_ips;

I hope I didn't break anything but so far it looks good.

richvdh · 2019-05-30T10:24:45Z

DELETE FROM public.device_lists_remote_cache;
DELETE FROM public.device_lists_remote_extremeties;

Just as a warning to others tempted to try the same thing: this has the potential to break e2e messaging. If you're not using e2e over federation, you'll be fine. Otherwise, you get to keep both parts when it breaks ;)

peterhoeg · 2019-05-31T09:27:31Z

Do you by chance have a safe set of sql statements that can be run to clean things up? I am thinking cleaning up duplicates and the likes so that the indices can be created properly?

Tokodomo · 2019-06-02T14:29:01Z

@ara4n @richvdh @erikjohnston
I'm sorry guys, but the ongoing problems with synapse_port_db are outrageously ridiculous... How many people should ran into that knife? Can you please concentrate on fixing the critical bugs in the official codebase (like synapse_port_db)? Testing such sensible parts of Synapse is essential, isn't it? Remove that script from the official github repo (if it doesn't work like expected). Otherwise, please fix it. Luckily, I tried it beforehand (on a test system):

cat homeserver.log
...
2019-06-02 13:39:53,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-853 - Failed to insert client IP (('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475590506)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
2019-06-02 13:41:23,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-871 - Failed to insert client IP (('@carol:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBjYXJvbDpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSA9ak8tZmtnKkB-dUtPcVlxCjAwMmZzaWduYXR1cmUgCi9fTVCB-1JJNy6av5APpbjoTy2YMZ8zMeToN0rpa2AK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'ATQMZGBFYH', 1559475681482)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
2019-06-02 13:42:13,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-881 - Failed to insert client IP (('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475729288)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
2019-06-02 13:43:48,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-900 - Failed to insert client IP (('@carol:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBjYXJvbDpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSA9ak8tZmtnKkB-dUtPcVlxCjAwMmZzaWduYXR1cmUgCi9fTVCB-1JJNy6av5APpbjoTy2YMZ8zMeToN0rpa2AK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'ATQMZGBFYH', 1559475824841)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
2019-06-02 13:44:18,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-906 - Failed to insert client IP (('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475854569)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')

cat pgsql_store/postgresql.log
2019-06-02 11:22:11.263 CEST [23641] LOG:  listening on IPv6 address "::1", port 54320
2019-06-02 11:22:11.263 CEST [23641] LOG:  listening on IPv4 address "127.0.0.1", port 54320
2019-06-02 11:22:11.329 CEST [23641] LOG:  listening on Unix socket "~/pgsql_store/.s.PGSQL.54320"
2019-06-02 11:22:11.448 CEST [23642] LOG:  database system was shut down at 2019-06-02 11:21:50 CEST
2019-06-02 11:22:11.492 CEST [23641] LOG:  database system is ready to accept connections
2019-06-02 11:38:11.504 CEST [28307] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:38:11.504 CEST [28307] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@bob:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWJjaWQgdXNlcl9pZCA9IEBib2I6bmFzCjAwMTZjaWQgdHlwZSA9IGFjY2VzcwowMDIxY2lkIG5vbmNlID0gQUhRKjJtS3dPRExEMkx6NAowMDJmc2lnbmF0dXJlILxzIZUbG2qcu4bTR0QSFi1-Gz_jqbUuPRYnkziqCfcdCg', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'LVVTBWVFVR', 1559468290864) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:38:11.510 CEST [28307] ERROR:  current transaction is aborted, commands ignored until end of transaction block
2019-06-02 11:38:11.510 CEST [28307] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559468291411) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:40:11.505 CEST [28306] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:40:11.505 CEST [28306] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@bob:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWJjaWQgdXNlcl9pZCA9IEBib2I6bmFzCjAwMTZjaWQgdHlwZSA9IGFjY2VzcwowMDIxY2lkIG5vbmNlID0gQUhRKjJtS3dPRExEMkx6NAowMDJmc2lnbmF0dXJlILxzIZUbG2qcu4bTR0QSFi1-Gz_jqbUuPRYnkziqCfcdCg', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'LVVTBWVFVR', 1559468411294) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:40:16.504 CEST [28307] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:40:16.504 CEST [28307] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559468411785) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:42:10.741 CEST [28301] ERROR:  duplicate key value violates unique constraint "device_uniqueness"
2019-06-02 11:42:10.741 CEST [28301] DETAIL:  Key (user_id, device_id)=(@alice:nas, VUONEXXEBP) already exists.
2019-06-02 11:42:10.741 CEST [28301] STATEMENT:  INSERT INTO devices (user_id, device_id, display_name) VALUES('@alice:nas', 'VUONEXXEBP', NULL)
2019-06-02 11:42:16.504 CEST [28308] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:42:16.504 CEST [28308] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@bob:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWJjaWQgdXNlcl9pZCA9IEBib2I6bmFzCjAwMTZjaWQgdHlwZSA9IGFjY2VzcwowMDIxY2lkIG5vbmNlID0gQUhRKjJtS3dPRExEMkx6NAowMDJmc2lnbmF0dXJlILxzIZUbG2qcu4bTR0QSFi1-Gz_jqbUuPRYnkziqCfcdCg', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'LVVTBWVFVR', 1559468531571) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:42:19.028 CEST [28306] ERROR:  duplicate key value violates unique constraint "device_uniqueness"
2019-06-02 11:42:19.028 CEST [28306] DETAIL:  Key (user_id, device_id)=(@bob:nas, LVVTBWVFVR) already exists.
2019-06-02 11:42:19.028 CEST [28306] STATEMENT:  INSERT INTO devices (user_id, device_id, display_name) VALUES('@bob:nas', 'LVVTBWVFVR', NULL)
2019-06-02 11:42:41.504 CEST [28301] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:42:41.504 CEST [28301] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559468561455) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:44:26.504 CEST [28297] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:44:26.504 CEST [28297] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@bob:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWJjaWQgdXNlcl9pZCA9IEBib2I6bmFzCjAwMTZjaWQgdHlwZSA9IGFjY2VzcwowMDIxY2lkIG5vbmNlID0gQUhRKjJtS3dPRExEMkx6NAowMDJmc2lnbmF0dXJlILxzIZUbG2qcu4bTR0QSFi1-Gz_jqbUuPRYnkziqCfcdCg', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'LVVTBWVFVR', 1559468666410) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:45:01.504 CEST [29593] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:45:01.504 CEST [29593] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559468697985) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
...
2019-06-02 12:28:48.826 CEST [10630] ERROR:  current transaction is aborted, commands ignored until end of transaction block
2019-06-02 12:28:48.826 CEST [10630] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559471328473) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 12:29:39.294 CEST [10629] ERROR:  duplicate key value violates unique constraint "device_uniqueness"
2019-06-02 12:29:39.294 CEST [10629] DETAIL:  Key (user_id, device_id)=(@bob:nas, KMOALBDPRY) already exists.
2019-06-02 12:29:39.294 CEST [10629] STATEMENT:  INSERT INTO devices (user_id, device_id, display_name) VALUES('@bob:nas', 'KMOALBDPRY', NULL)
2019-06-02 12:29:48.386 CEST [10630] ERROR:  duplicate key value violates unique constraint "device_uniqueness"
2019-06-02 12:29:48.386 CEST [10630] DETAIL:  Key (user_id, device_id)=(@alice:nas, VUONEXXEBP) already exists.
2019-06-02 12:29:48.386 CEST [10630] STATEMENT:  INSERT INTO devices (user_id, device_id, display_name) VALUES('@alice:nas', 'VUONEXXEBP', NULL)
2019-06-02 12:31:08.823 CEST [10629] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 12:31:08.823 CEST [10629] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@carol:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBjYXJvbDpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSA9ak8tZmtnKkB-dUtPcVlxCjAwMmZzaWduYXR1cmUgCi9fTVCB-1JJNy6av5APpbjoTy2YMZ8zMeToN0rpa2AK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'ATQMZGBFYH', 1559471466274) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 12:31:18.823 CEST [10620] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 12:31:18.823 CEST [10620] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559471476459) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
...
2019-06-02 13:42:13.824 CEST [10620] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 13:42:13.824 CEST [10620] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475729288) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 13:43:48.824 CEST [11460] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 13:43:48.824 CEST [11460] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@carol:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBjYXJvbDpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSA9ak8tZmtnKkB-dUtPcVlxCjAwMmZzaWduYXR1cmUgCi9fTVCB-1JJNy6av5APpbjoTy2YMZ8zMeToN0rpa2AK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'ATQMZGBFYH', 1559475824841) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 13:44:18.823 CEST [11458] ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 13:44:18.823 CEST [11458] STATEMENT:  INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475854569) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen

Finally, two sql dumps (if anybody is interested):
before_synapse_port_db_(sqlite3).zip
after_synapse_port_db_(pgsql).zip

Tokodomo · 2019-06-02T14:51:50Z

#4628 #5306 #5312

z3ntu · 2019-06-27T10:38:09Z

What's the recommended action to take after such a failed migration? Continue using the sqlite db or is it safe to use the "broken" postgres db?

maakuth · 2019-07-11T06:14:00Z

What's the recommended action to take after such a failed migration? Continue using the sqlite db or is it safe to use the "broken" postgres db?

I just applied @huguesdk's schema changes after running a few days with the wrong schema, after migrating from SQLite, so I'm very much about to find out.

babolivier · 2019-07-18T08:28:03Z

I did some investigation and fiddling around this, but won't be able to go much further due to a lack of time to dedicate to it. Here's what we've decided needed to be done to fix this issue:

Make the script fail with an informative message if there are pending background updates in the SQLite DB
Once the PostgreSQL DB is created, make the script run all of the background updates on it before it begins inserting the data

To do the latter, we need to create a new class in Synapse that holds the code for all background updates but doesn't require access to the hs object, otherwise the script needs to instantiate one and it becomes quite trickier to manage. This should be doable, especially as a new full schema has been created in v1.0.0 (which effectively decreases the amount of background updates to run on a new DB).

jptsroh · 2019-08-16T18:13:47Z

I have a similar problem. I cannot upload/attach files anymore due to missing indices. This however only manifested after upgrading to synapse v1.2.1. I would appreciate a quick fix, which reestablishes the mandatory db structure very much.

maakuth · 2019-08-16T18:33:14Z

@jptsroh did you try the SQL script in huguesdk's comment?

jptsroh · 2019-08-17T08:20:18Z

After I chose the right db I could execute those sql statements. Now I can upload files again. Thanks!
However, the postgres error log still shows errors and now complains about duplicate entries:

ERROR: duplicate key value violates unique constraint "device_uniqueness"
2019-08-17 08:11:56.416 UTC [486] matrix_synapse@dopadb DETAIL: Key (user_id, device_id)=(@....) already exists.
STATEMENT: INSERT INTO devices (user_id, device_id, display_name) VALUES('@.....', 'xxx', NULL)

babolivier · 2019-09-24T18:07:58Z

Good news, a fix to this is on its way: #6102! :)

Make `synapse_port_db` correctly create indexes in the PostgreSQL database, by having it run the background updates on the database before migrating the data. To ensure we're migrating the right data, also block the port if the SQLite3 database still has pending or ongoing background updates. Fixes #4877

babolivier · 2019-10-23T15:47:05Z

Fixed in #6102 (and #6243)! :)

z3ntu · 2019-10-23T16:48:20Z

Awesome! Which version will this be included in? 1.4.2, 1.5.0?

babolivier · 2019-10-23T16:49:29Z

Awesome! Which version will this be included in? 1.4.2, 1.5.0?

1.5.0, which should be out in one or two weeks (don't take my word for it though)

richvdh · 2019-10-29T19:19:35Z

I'm going to reopen this I'm afraid, because I feel like an important part of it (helping out all those people who have migrated to postgres at some point in the last 4 years) hasn't been done.

pstn · 2019-12-01T23:25:22Z

Since it might be related and help people out here: I also had wrong behavior. Namely read markers didn't work correctly and (way more dangerous for the network, if my instance wasn't the only one doing it), the timeout for destinations didn't get saved properly. Turns out public.destinations and public.room_account_data didn't have constraints in place that should have been there. I manually deleted the lines that broke the unique constraints and readded them. with
ALTER TABLE ONLY public.destinations ADD CONSTRAINT destinations_pkey PRIMARY KEY (destination);
and
ALTER TABLE ONLY public.room_account_data ADD CONSTRAINT room_account_data_uniqueness UNIQUE (user_id, room_id, account_data_type);

I apologize to anyone, my instance might have spammed while their instances were already struggling.

@richvdh Do you think that there is a way to find other instances with that behavior (if there are any) and notify the administrators of them?

Ralith · 2019-12-01T23:34:46Z

A good way to do that would be to configure a future version of synapse to detect inconsistencies on first startup and correct them, or at least produce a suitably informative fatal error.

richvdh · 2019-12-02T11:55:53Z

the initial plan here is to produce a document to help admins fix their databases. Hopefully we'll get tot that in the next few days.

Automated detection of the missing indexes might happen later but there are no current plans to implement that.

pstn · 2019-12-02T13:54:05Z

I think a document to fix the indexes is appropriate, but I do think the bug I found for the missing constraints lead to a way more serious behavious since it floods the network of hosts that are already struggling to answer to requests in time. I think it would be appropriate to rather not start an intstance (at least without a switch), than just let it continue to do harm to the network.

richvdh · 2019-12-24T00:02:10Z

I'll try and make some progress here.

Here are some schema dumps for various clean databases:

A fresh postgres database, created with synapse v1.7.2: full.172.sql.txt
A database created in sqlite and ported to postgres by synapse 1.4.1 (the last version with a broken synapse_port_db script), then upgraded to v1.7.2: full.ported_141.sql.txt
A database created in sqlite and ported to postgres by synapse 0.99.5 (the last version before we recreated the full schema file which synapse_port_db uses to initialise the postgres database), then upgraded to v1.7.2: full.ported_0995.sql.txt

richvdh · 2019-12-24T00:30:41Z

Diffs between the ported database and the clean db are at https://gist.github.com/richvdh/f1d84edf7c3da1fce2347675dd3d55e5.

So, to fix a broken database, you should be able to run the following sql in psql:

CREATE INDEX IF NOT EXISTS access_tokens_device_id ON access_tokens USING btree (user_id, device_id);
CREATE INDEX IF NOT EXISTS current_state_events_member_index ON current_state_events USING btree (state_key) WHERE (type = 'm.room.member'::text);
CREATE INDEX IF NOT EXISTS device_inbox_stream_id_user_id ON device_inbox USING btree (stream_id, user_id);
DROP INDEX IF EXISTS device_inbox_stream_id;

CREATE UNIQUE INDEX IF NOT EXISTS device_lists_remote_cache_unique_id ON device_lists_remote_cache USING btree (user_id, device_id);
CREATE UNIQUE INDEX IF NOT EXISTS device_lists_remote_extremeties_unique_idx ON device_lists_remote_extremeties USING btree (user_id);

CREATE INDEX IF NOT EXISTS device_lists_stream_user_id ON device_lists_stream USING btree (user_id, device_id);
CREATE INDEX IF NOT EXISTS event_contains_url_index ON events USING btree (room_id, topological_ordering, stream_ordering) WHERE ((contains_url = true) AND (outlier = false));
CREATE INDEX IF NOT EXISTS event_push_actions_highlights_index ON event_push_actions USING btree (user_id, room_id, topological_ordering, stream_ordering) WHERE (highlight = 1);
CREATE INDEX IF NOT EXISTS event_push_actions_u_highlight ON event_push_actions USING btree (user_id, stream_ordering);
CREATE UNIQUE INDEX IF NOT EXISTS event_search_event_id_idx ON event_search USING btree (event_id);
CREATE INDEX IF NOT EXISTS event_to_state_groups_sg_index ON event_to_state_groups USING btree (state_group);
CREATE INDEX IF NOT EXISTS local_media_repository_url_idx ON local_media_repository USING btree (created_ts) WHERE (url_cache IS NOT NULL);

CREATE INDEX IF NOT EXISTS state_groups_state_type_idx ON state_groups_state USING btree (state_group, type, state_key);
DROP INDEX IF EXISTS state_groups_state_id;

CREATE UNIQUE INDEX IF NOT EXISTS user_ips_user_token_ip_unique_index ON user_ips USING btree (user_id, access_token, ip);
CREATE INDEX IF NOT EXISTS user_ips_device_id ON user_ips USING btree (user_id, device_id, last_seen);
CREATE INDEX IF NOT EXISTS user_ips_last_seen ON user_ips USING btree (user_id, last_seen);
CREATE INDEX IF NOT EXISTS user_ips_last_seen_only ON user_ips USING btree (last_seen);
DROP INDEX IF EXISTS user_ips_user_ip;

CREATE INDEX IF NOT EXISTS users_creation_ts ON users USING btree (creation_ts);

CREATE INDEX IF NOT EXISTS room_memberships_user_room_forgotten ON room_memberships USING btree (user_id, room_id) WHERE (forgotten = 1);

Note that some of these tables (especially state_groups_state!) may be large, and adding an index will lock the table, potentially making synapse hang while it waits for database operations to complete, so it may be best to either schedule synapse downtime, or to add CONCURRENTLY to the CREATE INDEX statements (in which case, be sure to run them individually rather than pasting all of the above at once).

Note also that some of the CREATE UNIQUE INDEX commands may fail if there are duplicates in the table. user_ips is particularly likely to be affected in this way; the easiest solution there may be simply to delete the existing data in the table (TRUNCATE TABLE user_ips;): its primary use is in populating the 'last seen IP address' in a user's device list so if that information is not valuable to you or your users, it can be deleted.

richvdh · 2019-12-24T00:31:47Z

I'd welcome any feedback from administrators who have success or otherwise running the above.

richvdh · 2020-01-02T16:33:07Z

so we think the above looks about right. It would be nice to figure out a way to either automate the repair process, or at least point people to the commands when necessary, but I'm going to consider that as a potential issue for the future and close this one.

This was referenced Mar 18, 2019

ERROR - update_client_ips-742- Failed to insert client IP #4628

Closed

Meaning of: Failed to handle edu 'm.device_list_update'? #2210

Closed

neilisfragile added z-bug (Deprecated Label) z-p2 (Deprecated Label) labels Mar 19, 2019

huguesdk mentioned this issue Apr 7, 2019

Federation not working properly with PostgreSQL (can't send messages) #4760

Closed

silvio mentioned this issue Apr 22, 2019

Problem with database update: ImportError: cannot import name execute_batch #5085

Closed

babolivier mentioned this issue Apr 25, 2019

Disable federation when using SQLite by default #5078

Closed

richvdh mentioned this issue May 1, 2019

Device lists can get badly out of sync during federation outages, breaking E2EE #5095

Closed

anoadragon453 added p1 and removed z-p2 (Deprecated Label) labels May 23, 2019

babolivier self-assigned this May 30, 2019

neilisfragile mentioned this issue May 30, 2019

Create new schema snapshot #5297

Closed

Ralith mentioned this issue Jun 12, 2019

Suspiciously costly join query #5439

Closed

babolivier mentioned this issue Sep 24, 2019

Make synapse_port_db correctly create indexes #6102

Merged

babolivier closed this as completed Oct 23, 2019

richvdh reopened this Oct 29, 2019

richvdh mentioned this issue Dec 16, 2019

467 - ERROR - update_client_ips-xxxx- Failed to insert client IP #6549

Closed

turt2live mentioned this issue Dec 23, 2019

psql error trying to update device list for remote user #6593

Closed

richvdh closed this as completed Jan 2, 2020

maxklenk mentioned this issue Nov 20, 2020

user_dir worker database problem #8787

Closed

lsrmg mentioned this issue Dec 8, 2022

synapse.api.errors.StoreError: 404: No row found during background process rotate_notifs #14454

Closed

synapse_port_db does not correctly create indexes #4877

synapse_port_db does not correctly create indexes #4877

Comments

richvdh commented Mar 18, 2019 • edited Loading

richvdh commented Mar 18, 2019 • edited Loading

MparkG commented Apr 1, 2019 • edited Loading

huguesdk commented Apr 7, 2019

julianfoad commented Apr 9, 2019

huguesdk commented Apr 9, 2019

marcotuna commented Apr 20, 2019

jayavanth commented Apr 22, 2019 • edited Loading

huguesdk commented Apr 23, 2019

jayavanth commented Apr 23, 2019 • edited Loading

huguesdk commented Apr 23, 2019

jayavanth commented Apr 23, 2019

huguesdk commented Apr 24, 2019

jayavanth commented Apr 24, 2019

peterhoeg commented May 30, 2019

peterhoeg commented May 30, 2019 • edited Loading

richvdh commented May 30, 2019

peterhoeg commented May 31, 2019

Tokodomo commented Jun 2, 2019

Tokodomo commented Jun 2, 2019

z3ntu commented Jun 27, 2019

maakuth commented Jul 11, 2019

babolivier commented Jul 18, 2019

jptsroh commented Aug 16, 2019

maakuth commented Aug 16, 2019

jptsroh commented Aug 17, 2019 • edited Loading

babolivier commented Sep 24, 2019

babolivier commented Oct 23, 2019

z3ntu commented Oct 23, 2019

babolivier commented Oct 23, 2019

richvdh commented Oct 29, 2019

pstn commented Dec 1, 2019

Ralith commented Dec 1, 2019

richvdh commented Dec 2, 2019

pstn commented Dec 2, 2019

richvdh commented Dec 24, 2019 • edited Loading

richvdh commented Dec 24, 2019

richvdh commented Dec 24, 2019

richvdh commented Jan 2, 2020

richvdh commented Mar 18, 2019 •

edited

Loading

richvdh commented Mar 18, 2019 •

edited

Loading

MparkG commented Apr 1, 2019 •

edited

Loading

jayavanth commented Apr 22, 2019 •

edited

Loading

jayavanth commented Apr 23, 2019 •

edited

Loading

peterhoeg commented May 30, 2019 •

edited

Loading

jptsroh commented Aug 17, 2019 •

edited

Loading

richvdh commented Dec 24, 2019 •

edited

Loading