Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

synapse_port_db fails when ui_auth_sessions table is non-empty #7654

Closed
gjsmo opened this issue Jun 8, 2020 · 7 comments · Fixed by #7711
Closed

synapse_port_db fails when ui_auth_sessions table is non-empty #7654

gjsmo opened this issue Jun 8, 2020 · 7 comments · Fixed by #7711
Assignees
Labels
z-bug (Deprecated Label) z-p2 (Deprecated Label)

Comments

@gjsmo
Copy link

gjsmo commented Jun 8, 2020

Description

Running synapse_port_db with the --curses option causes an error. This does not occur if the option is removed.

Steps to reproduce

  • Set up synapse using default sqlite3 database
  • Stop synapse
  • Run synapse_port_db script with --curses option.

This should port the SQlite database into PostgreSQL. However, it fails with the error below (mildly redacted).

(env) root@[HOST] ~/s/scripts (develop)# synapse_port_db --sqlite-database homeserver.db.bak --postgres-config homeserver-postgres.yaml --curses
Traceback (most recent call last):
  File "/root/synapse/env/bin/synapse_port_db", line 609, in run
    consumeErrors=True,
twisted.internet.defer.FirstError: FirstError[#24, [Failure instance: Traceback: <class 'psycopg2.errors.ForeignKeyViolation'>: insert or update on           table "ui_auth_sessions_credentials" violates foreign key constraint "ui_auth_sessions_credentials_session_id_fkey"
DETAIL:  Key (session_id)=([ID]) is not present in table "ui_auth_sessions".

/root/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py:501:errback
/root/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py:568:_startRunCallbacks
/root/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py:654:_runCallbacks
/root/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py:1475:gotResult
--- <exception caught here> ---
/root/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py:1416:_inlineCallbacks
/root/synapse/env/lib/python3.7/site-packages/twisted/python/failure.py:512:throwExceptionIntoGenerator
/root/synapse/env/bin/synapse_port_db:375:handle_table
/root/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py:1416:_inlineCallbacks
/root/synapse/env/lib/python3.7/site-packages/twisted/python/failure.py:512:throwExceptionIntoGenerator
/root/synapse/env/lib/python3.7/site-packages/synapse/storage/database.py:527:runInteraction
/root/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py:1416:_inlineCallbacks
/root/synapse/env/lib/python3.7/site-packages/twisted/python/failure.py:512:throwExceptionIntoGenerator
/root/synapse/env/lib/python3.7/site-packages/synapse/storage/database.py:575:runWithConnection
/root/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py:250:inContext
/root/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py:266:<lambda>
/root/synapse/env/lib/python3.7/site-packages/twisted/python/context.py:122:callWithContext
/root/synapse/env/lib/python3.7/site-packages/twisted/python/context.py:85:callWithContext
/root/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py:306:_runWithConnection
/root/synapse/env/lib/python3.7/site-packages/twisted/python/compat.py:464:reraise
/root/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py:297:_runWithConnection
/root/synapse/env/lib/python3.7/site-packages/synapse/storage/database.py:572:inner_func
/root/synapse/env/lib/python3.7/site-packages/synapse/storage/database.py:418:new_transaction
/root/synapse/env/bin/synapse_port_db:363:insert
/root/synapse/env/bin/synapse_port_db:175:insert_many_txn
/root/synapse/env/lib/python3.7/site-packages/synapse/storage/database.py:213:executemany
/root/synapse/env/lib/python3.7/site-packages/synapse/storage/database.py:236:_do_execute
]]
Traceback (most recent call last):
  File "/root/synapse/env/bin/synapse_port_db", line 1056, in <module>
    sys.stderr.write(end_error)
TypeError: write() argument must be str, not FirstError
(env) root@[HOST] ~/s/scripts (develop) [0|1]#

Version information

  • Homeserver: matrix.zettabomb.com
  • Version: 1.14.0
  • Install method: Docker build from source
  • Platform: VMware Photon OS 3.0
@erikjohnston erikjohnston added z-bug (Deprecated Label) z-p2 (Deprecated Label) Z-Help-Wanted We know exactly how to fix this issue, and would be grateful for any contribution and removed Z-Help-Wanted We know exactly how to fix this issue, and would be grateful for any contribution z-p2 (Deprecated Label) labels Jun 8, 2020
@erikjohnston
Copy link
Member

erikjohnston commented Jun 8, 2020

It should be easy to fix the error writing to sys.stderr.write. Though it looks like that is only masking another error. I split out that failure mode to #7655

@gjsmo Does it work without the --curses option? The first error looks like it would happen regardless to me. I wonder if the port script is broken for anyone with non-empty ui_auth_sessions

@erikjohnston erikjohnston added p1 z-p2 (Deprecated Label) and removed p1 labels Jun 8, 2020
@erikjohnston erikjohnston changed the title synapse_port_db fails when using --curses option synapse_port_db fails when ui_auth_sessions table is non-empty Jun 8, 2020
@clokep
Copy link
Member

clokep commented Jun 8, 2020

Can't say I know much about synapse_port_db, but I'm unsure if moving over the UI Auth session is even valuable? I assume that switching from sqlite to postgresql implies some downtime, so the sessions have a decent chance of becoming invalid anyway. The result for users would be that they would need to restart whatever operation they were doing (e.g. deleting a device).

@gjsmo
Copy link
Author

gjsmo commented Jun 8, 2020

It should be easy to fix the error writing to sys.stderr.write. Though it looks like that is only masking another error. I split out that failure mode to #7655

@gjsmo Does it work without the --curses option? The first error looks like it would happen regardless to me. I wonder if the port script is broken for anyone with non-empty ui_auth_sessions

It worked fine without the --curses option for me, although admittedly I only tried it on one DB. The presence/lack of the option was the only change, same SQlite DB file, same PostgreSQL instance etc.

Can't say I know much about synapse_port_db, but I'm unsure if moving over the UI Auth session is even valuable? I assume that switching from sqlite to postgresql implies some downtime, so the sessions have a decent chance of becoming invalid anyway. The result for users would be that they would need to restart whatever operation they were doing (e.g. deleting a device).

It does incur some downtime but not too much - for me it was about 20 minutes (out of which maybe 10 was spent on this particular bug). I think the officially instructions suggest to actually do it in a way that creates maybe 1 minute of downtime, by running the script once while Synapse remains up, then again with Synapse down to update whatever changes occurred during the initial migration (but this takes much less time). In general I see it's recommended to do this earlier than later so the SQlite DB is going to be fairly small too.

Do you mean that the login sessions would be invalidated or just the ongoing actions? That seems like a big-ish difference but I'm not sure what that table represents.

@erikjohnston
Copy link
Member

Can't say I know much about synapse_port_db, but I'm unsure if moving over the UI Auth session is even valuable? I assume that switching from sqlite to postgresql implies some downtime, so the sessions have a decent chance of becoming invalid anyway. The result for users would be that they would need to restart whatever operation they were doing (e.g. deleting a device).

I think ignoring the table is probably fine, the other option is to ensure that the port script copies tables over in the correct order to make the foreign constraints happy.

It worked fine without the --curses option for me, although admittedly I only tried it on one DB. The presence/lack of the option was the only change, same SQlite DB file, same PostgreSQL instance etc.

Weird. I can't see how it'd change things, though maybe there is a race or something that means it happens randomly or only on the first run or something

Do you mean that the login sessions would be invalidated or just the ongoing actions? That seems like a big-ish difference but I'm not sure what that table represents.

Just ongoing actions, i.e. if the user was in the middle of authenticating to delete a device (as that requires typing in your password again, for example) they'd have to retry. Login sessions wouldn't get deleted.

@clokep
Copy link
Member

clokep commented Jun 16, 2020

@richvdh I believe you said you believe the fix for this is straightforward, I think there's two options here:

  1. Ignore the two tables when porting between databases.
  2. Ensure that the ui_auth_sessions table comes before ui_auth_sessions_credentials table.

Did you have one of those options in mind or something else?

@richvdh
Copy link
Member

richvdh commented Jun 16, 2020

I had option 1 in mind.

@f1egmatik
Copy link

I think that we need add
"ui_auth_sessions_ips",
to IGNORED_TABLES.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
z-bug (Deprecated Label) z-p2 (Deprecated Label)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants