-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
57/rooms_version_column_3.sql taking ages on upgrade #7144
Comments
Right, that update didn't take very long to run on matrix.org, so I assumed it would be fine, but apparently I was wrong, sorry about that :/ fwiw we usually do that kind of updates in the background so it's not blocking, but this one fell into the small category of "we need that new info as soon as Synapse starts up so we need to run it synchronously". |
@babolivier fwiw it ran for 2 hours before I cancelled it and rolled back. If it only took a few minutes on matrix.org, then there's probably a missing index somewhere that Synapse needs to create :( |
As a general principle, yes I'd agree that we should call out db updates which we expect to be slow. However, that's not really actionable as an issue, so let's instead focus on why this particular update is slow for @turt2live . |
fwiw, I did run |
The (main) reason it wasn't slow on matrix.org might be that we had previously populated this column via a background update. Some history here, largely for my own reference:
|
The query plan should look like this:
so the problem is that it's sequence-scanning |
... which isn't surprising, given |
Hmm... even after creating the index it looks like postgres wants to do something else instead:
Does matrix.org also have specific query tuning to alter the plan? |
I don't think so. The query plan above looks relatively sane now. Does it still take ages? |
Nope, finished in seconds (~5). Guess I can update now :D |
I think the right fix here is for the update to use the UPDATE rooms SET room_version=(
SELECT COALESCE(json::json->'content'->>'room_version','1')
FROM events e
INNER JOIN state_events se USING (event_id)
INNER JOIN event_json ej USING (event_id)
WHERE e.room_id=rooms.room_id AND e.type='m.room.create' AND se.state_key=''
LIMIT 1
)
WHERE rooms.room_version IS NULL; ... but that could still be quite slow, particularly for rooms where there have been a large number of events. |
conclusion on this: we will add a warning about this to UPGRADE.rst. |
Currently my server is stuck on
Applying engine-specific schema 57/rooms_version_column_3.sql.postgres
and has been for about 30 minutes now. It's an unavoidable update, but would have been nice to know that there's a significant update to the database so I can expect that it'll be a while before the server starts up on its own.It looks like this particular update is just a really expensive nested loop:
The other reason for mentioning large database updates would be to let admins run them ahead of time where possible, knowing that they're completely on their own and could ruin everything. This shouldn't be an advertised feature of bolding some text in the changelog though.
The text was updated successfully, but these errors were encountered: