Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

57/rooms_version_column_3.sql taking ages on upgrade #7144

Closed
turt2live opened this issue Mar 26, 2020 · 12 comments · Fixed by #7275
Closed

57/rooms_version_column_3.sql taking ages on upgrade #7144

turt2live opened this issue Mar 26, 2020 · 12 comments · Fixed by #7275
Assignees

Comments

@turt2live
Copy link
Member

Currently my server is stuck on Applying engine-specific schema 57/rooms_version_column_3.sql.postgres and has been for about 30 minutes now. It's an unavoidable update, but would have been nice to know that there's a significant update to the database so I can expect that it'll be a while before the server starts up on its own.

It looks like this particular update is just a really expensive nested loop:

synapse=# select count(*) from rooms;
 count
--------
 528770
(1 row)

synapse=# select count(*) from rooms where room_version is null;
 count
-------
  4071
(1 row)

synapse=# explain UPDATE rooms SET room_version=(
    SELECT COALESCE(json::json->'content'->>'room_version','1')
    FROM state_events se INNER JOIN event_json ej USING (event_id)
    WHERE se.room_id=rooms.room_id AND se.type='m.room.create' AND se.state_key=''
    LIMIT 1
) WHERE rooms.room_version IS NULL;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Update on rooms  (cost=0.00..7581986912.20 rows=4248 width=55)
   ->  Seq Scan on rooms  (cost=0.00..7581986912.20 rows=4248 width=55)
         Filter: (room_version IS NULL)
         SubPlan 1
           ->  Limit  (cost=0.70..1784833.81 rows=1 width=1107)
                 ->  Nested Loop  (cost=0.70..1784833.81 rows=1 width=1107)
                       ->  Seq Scan on state_events se  (cost=0.00..1784825.08 rows=1 width=35)
                             Filter: ((room_id = rooms.room_id) AND (type = 'm.room.create'::text) AND (state_key = ''::text))
                       ->  Index Scan using event_json_event_id_key on event_json ej  (cost=0.70..8.71 rows=1 width=1140)
                             Index Cond: (event_id = se.event_id)
(10 rows)

The other reason for mentioning large database updates would be to let admins run them ahead of time where possible, knowing that they're completely on their own and could ruin everything. This shouldn't be an advertised feature of bolding some text in the changelog though.

@babolivier
Copy link
Contributor

Right, that update didn't take very long to run on matrix.org, so I assumed it would be fine, but apparently I was wrong, sorry about that :/

fwiw we usually do that kind of updates in the background so it's not blocking, but this one fell into the small category of "we need that new info as soon as Synapse starts up so we need to run it synchronously".

@turt2live
Copy link
Member Author

@babolivier fwiw it ran for 2 hours before I cancelled it and rolled back. If it only took a few minutes on matrix.org, then there's probably a missing index somewhere that Synapse needs to create :(

@richvdh
Copy link
Member

richvdh commented Mar 26, 2020

As a general principle, yes I'd agree that we should call out db updates which we expect to be slow. However, that's not really actionable as an issue, so let's instead focus on why this particular update is slow for @turt2live .

@richvdh richvdh changed the title Mention in the changelog when expensive database updates are included 57/rooms_version_column_3.sql taking ages on upgrade Mar 26, 2020
@turt2live
Copy link
Member Author

fwiw, I did run create table t2b_state_events as (select event_id, room_id from state_events where type = 'm.room.create' and state_key = ''); (with create index t2b_se_rid on t2b_state_events(room_id)) and then tried to modify the update to use the new t2b_state_events table, which did remove the inner loop, but was still awfully slow (~30 minutes before giving up).

@richvdh
Copy link
Member

richvdh commented Mar 26, 2020

The (main) reason it wasn't slow on matrix.org might be that we had previously populated this column via a background update.

Some history here, largely for my own reference:

@richvdh
Copy link
Member

richvdh commented Mar 26, 2020

The query plan should look like this:

 Update on rooms
   ->  Seq Scan on rooms
         Filter: (room_version IS NULL)
         SubPlan 1
           ->  Limit  (cost=1.27..1844.99 rows=1 width=32)
                 ->  Nested Loop  (cost=1.27..33188.20 rows=18 width=32)
                       ->  Index Scan using state_events_room_id_idx on state_events se  (cost=0.57..33103.06 rows=18 width=34)
                             Index Cond: (room_id = rooms.room_id)
                             Filter: ((type = 'm.room.create'::text) AND (state_key = ''::text))
                       ->  Index Scan using evjson_uniq_idx on event_json ej  (cost=0.70..4.72 rows=1 width=1134)
                             Index Cond: (event_id = se.event_id)

so the problem is that it's sequence-scanning state_events rather than using an index.

@richvdh
Copy link
Member

richvdh commented Mar 26, 2020

... which isn't surprising, given state_events doesn't have an index on room_id except on matrix.org.

@turt2live
Copy link
Member Author

Hmm... even after creating the index it looks like postgres wants to do something else instead:

synapse=# explain UPDATE rooms SET room_version=(
    SELECT COALESCE(json::json->'content'->>'room_version','1')
    FROM state_events se INNER JOIN event_json ej USING (event_id)
    WHERE se.room_id=rooms.room_id AND se.type='m.room.create' AND se.state_key=''
    LIMIT 1
) WHERE rooms.room_version IS NULL;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
 Update on rooms  (cost=0.00..23544492.02 rows=4671 width=55)
   ->  Seq Scan on rooms  (cost=0.00..23544492.02 rows=4671 width=55)
         Filter: (room_version IS NULL)
         SubPlan 1
           ->  Limit  (cost=251.01..5037.81 rows=1 width=1107)
                 ->  Nested Loop  (cost=251.01..28971.81 rows=6 width=1107)
                       ->  Bitmap Heap Scan on state_events se  (cost=250.32..28919.41 rows=6 width=35)
                             Recheck Cond: (room_id = rooms.room_id)
                             Filter: ((type = 'm.room.create'::text) AND (state_key = ''::text))
                             ->  Bitmap Index Scan on state_events_room_id_idx  (cost=0.00..250.31 rows=7700 width=0)
                                   Index Cond: (room_id = rooms.room_id)
                       ->  Index Scan using event_json_event_id_key on event_json ej  (cost=0.70..8.71 rows=1 width=1140)
                             Index Cond: (event_id = se.event_id)
(13 rows)

Does matrix.org also have specific query tuning to alter the plan?

@richvdh
Copy link
Member

richvdh commented Mar 26, 2020

Does matrix.org also have specific query tuning to alter the plan?

I don't think so. The query plan above looks relatively sane now. Does it still take ages?

@turt2live
Copy link
Member Author

Nope, finished in seconds (~5). Guess I can update now :D

@richvdh
Copy link
Member

richvdh commented Mar 26, 2020

I think the right fix here is for the update to use the events table instead:

UPDATE rooms SET room_version=(
    SELECT COALESCE(json::json->'content'->>'room_version','1')    
    FROM events e
        INNER JOIN state_events se USING (event_id)
        INNER JOIN event_json ej USING (event_id)
    WHERE e.room_id=rooms.room_id AND e.type='m.room.create' AND se.state_key=''
    LIMIT 1
)
WHERE rooms.room_version IS NULL;

... but that could still be quite slow, particularly for rooms where there have been a large number of events.

@richvdh
Copy link
Member

richvdh commented Apr 2, 2020

conclusion on this: we will add a warning about this to UPGRADE.rst.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants