Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admin: very slow recover view #748

Closed
blueyed opened this issue Sep 4, 2018 · 10 comments
Closed

admin: very slow recover view #748

blueyed opened this issue Sep 4, 2018 · 10 comments

Comments

@blueyed
Copy link
Contributor

blueyed commented Sep 4, 2018

The recover view in the admin is very slow for me (more than 200s).

The slow query there is:

SELECT "reversion_version"."id", "reversion_version"."revision_id", "reversion_version"."object_id", "reversion_version"."content_type_id", "reversion_version"."db", "reversion_version"."format", "reversion_version"."serialized_data", "reversion_version"."object_repr"
FROM "reversion_version"
WHERE "reversion_version"."id" IN (
  SELECT MAX(V."id") FROM "reversion_version" V
  LEFT JOIN "app_model" ON V."object_id" = CAST("app_model"."id" as varchar(191))
  WHERE V."db" = 'default' AND V."content_type_id" = 44 AND "app_model"."id" IS NULL
  GROUP BY V."object_id"
) ORDER BY "reversion_version"."id" ASC
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                               |
|----------------------------------------------------------------------------------------------------------------------------------------------------------|
| Merge Join  (cost=19428.31..1555681.33 rows=5074139 width=622)                                                                                           |
|   Merge Cond: (reversion_version.id = "ANY_subquery".max)                                                                                                |
|   ->  Index Scan using reversion_version_pkey on reversion_version  (cost=0.43..1510879.77 rows=10148278 width=622)                                      |
|   ->  Sort  (cost=19427.87..19428.37 rows=200 width=4)                                                                                                   |
|         Sort Key: "ANY_subquery".max                                                                                                                     |
|         ->  HashAggregate  (cost=19418.23..19420.23 rows=200 width=4)                                                                                    |
|               Group Key: "ANY_subquery".max                                                                                                              |
|               ->  Subquery Scan on "ANY_subquery"  (cost=4.89..19401.15 rows=6832 width=4)                                                               |
|                     ->  GroupAggregate  (cost=4.89..19332.83 rows=6832 width=9)                                                                          |
|                           Group Key: v.object_id                                                                                                         |
|                           ->  Merge Anti Join  (cost=4.89..19212.94 rows=10313 width=9)                                                                  |
|                                 Merge Cond: ((v.object_id)::text = (((app_model.id)::character varying(191))::text))                        |
|                                 ->  Index Scan using reversion_version_db_b2c54f65_uniq on reversion_version v  (cost=0.56..19104.35 rows=10346 width=9) |
|                                       Index Cond: (((db)::text = 'default'::text) AND (content_type_id = 44))                                            |
|                                 ->  Sort  (cost=4.33..4.43 rows=37 width=4)                                                                              |
|                                       Sort Key: (((app_model.id)::character varying(191))::text)                                            |
|                                       ->  Seq Scan on app_model  (cost=0.00..3.37 rows=37 width=4)                                          |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+

According to django-debug-toolbar it gets triggered in reversion/admin.py in recoverlist_view(278) for the {% if deleted %} in the reversion/templates/reversion/recover_list.html.

Generated in

deleted = self._reversion_order_version_queryset(Version.objects.get_deleted(self.model))
.

Using django-reversion==3.0.0.

reversion_version has 10687260 entries (reversion_revision 7096222).

I have not investigated further yet.

@etianen
Copy link
Owner

etianen commented Sep 5, 2018 via email

@blueyed
Copy link
Contributor Author

blueyed commented Sep 5, 2018

PostgreSQL.
~10 million versions.

I assume we used it (or are still using it) on some models that are changing a lot, where it should be disabled partly.

select content_type_id, count(*) as count from reversion_version  group by content_type_id order by count;
+-------------------+---------+
| content_type_id   | count   |                                                                                                                                                              
|-------------------+---------|                                                                                                                                                              
…
| 62                | 11117   |
| 27                | 11204   |
| 7                 | 106673  |
| 34                | 161814  |
| 14                | 199364  |
| 13                | 567681  |
| 41                | 1526685 |
| 38                | 3066905 |
| 10                | 5042956 |
+-------------------+---------+

@etianen
Copy link
Owner

etianen commented Sep 5, 2018 via email

@blueyed
Copy link
Contributor Author

blueyed commented Sep 9, 2018

The slow part here is the ORDER BY at the end.

Without it the query plan looks like this:

+----------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                   |
|----------------------------------------------------------------------------------------------------------------------------------------------|
| Nested Loop  (cost=19555.76..21249.83 rows=5109492 width=622)                                                                                |
|   ->  HashAggregate  (cost=19555.33..19557.33 rows=200 width=4)                                                                              |
|         Group Key: max(v.id)                                                                                                                 |
|         ->  GroupAggregate  (cost=7.11..19469.57 rows=6861 width=9)                                                                          |
|               Group Key: v.object_id                                                                                                         |
|               ->  Merge Anti Join  (cost=7.11..19349.06 rows=10380 width=9)                                                                  |
|                     Merge Cond: ((v.object_id)::text = (((app_model.id)::character varying(191))::text))                           |
|                     ->  Index Scan using reversion_version_db_b2c54f65_uniq on reversion_version v  (cost=0.56..19237.42 rows=10418 width=9) |
|                           Index Cond: (((db)::text = 'default'::text) AND (content_type_id = 43))                                            |
|                     ->  Sort  (cost=6.55..6.66 rows=42 width=4)                                                                              |
|                           Sort Key: (((app_model.id)::character varying(191))::text)                                               |
|                           ->  Seq Scan on app_model  (cost=0.00..5.42 rows=42 width=4)                                             |
|   ->  Index Scan using reversion_version_pkey on reversion_version  (cost=0.43..8.45 rows=1 width=622)                                       |
|         Index Cond: (id = (max(v.id)))                                                                                                       |
+----------------------------------------------------------------------------------------------------------------------------------------------+

It appears like the ORDER BY is not applied only to the result, but influences the query altogether.

Workaround:

SELECT "reversion_version"."id", "reversion_version"."revision_id", "reversion_version"."object_id", "reversion_version"."content_type_id", "reversion_version"."db", "reversion_version"."format", "reversion_version"."serialized_data", "reversion_version"."object_repr"
FROM "reversion_version" WHERE "reversion_version"."id" = ANY(
  ARRAY(
    SELECT MAX(V."id") FROM "reversion_version" V
    LEFT JOIN "app_model" ON V."object_id" = CAST("app_model"."id" as varchar(191))
    WHERE V."db" = 'default' AND V."content_type_id" = 43 AND "app_model"."id" IS NULL GROUP BY V."object_id"
  ))
ORDER BY "reversion_version"."id" ASC

(via https://stackoverflow.com/a/15007154/15690)

Since this is likely not supported for Django and all DBs, it might be worth ordering in Python instead?

With the workaround (or no ordering in the end) the query only takes 0.073s, instead of 205.484s.

PostgreSQL 9.6.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

@blueyed
Copy link
Contributor Author

blueyed commented Sep 9, 2018

For completeness the query plan for the ARRAY workaround:

+------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                               |
|------------------------------------------------------------------------------------------------------------------------------------------|
| Index Scan using reversion_version_pkey on reversion_version  (cost=19470.00..19522.38 rows=10 width=622)                                |
|   Index Cond: (id = ANY ($0))                                                                                                            |
|   InitPlan 1 (returns $0)                                                                                                                |
|     ->  GroupAggregate  (cost=7.11..19469.57 rows=6861 width=9)                                                                          |
|           Group Key: v.object_id                                                                                                         |
|           ->  Merge Anti Join  (cost=7.11..19349.06 rows=10380 width=9)                                                                  |
|                 Merge Cond: ((v.object_id)::text = (((app_model.id)::character varying(191))::text))                           |
|                 ->  Index Scan using reversion_version_db_b2c54f65_uniq on reversion_version v  (cost=0.56..19237.42 rows=10418 width=9) |
|                       Index Cond: (((db)::text = 'default'::text) AND (content_type_id = 43))                                            |
|                 ->  Sort  (cost=6.55..6.66 rows=42 width=4)                                                                              |
|                       Sort Key: (((app_model.id)::character varying(191))::text)                                               |
|                       ->  Seq Scan on app_model  (cost=0.00..5.42 rows=42 width=4)                                             |
+------------------------------------------------------------------------------------------------------------------------------------------+

@etianen
Copy link
Owner

etianen commented Sep 11, 2018 via email

@blueyed
Copy link
Contributor Author

blueyed commented Sep 11, 2018

Yeah, I am no expert their either.
I think the only feasible option might be to not order in the DB, but in Python later - would you take that?
OTOH: we could also simulate the static array by doing the subquery explicitly before, and then using its result instead of the subquery. That would result in two queries though instead of just one + sorting.

It would also be good to know if this is a problem with other DBs or later PostgreSQL versions.

@etianen
Copy link
Owner

etianen commented Sep 12, 2018 via email

@blueyed
Copy link
Contributor Author

blueyed commented Sep 13, 2018

Yes, that makes sense.
I likely won't come back to this soon, but it is good to have a plan for this already.

@etianen
Copy link
Owner

etianen commented Jan 22, 2021

The deleted objects query has been optimized yet again, and is available in 3.0.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants