Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow introspection query in asyncpg #113292

Closed
dikshant opened this issue Oct 30, 2023 · 4 comments · Fixed by #117499
Closed

Slow introspection query in asyncpg #113292

dikshant opened this issue Oct 30, 2023 · 4 comments · Fixed by #117499
Assignees
Labels
A-sql-vtables Virtual tables - pg_catalog, information_schema etc A-tools-asyncpg C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-investigation Further steps needed to qualify. C-label will change. T-sql-queries SQL Queries Team

Comments

@dikshant
Copy link

dikshant commented Oct 30, 2023

(
        SELECT
            t.oid                           AS oid,
            ns.nspname                      AS ns,
            t.typname                       AS name,
            t.typtype                       AS kind,
            (CASE WHEN t.typtype = 'd' THEN
                (WITH RECURSIVE typebases(oid, depth) AS (
                    SELECT
                        t2.typbasetype      AS oid,
                        0                   AS depth
                    FROM
                        pg_type t2
                    WHERE
                        t2.oid = t.oid

                    UNION ALL

                    SELECT
                        t2.typbasetype      AS oid,
                        tb.depth + 1        AS depth
                    FROM
                        pg_type t2,
                        typebases tb
                    WHERE
                       tb.oid = t2.oid
                       AND t2.typbasetype != 0
               ) SELECT oid FROM typebases ORDER BY depth DESC LIMIT 1)

               ELSE NULL
            END)                            AS basetype,
            t.typelem                       AS elemtype,
            elem_t.typdelim                 AS elemdelim,
            range_t.rngsubtype              AS range_subtype,
            (CASE WHEN t.typtype = 'c' THEN
                (SELECT
                    array_agg(ia.atttypid ORDER BY ia.attnum)
                FROM
                    pg_attribute ia
                    INNER JOIN pg_class c
                        ON (ia.attrelid = c.oid)
                WHERE
                    ia.attnum > 0 AND NOT ia.attisdropped
                    AND c.reltype = t.oid)

                ELSE NULL
            END)                            AS attrtypoids,
            (CASE WHEN t.typtype = 'c' THEN
                (SELECT
                    array_agg(ia.attname::text ORDER BY ia.attnum)
                FROM
                    pg_attribute ia
                    INNER JOIN pg_class c
                        ON (ia.attrelid = c.oid)
                WHERE
                    ia.attnum > 0 AND NOT ia.attisdropped
                    AND c.reltype = t.oid)

                ELSE NULL
            END)                            AS attrnames
        FROM
            pg_catalog.pg_type AS t
            INNER JOIN pg_catalog.pg_namespace ns ON (
                ns.oid = t.typnamespace)
            LEFT JOIN pg_type elem_t ON (
                t.typlen = -1 AND
                t.typelem != 0 AND
                t.typelem = elem_t.oid
            )
            LEFT JOIN pg_range range_t ON (
                t.oid = range_t.rngtypid
            )
    )

Explain Analyze Verbose:

                                                                                                           info
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  planning time: 3ms
  execution time: 2.2s
  distribution: local
  vectorized: true
  maximum memory usage: 18 MiB
  network usage: 0 B (0 messages)
  regions: local
  sql cpu time: 566ms
  isolation level: serializable
  priority: normal
  quality of service: regular

  • render
  │ columns: (oid, ns, name, kind, basetype, elemtype, elemdelim, range_subtype, attrtypoids, attrnames)
  │ render basetype: CASE WHEN any_not_null = 'd' THEN subquery(any_not_null) ELSE CAST(NULL AS OID) END
  │ render attrtypoids: CASE WHEN any_not_null = 'c' THEN any_not_null ELSE CAST(NULL AS OID[]) END
  │ render attrnames: CASE WHEN any_not_null = 'c' THEN any_not_null ELSE CAST(NULL AS STRING[]) END
  │ render oid: any_not_null
  │ render typname: any_not_null
  │ render typtype: any_not_null
  │ render typelem: any_not_null
  │ render nspname: any_not_null
  │ render typdelim: any_not_null
  │ render rngsubtype: any_not_null
  │
  └── • group (hash)
      │ columns: (rownum, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null)
      │ nodes: n1
      │ regions: local
      │ actual row count: 402
      │ vectorized batch count: 9
      │ estimated max memory allocated: 10 MiB
      │ estimated max sql temp disk usage: 0 B
      │ sql cpu time: 3ms
      │ estimated row count: 98,010 (missing stats)
      │ aggregate 0: any_not_null(array_agg)
      │ aggregate 1: any_not_null(any_not_null)
      │ aggregate 2: any_not_null(any_not_null)
      │ aggregate 3: any_not_null(any_not_null)
      │ aggregate 4: any_not_null(any_not_null)
      │ aggregate 5: any_not_null(any_not_null)
      │ aggregate 6: any_not_null(any_not_null)
      │ aggregate 7: any_not_null(any_not_null)
      │ aggregate 8: any_not_null(any_not_null)
      │ group by: rownum
      │
      └── • project
          │ columns: (any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, array_agg, rownum)
          │
          └── • apply join (left outer)
              │ columns: (rownum, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, attnum, column276, array_agg)
              │ nodes: n1
              │ regions: local
              │ actual row count: 3,174
              │ vectorized batch count: 13
              │ sql cpu time: 280ms
              │ estimated row count: 1,600,993 (missing stats)
              │
              └── • group (hash)
                  │ columns: (rownum, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null, any_not_null)
                  │ nodes: n1
                  │ regions: local
                  │ actual row count: 402
                  │ vectorized batch count: 1
                  │ estimated max memory allocated: 6.4 MiB
                  │ estimated max sql temp disk usage: 0 B
                  │ sql cpu time: 1ms
                  │ estimated row count: 98,010 (missing stats)
                  │ aggregate 0: any_not_null(array_agg)
                  │ aggregate 1: any_not_null(oid)
                  │ aggregate 2: any_not_null(typname)
                  │ aggregate 3: any_not_null(typtype)
                  │ aggregate 4: any_not_null(typelem)
                  │ aggregate 5: any_not_null(nspname)
                  │ aggregate 6: any_not_null(typdelim)
                  │ aggregate 7: any_not_null(rngsubtype)
                  │ group by: rownum
                  │
                  └── • project
                      │ columns: (oid, typname, typtype, typelem, nspname, typdelim, rngsubtype, array_agg, rownum)
                      │
                      └── • apply join (left outer)
                          │ columns: (oid, nspname, oid, typname, typnamespace, typlen, typtype, typelem, oid, typdelim, rngtypid, rngsubtype, rownum, attrelid, atttypid, attnum, attisdropped, oid, reltype, array_agg)
                          │ nodes: n1
                          │ regions: local
                          │ actual row count: 3,174
                          │ vectorized batch count: 13
                          │ sql cpu time: 280ms
                          │ estimated row count: 1,600,993 (missing stats)
                          │
                          └── • ordinality
                              │ columns: (oid, nspname, oid, typname, typnamespace, typlen, typtype, typelem, oid, typdelim, rngtypid, rngsubtype, rownum)
                              │ nodes: n1
                              │ regions: local
                              │ actual row count: 402
                              │ vectorized batch count: 6
                              │ sql cpu time: 12µs
                              │ estimated row count: 98,010 (missing stats)
                              │
                              └── • hash join (left outer)
                                  │ columns: (oid, nspname, oid, typname, typnamespace, typlen, typtype, typelem, oid, typdelim, rngtypid, rngsubtype)
                                  │ nodes: n1
                                  │ regions: local
                                  │ actual row count: 402
                                  │ vectorized batch count: 6
                                  │ estimated max memory allocated: 120 KiB
                                  │ estimated max sql temp disk usage: 0 B
                                  │ sql cpu time: 140µs
                                  │ estimated row count: 98,010 (missing stats)
                                  │ equality: (oid) = (rngtypid)
                                  │
                                  ├── • hash join (inner)
                                  │   │ columns: (oid, nspname, oid, typname, typnamespace, typlen, typtype, typelem, oid, typdelim)
                                  │   │ nodes: n1
                                  │   │ regions: local
                                  │   │ actual row count: 402
                                  │   │ vectorized batch count: 6
                                  │   │ estimated max memory allocated: 192 KiB
                                  │   │ estimated max sql temp disk usage: 0 B
                                  │   │ sql cpu time: 291µs
                                  │   │ estimated row count: 9,801 (missing stats)
                                  │   │ equality: (oid) = (typnamespace)
                                  │   │
                                  │   ├── • virtual table
                                  │   │     columns: (oid, nspname)
                                  │   │     nodes: n1
                                  │   │     regions: local
                                  │   │     actual row count: 5
                                  │   │     vectorized batch count: 3
                                  │   │     sql cpu time: 36µs
                                  │   │     estimated row count: 1,000 (missing stats)
                                  │   │     table: pg_namespace@primary
                                  │   │
                                  │   └── • hash join (left outer)
                                  │       │ columns: (oid, typname, typnamespace, typlen, typtype, typelem, oid, typdelim)
                                  │       │ nodes: n1
                                  │       │ regions: local
                                  │       │ actual row count: 402
                                  │       │ vectorized batch count: 9
                                  │       │ estimated max memory allocated: 40 KiB
                                  │       │ estimated max sql temp disk usage: 0 B
                                  │       │ sql cpu time: 637µs
                                  │       │ estimated row count: 1,000 (missing stats)
                                  │       │ equality: (typelem) = (oid)
                                  │       │ pred: typlen = -1
                                  │       │
                                  │       ├── • virtual table
                                  │       │     columns: (oid, typname, typnamespace, typlen, typtype, typelem)
                                  │       │     nodes: n1
                                  │       │     regions: local
                                  │       │     actual row count: 402
                                  │       │     vectorized batch count: 9
                                  │       │     sql cpu time: 570µs
                                  │       │     estimated row count: 1,000 (missing stats)
                                  │       │     table: pg_type@primary
                                  │       │
                                  │       └── • virtual table
                                  │             columns: (oid, typdelim)
                                  │             nodes: n1
                                  │             regions: local
                                  │             actual row count: 402
                                  │             vectorized batch count: 9
                                  │             sql cpu time: 495µs
                                  │             estimated row count: 333 (missing stats)
                                  │             table: pg_type@pg_type_oid_idx
                                  │             spans: [/1 - ]
                                  │
                                  └── • virtual table
                                        columns: (rngtypid, rngsubtype)
                                        nodes: n1
                                        regions: local
                                        actual row count: 0
                                        vectorized batch count: 0
                                        sql cpu time: 16µs
                                        estimated row count: 1,000 (missing stats)
                                        table: pg_range@primary
(176 rows)

Time: 2.169s total (execution 2.169s / network 0.001s)

This query takes 1.9 seconds on a 3 node local roachprod cluster. This can cause weird slow latencies when using asyncpg when these checks run.

root@localhost:26257/system/defaultdb> SELECT VERSION();
                                           version
----------------------------------------------------------------------------------------------
  CockroachDB CCL v23.2.0-alpha.4-dev (darwin arm64, built , go1.20.10 X:nocoverageredesign)
(1 row)

Time: 1ms total (execution 1ms / network 0ms)

We should check if rest of the queries here are also slow:
https://github.com/MagicStack/asyncpg/blob/master/asyncpg/introspection.py (edite

Jira issue: CRDB-32853

@dikshant dikshant added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-investigation Further steps needed to qualify. C-label will change. A-sql-vtables Virtual tables - pg_catalog, information_schema etc T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) T-sql-queries SQL Queries Team labels Oct 30, 2023
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Oct 30, 2023
@dikshant
Copy link
Author

dikshant commented Oct 31, 2023

There was a similar issue with a django introspection query once before but that was due to rewriting the built in as an internal udf:
#96218 (comment)

@rafiss
Copy link
Collaborator

rafiss commented Oct 31, 2023

I wrote a rttanalysis test that shows that there's only one descriptor lookup happening, but this query still takes ~2-4 seconds. I will merge this test, then I think the next step is to figure out if the query plan can be improved.

@rafiss
Copy link
Collaborator

rafiss commented Nov 2, 2023

Benchmark is in #113538. The results from running on my laptop show that it takes 7 seconds.

❯ ./dev bench pkg/bench/rttanalysis -f=BenchmarkORMQueries/asyncpg --verbose

BenchmarkORMQueries/asyncpg_types-10         	       1	7772714751 ns/op	         1.000 roundtrips

@rafiss rafiss removed the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Nov 2, 2023
@rafiss rafiss removed this from SQL Queries Nov 2, 2023
@rafiss rafiss removed their assignment Nov 2, 2023
craig bot pushed a commit that referenced this issue Nov 2, 2023
113538: rttanalysis: add a benchmark for asyncpg query to load types r=rafiss a=rafiss

This benchmark shows that a query made by asyncpg only performs one KV roundtrip, but still takes multiple seconds to complete because of a suboptimal query plan. It's likely that WITH RECURSIVE is part of the problem.

informs  #113292

Release note: None

113575: roachtest: remove unused config from allocbench r=andrewbaptist a=kvoli

The node attributes were not used by any test configuration, remove
them.

Part of: #110115
Release note: None

---

The replication factor config was never used, remove it.

Part of #110115
Release note: None

Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Austen McClernon <austen@cockroachlabs.com>
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Nov 10, 2023
@yuzefovich
Copy link
Member

On 22.2 we couldn't execute this query because we couldn't decorrelate the subquery. On 23.1 I got 2s local execution time, on current master I got 2.8s.

@yuzefovich yuzefovich moved this from Triage to Active in SQL Queries Nov 28, 2023
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Jan 8, 2024
This commit adds a new norm rule, `FoldGroupByAndWindow`, which can
merge a Window operator with a parent GroupBy operator when the grouping
columns are the same as the partition columns. See the rule comment for
the complete list of conditions. In addition to removing a potentially
expensive Window operator, this transformation makes way for other rules
to match.

Fixes cockroachdb#113292

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Jan 12, 2024
This commit adds a new norm rule, `FoldGroupByAndWindow`, which can
merge a Window operator with a parent GroupBy operator when the grouping
columns are the same as the partition columns. See the rule comment for
the complete list of conditions. In addition to removing a potentially
expensive Window operator, this transformation makes way for other rules
to match.

Fixes cockroachdb#113292

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Oct 3, 2024
This commit adds a new norm rule, `FoldGroupByAndWindow`, which can
merge a Window operator with a parent GroupBy operator when the grouping
columns are the same as the partition columns. See the rule comment for
the complete list of conditions. In addition to removing a potentially
expensive Window operator, this transformation makes way for other rules
to match.

Fixes cockroachdb#113292

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Dec 5, 2024
This commit adds a new norm rule, `FoldGroupByAndWindow`, which can
merge a Window operator with a parent GroupBy operator when the grouping
columns are the same as the partition columns. See the rule comment for
the complete list of conditions. In addition to removing a potentially
expensive Window operator, this transformation makes way for other rules
to match.

Fixes cockroachdb#113292

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Dec 5, 2024
This commit adds a new norm rule, `FoldGroupByAndWindow`, which can
merge a Window operator with a parent GroupBy operator when the grouping
columns are the same as the partition columns. See the rule comment for
the complete list of conditions. In addition to removing a potentially
expensive Window operator, this transformation makes way for other rules
to match.

Fixes cockroachdb#113292

Release note: None
craig bot pushed a commit that referenced this issue Dec 12, 2024
117499: opt: add rule to merge GroupBy and Window operators r=DrewKimball a=DrewKimball

#### opt: add method to add strict dependency to FuncDepSet

This commit adds a new method, `AddStrictDependency`, to `FuncDepSet`.
This will be used in the following commit to add a dependency between
a Window operator's partition columns and its functions. Existing
methods don't work for this because the dependency is not a key.

Epic: None

Release note: None

#### opt: infer functional dependencies for window functions

This commit adds logic to infer strict functional dependencies from
a Window operator's partition column(s) to some or all of its window
functions when the following conditions are satisfied:
1. The window function must be an aggregate, or first_value or last_value.
2. The window frame must be unbounded.

The above conditions ensure that the window function always produces the
same result given the same window frame, as well as that every row in a
partition has the same window frame. This means that the window function
produces the same output for every row in the partition, and therefore,
the partition columns functionally determine the output of the window
function.

Epic: None

Release note: None

#### opt: add rule to merge GroupBy and Window

This commit adds a new norm rule, `FoldGroupByAndWindow`, which can
merge a Window operator with a parent GroupBy operator when the grouping
columns are the same as the partition columns. See the rule comment for
the complete list of conditions. In addition to removing a potentially
expensive Window operator, this transformation makes way for other rules
to match.

Fixes #113292

Release note: None

137069: opt: add implicit SELECT FOR UPDATE to initial scan of DELETE r=yuzefovich a=yuzefovich

This commit makes it so that we apply implicit SELECT FOR UPDATE locking behavior to the initial scan of the DELETE operation in some cases. Namely, we do so when the input to the DELETE is either a scan or an index join on top of a scan (with any number of renders on top) - these conditions mean that all filters were pushed down into the scan, so we won't lock any unnecessary rows. I think it's only possible to have at most one render expression on top of the scan, but I chose to be defensive and allowed nested renders too. In such form the conditions are exactly the same as we use for adding SFU to UPDATEs, so the same function is reused. Existing `enable_implicit_select_for_update` session variable is consulted.

Fixes: #50181.

Release note (sql change): DELETE statements now acquire locks using the FOR UPDATE locking mode during their initial row scan in some comes, which improves performance for contended workloads. This behavior is configurable using the `enable_implicit_select_for_update` session variable.

Co-authored-by: Drew Kimball <drewk@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@craig craig bot closed this as completed in 147054d Dec 13, 2024
@github-project-automation github-project-automation bot moved this from Active to Done in SQL Queries Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-vtables Virtual tables - pg_catalog, information_schema etc A-tools-asyncpg C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-investigation Further steps needed to qualify. C-label will change. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants