Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting by Aggregates is extremely slow #5554

Labels
support/needs-more-info Needs more details/info/repro instructions

Comments

@adventurini
Copy link

When querying by an aggregate sum on an array relationship, the performance is abysmal. Any ideas?

@tirumaraiselvan
Copy link
Contributor

We will need more info to diagnose performance issues.

  1. What is the full query?
  2. What is the size of the table (num of rows?)
  3. What is the Analyze output? https://hasura.io/docs/1.0/graphql/manual/queries/performance.html#analysing-query-performance

@tirumaraiselvan tirumaraiselvan added the support/needs-more-info Needs more details/info/repro instructions label Aug 10, 2020
@nbourdin
Copy link

nbourdin commented Apr 6, 2023

Hello i have the same issue, my request perform in 11seconds (100ms without the order_by)
Here is some informations below

We will need more info to diagnose performance issues.

  1. What is the full query?
query event {
  event(order_by: {races_aggregate: {min: {start_date: asc}}}, limit: 25, offset: 0, where: {type: {_eq: classic}, sport: {_eq: "trail"}}) {
    title
  }
}
  1. What is the size of the table (num of rows?)
The event table contains 5k rows, and the race table contains 10k rows
  1. What is the Analyze output? https://hasura.io/docs/1.0/graphql/manual/queries/performance.html#analysing-query-performance

Here is the generated SQL from analyze :

SELECT
  coalesce(
    json_agg(
      "root"
      ORDER BY
        "root.ar.root.races.order_by.min.start_date" ASC NULLS LAST
    ),
    '[]'
  ) AS "root"
FROM
  (
    SELECT
      row_to_json(
        (
          SELECT
            "_e"
          FROM
            (
              SELECT
                "_root.base"."title" AS "title"
            ) AS "_e"
        )
      ) AS "root",
      "_root.ar.root.races.order_by"."min.start_date" AS "root.ar.root.races.order_by.min.start_date"
    FROM
      (
        SELECT
          *
        FROM
          "public"."event"
        WHERE
          (
            (("public"."event"."type") = (('classic') :: text))
            AND (("public"."event"."sport") = (('trail') :: text))
          )
      ) AS "_root.base"
      LEFT OUTER JOIN LATERAL (
        SELECT
          min("start_date") AS "min.start_date"
        FROM
          (
            SELECT
              "_root.ar.root.races.order_by.base"."start_date" AS "start_date"
            FROM
              (
                SELECT
                  *
                FROM
                  "public"."race"
                WHERE
                  (("_root.base"."id") = ("event__id"))
              ) AS "_root.ar.root.races.order_by.base"
          ) AS "_root.ar.root.races.order_by"
      ) AS "_root.ar.root.races.order_by" ON ('true')
    ORDER BY
      "root.ar.root.races.order_by.min.start_date" ASC NULLS LAST
    LIMIT
      25 OFFSET 0
  ) AS "_root"

and the execution plan :

Aggregate  (cost=692886.02..692886.03 rows=1 width=32)
  ->  Limit  (cost=692885.64..692885.71 rows=25 width=40)
        ->  Sort  (cost=692885.64..692891.42 rows=2312 width=40)
              Sort Key: (min(race.start_date))
              ->  Nested Loop Left Join  (cost=299.40..692820.40 rows=2312 width=40)
                    ->  Seq Scan on event  (cost=0.00..497.78 rows=2312 width=39)
                          Filter: ((type = 'classic'::text) AND (sport = 'trail'::text))
                    ->  Aggregate  (cost=299.40..299.41 rows=1 width=8)
                          ->  Seq Scan on race  (cost=0.00..299.40 rows=2 width=8)
                                Filter: (event.id = event__id)
                    SubPlan 1
                      ->  Result  (cost=0.00..0.01 rows=1 width=32)
JIT:
  Functions: 18
  Options: Inlining true, Optimization true, Expressions true, Deforming true

@nbourdin
Copy link

Ok thanks to a response on stackoverflow,
i added a hash index on race.event__id and now its really fast

@seanparkross
Copy link
Contributor

Yes. @nbourdin Thanks for letting us know.

For any further users who have slow queries like this. Using indexes on specific columns will usually greatly speed up your queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment