fix: performance issue when database contains a lot of schemas #480

magrinj · 2024-01-02T15:08:45Z

What kind of change does this PR introduce?

Performance issue

What is the current behavior?

Currently, the query time increases linearly with the number of schemas in our system.

What is the new behavior?

With the proposed changes, the query time remains constant, regardless of the number of schemas.

Additional context

At Twenty, we've identified a performance bottleneck in pg_graphql related to our multi-tenancy approach, which involves creating a separate schema for each workspace. In our production environment, which includes hundreds of schemas, we've noticed a significant slowdown in query performance.

Upon investigation, we found that the types and composites in load_sql_context.sql are being loaded for all schemas, rather than just the relevant schema. This seems to be the root cause of the performance issue.

We have identified a potential fix but are seeking additional insights and suggestions to refine our approach.

olirice · 2024-01-03T16:28:06Z

The failing test case shows why we chose the current behavior.

Consider the following case:

search path is public
A type is defined in schema xyz
A table in schema public references xyz.some_type

If the user has permission to access xyz.some_type, they should be allowed to reference it.

After the change made in this PR, we would fail to recognize the referenced type and it would be marked as Opaque

Thats admittedly an edge case but its one we've decided to support in the past so we'll need to think through options before merging this.

The schema is cached on a per-connection basis across graphql requests and the cache is busted only when a DDL event occurs. I could see it having a big impact on your p99, but am surprised if it was a consistent issue.

What kind of performance characteristics are you seeing with ~200 schemas?

magrinj · 2024-01-08T07:53:30Z

Hey @olirice,

Thanks a lot for your answer, in our test we add around 400 schemas, the load_context query goes from around 50ms to 250-300ms.
After the change the query goes back to 50ms.
It's a big limitation for us, because the production became really slow.
I understand the edge case you want to support, on our end we don't need it for now, but maybe it's gonna change in the future.
Maybe we can query the foreign references and only join them instead of all the composites and all the types ?

olirice · 2024-01-08T20:23:51Z

make sense, thanks for the additional info

Maybe we can query the foreign references and only join them instead of all the composites and all the types ?

Yes, that would be safer. Its possible that doing the lookups on references made functions and tables could actually be slower but its definitely worth a test. Are you up for updating this PR with that approach?

magrinj · 2024-01-09T09:23:45Z

Hey @olirice,

Thanks for the swift feedback and the suggestion – much appreciated! We'll definitely take a stab at updating the PR with the approach you mentioned. We'll try to tackle that in the January month.

On a slightly different note, our team is also keen on nested query filters. We're practically dreaming of them at this point 😅. Do you have any plans to include this feature in pg_graphql? It would be a game-changer for us. We're crossing our fingers that it's on the roadmap.

olirice · 2024-01-09T21:23:43Z

We'll try to tackle that in the January month

Nice, looking forward to it

nested query filters

could you give me a concrete example for this?
is it #88 or something else?

magrinj · 2024-01-11T07:19:02Z

Yes it's exactly this ticket 🙂

etherealhedgehog · 2024-02-08T17:04:05Z

Hey @olirice and @magrinj we are also experiencing this same issue. We have 3000+ schemas in a multi-tenancy environment and after version 1.2.3 our queries resolve in over 1 minute even when there is no data in the schema. We are currently locked at v1.2.3 because of this issue. We would love to see this fix get committed and could offer additional performance testing for it. For additional context, each schema of ours has approximately 10-15 tables in it.

olirice · 2024-02-12T17:17:01Z

@etherealhedgehog @magrinj proposal available in #493

olirice · 2024-02-12T17:21:51Z

@etherealhedgehog 3000 schemas is a pretty serious project! I'd love to hear more about your application and how you're using pg_graphql e.g. public facing API, admin API, etc?

olirice · 2024-04-10T15:21:46Z

resolved by #493

fix: performance issue when database contains a lot of schemas

570910b

olirice mentioned this pull request Feb 12, 2024

Context loading optimization for schema based multi-tenant workloads #493

Merged

olirice closed this Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: performance issue when database contains a lot of schemas #480

fix: performance issue when database contains a lot of schemas #480

magrinj commented Jan 2, 2024

olirice commented Jan 3, 2024 •

edited

Loading

magrinj commented Jan 8, 2024

olirice commented Jan 8, 2024

magrinj commented Jan 9, 2024

olirice commented Jan 9, 2024

magrinj commented Jan 11, 2024

etherealhedgehog commented Feb 8, 2024 •

edited

Loading

olirice commented Feb 12, 2024

olirice commented Feb 12, 2024

olirice commented Apr 10, 2024

fix: performance issue when database contains a lot of schemas #480

fix: performance issue when database contains a lot of schemas #480

Conversation

magrinj commented Jan 2, 2024

What kind of change does this PR introduce?

What is the current behavior?

What is the new behavior?

Additional context

olirice commented Jan 3, 2024 • edited Loading

magrinj commented Jan 8, 2024

olirice commented Jan 8, 2024

magrinj commented Jan 9, 2024

olirice commented Jan 9, 2024

magrinj commented Jan 11, 2024

etherealhedgehog commented Feb 8, 2024 • edited Loading

olirice commented Feb 12, 2024

olirice commented Feb 12, 2024

olirice commented Apr 10, 2024

olirice commented Jan 3, 2024 •

edited

Loading

etherealhedgehog commented Feb 8, 2024 •

edited

Loading