Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sync-service): index for array contains #2359

Merged
merged 11 commits into from
Feb 25, 2025
Merged

Conversation

robacourt
Copy link
Contributor

@robacourt robacourt commented Feb 19, 2025

This PR seeks to address trigger.dev's WAL lag issues by adding two optimisations to where clause filtering:

  • Allow multiple conditions in a where clause to be optimised (not just one)
  • Optimise where clauses that have a condition in the form 'array_field @> array_const'

Allow multiple conditions in a where clause to be optimised (not just one)

This feature alone should halve the processing time for trigger.dev as they have many where clauses in the form WHERE runtimeEnvironmentId = ? AND batchId = ? where if a change matched the runtimeEnvironmentId it would then have to iterate through the batchIds. Now it doesn't - the batchId condition is indexed as well.

Optimise where clauses that have a condition in the form 'array_field @> array_const'

Trigger.dev's other problematic where clauses are in the form WHERE runtimeEnvironmentId = ? AND runTags @> ? and this feature optimises the @> operation. The algorithm for this optimisation can be seen in the InclusionIndex module.

Comparing the performance of this index against the current method shows quite a dramatic difference:
Screenshot 2025-02-19 at 19 42 59

Here you can see it performs well with 100k shapes (which for trigger.dev would be 100k shapes with the same runtimeEnvironmentId, in other words 100k shapes per user)
Screenshot 2025-02-19 at 19 43 47

Both of the charts above are based on a shape array size of 3 and a change array size of 10 which is typical for trigger.dev.

The algorithm seems to scale roughly linearly with change array size:
Screenshot 2025-02-24 at 16 12 34

The algorithm seems to scale well with shape array size for a fixed change array size (10):
Screenshot 2025-02-24 at 16 14 35

@robacourt robacourt force-pushed the rob/index-array-contains branch 8 times, most recently from 707acda to 11fa653 Compare February 24, 2025 15:45
@robacourt robacourt changed the title WIP: index for array contains feat(sync-service): index for array contains Feb 24, 2025
@robacourt robacourt force-pushed the rob/index-array-contains branch from 11fa653 to 643c266 Compare February 24, 2025 15:52
@robacourt robacourt marked this pull request as ready for review February 24, 2025 16:03
@robacourt robacourt force-pushed the rob/index-array-contains branch from 643c266 to 10c5854 Compare February 24, 2025 16:33
Copy link
Member

@alco alco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!

Copy link
Contributor

@icehaunter icehaunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good job! 🔥 My only nitpick would be nested update formatting, but I've relayed that to you in DMs, feel free to address or not

@robacourt robacourt force-pushed the rob/index-array-contains branch from 765c282 to bef3973 Compare February 25, 2025 14:49
@robacourt robacourt merged commit c444072 into main Feb 25, 2025
34 checks passed
@robacourt robacourt deleted the rob/index-array-contains branch February 25, 2025 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants