Fix performance issue with many duplicate ids #40

rkistner · 2024-10-15T15:03:39Z

If you had a sync rules query like this: select '' as id from mytable (all rows with the same id), the JOIN sync_local query would have O(n^2) operations. For 10k rows, this is 100m operations, which takes a very long time.

While this should never be the case in a production app (you'd typically have a max of 2 or 3 duplicates), it could easily lead to a "hanging" app during development if you had a mistake in the query, making it difficult to debug.

The change in query here takes it back to O(n). It does use an additional "TEMP B-TREE" for the DISTINCT/UNION step, but does not seem to negatively affect sync performance in my testing.

Fix performance issue with many duplicate ids.

0f56b71

rkistner requested a review from DominicGBauer October 15, 2024 15:03

DominicGBauer approved these changes Oct 15, 2024

View reviewed changes

rkistner merged commit a59beae into main Oct 15, 2024
12 checks passed

rkistner deleted the fix-duplicate-id-performance branch October 15, 2024 15:09

rkistner mentioned this pull request Feb 3, 2025

[WIP] Rewrite the "sync_local" query #56

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix performance issue with many duplicate ids #40

Fix performance issue with many duplicate ids #40

Uh oh!

rkistner commented Oct 15, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix performance issue with many duplicate ids #40

Fix performance issue with many duplicate ids #40

Uh oh!

Conversation

rkistner commented Oct 15, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants