Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize records and rankings queries #9975

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

gregorbg
Copy link
Member

We cannot pre-compute the whole tables, the number of possible combinations of parameters is simply too big for that.
However we can work with more efficient indexes, and build a few auxiliary tables that unfold Results once instead of unfolding them upon every request.

Together with the Redis caches, this should be enough to re-enable some of the filters.

Here's a more complete list of what this PR is doing

  • Replacing implicit cross joins in the WHERE section by explicit JOIN statements for readability
  • Pre-compute the gender value in the Concise*Results directly, to avoid at least one JOIN
  • Introduce a new ComputeRankingsRecords (CRR) job
    • Creates a table that "unfolds" value1value5 into separate rows, referencing their respective "parent" result via foreign key
    • Creates a table that pre-filters all result rows marked as regionalRecord in any capacity
  • Add more indexes to the existing Concise*Results tables, mostly to accelerate GROUP BY statements
  • (Bonus: change the ID columns of the Concise*Results tables so that they match the Results and the Ranks* tables)

@gregorbg gregorbg force-pushed the feature/augmented-cad-tables branch 3 times, most recently from 68edcc0 to 083798a Compare September 20, 2024 13:06

class ComputeRankingsRecords < WcaCronjob
def self.reason_not_to_run
unless ComputeAuxiliaryData.last_run_successful?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not also check if we already computed the table since CAD ran last? Otherwise this looks to me like this will just run every 30 minutes regardless if there are new PR information or not

@gregorbg gregorbg marked this pull request as draft October 8, 2024 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants