-
Notifications
You must be signed in to change notification settings - Fork 640
Improve performance of the reverse dependencies endpoint #3655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Note that this is practically reversing what we did in #592, but this is now starting to actually impact our performance. |
Is it possible to use an index instead of a column? That would avoid issues with the column getting out of sync. |
I'm pretty sure that query is too complex to go on an index. A materialized view might work depending on how long it takes for it to update. |
Yeah, played around with it and indexes don't seem to help much. This query though takes around 3.2s (explain) on the primary db: SELECT DISTINCT ON (crate_id) crate_id, id
FROM versions
WHERE NOT yanked
ORDER BY crate_id, to_semver_no_prerelease(num) DESC NULLS LAST; We could turn it into a materialized view we refresh every time a new crate is published/yanked/unyanked/deleted, but I'm worried its speed will continue to decrease as time goes by. Another option to investigate would be to store the parsed semver in the database, without calling |
I'm happy to report that we've made significant progress on this issue. After applying #8737 , the average query execution time for serde has been reduced from 2.5s to 500ms in production. |
Me and Justin spent some time a couple days ago trying to figure out why the reverse dependencies endpoint is so slow for crates with a large number of reverse dependencies. This is the
EXPLAIN ANALYZE
of that query.The goal of the query is to return all crates whose last version depends on the current crate. As the latest version is not stored anywhere the query is doing a subquery that sorts all the versions of each crate by their semver, which is really slow.
We should denormalize the latest version ID (semver-wise) in the crates table, and change the query to remove the subquery. After removing the subquery the query should be small enough to be converted to Diesel. Ideally denormalizing the latest version ID and updating the query should be done in separate PRs.
The text was updated successfully, but these errors were encountered: