Skip to content

Commit

Permalink
Auto merge of #1668 - sgrif:sg-query-optimization, r=jtgeibel
Browse files Browse the repository at this point in the history
Optimize our most time consuming query

Our database spends more of its time processing /api/v1/crates with no
parameters other than pagination. This query is the main one hit by
crawlers, and it is taking over 100ms to run, so it's at the top of our
list (for posterity's sake, #2 is copying `crate_downloads` during
backups, #3 and #4 are the updates run from bin/update-downloads, and #5
is the query run from the download endpoint)

The query is having to perform the full join between crates and
recent_downloads, and then count the results of that. Since we have no
search parameters of any kind, this count is equivalent to just counting
the crates table, which we can do much more quickly. We still need to do
the count over the whole thing if there's any where clause, but we can
optimize the case where there's no search.

This implicitly relies on the fact that we're only changing the select
clause in branches where we're also setting a where clause. Diesel 2
will probably have a feature that lets us avoid this. We could also
refactor the "exact match" check to be client side instead of the DB and
get rid of all the cases where we modify the select clause.

Before:

```
 Limit  (cost=427.87..470.65 rows=100 width=877) (actual time=109.698..109.739 rows=100 loops=1)
   ->  WindowAgg  (cost=0.14..10119.91 rows=23659 width=877) (actual time=109.277..109.697 rows=1100 loops=1)
         ->  Nested Loop Left Join  (cost=0.14..9966.13 rows=23659 width=869) (actual time=0.051..85.429 rows=23659 loops=1)
               ->  Index Scan using index_crates_name_ordering on crates  (cost=0.08..7604.30 rows=23659 width=860) (actual time=0.037..34.975 rows=23659 loops=1)
               ->  Index Scan using recent_crate_downloads_crate_id on recent_crate_downloads  (cost=0.06..0.10 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=23659)
                     Index Cond: (crate_id = crates.id)
 Planning time: 1.307 ms
 Execution time: 111.840 ms
```

After:

```
 Limit  (cost=1052.34..1094.76 rows=100 width=877) (actual time=11.536..12.026 rows=100 loops=1)
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=627.96..627.96 rows=1 width=8) (actual time=4.966..4.966 rows=1 loops=1)
           ->  Index Only Scan using packages_pkey on crates crates_1  (cost=0.06..616.13 rows=23659 width=0) (actual time=0.015..3.513 rows=23659 loops=1)
                 Heap Fetches: 811
   ->  Subquery Scan on t  (cost=0.14..10037.11 rows=23659 width=877) (actual time=5.019..11.968 rows=1100 loops=1)
         ->  Nested Loop Left Join  (cost=0.14..9966.13 rows=23659 width=869) (actual time=0.051..6.831 rows=1100 loops=1)
               ->  Index Scan using index_crates_name_ordering on crates  (cost=0.08..7604.30 rows=23659 width=860) (actual time=0.038..3.331 rows=1100 loops=1)
               ->  Index Scan using recent_crate_downloads_crate_id on recent_crate_downloads  (cost=0.06..0.10 rows=1 width=12) (actual time=0.003..0.003 rows=1 loops=1100)
                     Index Cond: (crate_id = crates.id)
 Planning time: 1.377 ms
 Execution time: 12.106 ms
```
  • Loading branch information
bors committed Mar 20, 2019
2 parents 8658e49 + c9f4394 commit 907a2d4
Showing 1 changed file with 32 additions and 8 deletions.
40 changes: 32 additions & 8 deletions src/controllers/krate/search.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
//! Endpoint for searching and discovery functionality
use diesel::sql_types::{NotNull, Nullable};
use diesel_full_text_search::*;

use crate::controllers::helpers::Paginate;
Expand Down Expand Up @@ -41,17 +42,20 @@ pub fn search(req: &mut dyn Request) -> CargoResult<Response> {
.get("sort")
.map(|s| &**s)
.unwrap_or("recent-downloads");
let mut has_filter = false;

let selection = (
ALL_COLUMNS,
false.into_sql::<Bool>(),
recent_crate_downloads::downloads.nullable(),
);
let mut query = crates::table
.left_join(recent_crate_downloads::table)
.select((
ALL_COLUMNS,
false.into_sql::<Bool>(),
recent_crate_downloads::downloads.nullable(),
))
.select(selection)
.into_boxed();

if let Some(q_string) = params.get("q") {
has_filter = true;
if !q_string.is_empty() {
let sort = params.get("sort").map(|s| &**s).unwrap_or("relevance");
let q = plainto_tsquery(q_string);
Expand All @@ -75,6 +79,7 @@ pub fn search(req: &mut dyn Request) -> CargoResult<Response> {
}

if let Some(cat) = params.get("category") {
has_filter = true;
query = query.filter(
crates::id.eq_any(
crates_categories::table
Expand All @@ -90,6 +95,7 @@ pub fn search(req: &mut dyn Request) -> CargoResult<Response> {
}

if let Some(kw) = params.get("keyword") {
has_filter = true;
query = query.filter(
crates::id.eq_any(
crates_keywords::table
Expand All @@ -99,6 +105,7 @@ pub fn search(req: &mut dyn Request) -> CargoResult<Response> {
),
);
} else if let Some(letter) = params.get("letter") {
has_filter = true;
let pattern = format!(
"{}%",
letter
Expand All @@ -110,6 +117,7 @@ pub fn search(req: &mut dyn Request) -> CargoResult<Response> {
);
query = query.filter(canon_crate_name(crates::name).like(pattern));
} else if let Some(user_id) = params.get("user_id").and_then(|s| s.parse::<i32>().ok()) {
has_filter = true;
query = query.filter(
crates::id.eq_any(
crate_owners::table
Expand All @@ -120,6 +128,7 @@ pub fn search(req: &mut dyn Request) -> CargoResult<Response> {
),
);
} else if let Some(team_id) = params.get("team_id").and_then(|s| s.parse::<i32>().ok()) {
has_filter = true;
query = query.filter(
crates::id.eq_any(
crate_owners::table
Expand All @@ -130,6 +139,7 @@ pub fn search(req: &mut dyn Request) -> CargoResult<Response> {
),
);
} else if params.get("following").is_some() {
has_filter = true;
query = query.filter(
crates::id.eq_any(
follows::table
Expand All @@ -151,9 +161,23 @@ pub fn search(req: &mut dyn Request) -> CargoResult<Response> {

// The database query returns a tuple within a tuple, with the root
// tuple containing 3 items.
let data = query
.paginate(limit, offset)
.load::<((Crate, bool, Option<i64>), i64)>(&*conn)?;
let data = if has_filter {
query
.paginate(limit, offset)
.load::<((Crate, bool, Option<i64>), i64)>(&*conn)?
} else {
sql_function!(fn coalesce<T: NotNull>(value: Nullable<T>, default: T) -> T);
query
.select((
// FIXME: Use `query.selection()` if that feature ends up in
// Diesel 2.0
selection,
coalesce(crates::table.count().single_value(), 0),
))
.limit(limit)
.offset(offset)
.load(&*conn)?
};
let total = data.first().map(|&(_, t)| t).unwrap_or(0);
let perfect_matches = data.iter().map(|&((_, b, _), _)| b).collect::<Vec<_>>();
let recent_downloads = data
Expand Down

0 comments on commit 907a2d4

Please sign in to comment.