Commit 795a8d9
committed
💥 Improve database performance
This is a **BREAKING CHANGE** that aims to improve database performance
by changing the queries and indexes used to perform operations. There
will be no changes visible to consumers of the JavaScript API.
The break will:
- require consumers to run a migration query before upgrading
- add new indexes
- allow dropping of an old index
More details on migration are at the bottom of this commit message.
Motivation
----------
This library seems to have been written with the assumption that its
collections are small. However, we have hundreds of thousands of jobs in
various queues on Production, which causes some slow queries because of
the design choices made in the schema of this library.
In particular, we aim to address two issues:
1. `{deleted: {$exists: boolean}}` calls are inefficient, but used by
practically every query in this library
2. counting inflight jobs has awful performance when there are many
available jobs
The result is that no further filtering is needed beyond the index on
any of the issued queries.
`$exists`
---------
MongoDB's `$exists` operator has [tricky index performance][1]. In the
best cases, `$exists: true` can use the index, but only if it's sparse,
and `$exists: false` can **never** just use the index: it always needs
to fetch documents.
In order to avoid the constant use of `$exists` in this library, we rely
on a logical paradigm shift: we `$unset` `visible` when acking the job,
so that `visible` and `deleted` are mutually exclusive fields. Therefore
we can:
- add a sparse index for both of these fields
- query on the field we care about, and **know** that it implies the
absence of the other field, allowing the removal of `$exists`
assertions
It should be noted that in local testing, I observed
[strange behaviour][2] when trying to use this partial index: we have to
use `ack: {$gt: ''}` instead of `ack: {$exists: true}` to get MongoDB to
leverage this index for some reason.
`inFlight()`
------------
The existing `inFlight()` query has particularly bad performance in
cases where there are many (hundreds of thousands) of jobs available to
pick up.
This is because the existing query uses the `deleted_1_visible_1` index,
but even after filtering by `deleted` and `visible` with the index, the
database will need to fetch every single job that could be picked up,
and check for `ack`, which is very slow.
We improve the performance here by:
- the removal of the `$exists` query (see above)
- the addition of a partial index that only contains unacked jobs that
have been retrieved at some point by `get()`. We can then filter
these by the current time to find in-flight jobs
Migration path
--------------
This performance improvement is built upon a shift in the assumptions
made about underlying job structure: namely, that `deleted` and
`visible` are now mutually exclusive properties (which was not true
before).
1. Bump patch version to [`7.1.1`][3]: this will start removing the
`visible` property from acked jobs in a non-breaking way
2. Deploy the patch to Production
3. Update any existing documents to match this new schema:
```js
db.collection.updateMany(
{deleted: {$exists: true}},
{$unset: {visible: 1}},
)
```
4. Bump major version to `8.0.0` and deploy
5. Drop old index `delted_1_visible_1`, which is no longer used
[1]: https://www.mongodb.com/docs/manual/reference/operator/query/exists/#use-a-sparse-index-to-improve--exists-performance
[2]: https://www.mongodb.com/community/forums/t/partial-index-is-not-used-during-search/290507/2
[3]: #161 parent d4da1aa commit 795a8d9
2 files changed
+14
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
88 | | - | |
| 88 | + | |
89 | 89 | | |
90 | 90 | | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
91 | 99 | | |
92 | 100 | | |
93 | 101 | | |
| |||
121 | 129 | | |
122 | 130 | | |
123 | 131 | | |
124 | | - | |
125 | 132 | | |
126 | 133 | | |
127 | 134 | | |
| |||
172 | 179 | | |
173 | 180 | | |
174 | 181 | | |
175 | | - | |
176 | 182 | | |
177 | 183 | | |
178 | 184 | | |
| |||
202 | 208 | | |
203 | 209 | | |
204 | 210 | | |
205 | | - | |
206 | 211 | | |
207 | 212 | | |
208 | 213 | | |
| |||
237 | 242 | | |
238 | 243 | | |
239 | 244 | | |
240 | | - | |
241 | 245 | | |
242 | 246 | | |
243 | 247 | | |
244 | 248 | | |
245 | 249 | | |
246 | 250 | | |
247 | | - | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
248 | 255 | | |
249 | | - | |
250 | 256 | | |
251 | 257 | | |
252 | 258 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
0 commit comments