Skip to content

Commit 795a8d9

Browse files
committed
💥 Improve database performance
This is a **BREAKING CHANGE** that aims to improve database performance by changing the queries and indexes used to perform operations. There will be no changes visible to consumers of the JavaScript API. The break will: - require consumers to run a migration query before upgrading - add new indexes - allow dropping of an old index More details on migration are at the bottom of this commit message. Motivation ---------- This library seems to have been written with the assumption that its collections are small. However, we have hundreds of thousands of jobs in various queues on Production, which causes some slow queries because of the design choices made in the schema of this library. In particular, we aim to address two issues: 1. `{deleted: {$exists: boolean}}` calls are inefficient, but used by practically every query in this library 2. counting inflight jobs has awful performance when there are many available jobs The result is that no further filtering is needed beyond the index on any of the issued queries. `$exists` --------- MongoDB's `$exists` operator has [tricky index performance][1]. In the best cases, `$exists: true` can use the index, but only if it's sparse, and `$exists: false` can **never** just use the index: it always needs to fetch documents. In order to avoid the constant use of `$exists` in this library, we rely on a logical paradigm shift: we `$unset` `visible` when acking the job, so that `visible` and `deleted` are mutually exclusive fields. Therefore we can: - add a sparse index for both of these fields - query on the field we care about, and **know** that it implies the absence of the other field, allowing the removal of `$exists` assertions It should be noted that in local testing, I observed [strange behaviour][2] when trying to use this partial index: we have to use `ack: {$gt: ''}` instead of `ack: {$exists: true}` to get MongoDB to leverage this index for some reason. `inFlight()` ------------ The existing `inFlight()` query has particularly bad performance in cases where there are many (hundreds of thousands) of jobs available to pick up. This is because the existing query uses the `deleted_1_visible_1` index, but even after filtering by `deleted` and `visible` with the index, the database will need to fetch every single job that could be picked up, and check for `ack`, which is very slow. We improve the performance here by: - the removal of the `$exists` query (see above) - the addition of a partial index that only contains unacked jobs that have been retrieved at some point by `get()`. We can then filter these by the current time to find in-flight jobs Migration path -------------- This performance improvement is built upon a shift in the assumptions made about underlying job structure: namely, that `deleted` and `visible` are now mutually exclusive properties (which was not true before). 1. Bump patch version to [`7.1.1`][3]: this will start removing the `visible` property from acked jobs in a non-breaking way 2. Deploy the patch to Production 3. Update any existing documents to match this new schema: ```js db.collection.updateMany( {deleted: {$exists: true}}, {$unset: {visible: 1}}, ) ``` 4. Bump major version to `8.0.0` and deploy 5. Drop old index `delted_1_visible_1`, which is no longer used [1]: https://www.mongodb.com/docs/manual/reference/operator/query/exists/#use-a-sparse-index-to-improve--exists-performance [2]: https://www.mongodb.com/community/forums/t/partial-index-is-not-used-during-search/290507/2 [3]: #16
1 parent d4da1aa commit 795a8d9

File tree

2 files changed

+14
-8
lines changed

2 files changed

+14
-8
lines changed

‎mongodb-queue.ts‎

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -85,9 +85,17 @@ export class MongoDBQueue<T = any> {
8585

8686
public async createIndexes(): Promise<void> {
8787
await Promise.all([
88-
this.col.createIndex({deleted: 1, visible: 1}),
88+
this.col.createIndex({visible: 1}, {sparse: true}),
8989
this.col.createIndex({ack: 1}, {unique: true, sparse: true}),
9090
this.col.createIndex({deleted: 1}, {sparse: true}),
91+
92+
// Index for efficient counts on in-flight
93+
this.col.createIndex({visible: 1, ack: 1}, {
94+
partialFilterExpression: {
95+
visible: {$exists: true},
96+
ack: {$exists: true},
97+
},
98+
}),
9199
]);
92100
}
93101

@@ -121,7 +129,6 @@ export class MongoDBQueue<T = any> {
121129
public async get(opts: GetOptions = {}): Promise<ExternalMessage<T> | null> {
122130
const visibility = opts.visibility || this.visibility;
123131
const query: Filter<Partial<Message<T>>> = {
124-
deleted: {$exists: false},
125132
visible: {$lte: now()},
126133
};
127134
const sort: Sort = {
@@ -172,7 +179,6 @@ export class MongoDBQueue<T = any> {
172179
const query: Filter<Partial<Message<T>>> = {
173180
ack: ack,
174181
visible: {$gt: now()},
175-
deleted: {$exists: false},
176182
};
177183
const update: UpdateFilter<Message<T>> = {
178184
$set: {
@@ -202,7 +208,6 @@ export class MongoDBQueue<T = any> {
202208
const query: Filter<Partial<Message<T>>> = {
203209
ack: ack,
204210
visible: {$gt: now()},
205-
deleted: {$exists: false},
206211
};
207212
const update: UpdateFilter<Message<T>> = {
208213
$set: {
@@ -237,16 +242,17 @@ export class MongoDBQueue<T = any> {
237242

238243
public async size(): Promise<number> {
239244
return this.col.countDocuments({
240-
deleted: {$exists: false},
241245
visible: {$lte: now()},
242246
});
243247
}
244248

245249
public async inFlight(): Promise<number> {
246250
return this.col.countDocuments({
247-
ack: {$exists: true},
251+
// For some unknown reason, MongoDB refuses to use the partial index with
252+
// {$exists: true}, but *will* use it if we use {$gt: ''}
253+
// https://www.mongodb.com/community/forums/t/partial-index-is-not-used-during-search/290507/2
254+
ack: {$gt: ''},
248255
visible: {$gt: now()},
249-
deleted: {$exists: false},
250256
});
251257
}
252258

‎package.json‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@reedsy/mongodb-queue",
3-
"version": "7.1.1",
3+
"version": "8.0.0",
44
"description": "Message queues which uses MongoDB.",
55
"main": "mongodb-queue.js",
66
"scripts": {

0 commit comments

Comments
 (0)