Description
Checked Existing
- I have checked the repository for duplicate issues.
What enhancement would you like to see?
When a collection has no indexes, Mongo will perform a full COLLSCAN
on the collection, scanning all documents to perform the query. This can very easily eat all available resources when working with
- Large databases
- Low performance queries
- Complex aggregations
When there are many documents, and many scans of those documents, Mongo can easily max out CPU usage and remain there. When working with very large documents (such as those which contain files), this can also easily max out memory usage.
Any other details to share? (OPTIONAL)
Currently our BOSS server is doing a COLLSCAN
over a 7gb+ collection, which has 880,975 documents, over a million times per day. This is locking up all system resources
I propose we add the following indexes:
Task
- Compound index on
task_id
andboss_app_id
, as that is the combination most often queried by
File
- Compound index on
task_id
andboss_app_id
, as that is the combination most often queried by - Single field index on
name
, as we sometimes query bytask_id
,boss_app_id
, andname
. This will use index intersection - Single field index on
data_id
, as we also often query by just this field
CECData
- Compound index on
creator_pid
andgame_id
, as that is the combination most often queried by - Single field index on
latest_data_id
, as we also often query by just this field
CECSlot
- Compound index on
creator_pid
andgame_id
, as that is the combination most often queried by
It should be noted that indexes do not come for free, nor does index intersection. Indexes are stored on disk by Mongo and will increase our storage usage. Index intersection also has some overhead compared to regular indexed queries, but it should be better than a full COLLSCAN
. We CAN make multiple compound indexes using the same fields, but this creates duplicate indexes on disk which again increases storage costs.
Metadata
Metadata
Assignees
Type
Projects
Status