Skip to content

[Enhancement]: Use indexes for Mongo collections #8

Open
@jonbarrow

Description

@jonbarrow

Checked Existing

  • I have checked the repository for duplicate issues.

What enhancement would you like to see?

When a collection has no indexes, Mongo will perform a full COLLSCAN on the collection, scanning all documents to perform the query. This can very easily eat all available resources when working with

  • Large databases
  • Low performance queries
  • Complex aggregations

When there are many documents, and many scans of those documents, Mongo can easily max out CPU usage and remain there. When working with very large documents (such as those which contain files), this can also easily max out memory usage.

Any other details to share? (OPTIONAL)

Currently our BOSS server is doing a COLLSCAN over a 7gb+ collection, which has 880,975 documents, over a million times per day. This is locking up all system resources

Screenshot from 2024-05-30 23-24-06

I propose we add the following indexes:

Task

  • Compound index on task_id and boss_app_id, as that is the combination most often queried by

File

  • Compound index on task_id and boss_app_id, as that is the combination most often queried by
  • Single field index on name, as we sometimes query by task_id, boss_app_id, and name. This will use index intersection
  • Single field index on data_id, as we also often query by just this field

CECData

  • Compound index on creator_pid and game_id, as that is the combination most often queried by
  • Single field index on latest_data_id, as we also often query by just this field

CECSlot

  • Compound index on creator_pid and game_id, as that is the combination most often queried by

It should be noted that indexes do not come for free, nor does index intersection. Indexes are stored on disk by Mongo and will increase our storage usage. Index intersection also has some overhead compared to regular indexed queries, but it should be better than a full COLLSCAN. We CAN make multiple compound indexes using the same fields, but this creates duplicate indexes on disk which again increases storage costs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    approvedThe topic is approved by a developerenhancementAn update to an existing part of the codebase

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions