Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large scale date sorting / filtering #1875

Closed
maciej-zarzeczny opened this issue May 24, 2021 · 4 comments
Closed

Large scale date sorting / filtering #1875

maciej-zarzeczny opened this issue May 24, 2021 · 4 comments
Assignees
Labels
Data UI Bug is related to Data frontend functionality HTD Large P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix
Milestone

Comments

@maciej-zarzeczny
Copy link
Contributor

Try to improve performance of sorting filtering big data queries

@maciej-zarzeczny maciej-zarzeczny added Data UI Bug is related to Data frontend functionality P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix labels May 24, 2021
@maciej-zarzeczny maciej-zarzeczny added this to the Strongcat milestone May 24, 2021
@abhidg
Copy link
Contributor

abhidg commented May 24, 2021

Likely an indexing issue since at the moment confirmed date is located within the events array. Should be fixed by adding a confirmedDate field and index with just the date

@maciej-zarzeczny maciej-zarzeczny self-assigned this Jun 1, 2021
@abhidg abhidg self-assigned this Jun 9, 2021
@abhidg abhidg modified the milestones: Strongcat, Proxy Jul 1, 2021
@joe-brilliant joe-brilliant modified the milestones: Proxy, Strongcat Jul 7, 2021
@abhidg abhidg modified the milestones: Strongcat, Proxy Jul 10, 2021
@iamleeg
Copy link
Contributor

iamleeg commented Sep 13, 2021

I don't know whether there are specific expectations of what to speed up here, but I see a small number of queries in prod that are still slow:

  • a couple times someone searched for verification status is EXCLUDED and sourceId is {{Taiwan source}} and confirmation date between 1 dec 2019 and now, seems a bit weirdly specific but may be an important case
  • someone was searching for cases by one curator (maybe themselves?) and their query came out as a text search for curatoremail:{{the email address}}, so this field wasn't parsed by the back end. Maybe it should be, and should be indexed? They were also searching for country:Deutschland which is not going to match but will be fixed by having the country drop-down in the search form.
  • the aggregations for generating the map data do a whole collection scan. They search for list:true, then e.g. one of them projects country fields, then buckets by country. Maybe the bucketing could be done before the projection to speed it up.
  • all of the mongoexport commands that rely on list:true are all slow. Probably we need to look for another way to deal with that parser limitation.

@iamleeg
Copy link
Contributor

iamleeg commented Sep 21, 2021

a couple times someone searched for verification status is EXCLUDED and sourceId is {{Taiwan source}} and confirmation date between 1 dec 2019 and now, seems a bit weirdly specific but may be an important case

This just came up again as the slowest query on the cluster in the graph window, so someone is doing it reasonably frequently.

@iamleeg
Copy link
Contributor

iamleeg commented Sep 21, 2021

A couple of indexes recommended by the Atlas automatic tool:

caseReference.sourceId: 1
caseReference.verificationStatus: 1

and

list: 1
revisionMetadata.creationMetadata.date: -1

iamleeg added a commit that referenced this issue Sep 22, 2021
Fixed a breaking bug in the down migration #1875
iamleeg added a commit that referenced this issue Sep 22, 2021
Fixed a breaking bug in the down migration #1875
iamleeg added a commit that referenced this issue Oct 1, 2021
Fixed a breaking bug in the down migration #1875
iamleeg added a commit that referenced this issue Nov 12, 2021
iamleeg added a commit that referenced this issue Nov 12, 2021
@joe-brilliant joe-brilliant added P2: Nice to have This would be nice to have, if we have time we will fix it, if not good to launch and fix later when and removed P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix labels Dec 7, 2021
@joe-brilliant joe-brilliant modified the milestones: Proxy, Lewis Dec 7, 2021
@joe-brilliant joe-brilliant modified the milestones: Lewis, Holding Bin Feb 1, 2022
@joe-brilliant joe-brilliant modified the milestones: Holding Bin, Leonidas Feb 15, 2022
@joe-brilliant joe-brilliant removed this from the Holding Bin milestone Feb 24, 2022
@joe-brilliant joe-brilliant added this to the Koa milestone Feb 24, 2022
@joe-brilliant joe-brilliant added P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix and removed P2: Nice to have This would be nice to have, if we have time we will fix it, if not good to launch and fix later when labels Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data UI Bug is related to Data frontend functionality HTD Large P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix
Projects
None yet
Development

No branches or pull requests

5 participants