Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk CSV: Upload 10k cases in under 1 minute #702

Closed
axmb opened this issue Jul 31, 2020 · 3 comments
Closed

Bulk CSV: Upload 10k cases in under 1 minute #702

axmb opened this issue Jul 31, 2020 · 3 comments
Assignees
Labels
Data UI Bug is related to Data frontend functionality Data Bug is related to data P2: Nice to have This would be nice to have, if we have time we will fix it, if not good to launch and fix later when

Comments

@axmb
Copy link
Contributor

axmb commented Jul 31, 2020

Describe the solution you'd like
We want to establish a reasonable upper bound on the recommended data size for bulk upload. We've chosen that to be 10k cases, and since it's advertised as an acceptable limit, we want it to complete in a similarly reasonable amount of time. For that, we've chosen <1 minute.

Describe alternatives you've considered

  • Bulk upload is slower, but provide progress (%) feedback (would rather have the significantly better speed).
  • Set a lower upper bound (infeasible; need to support at least ~10k).
  • Target adding more cases in the same time; e.g., an upper bound of 50k (will instead suggest this type of upload be handled via automated ingestion).

Additional context
This is also relevant to automated ingestion, which will use the same APIs as bulk upload. Automated ingestion, however, has an easier time sending large amounts of data. Since it's sent from a Docker image running with O(GB) of memory, we don't have to dance around sharding data in and out of the browser.

@axmb axmb added Data Bug is related to data Data UI Bug is related to Data frontend functionality P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix Eng ready labels Jul 31, 2020
@axmb axmb self-assigned this Jul 31, 2020
@axmb
Copy link
Contributor Author

axmb commented Jul 31, 2020

Related PRs:

#584
#594
#605
#642
#680
#691
#701

@sratcliffe118 sratcliffe118 added P2: Nice to have This would be nice to have, if we have time we will fix it, if not good to launch and fix later when and removed P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix labels Aug 26, 2020
@axmb
Copy link
Contributor Author

axmb commented Aug 26, 2020

I'll retest this shortly. We were close to 10k/minute, but added some slowdowns along with keeping case revisions (which were unfortunately necessary). The slowdowns do impact ADI, but since ADI functions have 15 minutes to complete and don't require a human's attention, I'll leave this at P2 unless that 15 minute deadline is in jeopardy.

@axmb
Copy link
Contributor Author

axmb commented Sep 17, 2020

Circling back -- there were some issues around this, but we've mostly resolved them.

There's a separate issue tracking a related FR in ADI (#1124).

@axmb axmb closed this as completed Sep 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data UI Bug is related to Data frontend functionality Data Bug is related to data P2: Nice to have This would be nice to have, if we have time we will fix it, if not good to launch and fix later when
Projects
None yet
Development

No branches or pull requests

2 participants