-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VS-816 Keeping ingestion under quota #8193
Conversation
…t with different numbers (if quotas are increased for a customer) without having to modify our code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome, thanks for taking this on! 🎉 👍
else if num_samples < 1000 then 1 | ||
else num_samples / 1000 | ||
else if beta_customer_rate_limit then beta_effective_load_data_batch_size | ||
else if num_samples < 1000 then 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the Broad limits can be structured similarly
# Broad users enjoy higher quotas and can scatter more widely than beta users
Int broad_max_scatter = 1000
Int broad_effective_load_data_batch_size = if num_samples < broad_max_scatter then 1
else num_samples / broad_max_scatter
.
.
.
Int effective_load_data_batch_size = if (defined(load_data_batch_size)) then select_first([load_data_batch_size])
else if beta_customer_rate_limit then beta_effective_load_data_batch_size
else broad_effective_load_data_batch_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really good suggestion. To be honest, I was frustrated with my inability to have an if (beta_customer_rate_limit) {}
block that selectively overwrote the computed value of effective_load_data_batch_size, so this is the cleanest way I could think to do it with my limited WDL experience. I was half expecting one of you to show me an easier way to handle the beta case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've restructured the code in that portion to unify the two cases a lot more, and made the variable name for beta behavior a little more self-explanatory. I'm running it again (just in case some silly typo or something caused an issue), so I'll post the successful run tomorrow when I have it.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## ah_var_store #8193 +/- ##
================================================
Coverage ? 86.219%
Complexity ? 35519
================================================
Files ? 2191
Lines ? 166339
Branches ? 17900
================================================
Hits ? 143416
Misses ? 16543
Partials ? 6380 |
… unify the beta and broad use cases better.
Limiting scattering size in ingest to keep beta customers under quota.
Successful run on NHGRI AnVIL dataset: https://app.terra.bio/#workspaces/gvs-dev/NHGRI_AnVIL_3K%20hatcher/job_history/a9c2a81b-d81c-4f7a-a433-4096ccc7b579
Quota behavior during successful run:
Next successful run after minor refactoring: https://app.terra.bio/#workspaces/gvs-dev/GVS%20Tiny%20Quickstart%20hatcher/job_history/8a34477f-af3b-4e19-8184-862ed1c2cba3