-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A/B test jobs #314
A/B test jobs #314
Conversation
AB_environments/AB_sample.dask.yaml
Outdated
# Leave empty if you don't want to override anything. | ||
distributed: | ||
scheduler: | ||
worker-saturation: 1.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notably, if #235 will go through we'll lose the ability to pass environment variables.
The alternative to forcing the developer to create an empty file every time was to ignore the file if it doesn't exist, but the risk of misspellings producing unexpected results was too high. Explicit is better than implicit.
042b39d
to
f32e100
Compare
ff71818
to
b008604
Compare
Ready for final review. |
b008604
to
88b3a4e
Compare
Question:In the context of A/B performance tests, do we care about the |
I can see a solid argument for skipping the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
full disclosure: I haven't reviewed ab_tests in much detail but rather the broad strokes.
My comments are mostly nitpicks. If there are issues or missing features we can iterate.
Looking forward to see this in action!
- dask==2022.8.1 | ||
- distributed=2022.8.1 | ||
``` | ||
- `AB_environments/AB_baseline.dask.yaml`: (empty file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does there need to be an empty file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The alternative to forcing the developer to create an empty file every time was to ignore the file if it doesn't exist, but the risk of misspellings quietly resulting in unexpected results was too high. Explicit is better than implicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not include empty file with appropriate name in repo? That would mean you'd get no custom config as the default, but also make it easier to not accidentally make a new file w/ wrong name (e.g., you'd either edit existing file or you'd use cp your-file AB_[tab]
and autocomplete to correct filename).
import yaml | ||
|
||
|
||
def main(fname: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we're now able to ship dask config using the dask.config ctx manager. That might be a better interface than this. is there a reason why we use env vars instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we're now able to ship dask config using the dask.config ctx manager.
Just to be clear, change to ship dask config to cluster is not yet deployed, presumably will go out next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? This is news to me.
Are you saying that when you call coiled.create_software_environment
it will pick up the local config?
Also, I needed something that could be passed to coiled env create
. Ideally one would want to pass it a dask config file directly, but I believe there's no such feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm confused. The feature is that when you create cluster it ships local config. I don't see any connection between dask config and the software environment... am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, change to ship dask config to cluster is not yet deployed, presumably will go out next week.
This is exactly what we need, please ping me when it's generally available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm confused. The feature is that when you create cluster it ships local config. I don't see any connection between dask config and the software environment... am I missing something?
The script is currently calling coiled env create -e DASK_DISTRIBUTED_...=VALUE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script is currently calling
coiled env create -e DASK_DISTRIBUTED_...=VALUE
I could be wrong but I think that just applies when creating the software environment and not when you make a cluster using that software environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works fine for me?
import coiled
import dask.config
import distributed
!coiled env create --name crusaderky/test_vars --conda AB_environments/AB_baseline.conda.yaml -e DASK_TESTVAR=123
cluster = coiled.Cluster(name="test_vars", n_workers=0, software="crusaderky/test_vars")
client = distributed.Client(cluster)
client.run_on_scheduler(lambda: dask.config.get("testvar"))
123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks, good to know!
This PR adds the ability to run the whole test suite on arbitrary additional coiled software environments.