-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uniform and better MCMC params for the tests #1107
Conversation
So I think that some tests were "ill-parametrized" in the sense that they had very few warmup samples (down to 1) and very many chains (up to 20) and high thinning (10-20). I am currently trying to find 1 hyper-param setting that works for all tests. There was a file-based solution for this in place, which was partially overwitten for tests. I kept the overwrites. Some of them could for sure be deleted. |
I will for now not change the MCMC parameterization in the examples since they might be WIP |
@@ -118,7 +118,8 @@ testpaths = [ | |||
] | |||
markers = [ | |||
"slow: marks tests as slow (deselect with '-m \"not slow\"')", | |||
"gpu: marks tests that require a gpu (deselect with '-m \"not gpu\"')" | |||
"gpu: marks tests that require a gpu (deselect with '-m \"not gpu\"')", | |||
"mcmc: marks tests that require MCMC sampling (deselect with '-m \"not mcmc\"')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. How long do the MCMC tests run, currently?
Edit: I see these are the current tests but with an additional flag. So its fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without -n auto
about 850s on my machine (this includes slow
). with -n auto
it's 4500s. I will need to investigate this further, but I can imagine that with multiple chains, we might cause more harm than good with test parallelization. this could interest you, too @Baschdl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we could think about running the mcmc tests sequentially (e.g. in CI) but a more sustainable solution might be to add a pytest.mark
for tests that should only be executed sequentially. At this point, a legitimate question would be whether the parallelization of the tests brings enough benefit to outweigh the maintenance cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks @famura
Just one comment about the num_chains
default.
|
||
def mdn_inference_with_different_methods(method): | ||
@pytest.mark.parametrize( | ||
"method", (SNPE, pytest.param(SNLE, marks=[pytest.mark.slow, pytest.mark.mcmc])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks, good all tests pass also locally on my machine using pytest tests -m "mcmc and not gpu" -v
.
This will currently just add the capability to rapidly test new MCMC paramters, but does not actually change them, right? So for that we can already merge it. I will leave it to @janfb to approve.
I want to add that I pulled some params out of thin air and some tests might run longer or shorter now. Actually, I never ran the slow tests locally before, so I have no reference. I will scan for the |
# Conflicts: # tests/conftest.py
Regarding
|
From my point of view this PR is ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the extra time. Looks all good, I will double check the num_chains
things and take it from here into main
.
I added separate fixtures for |
I think that's a good change. I was just lacking the insight which test should be sovled accurately. One final thing before merging this PR: be aware that |
Thanks for fixing the remaining override! Setting |
I see. Once the ruff check runs, donno what that error is tbh, we can merge it :) |
ruff format and ruff check show nothing (that is related to this PR) |
This is what the workflow is complaining about. Which also seems not to be related to the changes here. sbi/neural_nets/density_estimators/mixed_density_estimator.py:153:89: E501 Line too long (90 > 88) |
Yeah, I think these are a problem from main. The files are not changed, so mergning should not propagate that error in main |
Well, main is currently also failing. I can locally reproduce these four errors. |
Nice, thanks guys. So this PR is done |
What does this implement/fix? Explain your changes
A test errored with in MCMC samples for some runs.
Does this close any currently open issues?
Fixes #1090 (somebody else should test in on their machine, too since it is/was a stochastic bug
Any relevant code examples, logs, error output, etc?
To test if the issue is fixed, run
pytest tests/linearGaussian_snre_test.py -k test_api_snre_multiple_trials_and_rounds_map
Any other comments?
Related to issue #910 and indirectly PR #1053
To run the newly flagged tests locally use
Checklist
Put an
x
in the boxes that apply. You can also fill these out after creatingthe PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.
guidelines
with
pytest.mark.slow
.guidelines
main
(or there are no conflicts withmain
)