Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uniform and better MCMC params for the tests #1107

Merged
merged 14 commits into from
Mar 26, 2024
Merged

Conversation

famura
Copy link
Contributor

@famura famura commented Mar 22, 2024

What does this implement/fix? Explain your changes

A test errored with in MCMC samples for some runs.

Does this close any currently open issues?

Fixes #1090 (somebody else should test in on their machine, too since it is/was a stochastic bug

Any relevant code examples, logs, error output, etc?

To test if the issue is fixed, run pytest tests/linearGaussian_snre_test.py -k test_api_snre_multiple_trials_and_rounds_map

Any other comments?

Related to issue #910 and indirectly PR #1053

To run the newly flagged tests locally use

pytest tests -m "mcmc and not gpu" -v

Checklist

Put an x in the boxes that apply. You can also fill these out after creating
the PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.

  • I have read and understood the contribution
    guidelines
  • I agree with re-licensing my contribution from AGPLv3 to Apache-2.0.
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have reported how long the new tests run and potentially marked them
    with pytest.mark.slow.
  • New and existing unit tests pass locally with my changes
  • I performed linting and formatting as described in the contribution
    guidelines
  • I rebased on main (or there are no conflicts with main)

@famura famura added bug Something isn't working hackathon labels Mar 22, 2024
@famura famura requested a review from bkmi March 22, 2024 17:15
@famura famura self-assigned this Mar 22, 2024
@famura
Copy link
Contributor Author

famura commented Mar 22, 2024

So I think that some tests were "ill-parametrized" in the sense that they had very few warmup samples (down to 1) and very many chains (up to 20) and high thinning (10-20). I am currently trying to find 1 hyper-param setting that works for all tests. There was a file-based solution for this in place, which was partially overwitten for tests. I kept the overwrites. Some of them could for sure be deleted.

@famura
Copy link
Contributor Author

famura commented Mar 22, 2024

I will for now not change the MCMC parameterization in the examples since they might be WIP

@@ -118,7 +118,8 @@ testpaths = [
]
markers = [
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
"gpu: marks tests that require a gpu (deselect with '-m \"not gpu\"')"
"gpu: marks tests that require a gpu (deselect with '-m \"not gpu\"')",
"mcmc: marks tests that require MCMC sampling (deselect with '-m \"not mcmc\"')"
Copy link
Contributor

@manuelgloeckler manuelgloeckler Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. How long do the MCMC tests run, currently?

Edit: I see these are the current tests but with an additional flag. So its fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without -n auto about 850s on my machine (this includes slow). with -n auto it's 4500s. I will need to investigate this further, but I can imagine that with multiple chains, we might cause more harm than good with test parallelization. this could interest you, too @Baschdl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we could think about running the mcmc tests sequentially (e.g. in CI) but a more sustainable solution might be to add a pytest.mark for tests that should only be executed sequentially. At this point, a legitimate question would be whether the parallelization of the tests brings enough benefit to outweigh the maintenance cost.

tests/conftest.py Outdated Show resolved Hide resolved
Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks @famura
Just one comment about the num_chains default.

tests/conftest.py Outdated Show resolved Hide resolved
tests/conftest.py Outdated Show resolved Hide resolved

def mdn_inference_with_different_methods(method):
@pytest.mark.parametrize(
"method", (SNPE, pytest.param(SNLE, marks=[pytest.mark.slow, pytest.mark.mcmc]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@manuelgloeckler manuelgloeckler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks, good all tests pass also locally on my machine using pytest tests -m "mcmc and not gpu" -v.

This will currently just add the capability to rapidly test new MCMC paramters, but does not actually change them, right? So for that we can already merge it. I will leave it to @janfb to approve.

@famura
Copy link
Contributor Author

famura commented Mar 25, 2024

Looks, good all tests pass also locally on my machine using pytest tests -m "mcmc and not gpu" -v.

This will currently just add the capability to rapidly test new MCMC parameters, but does not actually change them, right? So for that we can already merge it. I will leave it to @janfb to approve.

I want to add that I pulled some params out of thin air and some tests might run longer or shorter now. Actually, I never ran the slow tests locally before, so I have no reference.

I will scan for the slice_np_vec sampler and use more chains. Oh and I've seen that there is already a merge conflict. Soon I will have to go back to my main work, and then won't find the time to continuously merge origin/main back in. I can understand if you dont want to rush things though

@famura
Copy link
Contributor Author

famura commented Mar 25, 2024

Regarding num_chains for slice_np_vectoruzed, I did not see any relevant speed up between 3 and 10 for this test on my local machine

pytest tests/inference_on_device_test.py::test_training_and_mcmc_on_device -k "slice_np_vectorized"

@famura famura changed the title Better MCMC hparams Uniform and better MCMC params for the tests Mar 25, 2024
@famura
Copy link
Contributor Author

famura commented Mar 25, 2024

From my point of view this PR is ready

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the extra time. Looks all good, I will double check the num_chains things and take it from here into main.

@janfb
Copy link
Contributor

janfb commented Mar 25, 2024

I added separate fixtures for mcmc_params_fast and mcmc_params_accurate. The speed up is marginal on my machine too, but I think it makes sense to separate these two cases.

@famura
Copy link
Contributor Author

famura commented Mar 26, 2024

I added separate fixtures for mcmc_params_fast and mcmc_params_accurate. The speed up is marginal on my machine too, but I think it makes sense to separate these two cases.

I think that's a good change. I was just lacking the insight which test should be sovled accurately. One final thing before merging this PR: be aware that num_chains=20 can backfire, i.e., be much slower, in the case of samplers that use multiple threads/processes and machines with few cores. I think 10 or so would do the same trick.

@janfb
Copy link
Contributor

janfb commented Mar 26, 2024

Thanks for fixing the remaining override!
I see your point about the chains.
But at the moment, we are using slice_np_vectorized for most "accurate" tests, which requires just a single core. If not, we are setting num_chains=1.

Setting num_chains=10 would work as well, but takes the same time for me.

@famura
Copy link
Contributor Author

famura commented Mar 26, 2024

I see. Once the ruff check runs, donno what that error is tbh, we can merge it :)

@famura
Copy link
Contributor Author

famura commented Mar 26, 2024

ruff format and ruff check show nothing (that is related to this PR)

@manuelgloeckler
Copy link
Contributor

This is what the workflow is complaining about. Which also seems not to be related to the changes here.
Not sure how this happened.

sbi/neural_nets/density_estimators/mixed_density_estimator.py:153:89: E501 Line too long (90 > 88)
sbi/neural_nets/density_estimators/mixed_density_estimator.py:178:89: E501 Line too long (91 > 88)
sbi/utils/user_input_checks.py:613:89: E501 Line too long (103 > 88)
sbi/utils/user_input_checks.py:636:89: E501 Line too long (103 > 88)

@famura
Copy link
Contributor Author

famura commented Mar 26, 2024

Yeah, I think these are a problem from main. The files are not changed, so mergning should not propagate that error in main

@manuelgloeckler
Copy link
Contributor

Well, main is currently also failing. I can locally reproduce these four errors.
I will quickly fix them here before merging.

@famura
Copy link
Contributor Author

famura commented Mar 26, 2024

Nice, thanks guys. So this PR is done

@janfb janfb merged commit 3b63c9f into main Mar 26, 2024
3 checks passed
@famura famura deleted the fix/snre_test_mcmc_inf branch March 27, 2024 08:24
@famura famura mentioned this pull request Apr 3, 2024
@janfb janfb mentioned this pull request Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hackathon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

linearGaussian_snre_test fails
4 participants