Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue #568 : small embedding net as default #624

Merged
merged 10 commits into from
Feb 7, 2022
Merged

Conversation

JuliaLinhart
Copy link
Contributor

There @michaeldeistler, should work now!

@michaeldeistler
Copy link
Contributor

Great, thanks!

@michaeldeistler
Copy link
Contributor

@janfb is this ok to merge for you? The PR adds a one-layer neural network as default embedding net

@michaeldeistler michaeldeistler linked an issue Jan 28, 2022 that may be closed by this pull request
@michaeldeistler michaeldeistler self-requested a review January 28, 2022 11:45
Copy link
Contributor

@michaeldeistler michaeldeistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok unfortunately tests are failing. It seems that the net does not infer the shape of x correctly

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for tackling this!
One remark: what is the overhead of this? What the difference in training speed of the default settings, e.g., NPE with maf, or NRE with resnet (since it uses two embeddings), with and without the new default embedding?
Could we run a small benchmark, just to get an impression?

@JuliaLinhart
Copy link
Contributor Author

What about now?

@michaeldeistler
Copy link
Contributor

Just started running them. You can also run them locally with pytest -m "not slow and not gpu" tests/

@michaeldeistler
Copy link
Contributor

still failing unfortunately

@JuliaLinhart
Copy link
Contributor Author

JuliaLinhart commented Jan 28, 2022

I fixed it, but it's not so pretty.. You might want to have a look! The remaining test failures are linked to performance issues in check_c2st...

@JuliaLinhart
Copy link
Contributor Author

I don't know why, but checking the type with type(embedding_net) did not work. It always seemed to be True as the embedding net turned out to be DefaultEmbeddingNet even when trying to use another embedding net (e.g. nn.Identity() or a CNN).

Copy link
Contributor

@michaeldeistler michaeldeistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current solution is ok. I left some comments that should make it a bit more readable.

sbi/utils/sbiutils.py Outdated Show resolved Hide resolved
sbi/utils/sbiutils.py Outdated Show resolved Hide resolved
sbi/utils/sbiutils.py Show resolved Hide resolved
sbi/neural_nets/classifier.py Outdated Show resolved Hide resolved
sbi/neural_nets/classifier.py Outdated Show resolved Hide resolved
@michaeldeistler
Copy link
Contributor

Looks good now, thanks! Have you ran a small experiment to compare the runtimes with and without DefaultEmbedding?

@JuliaLinhart
Copy link
Contributor Author

JuliaLinhart commented Jan 31, 2022

No, not yet! Any suggestions which benchmarking example I should use for that and where I should implement that? Also, how do you measure the runtimes, just to be on the same page.

@michaeldeistler
Copy link
Contributor

you can just use a linear gaussian as done here.

Only the training runtime matters, you can measure it e.g. with time.time() here

@michaeldeistler
Copy link
Contributor

My bigger concern is that some tests are failing because of their c2st right now. I'm a bit worried that the default embedding net impedes performance for small models such as the ones we run in our tests. If that is the case, we should reconsider using a DefaultEmbeddingNet. Could you run slow tests with pytest -m 'not gpu' tests and check how many fail?

@JuliaLinhart
Copy link
Contributor Author

JuliaLinhart commented Feb 1, 2022

It's been running all night and seems to be stuck on a code in linearGaussian_snpe_test.py, is it supposed to take that long?

Screenshot 2022-02-01 at 09 29 56

@michaeldeistler
Copy link
Contributor

Hm, no, the whole thing should take about 2-3 hours. Not sure what's going wrong here. You can also run specific tests with
pytest -m 'not gpu' tests/linear_gaussian_snre_test.py

@JuliaLinhart
Copy link
Contributor Author

JuliaLinhart commented Feb 1, 2022

Still bugging on the linear_gaussian_snpe_test.py but I manually ran the others:

Screenshot 2022-02-01 at 12 02 17

Screenshot 2022-02-01 at 12 04 25

@janfb
Copy link
Contributor

janfb commented Feb 1, 2022

I am able to reproduce the failing tests for pytest -m "not gpu" tests/linearGaussian_snpe_test.py. Changing num_simulations = 2500 to 2600 makes them pass again.
Just speculating here, but it could be that the increased number of parameters induced by the embedding net just needs more training data than before?

@JuliaLinhart
Copy link
Contributor Author

Nice catch! I did the same for the others and the tests pass now!

@janfb
Copy link
Contributor

janfb commented Feb 2, 2022

ok cool!
how about the slow tests? For me
pytest -m "not gpu" tests/linearGaussian_snpe_test.py::test_sample_conditional
is stuck in tuning the bracket widths. On main it passes in 3min. This seems to be due to the embedding net.

Plus, there is a merge conflict insbi/neural_nets/mdn.py, due to the GPU PR. So this needs another local rebase.

@JuliaLinhart
Copy link
Contributor Author

JuliaLinhart commented Feb 2, 2022

Yeah for me it's stuck too! And the merge conflict is because someone added checks in the mdn.py file. I didn't do anything because I didn't want to mess things up haha But basically we just have to add a line for imports

@janfb
Copy link
Contributor

janfb commented Feb 2, 2022

Update @JuliaLinhart : setting max_num_epochs=60 in line 432 should do the trick.

@JuliaLinhart
Copy link
Contributor Author

I'm running the slow tests again now

@JuliaLinhart
Copy link
Contributor Author

On my side all tests pass now

@codecov-commenter
Copy link

Codecov Report

Merging #624 (edc4183) into main (d373ff6) will increase coverage by 1.97%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #624      +/-   ##
==========================================
+ Coverage   68.72%   70.69%   +1.97%     
==========================================
  Files          67       68       +1     
  Lines        4476     4713     +237     
==========================================
+ Hits         3076     3332     +256     
+ Misses       1400     1381      -19     
Flag Coverage Δ
unittests 70.69% <100.00%> (+1.97%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sbi/neural_nets/classifier.py 100.00% <100.00%> (ø)
sbi/neural_nets/flow.py 88.17% <100.00%> (+0.96%) ⬆️
sbi/neural_nets/mdn.py 100.00% <100.00%> (ø)
sbi/utils/get_nn_models.py 85.71% <100.00%> (+0.29%) ⬆️
sbi/utils/sbiutils.py 79.03% <100.00%> (+0.88%) ⬆️
sbi/utils/__init__.py 100.00% <0.00%> (ø)
sbi/analysis/__init__.py 100.00% <0.00%> (ø)
sbi/utils/posterior_ensemble.py 85.29% <0.00%> (ø)
sbi/inference/posteriors/base_posterior.py 81.81% <0.00%> (+1.51%) ⬆️
sbi/inference/posteriors/mcmc_posterior.py 77.46% <0.00%> (+4.22%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d373ff6...edc4183. Read the comment docs.

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, it looks good to me now and tests are passing!
One final question: for some test cases you increased the number of simulations quite a lot, is this necessary? E.g., are these number lower bounds on making the tests pass?

tests/linearGaussian_snle_test.py Show resolved Hide resolved
tests/sbiutils_test.py Show resolved Hide resolved
@janfb
Copy link
Contributor

janfb commented Feb 7, 2022

Alright, we can merge this into main now. The final thing to do @JuliaLinhart would be a rebase on main to resolve the conflicts.
It seems you have your changes on the main of your fork, not on a feature branch, so this requires a bit of wiggling. But maybe you could try to just rebase on the upstream main and resolve the conflicts:
(assuming your fork is in origin and the sbi repo is in upstream)

git pull upstream main --rebase
  • then resolve the conflict, I think it is just one additional import in neural_nets.mdn
  • add the fixed file with git add
  • git rebase --continue
  • force push to your fork main to include the rebase:git push origin main -f

@JuliaLinhart
Copy link
Contributor Author

JuliaLinhart commented Feb 7, 2022

Yes I forgot to create a branch.. 🤭
But it seems that there are not conflicts, I solved them a few days ago! or am I looking at the wrong place?

@janfb
Copy link
Contributor

janfb commented Feb 7, 2022

no problem :) but there is one conflict, see below "This branch cannot be rebased due to conflicts".

You probably have to rebase again on the most recent main.

@JuliaLinhart
Copy link
Contributor Author

JuliaLinhart commented Feb 7, 2022

hmm I really cannot see that..

(base) julialinhart@nat-eduroam-138-gw-01-sif sbi % git pull upstream main --rebase
fatal: 'upstream' does not appear to be a git repository
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

(base) julialinhart@nat-eduroam-138-gw-01-sif sbi % git branch -vv
* main 68c744a [origin/main] lower bound for num_simulations

(base) julialinhart@nat-eduroam-138-gw-01-sif sbi % git pull --rebase              
Already up to date.
(base) julialinhart@nat-eduroam-138-gw-01-sif sbi % git rebase --continue
fatal: No rebase in progress?
(base) julialinhart@nat-eduroam-138-gw-01-sif sbi % git rebase          
Current branch main is up to date.

@janfb
Copy link
Contributor

janfb commented Feb 7, 2022

it seems you haven't set upstream correctly. maybe

git remote add upstream https://github.com/mackelab/sbi.git

helps?

@janfb janfb merged commit 51d364f into sbi-dev:main Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Small embedding_net as default?
4 participants