Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add size invariant iid embedding nets, tests. #808

Merged
merged 3 commits into from
Mar 1, 2023

Conversation

janfb
Copy link
Contributor

@janfb janfb commented Feb 22, 2023

goal: have an embedding net for NPE that can handle varying number of iid trials, i.e., learn how the posterior changes as we change the number of trials.

problem: our training procedure assume that x is a tensor with fixed dimensions, e.g., the trial dimension must be the same for all training data points.

solution: given a training data set with a varying number of trials where the maximum number of trials is max_num_trials, pad all xs with smaller number of trials with NaNs such that x.shape = (num_thetas, max_num_trials, dim_x). Adapt the PermutationInvariantEmbeddingNet such that it detects the NaNs and applies the TrialEmbedding only to the valid entries (using a loop over the batch).

functional tests: seems to work fine for a Gaussian example

Questions:

  • any ideas how to change our training procedure to be more flexible, e.g., allows x to be a list? treat x a torch.DataSet from the very start in append_simulations?

Copy link
Contributor

@michaeldeistler michaeldeistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A few points below. I'd like to think a bit about this, so let's maybe not merge it today yet?

sbi/neural_nets/embedding_nets.py Outdated Show resolved Hide resolved
sbi/neural_nets/embedding_nets.py Outdated Show resolved Hide resolved
sbi/neural_nets/embedding_nets.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coschroeder coschroeder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's discuss again if we can avoid the loop for varying trial numbers.

sbi/neural_nets/embedding_nets.py Outdated Show resolved Hide resolved
@janfb janfb force-pushed the size-invariant-iid-embedding branch from 0326c2c to 502a70e Compare February 23, 2023 15:40
@codecov-commenter
Copy link

codecov-commenter commented Feb 23, 2023

Codecov Report

Merging #808 (0ef585a) into main (2a114f4) will increase coverage by 0.05%.
The diff coverage is 87.50%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main     #808      +/-   ##
==========================================
+ Coverage   74.76%   74.81%   +0.05%     
==========================================
  Files          80       80              
  Lines        6190     6191       +1     
==========================================
+ Hits         4628     4632       +4     
+ Misses       1562     1559       -3     
Flag Coverage Δ
unittests 74.81% <87.50%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sbi/analysis/plot.py 56.43% <ø> (ø)
sbi/analysis/tensorboard_output.py 86.25% <ø> (ø)
sbi/inference/abc/smcabc.py 12.26% <0.00%> (ø)
sbi/inference/posteriors/base_posterior.py 80.82% <ø> (ø)
sbi/inference/posteriors/direct_posterior.py 98.21% <ø> (ø)
...inference/potentials/likelihood_based_potential.py 100.00% <ø> (ø)
sbi/inference/snle/snle_base.py 91.58% <ø> (ø)
sbi/inference/snpe/snpe_base.py 89.17% <ø> (ø)
sbi/inference/snre/bnre.py 44.00% <ø> (ø)
sbi/inference/snre/snre_base.py 94.82% <ø> (ø)
... and 10 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@janfb janfb force-pushed the size-invariant-iid-embedding branch from 502a70e to 0ef585a Compare February 24, 2023 15:39
@janfb
Copy link
Contributor Author

janfb commented Feb 24, 2023

this is working accurately for a Gaussian iid example with up 100 trials, trained with varying number of trials.

thanks to @manuelgloeckler's input, the forward pass is performed batched as well by masking the NaNs before the forward pass and setting them zero afterwards.

Copy link
Contributor

@manuelgloeckler manuelgloeckler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey,

Looks good. One minor comment: I like the option to pass a custom aggregation function; currently, it would perfectly work for torch.sum . Yet for others like torch.max, torch.min or torch.median it would compute a slightly different value then expected as we substitute the invalid outputs with zero. We may should add this to the docstring, such that the user can adjust for this explicitly.

Copy link
Contributor

@michaeldeistler michaeldeistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting really awesome, I like the way it is implemented now!

As far as I understand, this does not work if x has a NaN value (which is not just because of missing trials). We should spell this out clearly in the docstring of this embedding net. We could even add a check which tests whether there are x for which only some summary stats are NaN and raise an error in this case (I'm also happy to leave this as TODO in the code for now)

Regarding my comments: I'd be happy to have a quick call if anything is unclear

sbi/neural_nets/embedding_nets.py Outdated Show resolved Hide resolved
sbi/neural_nets/embedding_nets.py Outdated Show resolved Hide resolved
sbi/neural_nets/embedding_nets.py Outdated Show resolved Hide resolved
sbi/neural_nets/embedding_nets.py Show resolved Hide resolved
@janfb janfb force-pushed the size-invariant-iid-embedding branch 3 times, most recently from fe29549 to 2833d48 Compare March 1, 2023 10:52
Copy link
Contributor Author

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added asserts to catch NaNs in standardizing nets. This will inform the user when they want to use NaNs to encode varying number of trials that they have to turn of z-scoring and set exclude_invalid_x=False.

sbi/neural_nets/embedding_nets.py Show resolved Hide resolved
@janfb janfb force-pushed the size-invariant-iid-embedding branch 2 times, most recently from fbbaf8f to e5b3cb3 Compare March 1, 2023 12:16
@janfb janfb force-pushed the size-invariant-iid-embedding branch 3 times, most recently from b2b9e55 to 5b8e232 Compare March 1, 2023 13:59
@janfb janfb force-pushed the size-invariant-iid-embedding branch from 5b8e232 to 5805e96 Compare March 1, 2023 14:10
@janfb janfb merged commit cd10570 into main Mar 1, 2023
@janfb janfb deleted the size-invariant-iid-embedding branch February 13, 2024 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants