Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared vars dirty values in Deterministic RVs #3643

Closed
ghost opened this issue Oct 7, 2019 · 1 comment · Fixed by #3645
Closed

Shared vars dirty values in Deterministic RVs #3643

ghost opened this issue Oct 7, 2019 · 1 comment · Fixed by #3645
Assignees
Labels

Comments

@ghost
Copy link

ghost commented Oct 7, 2019

sample_posterior_predictive is susceptible to "bad" ordering of var_names.
The problem arises only when using theano.shared vars for observed RVs.
It seems that when sample_posterior_predictive is given var_names in an order in which a Deterministic RV is placed before the RV(s) it is based on, the samples will contain wrong values for that deterministic variable.
Namely - the values for the deterministic rv in the ppc trace will be values based on the "original" theano.shared values.

One possible fix is reordering var_names inside sample_posterior_predictive to be consistent with

free_RVs + observed_RVs + deterministics

As far as I tested this ordering works, but this might be just a workaround hiding a deeper problem. Not sure.

If the reordering solution is the right way to go, better to topologically order the rvs, though I didn't find a way in the API to retrieve their dependencies.

See reproduction below.

np.random.seed(123)
        obs_a = theano.shared(np.array([10., 20., 30.]))

        with pm.Model() as m:
            pm.Normal('mu', 3, 5)
            a = pm.Normal('a', 20, 10, observed=obs_a)
            pm.Deterministic('b', a * 2)

            trace = pm.sample(10)

        # var_names = ['a', 'b'] # GOOD ORDERING
        var_names = ['b', 'a'] # BAD ORDERING
        ppc = pm.sample_posterior_predictive(trace, model=m, var_names=var_names)
        sa = ppc['a'][0, 0]
        sb = ppc['b'][0, 0]
        if sa*2 != sb:
            print(f'a = {sa}')
            print(f'b = {sb}')
            raise ValueError('wrong')
        
        print('OK')
  • PyMC3 Version: 3.7
  • Theano Version: 1.0.4
  • Python Version: 3.7.3
  • Operating system: macOS High Sierra
  • How did you install PyMC3: pip
@lucianopaz
Copy link
Contributor

Thanks for reporting @uri-datorama. This highlights a problem in draw_values. In particular, in how the output from pymc3.model.get_named_nodes_and_relations is handled. I'll work on a fix later today.

@lucianopaz lucianopaz added the bug label Oct 7, 2019
@lucianopaz lucianopaz self-assigned this Oct 7, 2019
lucianopaz added a commit to lucianopaz/pymc that referenced this issue Oct 8, 2019
lucianopaz added a commit to lucianopaz/pymc that referenced this issue Oct 14, 2019
junpenglao pushed a commit that referenced this issue Feb 3, 2020
* Fixed leaf_nodes computation

* Refactored get_named_nodes_and_relations to make it consistent with theano naming

* Added test for #3643

* Added release notes

* Fixed float32 error

* Applied changes suggested by @rpgoldman

* Fixed ignored nodes bug

* Modify docstrings for numpy style.

Noticed @junpenglao's comment and modified for numpy formatting.

* Changed leaves to leaf_dict as per @junpeglao's suggestion

Co-authored-by: rpgoldman <rpgoldman@goldman-tribe.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant