Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Datatype of Discrete RVs is changed to float64 when observed data has missing values #6424

Closed
jessegrabowski opened this issue Jan 2, 2023 · 0 comments · Fixed by #6425
Labels

Comments

@jessegrabowski
Copy link
Member

Describe the issue:

Issue first reported here. When using a categorical likelihood with missing variables in the observed data vector, the result is not able to be used as an index variable, because the dtype of the combined missing+observed data vector created in model.make_obs_var does not inherit the dtype of the underlying RV.

This will cause unexpected behavior if the user wants to index with the variable elsewhere in the model.

Reproduceable code example:

import pymc as pm
import numpy as np
import pytensor.tensor as pt

data = np.ma.masked_equal([1, 1, 0, 0, 2, -1, -1], -1)
something_to_index = pt.as_tensor_variable(np.random.normal(size=(10, 3)))

with pm.Model():
    idx = pm.Categorical(f"idx", p=[0.1, 0.2, 0.7], observed=data)
    stuff = something_to_index[:, idx]

Error message:

<details>
Traceback (most recent call last):
  File "/Users/jessegrabowski/Documents/Python/pymc/test.py", line 10, in <module>
    stuff = something_to_index[:, idx]
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/var.py", line 551, in __getitem__
    return at.subtensor.advanced_subtensor(self, *args)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/graph/op.py", line 296, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/subtensor.py", line 2556, in make_node
    index = tuple(map(as_index_variable, index))
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/subtensor.py", line 2518, in as_index_variable
    raise TypeError("index must be integers or a boolean mask")
TypeError: index must be integers or a boolean mask
</details>

PyMC version information:

pymc: 0+untagged.9319.g78a3582.dirty pytensor: 2.8.11

Context for the issue:

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant