Skip to content

Commit

Permalink
Support imputations with ndarray data
Browse files Browse the repository at this point in the history
closes #4437
  • Loading branch information
michaelosthege committed Jan 25, 2021
1 parent 823906a commit 9c2203d
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 9 deletions.
3 changes: 2 additions & 1 deletion RELEASE-NOTES.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
# Release Notes

## PyMC3 vNext (on deck)
## PyMC3 vNext (3.11.1)

### Breaking Changes

### New Features
+ Automatic imputations now also work with `ndarray` data, not just `pd.Series` or `pd.DataFrame` (see[#4439](https://github.com/pymc-devs/pymc3/pull/4439)).

### Maintenance
- `math.log1mexp_numpy` no longer raises RuntimeWarning when given very small inputs. These were commonly observed during NUTS sampling (see [#4428](https://github.com/pymc-devs/pymc3/pull/4428)).
Expand Down
30 changes: 22 additions & 8 deletions pymc3/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
from typing import TYPE_CHECKING, Any, List, Optional, Type, TypeVar, Union, cast

import numpy as np
import pandas as pd
import scipy.sparse as sps
import theano
import theano.graph.basic
Expand Down Expand Up @@ -1695,16 +1696,29 @@ def pandas_to_array(data):
XXX: When `data` is a generator, this will return a Theano tensor!
"""
if hasattr(data, "values"): # pandas
if data.isnull().any().any(): # missing values
ret = np.ma.MaskedArray(data.values, data.isnull().values)
if isinstance(data, (pd.Series, pd.DataFrame)):
mask = data.isnull()
if mask.any().any():
# there are missing values
ret = np.ma.MaskedArray(data.values, mask.values)
else:
ret = data.values
elif hasattr(data, "mask"):
if data.mask.any():
ret = data
else: # empty mask
ret = data.filled()
elif isinstance(data, np.ndarray):
if isinstance(data, np.ma.MaskedArray):
if not data.mask.any():
# empty mask
ret = data.filled()
else:
# already masked and rightly so
ret = data
else:
# already a ndarray, but not masked
mask = np.isnan(data)
if np.any(mask):
ret = np.ma.MaskedArray(data, mask)
else:
# no masking required
ret = data
elif isinstance(data, theano.graph.basic.Variable):
ret = data
elif sps.issparse(data):
Expand Down

0 comments on commit 9c2203d

Please sign in to comment.