Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use RandomVariables in PyMC models #4440

Closed
wants to merge 8 commits into from

Conversation

brandonwillard
Copy link
Contributor

@brandonwillard brandonwillard commented Jan 25, 2021

This PR prototypes large-scale changes that introduce tighter integration with Theano-PyMC/Aesara and—more specifically—the use of theano.tensor.random.op.RandomVariables.

The original PyMC3 tests will likely not pass any time soon, due to the nature and extent of the changes involved.

This PR is currently for discussion and demonstration purposes only. It is not a place to discuss changes unrelated to the use of RandomVariables and Theano-PyMC/Aesara.

  • Replace a few Distributions with RandomVariables and logp dispatch functions
  • Remove/refactor use of Model.size (was previously Model.ndim)
  • Remove use of ArrayOrdering
  • Refactor use of DictToArrayBijection
  • Get NUTS and posterior predictive sampling to work under these changes

@brandonwillard
Copy link
Contributor Author

brandonwillard commented Jan 25, 2021

The general goal behind these initial changes was to start tracking RandomVariable-generated TensorVariables within Model.

The entry point for RandomVariables is the Distribution class.

Changes to Distribution

For demonstration purposes, only two basic Distributions were changed: Uniform and Normal. Now, they act as simple pass-throughs for RandomVariable.__call__. In other words, they apply RandomVariable Ops to inputs and return the output TensorVariables. Under these changes, a Distribution object is never really instantiated, so their current form—as classes—is not exactly fitting. Expect more changes here.

The log-likelihood methods provided by Distribution are now generic functions via single dispatch. This allows us to obtain log-likelihoods for random variable terms without manually attaching and maintaining Distribution objects on TensorVariables.
This change also removes the awkward coupling introduced by Distributions between a random variable, its inputs, and these effectively external (to Theano-PyMC/Aesara) Distribution objects. Since RandomVariables already connect a random variable's input TensorVariables with its output TensorVariable (i.e. via Apply nodes), the original design/use of Distribution is completely obviated.

Aside from their role as RandomVariable constructors that simply preserve the old PyMC3 API, they're currently being used to convey transformation information. This information could be conveyed in numerous other ways that do not require Distribution classes, so this is yet another reason to refactor Distributions into simple constructor functions.

Changes to Model

The changes to Model were fairly straightforward and mostly focused on the old Model.Var method. Now that Distributions do the job of generating TensorVariables (via RandomVariables Ops), there's no need/use for the TensorVariables that were manually constructed by Model.Var. PyMC3Variable, FreeRV, ObservedRV, and TransformedRV—and all the logic surrounding them—are no longer necessary. All of these classes were used as a means of attaching information and functions to TensorVariables, but we can more easily attach the same information using the tags and access the same functions through the log-likelihood dispatch functions.

In order to represent the association between a random variable and its observations, the code now uses the Observed Op instead of a distinct ObservedRV tensor and a View Op. This approach is much more direct and works better within the Theano-PyMC/Aesara framework.

The other main change to Model was the introduction of "value" variables. These are TensorVariables that are paired with each random variable's TensorVariable (i.e. the ones generated by RandomVariable Ops)—via a values_var tag—and are used to represent the random variables within log-likelihood graphs. In essence, these "value" variables fill in for the old FreeRVs and TransformedRVs.

These distinct TensorVariables are necessary because we can't use the RandomVariable-produced TensorVariables as inputs to the log-likelihood functions; otherwise, all of the log-likelihood functions we would compile from these graphs would only ever compute the log-likelihoods of randomly generated numbers!
In other words, now that we're in the business of generating two types of graphs (i.e. sample- and measure/log-likelihood-space graphs), we need two TensorVariables to represent the same random variable in both graphs.

Demonstration

Under the current changes, we can create a simple model:

import numpy as np

import theano
import theano.tensor as tt
import pymc3 as pm

from theano.printing import debugprint as tt_dprint


y_sample = np.random.normal(size=(2, 2))

with pm.Model() as model:
    a_tt = tt.vector("a")
    a_tt.tag.test_value = np.r_[1, 2]
    B_rv = pm.Uniform("B", a_tt, 3, size=(2, 2))
    C_tt = B_rv * 2 + a_tt
    Y_rv = pm.Normal("Y", B_rv, C_tt, observed=y_sample)

Now, the TensorVariables produced by the Distributions pm.Uniform and pm.Normal are fully functional sample-space graphs:

>>> tt_dprint(B_rv)
uniform_rv.1 [id A] 'B'   
 |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F5BBB793440>) [id B]
 |TensorConstant{(2,) of 2} [id C]
 |TensorConstant{11} [id D]
 |a [id E]
 |TensorConstant{3.0} [id F]
>>> tt_dprint(Y_rv)
<theano.tensor.random.op.Observed object at 0x7f5bcdf40490> [id A] 'Y'   
 |normal_rv.1 [id B] 'Y'   
 | |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F5BBB793540>) [id C]
 | |TensorConstant{[]} [id D]
 | |TensorConstant{11} [id E]
 | |uniform_rv.1 [id F] 'B'   
 | | |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F5BBB793440>) [id G]
 | | |TensorConstant{(2,) of 2} [id H]
 | | |TensorConstant{11} [id I]
 | | |a [id J]
 | | |TensorConstant{3.0} [id K]
 | |Elemwise{mul,no_inplace} [id L] ''   
 |   |InplaceDimShuffle{x,x} [id M] ''   
 |   | |TensorConstant{1.0} [id N]
 |   |Elemwise{add,no_inplace} [id O] ''   
 |     |Elemwise{mul,no_inplace} [id P] ''   
 |     | |uniform_rv.1 [id F] 'B'   
 |     | |InplaceDimShuffle{x,x} [id Q] ''   
 |     |   |TensorConstant{2} [id R]
 |     |InplaceDimShuffle{x,0} [id S] ''   
 |       |a [id J]
 |TensorConstant{[[-0.30786..58805874]]} [id T]

Although Distribution.random is no longer present, we can achieve the same thing by simply compiling a Theano-PyMC/Aesara graph:

>>> B_rv.eval({a_tt: np.r_[1, 2]})
array([[1.81963075, 2.1584207 ],
       [1.27165215, 2.5665736 ]])

We can generate a measure-space graph for the total log-likelihood:

>>> model_logpt = model.logpt
>>> tt_dprint(model_logpt)
Sum{acc_dtype=float64} [id A] '__logp'   
 |MakeVector{dtype='float64'} [id B] ''   
   |Sum{acc_dtype=float64} [id C] ''   
   | |Elemwise{mul,no_inplace} [id D] '__logp_B'   
   |   |Elemwise{switch,no_inplace} [id E] ''   
   |   | |Elemwise{mul,no_inplace} [id F] ''   
   |   | | |Elemwise{mul,no_inplace} [id G] ''   
   |   | | | |InplaceDimShuffle{x,x} [id H] ''   
   |   | | | | |TensorConstant{1} [id I]
   |   | | | |Elemwise{mul,no_inplace} [id J] ''   
   |   | | |   |InplaceDimShuffle{x,x} [id K] ''   
   |   | | |   | |TensorConstant{1} [id L]
   |   | | |   |Elemwise{ge,no_inplace} [id M] ''   
   |   | | |     |B [id N]
   |   | | |     |InplaceDimShuffle{x,0} [id O] ''   
   |   | | |       |a [id P]
...

The variable named B in that graph is not the same variable as B_rv; it's B_rv's "value" variable (i.e. B_rv.tag.value_var).

Sampling prior/posterior predictive values is now much simpler (in code) and performant. Only the former has been implemented here, though:

with model:
    prior_samples = pm.sample_prior_predictive(
        samples=3, indep_var_values={a_tt: np.r_[2.0, 3.3]}
    )
>>> prior_samples
{'Y': array([[[-0.08782206, 12.54011863],
         [-2.17211159,  7.47383723]],
 
        [[-0.08782206, 12.54011863],
         [-2.17211159,  7.47383723]],
 
        [[-0.08782206, 12.54011863],
         [-2.17211159,  7.47383723]]]),
 'B': array([[[2.40981538, 3.25247379],
         [2.13582608, 3.13002792]],
 
        [[2.40981538, 3.25247379],
         [2.13582608, 3.13002792]],
 
        [[2.40981538, 3.25247379],
         [2.13582608, 3.13002792]]])}

In these examples, the model contained an "independent" variable, a_tt. These are exposed through the newly added model.independent_vars property.
This kind of variable is not possible in the current PyMC3 framework without resorting to shared variables. I'm not sure if we want/need to allow these variable types, but it seemed like an unnecessary shortcoming, so I added it.
All these changes work with shared variables, but model.logp—and other things that compile graphs containing these independent variables—won't work with independent variables unless we provide a way to specify their values. This is almost entirely a matter of class and function design/refactoring, though.
This feature has been removed.

pymc3/tests/test_model.py Outdated Show resolved Hide resolved
@Sayam753
Copy link
Member

Hi @brandonwillard
Will it be a good idea to get involved in this PR by working on the test cases? because that will help me in better understanding these changes.

@brandonwillard
Copy link
Contributor Author

Will it be a good idea to get involved in this PR by working on the test cases? because that will help me in better understanding these changes.

That's a tough question, because the real issue is the underlying design changes. Basically, I don't want to go through a lot of refactoring only to have it scrapped.

At some point soon, we should be able to say whether or not we want to stick with some of these changes (e.g. the logp dispatch functions, turning Distributions into simple constructor functions, etc.) Once we've made decisions like those, we'll be able to determine the kinds of large-scale changes that are worth making.

Also, I should probably put this branch in the pymc3 repo itself, then it will be more suitable for accepting PRs; otherwise, one would have to submit PRs to my personal repo, and that's less easy for everyone else to track.

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 28, 2021

I don't know if this helps at all in deciding how to implement things in the end, but is there a straightforward way to adapt other PyMC3 "meta distribution" constructors (for a lack of a better word) under this framework? Stuff like Truncated and Censored distributions, Distributions for missing observations, Mixtures, RandomWalk, DensityDist, Potentials... Are any of these particularly challenging to adapt? Could these influence the final design for the RandomVariables?

Just trying to help get the discussion going on.

@brandonwillard
Copy link
Contributor Author

is there a straightforward way to adapt other PyMC3 "meta distribution" constructors (for a lack of a better word) under this framework? Stuff like Truncated and Censored distributions, Distributions for missing observations, Mixtures, RandomWalk, DensityDist, Potentials... Are any of these particularly challenging to adapt? Could these influence the final design for the RandomVariables?

Those are exactly the kinds of things we need to consider. Right now, I don't see any reason why the current approach wouldn't work in those cases; however, like many other Distributions that do not have RandomVariable implementations, we will need to turn the existing Distribution code (e.g. the contents of Distribution.random) into new RandomVariable classes.

This work is inline with the general theme of these PyMC3 changes: i.e. move more PyMC3 logic into Ops!

@brandonwillard
Copy link
Contributor Author

brandonwillard commented Jan 29, 2021

The most recent commit starts to address one of the biggest issues in PyMC3: its current dependence on concrete (i.e. non-symbolic) shapes.

More specifically, the ArrayOrdering and associated VarMap classes rely on the concrete shape information provided by FreeRV.size and FreeRV.dshape to map multiple model variables into a single "raveled" vector via DictToArrayBijection; however, the variables in question are purely symbolic and do not actually provide concrete shape values with which such a mapping can be made. Due to this, PyMC3 requires user-provided concrete shapes, which—in turn—requires extra logic that works entirely outside of Aesara and ultimately limits interactions between the two systems.

After some review, it seems as though most of the logic that relies on ArrayOrdering and DictToArrayBijection does not actually require this shape information upfront. Instead, use of these two classes is largely limited to key points within the step methods where concrete sample values are always present. In those situations, one doesn't need to obtain the shape information from the symbolic variables; it can be taken straight from the concrete values of those variables.

The last commit effectively removes the need for ArrayOrdering by slightly changing the interface to DictToArrayBijection so that one simply has to specify the variable order, each variable's concrete shape, and their dtypes. Now, instead of creating an ArrayOrdering object well before sampling and then passing that as an argument to DictToArrayBijection's constructor down the line, one creates a DictToArrayBijection and provides all the information it needs using the available samples.

The basics of DictToArrayBijection appear to work exactly as they did before under these changes:

import numpy as np
from pymc3.blocking import DictToArrayBijection

dpoint = {"v1": np.random.normal(size=(4, 3)).astype("float32"),
          "v2": np.random.normal(size=(2, 4)).astype("float64"),
          "v3": np.random.normal(size=(2, 4)).astype("float64")}

dtab = DictToArrayBijection([k for k in dpoint.keys()],
                            [v.shape for v in dpoint.values()],
                            [v.dtype for v in dpoint.values()])

# This simply ravels/flattens all the values in the dictionary and concatenates them
apoint = dtab.map(dpoint)
assert np.array_equal(np.concatenate([v.ravel() for v in dpoint.values()]), apoint)

# This reverts the concatenated values back to a dictionary 
# (the result should be equal to `dpoint`)
dpoint_rmap = dtab.rmap(apoint)
assert dpoint_rmap.keys() == dpoint.keys()
assert all(np.array_equal(a, b) for a, b in zip(dpoint.values(), dpoint_rmap.values()))

I've replaced most uses of ArrayOrdering, VarMap, and DictToArrayBijection to work like the above example, except for pymc3.variational.opvi.Group.__init_group__. The logic in that situation is not very clear, and the use of the now obsolete ArrayOrdering and VarMap is spread all across the class design by way of properties like Group.bdim, Group.ndim, etc. Basically, all the logic that requires Group.ordering and Group.bij—and the properties and methods that use those values—need to be refactored/removed and brought closer to wherever the concrete variable values are used in that process.

If there are any reasons why that shape information is needed outside of a sampling context, please tell me. In the meantime, I'll finish making the changes necessary to sample models. After that, I'll look into variable dimension sampling (e.g. Dirichlet processes). This is a situation that's likely to conflict with other parts of PyMC3 that assume fixed dimensions (e.g. the containers holding sample results, specific step methods, etc.), but, after these changes, it's something that's easy to do in terms of sampling and Aesara mechanics.

@twiecki
Copy link
Member

twiecki commented Feb 1, 2021

We should make sure to benchmark this properly so as not to introduce any perf regressions.

pymc3/blocking.py Outdated Show resolved Hide resolved
pymc3/step_methods/hmc/base_hmc.py Outdated Show resolved Hide resolved
pymc3/blocking.py Show resolved Hide resolved

kinetic = pot.velocity_energy(p_new, v_new)
kinetic = pot.velocity_energy(p_new.data, v_new)
energy = kinetic - logp

return State(q_new, p_new, v_new, q_new_grad, energy, logp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q_new and p_new might need to be packed back into RaveledData here, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should be RaveledData instances on that line (i.e. 117).

Note: I originally used an ndarray subclass to track point_map_info, instead of this RaveledData type. It was more convenient to use, because these lines didn't need to change, but it would also hold onto the mapping information for longer than desired, and that didn't seem good. I was able to make it return normal ndarrays whenever an operation changed its shape or dtype, but that wouldn't cover everything so I scrapped it.

@brandonwillard
Copy link
Contributor Author

brandonwillard commented Feb 2, 2021

With the most recent update, it should now be possible sample a simple model with NUTS:

import numpy as np

import theano
import theano.tensor as tt
import pymc3 as pm

from theano.printing import debugprint as tt_dprint

theano.config.compute_test_value = "warn"
theano.config.gcc = ""

np.random.seed(2344)
y_sample = np.random.normal(20.2, 0.5, size=(3,))

with pm.Model() as model:
    B_rv = pm.Normal("B", 0.0, 10.0)
    B_rv.tag.test_value = np.array(0.0)
    Y_rv = pm.Normal("Y", B_rv, 1.0, observed=y_sample)

with model:
    trace = pm.sample(
        draws=100,
        chains=1,
        cores=1,
        tune=10,
        compute_convergence_checks=False,
        return_inferencedata=False,
    )
Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [B]
Sampling 1 chain for 10 tune and 100 draw iterations (10 + 100 draws total) took 0 seconds.
The acceptance probability does not match the target. It is 0.92719043799477, but should be close to 0.8. Try to increase the number of tuning steps.
>>> trace.get_values("B").mean()
20.153435618022563

Currently, a size parameter is needed in order to tell the RandomVariable Op that the observed variable conforms to the observed values. The reason: the Observed Op that holds the RandomVariable + data pair is pretty strict about its two inputs matching (i.e. their types must be exactly the same). There are numerous ways to address this, but, for now, if anyone is going to test this branch, just keep that in mind.

Also, there's a reason for those compute_convergence_checks and return_inferencedata values: Arviz performs some log-likelihood operations using the old API (i.e. without the changes in this branch), so we need to disable those steps.

Next, I'll implement posterior predictive sampling.

@fonnesbeck
Copy link
Member

@brandonwillard so is this going to be merged to master as the PR currently indicates, or to a v4 branch?

I can devote some time today to helping with your checklist.

return self.function(*params)
else:
return np.array([self.function(*params) for _ in range(size[0])])
# size = to_tuple(size)
Copy link
Member

@fonnesbeck fonnesbeck Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need refactoring then, or does random just go away?

Copy link
Contributor Author

@brandonwillard brandonwillard Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distribution.random is gone forever; however, some of the code in our Distribution.random implementations can be repurposed in a RandomVariable.perform, especially for Distributions that do not already have corresponding RandomVariable subclasses in Aesara.

@brandonwillard
Copy link
Contributor Author

@brandonwillard many of the points you listed are related to having non-constant shape during sampling.

They're actually about the use of symbolic variables as a means of obtaining concrete shapes (e.g. through test values), and the way those values permeate our logic. Those are the fundamental problems that needs to be addressed in order to move forward.

In other words, we need to get concrete shape values from concrete samples of the variables and only those. We obtain those values from our Python-based sampling processes; unfortunately, those sampling process needs initial values with which to start and those have been provided by—or stored in—TensorVariable.tag.test_values. I'm guessing that this symbolic variable (i.e. TensorVariable) -> test value -> initial value connection is what led to our current predicament. We simply need to untangle these components in our logic.

I've phrased most things in terms of "dynamic" shapes because that's one of the easiest ways to describe how this entanglement is bad. Put another way, properties like var.dshape -> Tuple[int, ...] and var.dsize -> int make no sense when var is an arbitrary TensorVariable that could realistically take any shape. Something like var.tag.test_value.[shape|size] makes marginally more sense, but it's still bad. Instead, we should treat initial values as something entirely independent from the symbolic variables and never use test values in any of our logic. Why? Because they are entirely independent.

In summary, we should essentially outlaw the use of test values for anything except debugging.

I get that there might be an opportunity to allow for this flexibility, but I'm not sure if that should be a priority already.

Dynamic shape handling isn't just about sampling fancy models—although that is extremely important in itself—it's also about model (object) reuse and the use of out-of-sample data. Our current implicit reliance on test values and fixed-shape logic has led to inflexible code that can barely handle changes to the data in a model. For instance, we currently need to recreate models when the dimensions of their input data change. This is very costly when the models are non-trivial and require extensive C compilation. The worst part is that these limitations are not due to Theano/Aesara; they're due to PyMC3's inflexibility.

No, we don't need to make all of our samplers capable of handling dynamic shapes, especially for cases in which the underlying maths/methods needs to be carefully reconsidered; however, we do need to relocate that responsibility to the samplers themselves and update the internal API so that it works in a "shape agnostic" and "shape isolated" fashion.

In general, our API (e.g. the Model object, abstract methods, etc.) shouldn't ever allow one to assume that the shapes are fixed, or even provide values of any sort for those shapes. The developer of a sampler would—for instance—need to work that out using an initial value and/or the samples they generate in their sampler.

We don't know if/how the backends - including ArviZ can even handle non-constant shapes.

That is a real limitation, but it shouldn't be a limitation of PyMC3. If someone writes a sampler that can handle infinite dimensional distributions or anything like that, PyMC3 should be able to handle it. There's absolutely no reason that the core PyMC3 interface and functions can't handle sample results in the form of ragged arrays/lists. Sure, some summary and diagnostics functions might not like that, but they can always be made conditional on the dimensionality of their input (i.e. don't run them when the samples change dimension).

Stochastic processes with these kinds of dimensional properties are too important to ignore unnecessarily.

To me it sounds more like a GSoC-like follow up project?

Some parts of it, sure, but the core refactoring work that we're talking about in this PR needs to be done here and now.

@Spaak
Copy link
Member

Spaak commented Feb 4, 2021

never use test values in any of our logic

I must admit that I don't fully understand all the intracies of the transition to RandomVariables, but this is a great point. I've felt for a while already that the dependence of logic on test values should be avoided.

@fonnesbeck
Copy link
Member

Since test values are essentially an attribute of a particular model-fitting run, perhaps they should be passed as a dict to sample or fit.

@brandonwillard
Copy link
Contributor Author

brandonwillard commented Feb 4, 2021

@brandonwillard so is this going to be merged to master as the PR currently indicates, or to a v4 branch?

@fonnesbeck, good point; I did make the PR target master, but I didn't actually intend to merge it into master (at least not any time soon). I've already pushed these changes into v4, so this PR is only relevant for the comments.

I can put the above bullet points into an issue and we can start putting in PRs to the now updated v4 branch. Sound good?

@brandonwillard
Copy link
Contributor Author

Since test values are essentially an attribute of a particular model-fitting run, perhaps they should be passed as a dict to sample or fit.

Exactly, and, in both cases, I believe that would serve as an initial values parameter.

@ricardoV94
Copy link
Member

@brandonwillard I have a late question. You mentioned we need to convert the missing random methods to RandomVariable Ops. Will this be done on the Aesara library or here on PyMC?

@brandonwillard
Copy link
Contributor Author

@brandonwillard I have a late question. You mentioned we need to convert the missing random methods to RandomVariable Ops. Will this be done on the Aesara library or here on PyMC?

In PyMC.

I don't see us adding any more RandomVariable Ops to Aesara. In fact, I imagine that PyMC will be the package that provides RandomVariable Ops to Aesara users, and Aesara itself will only provide the base RandomVariable Op implementation and associated Types.

…d basic dists

These changes can be summarized as follows:
- `Model` objects now track fully functional Theano graphs that represent all
relationships between random and "deterministic" variables.  These graphs are
called these "sample-space" graphs.  `Model.unobserved_RVs`, `Model.basic_RVs`,
`Model.free_RVs`, and `Model.observed_RVs` contain these
graphs (i.e. `TensorVariable`s), which are generated by `RandomVariable` `Op`s.
- For each random variable, there is now a corresponding "measure-space"
variable (i.e. a `TensorVariable` that corresponds to said variable in a
log-likelihood graph).  These variables are available as `rv_var.tag.value_var`,
for each random variable `rv_var`, or via `Model.vars`.
- Log-likelihood (i.e. measure-space) graphs are now created for individual
random variables by way of the generic functions `logpt`, `logcdf`,
`logp_nojac`, and `logpt_sum` in `pymc3.distributions`.
- Numerous uses of concrete shape information stemming from `Model`
objects (e.g. `Model.size`) have been removed/refactored.
- Use of `FreeRV`, `ObservedRV`, `MultiObservedRV`, and `TransformedRV` has been
deprecated.  The information previously stored in these classes is now tracked
using `TensorVariable.tag`, and log-likelihoods are generated using the
aforementioned `log*` generic functions.
This commit changes `DictToArrayBijection` so that it returns a `RaveledVars`
datatype that contains the original raveled and concatenated vector along with
the information needed to revert it back to dictionay/variables form.

Simply put, the variables-to-single-vector mapping steps have been pushed away
from the model object and its symbolic terms and closer to the (sampling)
processes that produce and work with `ndarray` values for said terms.  In doing
so, we can operate under fewer unnecessarily strong assumptions (e.g. that the
shapes of each term are static and equal to the initial test points), and let
the sampling processes that require vector-only steps deal with any changes in
the mappings.
The approach currently being used is rather inefficient.  Instead, we should
change the `size` parameters for `RandomVariable` terms in the sample-space
graph(s) so that they match arrays of the inputs in the trace and the desired
number of output samples.  This would allow the compiled graph to vectorize
operations (when it can) and sample variables more efficiently in large batches.
Classes and functions removed:
- PyMC3Variable
- ObservedRV
- FreeRV
- MultiObservedRV
- TransformedRV
- ArrayOrdering
- VarMap
- DataMap
- _DrawValuesContext
- _DrawValuesContextBlocker
- is_fast_drawable
- _compile_theano_function
- vectorize_theano_function
- get_vectorize_signature
- _draw_value
- draw_values
- generate_samples
- fast_sample_posterior_predictive

Modules removed:
- pymc3.distributions.posterior_predictive
- pymc3.tests.test_random
@@ -1285,7 +1294,7 @@ def __init__(self, groups, model=None):
self._scale_cost_to_minibatch = theano.shared(np.int8(1))
model = modelcontext(model)
if not model.free_RVs:
raise TypeError("Model does not have FreeRVs")
raise TypeError("Model does not have an free RVs")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise TypeError("Model does not have an free RVs")
raise TypeError("Model does not have any free RVs")

"""Convenience attribute to return tag.test_value"""
return self.tag.test_value


def pandas_to_array(data):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but this function could probably benefit from a new name

@brandonwillard
Copy link
Contributor Author

All right, I'm closing this. Work will continue on the v4 branch in this repository.

@twiecki
Copy link
Member

twiecki commented Feb 14, 2021

@brandonwillard Do you want to open a PR for that too, that way it's easier to track progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants