Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array-valued discrete distribution #1146

Merged
merged 20 commits into from
Jun 1, 2022

Conversation

Mv77
Copy link
Contributor

@Mv77 Mv77 commented May 16, 2022

The current implementation of DiscreteDistribution allows us to represent multi-variate distributions in the sense that it can represent random variables with n dimensional domains (vector valued).

Something that it does not allow is for us to represent random variables whose realizations are arrays of arbitrary dimensions. Imagine, for instance, a random variable X that could be np.array([[1,2],[3,4]]) with probability 0.5 or np.array([[0,0],[0,0]]) with probability 0.5. The realizations of X have shape (2,2). We can not represent X directly---we can in principle represent the distribution of X.flatten() and do the required reshapes later, but this is cumbersome.

This PR works towards allowing random variables whose realizations are arrays of arbitrary dimensions. I will explain why I think that is useful in a comment below.

Please ensure your pull request adheres to the following guidelines:

  • Tests for new functionality/models or Tests to reproduce the bug-fix in code.
  • Updated documentation of features that add new functionality.
  • Update CHANGELOG.md with major/minor changes.

@codecov-commenter
Copy link

codecov-commenter commented May 16, 2022

Codecov Report

Merging #1146 (73f2f78) into master (2155175) will increase coverage by 0.01%.
The diff coverage is 95.58%.

@@            Coverage Diff             @@
##           master    #1146      +/-   ##
==========================================
+ Coverage   74.07%   74.09%   +0.01%     
==========================================
  Files          70       70              
  Lines       10809    10823      +14     
==========================================
+ Hits         8007     8019      +12     
- Misses       2802     2804       +2     
Impacted Files Coverage Δ
...RK/ConsumptionSaving/tests/test_ConsMarkovModel.py 86.56% <ø> (ø)
...K/ConsumptionSaving/tests/test_modelcomparisons.py 98.46% <ø> (ø)
HARK/distribution.py 84.57% <90.00%> (-0.82%) ⬇️
HARK/ConsumptionSaving/ConsIndShockModel.py 85.86% <100.00%> (ø)
HARK/ConsumptionSaving/ConsLaborModel.py 82.90% <100.00%> (ø)
HARK/ConsumptionSaving/ConsMedModel.py 74.04% <100.00%> (ø)
HARK/ConsumptionSaving/ConsPrefShockModel.py 79.68% <100.00%> (ø)
HARK/ConsumptionSaving/ConsRiskyContribModel.py 79.42% <100.00%> (ø)
...ConsumptionSaving/tests/test_ConsPrefShockModel.py 100.00% <100.00%> (ø)
HARK/core.py 91.47% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2155175...73f2f78. Read the comment docs.

@Mv77
Copy link
Contributor Author

Mv77 commented May 18, 2022

Here is an example of why I would find this useful.

Consider the portfolio model with sticky "shares". In that model, the end-of-period variables that you need to figure out your state vector next period are aNrm and Share. Both are, or can be, choices and so what you do is you have big tiled arrays with all combinations of your grids, aNrm_tiled and ShareNext.

# Make tiled arrays to calculate future realizations of bNrm and Share when integrating over RiskyDstn
self.aNrm_tiled, self.ShareNext = np.meshgrid(
self.aNrmGrid, self.ShareGrid, indexing="ij"
)

What you then want to do is compute the expectation of various functions (v, dvdm, dvds) of next period's state ((m,Sharenext)) conditional on every combination of this period's (aNrm,Share). One way to do this that would move us in the direction of using the same transition equations/functions in the solution and simulation of the model would be to compute the distribution of (m,ShareNext) conditional on every (a,Share) using something like

state_next_dstn = dist_of_function(dstn = ShockDstn, func = lambda shocks: transition(shocks, a, Share))

and then use calc_expectations to on state_next_dstn to find the expectation of v, dvdm, dvds.

The issue with this approach would be that we would need to have one DiscreteDistribution object representing the distribution of (m,ShareNext) for every combination of (a,Share) and would need to take the expectations one by one.

An alternative would be a single DiscreteDistribution object whose realizations X would be matrices of the same size as the "tiled" arrays. Then, X[i,j] would be interpreted as "next period's state conditional on the fact that you chose the i-th a and j-th share and whatever shock was realized." This would be a much more practical object to handle and take expectations over. The issue is that the current discrete distributions do not allow matrix-valued random variables, and that is what this PR wants to change.

@mnwhite
Copy link
Contributor

mnwhite commented May 18, 2022 via email

@sbenthall
Copy link
Contributor

This looks good!

@Mv77
Copy link
Contributor Author

Mv77 commented May 18, 2022

Thanks @mnwhite. I think @sbenthall has already merged a calc_expectations that can be given a function as a input. If not given, it finds the expectation of the distribution itself.

What I'm going for looks something like this.
state_next_dstn = dist_of_function(shock_dstn, lambda x: transition(x, post_state_now))
end_of_period_Ev = calc_expectation(state_dstn_next, vFunc_next)
end_of_period_Edvdm = calc_expectation(state_dstn_next, dvdmFunc_next)
end_of_period_Edvds = calc_expectation(state_dstn_next, dvdsFunc_next)

@mnwhite
Copy link
Contributor

mnwhite commented May 20, 2022 via email

@Mv77
Copy link
Contributor Author

Mv77 commented May 22, 2022

@mnwhite you are completely right that to take the expectation one would need to work with the permanent shock too. The way I handle that in a proof of concept for the portfolio model was to make the permanent shock part of the "state" of which we are finding the distribution at this stage. It is only a "state" for taking the expectation, I'm not saying that I made it an actual state. Still, it does make things slightly slower.

In terms of speed, the specialized code that we have for each particular model will always win. But I still think the tool is valuable for things like expressing a model with a transition (or a belief about a transition) that can be switched for another with the least pain possible. Having the option won't hurt.

@Mv77
Copy link
Contributor Author

Mv77 commented May 22, 2022

That is not the sole purpose of the PR, however. Even if my transition setup is not what we want to achieve, enforcing discrete distributions' values to be an np.ndarray is valuable.

  • Knowing the type allows the methods for this class to not have to check whether it is a list, an array, or something else and that makes the code more elegant. See for instance the .draw() and combine_indep_dstns methods in this PR.
  • It brings us closer to being able to use xarrays which are a tool that Seb, Alan, and Chris want to use. The big advantage (I hear) is being able to refer to dimensions by name, and that would make for more readable and orderly code.

@Mv77
Copy link
Contributor Author

Mv77 commented May 22, 2022

Base tests are passing. I still need to

  • Fix the examples that are currently failing.
  • Update the class description and documentation.

@Mv77 Mv77 changed the title [WIP] Array-valued discrete distribution Array-valued discrete distribution May 24, 2022
@Mv77
Copy link
Contributor Author

Mv77 commented May 24, 2022

This is ready for review (or further thoughts).

@sbenthall sbenthall self-assigned this May 25, 2022
@sbenthall
Copy link
Contributor

@sbenthall I'll review this.

# numpy is weird about 1-D arrays.
dstn_array = dstn_array.T

f_query = np.apply_along_axis(func, 0, dstn_array, *args)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbenthall keep the behavior of *args in mind when reviewing. It has changed!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick search does suggest that map with lambda functions is quite slow.

Do you think there is a better, faster way to do it? The issue I was trying to get around is that np.apply_along_axis requires the function's inputs to be 1-d

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like list comprehensions would work better:
https://stackoverflow.com/questions/1247486/list-comprehension-vs-map

Let me run some tests before merging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like np.apply_over_axes is numpy's function to apply over more than one dimension

https://numpy.org/doc/stable/reference/generated/numpy.apply_over_axes.html#numpy.apply_over_axes

Would that fit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

That certainly looks like something I could use. The only issue I see is that it does not accept extra arguments, like np.apply_along_axis. I could adapt it with a lambda function, not sure if it will make it lose too much performance.

I'll run some tests and report on Wed. Thanks!

@@ -1206,7 +1206,7 @@ def post_return_derivs(inc_shocks, b_aux, g_aux, s):

# Define grids
b_aux_grid = np.concatenate([np.array([0.0]), Rfree * aXtraGrid])
g_aux_grid = np.concatenate([np.array([0.0]), max(RiskyDstn.X) * nNrmGrid])
g_aux_grid = np.concatenate([np.array([0.0]), max(RiskyDstn.X.flatten()) * nNrmGrid])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a lot of extra code is now needed to flip vertical vectors/arrays into horizontal ones.
Would it be possible to have the arrays oriented horizontal in the DiscreteDistribution object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible yes, but I still thought this would be the path of least resistance.

I think other parts of HARK that deal with multivariate distributions use things like dstn.X[n] to find all the possible draws of the nth dimension. This seems like an intuitive and compact code pattern that I would like to keep. It was easier for me to conceptually think that nature was always the last dimension.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"X" is truly a terrible name for an object property that is doing so much work. See #1051

I'm not suggesting you replace "X" with something else in this PR, but I wonder what mathematical object X is now.

It's possible that we could have multiple ways of accessing this data within a distribution, so it requires less manipulation before entering and after pulling it from the object.


# Construct and return discrete distribution
return DiscreteDistribution(
pmf, X, seed=self.RNG.randint(0, 2 ** 31 - 1, dtype="int32")
pmf, X.T, seed=self.RNG.randint(0, 2 ** 31 - 1, dtype="int32")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also seems like something the user of DiscreteDistribution would rather not have to keep track of.
Is the issue that a 1-dimensional array is now ambiguously a 1d distribution with N buckets, or an N-D distribution with 1 bucket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

May be multivariate (list of arrays).
For multivariate distributions, the last dimension of X must index
"nature" or the random realization. For instance, if X.shape == (2,6,4),
the random variable has 4 possible realizations and each of them has shape (2,6).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be more consistent with the other downstream code if the 'nature' dimension was the first dimension, not the last.

@sbenthall
Copy link
Contributor

I've checked it over. See inline comments. Summary:

  • np.apply_over_axes might be what you're looking for to map a function over multiple dimensions of an array
  • Wouldn't it require less alteration to the code if the 'nature' dimension was the first, not last, dimension?

@sbenthall sbenthall merged commit 704bf8e into econ-ark:master Jun 1, 2022
@sbenthall sbenthall added this to the 0.13.0 milestone Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants