Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation improvement for lifecycle models #95

Closed
mnwhite opened this issue Jan 9, 2017 · 11 comments
Closed

Simulation improvement for lifecycle models #95

mnwhite opened this issue Jan 9, 2017 · 11 comments

Comments

@mnwhite
Copy link
Contributor

mnwhite commented Jan 9, 2017

The "simulation overhaul" PR made things better / actually functional for macro models, but at the cost of being cripplingly slow for micro-oriented lifecycle models. We should fix this.

The old simulation system would simulate N agents for T periods, with death captured through weighting the various periods of simulation. With proper demographic and mortality weighting, the entire history of NT simulated observations could be interpreted as a cross section of the population. This was good for micro models where agents just do their own thing, but bad-to-useless for macro models where agents' collective behavior feeds back into the aggregate state.

The new simulation system actually tracks the "current population", with agents actually dying and being replaced on the fly. Each simulated agent has a t_age and t_cycle, so that we know how absolutely old they are and where in the cycle they are. This enables macro models with gen eq features to work properly, and is just fine with perpetual youth models where t_cycle=0 for all agents, or at least T_cycle is small.

However, when T_cycle is large (say, 384 in the case of lifecycle cstwMPC), simulation is cripplingly slow in the new system. Each period, the simulator has to split out the agents by t_cycle and use the appropriate period-specific policy functions for each one. As is, that requires AgentCount*T_cycle equality comparisons every simulated period, and T_cycle calls to the policy functions. The latter can't be avoided if we want a unified simulation system, but there might be a fix for the former.

I propose using the attribute t_cycle_array as a Boolean array indicating each simulated agent's t_cycle in each simulated period. t_cycle_array[t,j,i]=True if and only if agent i has t_cycle=j when t_sim=t. This would only result in a speedup if the simulation is run at least twice, otherwise there is no reduction in computation.

This might be useful in conjunction with makeShockHistory, which pre-specifies shocks, and is an impetus to implement the makeMortalityHistory idea previously discussed.

@mnwhite
Copy link
Contributor Author

mnwhite commented Jun 14, 2018

After thinking about this on the back burner for a while, I think the solution is to rework the simulation framework so that models can handle three types of simulation, chosen with a single parameter:

  1. "Enumerated population" style, what we have now. There is an explicitly enumerated population of agents; agents "age", die, and are replaced as time proceeds. This works for all models, but is extremely slow with lifecycle models with a large T_cycle.

  2. "Cohort history" style, the version I originally programmed. One explicitly enumerated "birth cohort" is simulated from the first period of the model to the last, with no death and replacement. Cumulative survival weights can be calculated and the TxN history arrays can be reshaped and interpreted as a cross section of the population (i.e. many cohorts, each at a different period of time). However, this is only valid if there are no aggregate shocks or other effects that make absolute time relevant. When applicable, this style is faster to run because there are fewer "age checks" and fewer calls to interpolated functions.

  3. "Vectorized population" style, like in (most?) HANK models. This is like (1), but rather than tracking a population of explicitly enumerated agents, we instead (finely) discretize the idiosyncratic state space and track the distribution of agents as a stochastic vector (population weights that sum to 1).

I need to think about a framework that would allow these styles to be interchangeable.

@llorracc
Copy link
Collaborator

We definitely need 1. and 3. as tools. Version 2 is an interesting idea but the flaws you identify are probably fatal -- almost every application of these models will want to have some kind of aggregate events and I don't see a way of fixing the fact that when you need keep track of each cohort's idiosyncratic history of aggregate shocks things probably get similarly unwieldy to 1.

@mnwhite
Copy link
Contributor Author

mnwhite commented Jun 15, 2018

The point of (2) is that it's faster than (1), but only applicable in some cases. That's the point: We would be offering people an option to use (2) when their application will allow it.

Just for perspective, the following applications in HARK would be better with (2):

  • SolvingMicroDSOPs
  • cstwMPC lifecycle, which has no aggregate shocks
  • my JMP
  • my medical insurance model

@llorracc
Copy link
Collaborator

OK, maybe there's enough of a use case after all.

@llorracc
Copy link
Collaborator

Having thought about this a bit more, my guess is that if a full and complete treatment of this issue were performed, the bottom line would be that the interactions between history and the life cycle would have only second or third order effects. Medium/high frequency dynamics will mostly be dominated by short-horizon people, whose behavior will be the same regardless of distant historical events. Proving that, however, would be a lot of work ...

@mnwhite
Copy link
Contributor Author

mnwhite commented Apr 26, 2019 via email

@sbenthall sbenthall added this to the 1.0.0 milestone Mar 15, 2020
@mnwhite
Copy link
Contributor Author

mnwhite commented Aug 24, 2020

The first version of this would be not too difficult to implement once model structure is formalized with the work Sebastian is doing. There would be a switch that indicates whether agents in the model should actually be killed according to their LivPrb or instead have their CumLivPrb downweighted by LivPrb (starting at 1.0).

@sbenthall
Copy link
Contributor

Related to #847
I love the idea of pluggable population models
But with Matt out for a few months this will need to be bumped to 1.x

@sbenthall sbenthall modified the milestones: 1.0.0, 1.x.y Dec 15, 2020
@sbenthall sbenthall modified the milestones: 1.x.y, 1.1.0 Jan 23, 2021
@sbenthall
Copy link
Contributor

@alanlujan91 might be interested in this

@sbenthall
Copy link
Contributor

Related to #890

@mnwhite
Copy link
Contributor Author

mnwhite commented Jul 3, 2024

This is a major part of the HARK 1 roadmap, so closing this as an active issue.

@mnwhite mnwhite closed this as completed Jul 3, 2024
@mnwhite mnwhite added Tag: 1.0 About the v1 release of HARK. and removed Tag: Discussion labels Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

5 participants