Replies: 14 comments
-
The fundamental point is that the only thing that is mathematically necessary for a problem to be represented as a Bellman problem is that there be value function connecting the successive stages of the problem. That's it. Suppose in the day I have three choices to make. In the evening, I decide what to watch on TV, which gives me pleasure according to some function f(watch TV or go to a movie | depending on cash in my wallet, and on the day's news which might make me want to watch TV even if I've got lots of money). That is, the value I get will depend on the day's news, which is stochastic (and not known before evening) and a state variable of money. In the afternoon, I decide which restaurant to have lunch in, depending on what I see on online menus at lunchtime: g_{noon}(lunch | hunger_{lunchtime}, menus_{lunch}) In the morning, I decide whether to exercise by running around the lake outside if its a nice day, or on a treadmill inside -- which will determine how hungry I am at lunch, which may determine which restaurant I eat at - which may determine how much cash I have when I decide what to do in the evening. Viewed from the perspective of the night before, when I don't know what the weather will be next morning, there is a series of shocks arrayed in time. The problems are completely different at each stage, with different utility functions, control variables, state variables, transition equations, etc. But the problem is completely coherent and is susceptible to solution by backward induction on Bellman equations. The economist solving it would need to construct three unique solve_one_period solvers: value function v_{eve} = maximize utility subject to money and news v_{noon} = g_{noon}(lunch | hunger_{lunchtime}, menus_{lunch}) + E[v_{eve}(money, news) ] where E[] signifies that news is stochastic as of lunchtime. v_{morn}(weather,money) = f_{morn}(exercise choice depending on weather) + E[v_{noon}] v_{previous day}(money) = E[v_{morn}(money)] Maybe with this framework you can pose your question more concretely. I'm thinking of a "day" as being basically a stage of the problem, and of a requirement that the shocks all be realized at the beginning as being that at the end of the previous day, the weather shock of the morning, the menu shock of noon, and the news shock of the evening all should be drawn. |
Beta Was this translation helpful? Give feedback.
-
I thought I was following along just fine until I read this:
In this example, I would say the most natural way to model it, given our terminology, is:
...because of the repetition of the problem each day. Do you see it differently? |
Beta Was this translation helpful? Give feedback.
-
I'm thinking of a day as a stage, because it might be something that is part of a week, or a month, or a year. But even in your interpretation, the key point I was making holds: The problem in the night-before, the morning, the noon, and the evening, are basically completely distinct problems: Different states, controls, shocks, transition equations, etc. I cannot see any technical reason to assume that, say, the "lunch-menu" shocks must be realized at the same time as the "morning weather" shocks: All of them, you have been saying, have to be realized, I guess, right after the night-before expectations are taken. That, in my conception, is what we are disagreeing about. If you can articulate why all the shocks need to be realized at the beginning of the stage (after the night-before), I'm open to understanding why. Otherwise, it seems more natural to me to allow shocks to be realized at any point during the successive [steps?, moves?, evolutions?] during the stage. I suspect that this is another case where we have been misunderstanding each other, rather than disagreeing. |
Beta Was this translation helpful? Give feedback.
-
Yes, I agree there is a misunderstanding here.
Yes! I agree with this. I have never disagreed with it. Though Matt has raised a good point about why, in practice, you might want to have them sampled together at the beginning of the period. What I originally said, which seemed to spark this confusion, was:
I can clarify. What I meant was: for each exogenous shock variable, it occurs only once in a period. But as a corollary: because they each happen only once per period, and because they are exogenous, it doesn't matter much when forward-simulating "when" they are sampled. I believe Dolo samples them all at once, prior to the rest of the simulation, for performance reasons. I see all this as quite well-established fact at this point. Where I still see confusion is around these two points:
Earlier we defined some terminology. I don't see how your usage accords with that terminology. In the model you've described, the three stages repeat daily. I believe (though I'm having a hard time understanding some of the description) there is a control variable in each stage.
There is a sense in which that's true. The derived value functions isolate the information from each stage. But I'm not sure what you're getting at. I think what you are saying is that you would prefer it if there was an architecture that modeled each stage as a separate problem. Is that it? |
Beta Was this translation helpful? Give feedback.
-
Yes, that's exactly the point. And that is exactly what we say in the link you provided: Let me try another example, to see if we can converge. Let's modify the example we have now of a person whose life is different in different seasons of the year. We agree that the different seasons are different "periods." But there's no reason that the problem within each of those periods needs to be the same. In winter, for example, the person might have the option of being a ski bum OR a snowmobile tourguide, and might have the opportunity to "invest" in the value of their mountain cabin by building an extra room. In the summer, they are on the river, but they can work in the restaurant or as a river guide, or they can take a scuba vacation. In other words, their Bellman problems are arbitrarily different between periods. Or maybe a different example will help. We go back to the original seasonal problem, but add to it only one twist: Now the person is not just making a consumption decision but also a portfolio choice decision (between risky and safe assets) in every quarter. There is no inherent timing structure to the consumption and portfolio choice decisions; they are, in principle, simultaneous. But it turns out that for efficient solution of that problem, it is useful to break it into two "stages": Given assets-after-consumption, what proportion do I optimally invest in risky vs safe assets? PortShareOptimal_{t}(a_{t}) = Share that maximizes expected value from next quarter, \beta E_{t}[v_{beginning-of-period,t+1}(m_{t+1})] The share is the sole control variable here, there's no "joint" problem. Solution to this yields a "partway-through-the-period" value function, v_{t,stage: after consumption decision}(a_{t}) From this they can solve the single-control-variable problem v_{t} = maximize u(c) + v_{t,stage: after consumption decision}(a_{t} = m_{t} - c_{t}) I think you are on board with all of this. The period is a quarter. Within each quarter there are stages. The stages may have a natural ordering, but we don't want to think of them as being separated in calendar/chronological/real time for the purposes of the economics we are studying. All of this seems to me consistent with the terminology above, but maybe we need to tweak or expand upon that terminology to explicitly address whatever it is that seems to have made the two of us think we had different understandings of the terminology. |
Beta Was this translation helpful? Give feedback.
-
I'm not sure that in this case, it would be right to consider each season to be a different period. You are making a point that it is possible to imagine chains of problems for which the problems bear little resemblance to each other. I'm trying to point out that very often, there is a repetitive structure to the problems, and that this repetitive structure is what's being captured by the primitive t in the model. So, whereas in the original 'cyclic' seasonal problem the only thing that varies each season is the income shock, and so it makes sense to model that as a cyclic exogenous process that changes from period to period, in this new example I would be more inclined to model the problem with a yearly period and four seasonal stages. This may be a subtle point but what I'm sensitive to is the complexity of configuring a model. The more repetition in the model is exploited, the more elegantly the model can be expressed. Suppose we were to model the original seasonal model, where the variations are only to income shock configurations, as four separate Bellman problems each fully specified with their equations etc. There would be a lot of redundancy within that representation. The other question I have about what you're getting at -- modeling a problem as a sequence of fully specified Bellman problems -- is how you imagine building in the necessary links joining the problems. I.e., when the state in one problem depends on the state in another. |
Beta Was this translation helpful? Give feedback.
-
OK, I think it is now clear what the heart of our confusion was.
Earlier, I had proposed that we stop using the word "period" because it
seemed to be confusing matters.
The reason is that one can think of "period" as "the period in which Ronald
Reagan was elected President" or "the period of the year in which flowers
bloom."
You had proposed to associate "t" with "period." By which you meant "when
flowers bloom." I accepted it because I interpreted it as "when Ronald
Reagan was elected."
In both the economics and the mathematics literature, t is associated with
Ronald Reagan, not flowers. Hence our confusion.
You are right that it is extremely useful to consider models in which
patterns are repeated over and over again.
But the convention is that way to represent such patterns is to remove the
t entirely.
So for example the limiting solution to the infinite horizon buffer stock
consumption problem is
v(m) = u(c) + \beta v(m')
m' = (m-c)R + y'
Where the "primes" represent the fact that the period in question is the
'other' period. There's no need for the pluripotent variable t because
there's only one interesting alternative to today, which is "the future
period."
So, let me repeat my proposal. At least for now, let's ban the word
period. Let's also ban the word "cycles" because that is so confusingly
used in the current code.
I propose three definitions.
An "age": Dickens says "it was an age of incredulity" referring to the
late 18th century. This is an interval of time (which might have
subdivisions, like years, which are all identical to each other, at least
up to minor things like mortality rates).
A "progression" has "steps" that are arranged in time. So, a year is a
"progression" from winter to spring to summer to fall. The "steps" are
allowed different from each other (the ski bum in winter has different
options than the raft guide in summer).
A "course" has "obstacles" that are (computationally) overcome
sequentially, but which are not necessarily arranged in time. So, the
(conceptually simultaneous) joint consumption/portfolio-choice problem can
be dissected into a "course" with two "obstacles." The second "obstacle"
is, for a given amount of end-of-period assets a_{t}, to determine the
share to invest in safe vs risky assets. The first "obstacle" is, knowing
that you will make an optimal choice wrt the second obstacle, how much to
consume out of your market resources m_{t}.
But, the key point is that there is no simplification possible to any of
these beyond representing them as a Bellman problem. The elemental
requirement is that we need to be able to allow arbitrary variation in
states, controls, transition equations, optimality conditions, etc between
"ages", between "steps" of progressions, and between "obstacles" in
courses. Period. No compromise. Because any compromise would rule out
hugely important classes of problems.
Being able to handle all of this is, basically, is the central, core,
foundational, fundamental mission of HARK.
And it is something that can be accomplished in the existing HARK toolkit.
No revision of the toolkit can abandon that capacity.
But it is now accomplished in a highly unsatisfactory way: Basically, the
user must handcraft the appropriate sequence of "solve_one_period" solvers
all the way back to forever (or, until the beginning of the lifetime or
until the problem "converges" in some sense).
Honestly, I've long felt that addressing all of this satisfactorily is more
of a HARK 2.0 project than a plausible thing to be accomplished as part of
HARK 1.0. So, my sense is that it is is something that should be
incorporated in the scope of any proposals we write for a grant for future
funding.
…On Wed, Apr 7, 2021 at 12:20 PM Sebastian Benthall ***@***.***> wrote:
We agree that the different seasons are different "periods." But there's
no reason that the problem within each of those periods needs to be the
same. In winter, for example, the person might have the option of being a
ski bum OR a snowmobile tourguide, and might have the opportunity to
"invest" in the value of their mountain cabin by building an extra room. In
the summer, they are on the river, but they can work in the restaurant or
as a river guide, or they can take a scuba vacation. In other words, their
Bellman problems are arbitrarily different between periods.
I'm not sure that in this case, it would be right to consider each season
to be a different period.
You are making a point that it is possible to imagine chains of problems
for which the problems bear little resemblance to each other.
I'm trying to point out that very often, there is a repetitive structure
to the problems, and that this repetitive structure is what's being
captured by the primitive *t* in the model.
So, whereas in the original 'cyclic' seasonal problem the only thing that
varies each season is the income shock, and so it makes sense to model that
as a cyclic exogenous process that changes from period to period, in this
new example I would be more inclined to model the problem with a yearly
period and four seasonal stages.
This may be a subtle point but what I'm sensitive to is the complexity of
configuring a model. The more repetition in the model is exploited, the
more elegantly the model can be expressed. Suppose we were to model the
original seasonal model, where the variations are only to income shock
configurations, as four separate Bellman problems each fully specified with
their equations etc. There would be a lot of redundancy within that
representation.
The other question I have about what you're getting at -- modeling a
problem as a sequence of fully specified Bellman problems -- is how you
imagine building in the necessary links joining the problems. I.e., when
the state in one problem depends on the state in another.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#997 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKCK72BM3LNHHIMK5FGOK3THSA4TANCNFSM42O37C6Q>
.
--
- Chris Carroll
|
Beta Was this translation helpful? Give feedback.
-
Hmmm. OK. Here are my takeaways from this discussion:
I would caution against the use of "age" in this new terminology, since that has the connotations of both an objective interval of time (the 18th century) and the age of an individual person. I'd much prefer to reserve "age" for the latter. Since HARK is also designed to support heterogeneous agent modeling (HA), and agents may be heterogeneous by age, that seems like an important point to not miss. I wonder how you see this schema for arbitrary variation in the single agent problem connecting to problems with market equilibria. For reasons grounded in algorithmic information theory, it won't be possible to represent an infinite horizon problem with arbitrary variation in each time interval in a finite amount of software code. So I have to assume that you are restricting this discussion to finite horizon problems? In that case, what you are describing to me sounds much more like a general framework for decision theory than a framework for what Stachurski calls stochastic dynamic programming. Does that sound right? |
Beta Was this translation helpful? Give feedback.
-
On Thu, Apr 8, 2021 at 7:56 AM Sebastian Benthall ***@***.*** ***@***.***> wrote:
Hmmm. OK. Here are my takeaways from this discussion:
- This is primarily a HARK 2.0 discussion
Yes.
But a discussion worth having had at this stage:
1. Because it will help write a better proposal for HARK 2.0 funding
2. Because it may have some implications for how to accomplish the
remaining HARK 1.0 work
- If the main goal of HARK is to be an environment for writing one
period solvers, we should focus more on tooling that makes this easier by,
e.g, providing more mathematical capabilities. (For example,
functionalizing taking expectations, so it does not have to be recreated by
hand for every solution method, is useful).
Absolutely. The pervasive use of matrix multiplication instead of
expectation-taking even when expectation-taking is an option is an easily
fixable thing I’ve long wanted to fix. This is something we can even
accomplish as we move toward HARK 1.0.
-
- I'm wondering how you feel about Dolo's reduction of solution
concepts to the minimal information necessary, e.g. a reward function, or
an arbitrage function, sits within this scheme
It’s perfect, as the way to specify an individual Bellman problem.
The key way in which it does not fit our needs is precisely its (original)
inability to string together sequences/rounds/epochs. That’s why our main
discussions with Pablo have been on exactly that topic: How to add such a
capability to Dolo.
-
- I understand that it is an absolute necessity for HARK 2.0 to
have the ability to represent any arbitrary variation in Bellman problems
across ages and progressions.
- What improvements do you see potentially to the "unsatisfactory"
way HARK 1.0 accomplishes this?
One bit of re-engineering that seems like it might be plausibly do-able is
to add one layer of recursion of the whole setup.
Right now, basically, the HARK user has to choose:
1. You can do what is now called a “T_cycle” model (finite horizon
model), which is the case where in principle you can have a string of
arbitrarily different solve_one_period solvers going back as many
periods as you like. This is the basis for life cycle modeling
2. You can do what is now called a cycles model;
- This is required to be an infinite horizon model
- Within each “cycle” there are “steps” like the seasons of the year,
in which the problems can differ
- But each “cycle” as a whole is identical to all the other cycles.
Again to avoid confusion, I will substitute my word “progression” for the
“cycles” and restrict it to mean what I defined it to mean in the earlier
post: a series of steps ordered in time (like seasons).
You write that “age” *has the connotations of both an objective interval of
time (the 18th century) and the age of an individual person*.
That’s right, but as things stand that’s a feature not a bug. In HARK at
present, and in fact in almost all modeling of this kind, there is no
attempt to connect the time concept to specific moments of calendar time in
the real world. There was a fairly extensive discussion of whether at some
point we might like to allow for such a thing in HARK, about a year ago I
think, but the discussion was inconclusive.
We CAN allow the AGGREGATE economy to go through a “life cycle” of steps in
which its structure progresses from the “age of stone” to the “age of iron”
to the “machine age” to the “space age” to the “computer age” or whatever.
But only by using the time term I’m calling age and that we normally use
for age of an individual. Everyone in the economy would be living in the
same “age” of the aggregate economy, but they couldn’t have their own
individual specific ages because the age variable was already being used.
Returning to the main thrust (is there a reasonably low-cost way to improve
HARK 1.0) to make a step in our desired direction, I think there may be: We
should see how hard it would be to allow the embedding of a “progression”
or a “course” within a single “age.” That would allow us, for example, to
have life cycle consumers whose lives are divided into quarters that have a
seasonal “progression”. Or it would allow our life cycle consumers to
annually solve a portfolio choice problem that is instantiated as a
“course”.
Sometimes the easiest way to accomplish this kind of thing programmatically
is to allow for an arbitrary degree of recursion. That is, if we are able
to enable embedding in a generic way, we could allow embedding the
portfolio choice “course” inside of each season of the seasonal
“progression.”
I would caution against the use of "age" in this new terminology, since
that has the connotations of both an objective interval of time (the 18th
century) and the age of an individual person. I'd much prefer to reserve
"age" for the latter. Since HARK is also designed to support heterogeneous
agent modeling (HA), and agents may be heterogeneous by age, that seems
like an important point to not miss.
I wonder how you see this schema for arbitrary variation in the single
agent problem connecting to problems with market equilibria.
For reasons grounded in algorithmic information theory, it won't be
possible to represent an infinite horizon problem with arbitrary variation
in each time interval in a finite amount of software code. So I have to
assume that you are restricting this discussion to finite horizon problems?
In that case, what you are describing to me sounds much more like a
general framework for decision theory than a framework for what Stachurski
calls stochastic dynamic programming. Does that sound right?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#997 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKCK74MW45SOLTQZUX5M3DTHWKV7ANCNFSM42O37C6Q>
.
--
- Chris Carroll
|
Beta Was this translation helpful? Give feedback.
-
I'm having trouble following this because of the formatting errors. |
Beta Was this translation helpful? Give feedback.
-
OK, let's discuss in the upcoming meeting. |
Beta Was this translation helpful? Give feedback.
-
@llorracc I've been giving this some more thought. I'm trying to reconcile two conflicting ideas:
My understanding is that a Bellman equation is, classically, a (a) value function over states in a dynamic programming problem, defined in terms of a maximization over (b) a current payoff for taking an action summed with (c) the discounted (d) recursively referenced value function over (e) the state resulting from the transition of the current state and the chosen action. Classically, the (c) discounting and (d) recursive definition of the value function depends on the periodicity (repeating-ness) of the problem. Bellman equations are most useful in the infinite horizon context because they have some nice convergence properties. Based on what you've written here and said elsewhere, it sounds to me like what you want HARK to be doing does not look like a classic Bellman equation solution in a number of significant ways. First, infinite horizon problems seem to be of secondary importance. Most of the problems you're interested in seem to be ones with finite lifespans and well-defined terminal solutions. Second, you seem to want to break to condition that the value functions are recursively defined. In the general case of a "course", the expectation is that the value function for one obstacle will be defined in terms of a different value function (the value function corresponding to the next obstacle). Third, while there may be cases where these value functions are mutually recursive, there's no requirement that they will be. Which leads me to wonder why you continue to call the value functions "Bellman equations". For example, it's not clear to me what the semantics of the discount factor are in such a flexible modeling framework. Classically, the discount factor is easy to explain: it's a multiplicative discount on the utility that applies once per period. But without periods and only 'progressions' and 'courses', the discount rate seems rather unprincipled. Should I apply it only every time I consume, or also when I choose my portfolio allocation? Do I discount every year or every season? Etc. I would find it less confusing if we adopted a term that was more technically appropriate to the task at hand. I'd be perfectly happy to provisionally call what you are implying "Carroll Equations", and then try to define those formally. I'd be much more comfortable with that than stretching the definition of "Bellman equation" beyond recognition. At the opposite extreme of the Bellman equation, we could imagine for any finite problem an "extensive form decision problem", analogous to an extensive form game, but with only one player. It would have a series of steps/obstacles/decision variables, a number of "chance" nodes (shocks), have payoffs, and would be tractable with backwards induction. It sounds like this is really what you are after, since that would be able to represent any finite problem at all. Of course, it would be tedious to spell out the details of every step if there was a lot of redundancy between them. But that would motivate a library for concisely defining extensive form decision problems, which would probably look quite unlike what HARK is now (which takes as its default a Bellman form problem with periodicity and a perhaps infinite horizon.) |
Beta Was this translation helpful? Give feedback.
-
ping @alanlujan91 since he mentioned in a meeting an interest in something like this |
Beta Was this translation helpful? Give feedback.
-
Moving this to Discussion; this is largely resolved with the new discourse. |
Beta Was this translation helpful? Give feedback.
-
Continuing from here
The example that's come up for this is:
@llorracc it's clear that you think this is an important case to architect around.
But I'm afraid that after our brief discussion of it earlier I'm still not following the logic of this.
A few questions about it:
Beta Was this translation helpful? Give feedback.
All reactions