Solution Methods for Microeconomic Dynamic Stochastic Optimization Problems

SolvingMicroDSOPs, April 7, 2022

Solution Methods for Microeconomic Dynamic Stochastic Optimization Problems

April 7, 2022

Christopher D. Carroll¹

Note: The code associated with this document should work (though the Matlab code may be out of date), but has been superceded by the set of tools available in the Econ-ARK toolkit, more speciﬁcally the HARK Framework. The SMM estination code at the end has speciﬁcally been superceded by the SolvingMicroDSOPs REMARK

_____________________________________________________________________________________

Abstract
These notes describe tools for solving microeconomic dynamic stochastic optimization problems, and show how to use those tools for eﬃciently estimating a standard life cycle consumption/saving model using microeconomic data. No attempt is made at a systematic overview of the many possible technical choices; instead, I present a speciﬁc set of methods that have proven useful in my own work (and explain why other popular methods, such as value function iteration, are a bad idea). Paired with these notes is Mathematica, Matlab, and Python software that solves the problems described in the text.

Keywords: Dynamic Stochastic Optimization, Method of Simulated Moments, Structural Estimation
JEL codes: E21, F41

PDF:

https://github.com/llorracc/SolvingMicroDSOPs/blob/master/SolvingMicroDSOPs.pdf

Slides:

https://github.com/llorracc/SolvingMicroDSOPs/blob/master/SolvingMicroDSOPs-Slides.pdf

Web:

https://llorracc.github.io/SolvingMicroDSOPs

Code:

https://github.com/llorracc/SolvingMicroDSOPs/tree/master/Code

Archive:

https://github.com/llorracc/SolvingMicroDSOPs

(Contains LaTeX code for this document and software producing ﬁgures and results)

¹Carroll: Department of Economics, Johns Hopkins University, Baltimore, MD, http://www.econ2.jhu.edu/people/ccarroll/, ccarroll@jhu.edu, Phone: (410) 516-7602 The notes were originally written for my Advanced Topics in Macroeconomic Theory class at Johns Hopkins University; instructors elsewhere are welcome to use them for teaching purposes. Relative to earlier drafts, this version incorporates several improvements related to new results in the paper “Theoretical Foundations of Buﬀer Stock Saving” (especially tools for approximating the consumption and value functions). Like the last major draft, it also builds on material in “The Method of Endogenous Gridpoints for Solving Dynamic Stochastic Optimization Problems” published in Economics Letters, available at http://www.econ2.jhu.edu/people/ccarroll/EndogenousArchive.zip, and by including sample code for a method of simulated moments estimation of the life cycle model a la Gourinchas and Parker (2002) and Cagetti (2003). Background derivations, notation, and related subjects are treated in my class notes for ﬁrst year macro, available at http://www.econ2.jhu.edu/people/ccarroll/public/lecturenotes/consumption. I am grateful to several generations of graduate students in helping me to reﬁne these notes, to Marc Chan for help in updating the text and software to be consistent with Carroll (2006), to Kiichi Tokuoka for drafting the section on structural estimation, to Damiano Sandri for exceptionally insightful help in revising and updating the method of simulated moments estimation section, and to Weifeng Wu and Metin Uyanik for revising to be consistent with the ‘method of moderation’ and other improvements. All errors are my own.

1 Introduction

Calculating the mathematically optimal amount to save is a remarkably diﬃcult problem. Under well-founded assumptions about the nature of risk (and attitudes toward risk), the problem cannot be solved analytically; computational solutions are the only option. To avoid having to solve this hard problem, past generations of economists showed impressive ingenuity in reformulating the question. Budding graduate students are still taught a host of tricks whose purpose is partly to avoid the resort to numerical solutions: Quadratic or Constant Absolute Risk Aversion utility, perfect markets, perfect insurance, perfect foresight, the “timeless” perspective, the restriction of uncertainty to very special kinds,¹ and more.

The motivation is mainly to exchange an intractable general problem for a tractable speciﬁc alternative. Unfortunately, the burgeoning literature on numerical solutions has shown that the features that yield tractability also profoundly change the solution. These tricks are excuses to solve a problem that has deﬁned away the central diﬃculty: Understanding the proper role of uncertainty (and other complexities like constraints) in optimal intertemporal choice.

The temptation to use such tricks (and the tolerance for them in leading academic journals) is palpably lessening, thanks to advances in mathematical analysis, increasing computing power, and the growing capabilities of numerical computation software. Together, such tools permit today’s laptop computers to solve problems that required supercomputers a decade ago (and, before that, could not be solved at all).

These points are not unique to the consumption/saving problem; the same propositions apply to almost any question that involves both intertemporal choice and uncertainty, including many aspects of the behavior of ﬁrms and governments.

Given the ubiquity of such problems, one might expect that the use of numerical methods for solving dynamic optimization problems would by now be nearly as common as the use of econometric methods in empirical work.

Of course, we remain far from that equilibrium. The most plausible explanation for the gap is that barriers to the use of numerical methods have remained forbiddingly high.

These lecture notes provide a gentle introduction to a particular set of solution tools and show how they can be used to solve some canonical problems in consumption choice and portfolio allocation. Speciﬁcally, the notes describe and solve optimization problems for a consumer facing uninsurable idiosyncratic risk to nonﬁnancial income (e.g., labor or transfer income),² with detailed intuitive discussion of the various mathematical and computational techniques that, together, speed the solution by many orders of magnitude compared to “brute force” methods. The problem is solved with and without liquidity constraints, and the inﬁnite horizon solution is obtained as the limit of the ﬁnite horizon solution. After the basic consumption/saving problem with a deterministic interest rate is described and solved, an extension with portfolio choice between a riskless and a risky asset is also solved. Finally, a simple example is presented of how to use these methods (via the statistical ‘method of simulated moments’ or MSM; sometimes called ‘simulated method of moments’ or SMM) to estimate structural parameters like the coeﬃcient of relative risk aversion (a la Gourinchas and Parker (2002) and Cagetti (2003)).

2 The Problem

We are interested in the behavior a consumer whose goal in period $t$ is to maximize expected discounted utility from consumption over the remainder of a lifetime that ends in period $T$ :

[ ] T∑− t max 𝔼t βn𝜃𝜃𝜃u(ct+n) , n𝜃𝜃𝜃=0

(1)

and whose circumstances evolve according to the transition equations³

at = mt − ct bt+1 = atRt+1 y = p 𝜃𝜃𝜃 t+1 t+1 t+1 mt+1 = bt+1 + yt+1

(2)

where the variables are

β − pure time discount factor at− assets after all actions have been accomplished in period t bt+1− ‘bank balances’ (nonhuman wealth ) at the beginning of t + 1 ct− consumption in period t mt − ‘market resources’ available for consumption (‘cash-on-hand ’) pt+1− ‘permanent labor income ’ in period t + 1 Rt+1− interest factor (1 + rt+1) from period t to t + 1 y − noncapital income in period t + 1. t+1

For now, we will assume that the exogenous variables evolve as follows:

Rt =R ∀ t - constant interest factor = 1 + r pt+1 = ΦΦΦt+1pt - permanent labor income dynamics 2 2 log 𝜃𝜃𝜃t+n ∼ 𝒩 (− σ𝜃𝜃𝜃∕2,σ𝜃𝜃𝜃) - lognormal transitory shocks ∀ n > 0.

Using the fact about lognormally distributed variables ELogNorm⁴ that if $log φφφ ∼ 𝒩 (φ, σ2) φ$ then $log 𝔼 [φφφ ] = φ + σ2 ∕2 φ$ , assumption the assumption about the distribution of shocks guarantees that $log 𝔼[𝜃𝜃𝜃] = 0$ which means that $𝔼 [𝜃𝜃𝜃]$ =1 (the mean value of the transitory shock is 1).

Equation (3) indicates that we are allowing for a predictable average proﬁle of income growth over the lifetime ${ΦΦΦ }T 0$ (allowing, for example, for typical career wage paths).⁵

Finally, the utility function is of the Constant Relative Risk Aversion (CRRA), form, $u(∙ ) = ∙1− ρ∕(1 − ρ )$ .

As is well known, this problem can be rewritten in recursive (Bellman equation) form

vvvt(mt, pt) = max u (ct) + 𝔼t[βvvvt+1(mt+1, pt+1)] ct

(3)

subject to the Dynamic Budget Constraint (DBC) (2) given above, where $vvvt$ measures total expected discounted utility from behaving optimally now and henceforth.

3 Normalization

The single most powerful method for speeding the solution of such models is to redeﬁne the problem in a way that reduces the number of state variables (if possible). In the consumption problem here, the obvious idea is to see whether the problem can be rewritten in terms of the ratio of various variables to permanent noncapital (‘labor’) income $pt$ (henceforth for brevity referred to simply as ‘permanent income.’)

In the last period of life, there is no future, $vvvT +1 = 0$ , so the optimal plan is to consume everything, implying that

m1 −ρ vvvT (mT ,pT ) = --T--. 1 − ρ

(4)

Now deﬁne nonbold variables as the bold variable divided by the level of permanent income in the same period, so that, for example, $mT = mT ∕pT$ ; and deﬁne $vT (mT ) = u(mT )$ .⁶ For our CRRA utility function, $u(xy) = x1− ρu (y)$ , so equation (4) can be rewritten as

1−ρ 1−ρ vvv (m ,p ) = p1−ρm-T---= p1−ρΦΦΦ1 −ρm-T---= p1−ρΦΦΦ1 −ρv (m ). T T T T 1 − ρ T− 1 T 1 − ρ T−1 T T T

Now deﬁne a new optimization problem:

vt(mt ) = max u(ct) + 𝔼t[βΦΦΦ1 −ρvt+1(mt+1)] ct t+1 s.t. at = mt − ct mt+1 = (R ∕ΦΦΦt+1) at + 𝜃𝜃𝜃t+1 ◟--◝◜--◞ ≡ ℛt+1

The accumulation equation is the normalized version of the transition equation for $m t+1$ .⁷ Then it is easy to see that for $t = T − 1$ ,

vvv (m ,p ) = p1−ρ v (m ) T −1 T−1 T−1 T− 1 T−1 T−1

and so on back to all earlier periods. Hence, if we solve the problem (5) which has only a single state variable $mt$ , we can obtain the levels of the value function, consumption, and all other variables of interest simply by multiplying the results by the appropriate function of $p t$ , e.g. $ccct(mt, pt) = ptct(mt ∕pt)$ or $1−ρ vvvt(mt, pt) = pt vt(mt)$ . We have thus reduced the problem from two continuous state variables to one (and thereby enormously simpliﬁed its solution).

For some problems it will not be obvious that there is an appropriate ‘normalizing’ variable, but many problems can be normalized if suﬃcient thought is given. For example, Valencia (2006) shows how a bank’s optimization problem can be normalized by the level of the bank’s productivity.

4 The Usual Theory, and A Bit More Notation

The ﬁrst order condition for (5) with respect to $ct$ is

u′(c ) = 𝔼 [β ℛ ΦΦΦ1 −ρv′ (m )] t t t+1 t+1 t+1 t+1 = 𝔼t [βR ΦΦΦt−+ρ1v ′t+1(mt+1 )]

and because the Envelope theorem tells us that

v ′t(mt ) = 𝔼t[βRΦΦΦ −tρ+1v′t+1(mt+1)]

(5)

we can substitute the LHS of (5) for the RHS of (5) to get

u ′(ct) = v′t(mt )

(6)

and rolling this equation forward one period yields

u′(ct+1) = vt′+1(atℛt+1 + 𝜃𝜃𝜃t+1)

(7)

while substituting the LHS in equation (5) gives us the Euler equation for consumption

′ −ρ ′ u (ct) = 𝔼t [βR ΦΦΦ t+1u (ct+1)].

(8)

Now note that in equation (7) neither $mt$ nor $ct$ has any direct eﬀect on $vt+1$ - only the diﬀerence between them (i.e. unconsumed market resources or ‘assets’ $a t$ ) matters. It is therefore possible (and will turn out to be convenient) to deﬁne a function⁸

𝔳t(at) = 𝔼t[βΦΦΦ1−t+1ρvt+1(ℛt+1at + 𝜃𝜃𝜃t+1)]

(9)

that returns the expected $t + 1$ value associated with ending period $t$ with any given amount of assets. This deﬁnition implies that

𝔳′(a ) = 𝔼 [βR ΦΦΦ− ρv′ (ℛ a + 𝜃𝜃𝜃 )] t t t t+1 t+1 t+1 t t+1

(10)

or, substituting from equation (7),

′ [ −ρ ′ ] 𝔳t(at) = 𝔼t βRΦΦΦt+1u (ct+1(ℛt+1at + 𝜃𝜃𝜃t+1)) .

(11)

Finally, note for future use that the ﬁrst order condition (5) can now be rewritten as

u′(c) = 𝔳′(m − c ). t t t t

(12)

5 Solving the Next-to-Last Period

The problem in the second-to-last period of life is:

v (m ) = max u (c ) + β 𝔼 [ΦΦΦ1 −ρv ((m − c )ℛ + 𝜃𝜃𝜃 )], T−1 T−1 cT−1 T −1 T−1 T T T−1 T− 1 T T

and using (1) the fact that $vT = u(c)$ ; (2) the deﬁnition of $u(c)$ ; (3) the deﬁnition of the expectations operator, and (4) the fact that $ΦΦΦT$ is nonstochastic, this becomes

∫ c1T−−ρ1 1− ρ ∞ ((mT − 1 − cT− 1)ℛT + 𝜃𝜃𝜃 )1− ρ vT−1(mT −1) = mcax 1-−-ρ-+ βΦΦΦ T ----------1-−-ρ-----------dℱ (𝜃𝜃𝜃) T−1 0

where $ℱ$ is the cumulative distribution function for $𝜃𝜃𝜃$ .

In principle, the maximization implicitly deﬁnes a function $cT−1(mT −1)$ that yields optimal consumption in period $T − 1$ for any given level of resources $mT − 1$ . Unfortunately, however, there is no general analytical solution to this maximization problem, and so for any given $mT −1$ we must use numerical computational tools to ﬁnd the $cT−1$ that maximizes the expression. This is excruciatingly slow because for every potential $cT−1$ to be considered, the integral must be calculated numerically, and numerical integration is very slow.

5.1 Discretizing the Distribution

Our ﬁrst time-saving step is therefore to construct a discrete approximation to the lognormal distribution that can be used in place of numerical integration. We calculate an $n$ -point approximation as follows.

Deﬁne a set of points from $♯0$ to $♯n 𝜃𝜃𝜃$ on the $[0,1]$ interval as the elements of the set $♯ = {0,1 ∕n,2∕n, ...,1}$ .⁹ Call the inverse of the $𝜃𝜃𝜃$ distribution $−1 ℱ$ , and deﬁne the points $−1 −1 ♯i = ℱ (♯i)$ . Then the conditional mean of $𝜃𝜃𝜃$ in each of the intervals numbered 1 to $n$ is:

∫ ♯−1 𝜃𝜃𝜃 ≡ 𝔼 [𝜃𝜃𝜃|♯−1 ≤ 𝜃𝜃𝜃 < ♯− 1] = i 𝜗 dℱ (𝜗 ). i i− 1 i ♯−1 i−1

(13)

The method is illustrated in Figure 1. The solid continuous curve represents the “true” CDF $ℱ (𝜃𝜃𝜃)$ for a lognormal distribution such that $𝔼[𝜃𝜃𝜃] = 1$ , $σ𝜃𝜃𝜃 = 0.1$ . The short vertical line segments represent the $n𝜃𝜃𝜃$ equiprobable values of $𝜃 𝜃𝜃i$ which are used to approximate this distribution.¹⁰

pict

Figure 1:Discrete Approximation to Lognormal Distribution $ℱ$

Recalling our deﬁnition of $𝔳t(at)$ , for $t = T − 1$

( ) n𝜃𝜃𝜃 1−ρ 1− ρ 1-- ∑ (ℛT-aT-−1 +-𝜃𝜃𝜃i)-- 𝔳T−1(aT− 1) = βΦΦΦ T n𝜃𝜃𝜃 1 − ρ i=1

(14)

so we can rewrite the maximization problem as

{ } -c1T−−ρ1- vT −1(mT −1) = mcaTx−1 1 − ρ + 𝔳T −1(mT −1 − cT−1) .

(15)

5.2 The Approximate Consumption and Value Functions

Given a particular value of $mT −1$ , a numerical maximization routine can now ﬁnd the $cT−1$ that maximizes (15) in a reasonable amount of time. The Mathematica program that solves exactly this problem is called 2period.m. (The archive also contains parallel Matlab programs, but these notes will dwell on the speciﬁcs of the Mathematica implementation, which is superior in many respects.)

The ﬁrst thing 2period.m does is to read in the ﬁle functions.m which contains deﬁnitions of the consumption and value functions; functions.m also deﬁnes the function SolveAnotherPeriod which, given the existence in memory of a solution for period $t + 1$ , solves for period $t$ .

The next step is to run the programs setup_params.m, setup_grids.m, setup_shocks.m, respectively. setup_params.m sets values for the parameter values like the coeﬃcient of relative risk aversion. setup_shocks.m calculates the values for the $𝜃𝜃𝜃i$ deﬁned above (and puts those values, and the (identical) probability associated with each of them, in the vector variables $𝜃 𝜃𝜃$ Vals and $𝜃 𝜃𝜃$ Prob). Finally, setup_grids.m constructs a list of potential values of cash-on-hand and saving, then puts them in the vector variables mVec = aVec = ${0, 1,2,3,4}$ respectively. Then 2period.m runs the program setup_lastperiod.m which deﬁnes the elements necessary to determine behavior in the last period, in which $cT(m ) = m$ and $vT(m ) = u(m )$ .

After all the setup, the only remaining step in 2period.m is to invoke SolveAnotherPeriod, which constructs the solution for period $T − 1$ given the presence of the solution for period $T$ (constructed by setup_lastperiod.m).

Because we will always be comparing our solution to the perfect foresight solution, we also construct the variables required to characterize the perfect foresight consumption function in periods prior to $T$ . In particular, we construct the list yExpPDV (which contains the PDV of expected income – ‘expected human wealth’), and yMinPDV which contains the minimum possible discounted value of future income at the beginning of period $T − 1$ (‘minimum human wealth’).¹¹

The perfect foresight consumption function is also constructed (setup_PerfectForesightSolution.m). This program uses the fact that, in Mathematica, functions can be saved as objects using the commands # and &. The # denotes the argument of the function, while the &, placed at the end of the function, tells Mathematica that the function should be saved as an object. In the program, the last period perfect foresight consumption function is saved as an element in the list $cг$ = {(# - 1 + Last[yExpPDV]) Last[ $κ$ Min] &}, where Last[yExpPDV] gives the just-constructed PDV of human wealth at the beginning of $T$ (equal to 1, since current income is included in $hT$ ), and Last[ $κ$ Min] gives the perfect foresight marginal propensity to consume (equal to 1, since it is optimal to spend all resources in the last period). Since # in the code stands in for what was called $m$ in the model, the discounted total wealth is decomposed into discounted non-human wealth # - 1 and discounted human wealth Last[yExpPDV]. The resulting formula then corresponds to $¯cT = (mT − 1 + hT)κT$ , which translates to $¯cT = mT$ for $hT = κT = 1$ .

The inﬁnite horizon perfect foresight marginal propensity to save

1∕ρ λ = (1∕R )(Rβ )

(16)

is also deﬁned because it will be useful in a number of derivations.¹²

The program then constructs behavior for one iteration back from the last period of life by using the function AddNewPeriodToParamLifeDates. Using the Mathematica command AppendTo, various existing lists (which characterized the solution for period $T$ ) are redeﬁned to include an additional element representing the relevant formulas in the second to last period of life. For example, $κ$ Min now has two elements. The second element, given by 1/(1 + Last[ $λ$ ]/Last[ $κ$ Min]), is the perfect foresight marginal propensity to consume in $t = T − 1$ .¹³

Next, the program deﬁnes a function $𝔳$ [at_] (in functions_stable.m) which is the exact implementation of (9): It returns the expectation of the value of behaving optimally in period $T$ given any speciﬁc amount of assets at the end of $T − 1$ , $aT −1$ .

The heart of the program is the next expression (in functions.m). This expression loops over the values of the variable mVec, solving the maximization problem (given in equation (15)):

max u[c ] + 𝔳 [mVec [[i]]- c] c

(17)

for each of the $i$ values of mVec (henceforth let’s call these points $mT −1,i$ ). The maximization routine returns two values: the maximized value, and the value of $c$ which yields that maximized value. When the loop (the Table command) is ﬁnished, the variable vAndcList contains two lists, one with the values $vT− 1,i$ and the other with the consumption levels $cT− 1,i$ associated with the $mT − 1,i$ .

5.3 An Interpolated Consumption Function

Now we use the ﬁrst of the really convenient built-in features of Mathematica. Given a set of points on a function (in this case, the consumption function $c (m ) T−1$ ), Mathematica can create an object called an InterpolatingFunction which when applied to an input $m$ will yield the value of $c$ that corresponds to a linear interpolation of the value of $c$ from the points in the InterpolatingFunction object. We can therefore deﬁne an approximation to the consumption function $`cT−1(mT − 1)$ which, when called with an $mT −1$ that is equal to one of the points in mVec[[i]] returns the associated value of $c T−1,i$ , and when called with a value of $mT −1$ that is not exactly equal to one of the mVec[[i]], returns the value of $c$ that reﬂects a linear interpolation between the $cT−1,i$ associated with the two mVec[[i]] points nearest to $mT −1$ . Thus if the function is called with $mT −1 = 1.75$ and the nearest gridpoints are $mj,T−1 = 1$ and $mk,T− 1 = 2$ then the value of $c T− 1$ returned by the function would be $(0.25c + 0.75c ) j,T− 1 k,T−1$ . We can deﬁne a numerical approximation to the value function $`vT−1(mT −1)$ in an exactly analogous way.

Figures 2 and 3 show plots of the $`cT−1$ and $`vT−1$ InterpolatingFunctions that are generated by the program 2PeriodInt.m. While the $`cT−1$ function looks very smooth, the fact that the $`vT −1$ function is a set of line segments is very evident. This ﬁgure provides the beginning of the intuition for why trying to approximate the value function directly is a bad idea (in this context).¹⁴

pict

Figure 2: $cT −1(mT −1)$ (solid) versus $`cT−1(mT −1)$ (dashed)

pict

Figure 3: $vT −1$ (solid) versus $`vT− 1(mT − 1)$ (dashed)

5.4 Interpolating Expectations

2period.m works well in the sense that it generates a good approximation to the true optimal consumption function. However, there is a clear ineﬃciency in the program: Since it uses equation (15), for every value of $mT − 1$ the program must calculate the utility consequences of various possible choices of $cT −1$ as it searches for the best choice. But for any given value of $aT−1$ , there is a good chance that the program may end up calculating the corresponding $𝔳$ many times while maximizing utility from diﬀerent $m T−1$ ’s. For example, it is possible that the program will calculate the value of ending the period with $aT −1 = 0$ dozens of times. It would be much more eﬃcient if the program could make that calculation once and then merely recall the value when it is needed again.

This can be achieved using the same interpolation technique used above to construct a direct numerical approximation to the value function: Deﬁne a grid of possible values for saving at time $T − 1$ , $⃗aT −1$ (aVec in setup_grids.m), designating the speciﬁc points $aT −1,i$ ; for each of these values of $aT −1,i$ , calculate the vector $⃗𝔳T−1$ as the collection of points $𝔳T− 1,i = 𝔳T −1(aT−1,i)$ using equation (9); then construct an InterpolatingFunction object $`𝔳 (a ) T− 1 T−1$ from the list of points on the function captured in the $⃗a T −1$ and $⃗𝔳 T−1$ vectors.

Thus, we are now interpolating for the function that reveals the expected value of ending the period with a given amount of assets.¹⁵ The program 2periodIntExp.m solves this problem. Figure 4 compares the true value function to the InterpolatingFunction approximation; the functions are of course identical at the gridpoints chosen for $aT− 1$ and they appear reasonably close except in the region below $mT −1 = 1$ .

pict

Figure 4:End-Of-Period Value $𝔳T−1(aT− 1)$ (solid) versus $`𝔳T −1(aT−1)$ (dashed)

pict

Figure 5: $cT −1(mT −1)$ (solid) versus $`cT−1(mT −1)$ (dashed)

Nevertheless, the resulting consumption rule obtained when $`𝔳T−1(aT− 1)$ is used instead of $𝔳T −1(aT−1)$ is surprisingly bad, as shown in ﬁgure 5. For example, when $mT −1$ goes from 2 to 3, $`c T−1$ goes from about 1 to about 2, yet when $m T −1$ goes from 3 to 4, $`cT−1$ goes from about 2 to about 2.05. The function fails even to be strictly concave, which is distressing because Carroll and Kimball (1996) prove that the correct consumption function is strictly concave in a wide class of problems that includes this problem.

5.5 Value Function versus First Order Condition

Loosely speaking, our diﬃculty reﬂects the fact that the consumption choice is governed by the marginal value function, not by the level of the value function (which is the object that we approximated). To understand this point, recall that a quadratic utility function exhibits risk aversion because with a stochastic $c$ ,

𝔼 [− (c − /c)2] < − (𝔼[c] − /c)2

(18)

where $/c$ is the ‘bliss point’. However, unlike the CRRA utility function, with quadratic utility the consumption/saving behavior of consumers is unaﬀected by risk since behavior is determined by the ﬁrst order condition, which depends on marginal utility, and when utility is quadratic, marginal utility is unaﬀected by risk:

𝔼[− 2(c − /c)] = − 2(𝔼[c] − /c).

(19)

Intuitively, if one’s goal is to accurately capture choices that are governed by marginal value, numerical techniques that approximate the marginal value function will yield a more accurate approximation to optimal behavior than techniques that approximate the level of the value function.

The ﬁrst order condition of the maximization problem in period $T − 1$ is:

′ −ρ ′ u (cT −1) = β 𝔼T −1[ΦΦΦ T Ru (cT)] ( 1 ) ∑n𝜃𝜃𝜃 c−T ρ−1 = Rβ --- ΦΦΦ −Tρ(R (mT −1 − cT −1) + 𝜃𝜃𝜃i)−ρ . n𝜃𝜃𝜃 i=1

(20)

pict

Figure 6: $u ′(c)$ versus $𝔳′T− 1(3 − c),𝔳′T− 1(4 − c),`𝔳′T −1(3 − c),`𝔳′T −1(4 − c)$

The downward-sloping curve in Figure 6 shows the value of $c−Tρ−1$ for our baseline parameter values for $0 ≤ cT− 1 ≤ 4$ (the horizontal axis). The solid upward-sloping curve shows the value of the RHS of (20) as a function of $c T −1$ under the assumption that $mT − 1 = 3$ . Constructing this ﬁgure is rather time-consuming, because for every value of $cT− 1$ plotted we must calculate the RHS of (20). The value of $cT− 1$ for which the RHS and LHS of (20) are equal is the optimal level of consumption given that $mT −1 = 3$ , so the intersection of the downward-sloping and the upward-sloping curves gives the optimal value of $cT− 1$ . As we can see, the two curves intersect just below $cT −1 = 2$ . Similarly, the upward-sloping dashed curve shows the expected value of the RHS of (20) under the assumption that $mT −1 = 4$ , and the intersection of this curve with $u ′(cT−1)$ yields the optimal level of consumption if $mT −1 = 4$ . These two curves intersect slightly below $cT− 1 = 2.5$ . Thus, increasing $mT −1$ from 3 to 4 increases optimal consumption by about 0.5.

Now consider the derivative of our function $`𝔳T−1(aT− 1)$ . Because we have constructed $`𝔳T −1$ as a linear interpolation, the slope of $`𝔳T−1(aT− 1)$ between any two adjacent points ${aT −1,i,ai+1,T− 1}$ is constant. The level of the slope immediately below any particular gridpoint is diﬀerent, of course, from the slope above that gridpoint, a fact which implies that the derivative of $`𝔳T− 1(aT −1)$ follows a step function.

The solid-line step function in Figure 6 depicts the actual value of $′ `𝔳T−1(3 − cT−1)$ . When we attempt to ﬁnd optimal values of $cT−1$ given $mT −1$ using $`𝔳T −1(aT−1)$ , the numerical optimization routine will return the $cT−1$ for which $u′(cT−1) = `𝔳′T−1(mT − 1 − cT− 1)$ . Thus, for $m = 3 T−1$ the program will return the value of $c T− 1$ for which the downward-sloping $′ u (cT−1)$ curve intersects with the $′ `𝔳T− 1(3 − cT −1)$ ; as the diagram shows, this value is exactly equal to 2. Similarly, if we ask the routine to ﬁnd the optimal $cT−1$ for $mT −1 = 4$ , it ﬁnds the point of intersection of $u′(cT−1)$ with $`𝔳 ′T−1(4 − cT−1)$ ; and as the diagram shows, this intersection is only slightly above 2. Hence, this ﬁgure illustrates why the numerical consumption function plotted earlier returned values very close to $c = 2 T−1$ for both $mT − 1 = 3$ and $mT − 1 = 4$ .

We would obviously obtain much better estimates of the point of intersection between $u′(cT−1)$ and $𝔳 ′T−1(mT −1 − cT−1)$ if our estimate of $`𝔳′T− 1$ were not a step function. In fact, we already know how to construct linear interpolations to functions, so the obvious next step is to construct a linear interpolating approximation to the expected marginal value of end-of-period assets function $′ 𝔳$ . That is, we calculate

( 1 ) ∑n𝜃𝜃𝜃 𝔳′T− 1(aT −1) = βR ΦΦΦ−T ρ --- (ℛT aT− 1 + 𝜃𝜃𝜃i)−ρ n𝜃𝜃𝜃 i=1

(21)

at the points in aVec yielding $′ ′ {{aT −1,1,𝔳T −1,1},{aT −1,2,𝔳T− 1,2}...}$ and construct $`𝔳′T −1(aT−1)$ as the linear interpolating function that ﬁts this set of points.

PlotOPRawVSFOC

pict

Figure 7: $𝔳 ′T−1(aT−1)$ versus $`𝔳′T −1(aT−1)$

The program ﬁle functionsIntExpFOC.m therefore uses the function $𝔳$ a[at_] deﬁned in functions_stable.m as the embodiment of equation (21), and constructs the InterpolatingFunction as described above. The results are shown in Figure 7. The linear interpolating approximation looks roughly as good (or bad) for the marginal value function as it was for the level of the value function. However, Figure 8 shows that the new consumption function (long dashes) is a considerably better approximation of the true consumption function (solid) than was the consumption function obtained by approximating the level of the value function (short dashes).

pict

Figure 8: $cT −1(mT −1)$ (solid) Versus Two Methods for Constructing $`cT−1(mT − 1)$

5.6 Transformation

Even the new-and-improved consumption function diverges notably from the true solution, especially at lower values of $m$ . That is because the linear interpolation does an increasingly poor job of capturing the nonlinearity of $′ 𝔳T− 1(aT −1)$ at lower and lower levels of $a$ .

This is where we unveil our next trick. To understand the logic, start by considering the case where $ℛ = β = ΦΦΦ = 1 T T$ and there is no uncertainty (that is, we know for sure that income next period will be $𝜃𝜃𝜃T = 1$ ). The ﬁnal Euler equation is then:

c−ρ = c− ρ. T−1 T

(22)

In the case we are now considering with no uncertainty and no liquidity constraints, the optimizing consumer does not care whether a unit of income is scheduled to be received in the future period $T$ or the current period $T − 1$ ; there is perfect certainty that the income will be received, so the consumer treats it as equivalent to a unit of current wealth. Total resources therefore are comprised of two types: current market resources $mT −1$ and ‘human wealth’ (the PDV of future income) of $𝔥T−1 = 1$ (where we use the Gothic font to signify that this is the expectation, as of the END of the period, of the income that will be received in future periods; it does not include current income, which has already been incorporated into $mT − 1$ ).

The optimal solution is to spend half of total lifetime resources in period $T − 1$ and the remainder in period $T$ . Since total resources are known with certainty to be $mT − 1 + 𝔥T −1 = mT −1 + 1$ , and since $v′ (mT −1) = u ′(cT−1) T−1$ this implies that

( m + 1) −ρ v′T −1(mT −1) = --T−1----- . 2

(23)

Of course, this is a highly nonlinear function. However, if we raise both sides of (23) to the power $(− 1 ∕ρ)$ the result is a linear function:

′ − 1∕ρ mT-−-1 +-1 vT− 1(mT − 1)t = 2 .

(24)

This is a speciﬁc example of a general phenomenon: A theoretical literature cited in Carroll and Kimball (1996) establishes that under perfect certainty, if the period-by-period marginal utility function is of the form $−ρ ct$ , the marginal value function will be of the form $−ρ (γmt + ζ)$ for some constants ${γ, ζ}$ . This means that if we were solving the perfect foresight problem numerically, we could always calculate a numerically exact (because linear) interpolation. To put this in intuitive terms, the problem we are facing is that the marginal value function is highly nonlinear. But we have a compelling solution to that problem, because the nonlinearity springs largely from the fact that we are raising something to the power $− ρ$ . In eﬀect, we can ‘unwind’ all of the nonlinearity owing to that operation and the remaining nonlinearity will not be nearly so great. Speciﬁcally, applying the foregoing insights to the end-of-period value function $𝔳T −1$ , we can deﬁne

′ − 1∕ρ 𝔠T−1(aT− 1) ≡ [𝔳T− 1(aT −1)]

(25)

which would be linear in the perfect foresight case. Thus, our procedure is to calculate the values of $𝔠T− 1,i$ at each of the $aT−1,i$ gridpoints, with the idea that we will construct $`𝔠T− 1$ as the interpolating function connecting these points.

5.7 The Self-Imposed ‘Natural’ Borrowing Constraint and the $aT−1$ Lower Bound

This is the appropriate moment to ask an awkward question that we have so far neglected: How should a function like $`𝔠T− 1$ be evaluated outside the range of points spanned by ${a ,...,a } T−1,1 T− 1,n$ for which we have calculated the corresponding $𝔠 T −1,i$ gridpoints used to produce our linearly interpolating approximation $`𝔠T− 1$ (as described in section 5.3)?

The natural answer would seem to be linear extrapolation; for example, we could use

`𝔠T− 1(aT −1) = `𝔠T−1(aT− 1,1) + `𝔠aT− 1(aT −1,1)(aT− 1 − aT −1,1)

(26)

for values of $aT− 1 < aT−1,1$ , where $`a 𝔠T−1(aT−1,1)$ is the derivative of the $` 𝔠T−1$ function at the bottommost gridpoint (see below). Unfortunately, this approach will lead us into diﬃculties. To see why, consider what happens to the true (not approximated) $𝔳T −1(aT−1)$ as $aT− 1$ approaches the value $aT−1 = − 𝜃𝜃𝜃ℛ −T1$ . From (21) we have

( ) n∑𝜃𝜃𝜃 lim 𝔳′T−1(aT− 1) = lim βR ΦΦΦ −ρ -1- (aT− 1ℛT + 𝜃𝜃𝜃i)−ρ . aT−1↓aT−1 aT−1↓aT−1 T n𝜃𝜃𝜃 i=1

(27)

But since $𝜃𝜃𝜃-= 𝜃𝜃𝜃1$ , exactly at $aT−1 = aT −1$ the ﬁrst term in the summation would be $(− 𝜃𝜃𝜃-+ 𝜃𝜃𝜃1)−ρ = 1∕0ρ$ which is inﬁnity. The reason is simple: $− aT −1$ is the PDV, as of $T − 1$ , of the minimum possible realization of income in period $T$ ( $ℛ a = − 𝜃𝜃𝜃 T -T−1 1$ ). Thus, if the consumer borrows an amount greater than or equal to $−1 𝜃𝜃𝜃ℛ T$ (that is, if the consumer ends $T − 1$ with $−1 aT −1 ≤ − 𝜃𝜃𝜃ℛ T$ ) and then draws the worst possible income shock in period $T$ , he will have to consume zero in period $T$ (or a negative amount), which yields $− ∞$ utility and $∞$ marginal utility (or undeﬁned utility and marginal utility).

These reﬂections lead us to the conclusion that the consumer faces a ‘self-imposed’ liquidity constraint (which results from the precautionary motive): He will never borrow an amount greater than or equal to $𝜃𝜃𝜃ℛ −T1$ (that is, assets will never reach the lower bound of $a- T −1$ ).¹⁶ The constraint is ‘self-imposed’ in the sense that if the utility function were diﬀerent (say, Constant Absolute Risk Aversion), the consumer would be willing to borrow more than $−1 𝜃𝜃𝜃ℛ T$ because a choice of zero or negative consumption in period $T$ would yield some ﬁnite amount of utility.¹⁷

This self-imposed constraint cannot be captured well when the $𝔳′ T− 1$ function is approximated by a piecewise linear function like $`𝔳′ T− 1$ , because a linear approximation can never reach the correct gridpoint for $′ 𝔳T −1(aT−1) = ∞.$ To see what will happen instead, note ﬁrst that if we are approximating $𝔳′T−1$ the smallest value in aVec must be greater than $aT −1$ (because the expectation for any gridpoint $≤ aT −1$ is undeﬁned). Then when the approximating $𝔳′ T −1$ function is evaluated at some value less than the ﬁrst element in aVec[1], the approximating function will linearly extrapolate the slope that characterized the lowest segment of the piecewise linear approximation (between aVec[1] and aVec[2]), a procedure that will return a positive ﬁnite number, even if the requested $aT−1$ point is below $aT −1$ . This means that the precautionary saving motive is understated, and by an arbitrarily large amount as the level of assets approaches its true theoretical minimum $a- T −1$ .

The foregoing logic demonstrates that the marginal value of saving approaches inﬁnity as $−1 aT −1 ↓ aT −1 = − 𝜃𝜃𝜃-ℛ T$ . But this implies that $′ −1∕ρ limaT −1↓aT− 1 𝔠T−1(aT− 1) = (𝔳T −1(aT−1)) = 0$ ; that is, as $a$ approaches its minimum possible value, the corresponding amount of $c$ must approach its minimum possible value: zero.

The upshot of this discussion is a realization that all we need to do is to augment each of the $⃗aT− 1$ and $⃗cT− 1$ vectors with an extra point so that the ﬁrst element in the list used to produce our InterpolatingFunction is ${aT −1,0,cT− 1,0} = {aT −1,0.}$ .

pict

Figure 9: $𝔠T −1(aT−1)$ versus $`𝔠T−1(aT− 1)$

Figure 9 plots the results (generated by the program 2periodIntExpFOCInv.m). The solid line calculates the exact numerical value of $𝔠T−1(aT− 1)$ while the dashed line is the linear interpolating approximation $`𝔠 (a ). T −1 T−1$ This ﬁgure well illustrates the value of the transformation: The true function is close to linear, and so the linear approximation is almost indistinguishable from the true function except at the very lowest values of $aT −1$ .

Figure 10 similarly shows that when we calculate $``𝔳′ (a ) T−1 T− 1$ as $[`𝔠 (a )]−ρ T−1 T− 1$ (dashed line) we obtain a much closer approximation to the true function $′ 𝔳T− 1(aT −1)$ (solid line) than we did in the previous program which did not do the transformation (Figure 7).

pict

Figure 10: $𝔳′T−1(aT− 1)$ vs. $``𝔳′T− 1(aT −1)$ Constructed Using $`𝔠T−1(aT− 1)$

5.8 The Method of Endogenous Gridpoints

Our solution procedure for $c T−1$ still requires us, for each point in $⃗m T−1$ (mVect in the code), to use a numerical rootﬁnding algorithm to search for the value of $cT−1$ that solves $′ ′ u(cT− 1) = 𝔳 T−1(mT −1 − cT−1)$ . Unfortunately, rootﬁnding is a notoriously computation-intensive (that is, slow!) operation.

Our next trick lets us completely skip the rootﬁnding step. The method can be understood by noting that any arbitrary value of $a T− 1,i$ (greater than its lower bound value $a -T−1$ ) will be associated with some marginal valuation as of the end of period $T − 1$ , and the further observation that it is trivial to ﬁnd the value of $c$ that yields the same marginal valuation, using the ﬁrst order condition,

′ ′ u (cT−1,i) = 𝔳T−1(aT −1,i) cT −1,i = u′− 1(𝔳′T−1(aT−1,i)) ′ − 1∕ρ = (𝔳T−1(aT− 1,i)) ≡ 𝔠T−1(aT− 1,i) ≡ 𝔠 . T−1,i

(28)

But with mutually consistent values of $c T −1,i$ and $a T−1,i$ (consistent, in the sense that they are the unique optimal values that correspond to the solution to the problem in a single state), we can obtain the $mT −1,i$ that corresponds to both of them from

mT −1,i = cT− 1,i + aT −1,i.

(29)

These $m T−1$ gridpoints are “endogenous” in contrast to the usual solution method of specifying some ex-ante grid of values of $mT − 1$ and then using a rootﬁnding routine to locate the corresponding optimal $cT−1$ .

Thus, we can generate a set of $mT −1,i$ and $cT −1,i$ pairs that can be interpolated between in order to yield $`c(mT −1)$ at virtually zero computational cost once we have the $⃗𝔠T− 1$ values in hand!¹⁸ One might worry about whether the ${m, c}$ points obtained in this way will provide a good representation of the consumption function as a whole, but in practice there are good reasons why they work well (basically, this procedure generates a set of gridpoints that is naturally dense right around the parts of the function with the greatest nonlinearity).

pict

Figure 11: $cT−1(mT −1)$ (solid) versus $`cT−1(mT −1)$ (dashed)

Figure 11 plots the actual consumption function $cT−1$ and the approximated consumption function $`cT−1$ derived by the method of endogenous grid points. Compared to the approximate consumption functions illustrated in Figure 8 $`c T −1$ is quite close to the actual consumption function.

5.9 Improving the $a$ Grid

Thus far, we have arbitrarily used $a$ gridpoints of ${0.,1.,2.,3.,4.}$ (augmented in the last subsection by $a -T −1$ ). But it has been obvious from the ﬁgures that the approximated $`𝔠T− 1$ function tends to be farthest from its true value $𝔠T−1$ at low values of $a$ . Combining this with our insight that $aT−1$ is a lower bound, we are now in position to deﬁne a more deliberate method for constructing gridpoints for $aT−1$ – a method that yields values that are more densely spaced than the uniform grid at low values of $a$ . A pragmatic choice that works well is to ﬁnd the values such that (1) the last value exceeds the lower bound by the same amount $¯aT −1$ as our original maximum gridpoint (in our case, 4.); (2) we have the same number of gridpoints as before; and (3) the multi-exponential growth rate (that is, $... eee$ for some number of exponentiations $n 𝜃𝜃𝜃$ ) from each point to the next point is constant (instead of, as previously, imposing constancy of the absolute gap between points).

pict

Figure 12: $𝔠T −1(aT−1)$ versus $`𝔠T−1(aT− 1)$ , Multi-Exponential aVec

pict

Figure 13: $𝔳′T−1(aT− 1)$ vs. $``𝔳′T− 1(aT −1)$ , Multi-Exponential aVec

The results (generated by the program 2periodIntExpFOCInvEEE.m) are depicted in Figures 12 and 13, which are notably closer to their respective truths than the corresponding ﬁgures that used the original grid.

5.10 The Method of Moderation

Unfortunately, this endogenous gridpoints solution is not very well-behaved outside the original range of gridpoints targeted by the solution method. (Though other common solution methods are no better outside their own predeﬁned ranges). Figure 14 demonstrates the point by plotting the amount of precautionary saving implied by a linear extrapolation of our approximated consumption rule (the consumption of the perfect foresight consumer $¯cT− 1$ minus our approximation to optimal consumption under uncertainty, $`cT− 1$ ). Although theory proves that precautionary saving is always positive, the linearly extrapolated numerical approximation eventually predicts negative precautionary saving (at the point in the ﬁgure where the extrapolated locus crosses the horizontal axis).

pict

Figure 14:For Large Enough $mT − 1$ , Predicted Precautionary Saving is Negative (Oops!)

This error cannot be ﬁxed by extending the upper gridpoint; in the presence of serious uncertainty, the consumption rule will need to be evaluated outside of any prespeciﬁed grid (because starting from the top gridpoint, a large enough realization of the uncertain variable will push next period’s realization of assets above that top; a similar argument applies below the bottom gridpoint). While a judicious extrapolation technique can prevent this problem from being fatal (for example by carefully excluding negative precautionary saving), the problem is often dealt with using inelegant methods whose implications for the accuracy of the solution are diﬃcult to gauge.

As a preliminary to our solution, deﬁne $𝔥t$ as end-of-period human wealth (the present discounted value of future labor income) for a perfect foresight version of the problem of a ‘risk optimist:’ a consumer who believes with perfect conﬁdence that the shocks will always take the value 1, $𝜃𝜃𝜃t+n = 𝔼 [𝜃𝜃𝜃] = 1 ∀ n > 0$ . The solution to a perfect foresight problem of this kind takes the form¹⁹

¯ct(mt ) = (mt + 𝔥t)κt

(30)

for a constant minimal marginal propensity to consume $κ -t$ given below.

We similarly deﬁne $𝔥t$ as ‘minimal human wealth,’ the present discounted value of labor income if the shocks were to take on their worst possible value in every future period $𝜃𝜃𝜃t+n = 𝜃𝜃𝜃-∀ n > 0$ (which we deﬁne as corresponding to the beliefs of a ‘pessimist’).

We will call a ‘realist’ the consumer who correctly perceives the true probabilities of the future risks and optimizes accordingly.

A ﬁrst useful point is that, for the realist, a lower bound for the level of market resources is $mt = − 𝔥- t$ , because if $mt$ equalled this value then there would be a positive ﬁnite chance (however small) of receiving $𝜃𝜃𝜃t+n = 𝜃𝜃𝜃-$ in every future period, which would require the consumer to set $ct$ to zero in order to guarantee that the intertemporal budget constraint holds (this is the multiperiod generalization of the discussion in section 5.7 about $aT− 1$ ). Since consumption of zero yields negative inﬁnite utility, the solution to realist consumer’s problem is not well deﬁned for values of $m < m t --t$ , and the limiting value of the realist’s $c t$ is zero as $mt ↓ mt$ .

Given this result, it will be convenient to deﬁne ‘excess’ market resources as the amount by which actual resources exceed the lower bound, and ‘excess’ human wealth as the amount by which mean expected human wealth exceeds guaranteed minimum human wealth:

=◜−◞m◟t◝ ▴mt = mt + 𝔥 -t ▴𝔥t = 𝔥t − 𝔥t.

We can now transparently deﬁne the optimal consumption rules for the two perfect foresight problems, those of the ‘optimist’ and the ‘pessimist.’ The ‘pessimist’ perceives human wealth to be equal to its minimum feasible value $𝔥t$ with certainty, so consumption is given by the perfect foresight solution

ct(mt ) = (mt + 𝔥t)κt = ▴mt κt.

The ‘optimist,’ on the other hand, pretends that there is no uncertainty about future income, and therefore consumes

¯ct(mt ) = (mt + 𝔥t − 𝔥t + 𝔥t)κt = (▴mt + ▴ 𝔥t)κt = ct(mt ) + ▴𝔥tκt.

It seems obvious that the spending of the realist will be strictly greater than that of the pessimist and strictly less than that of the optimist. Figure 15 illustrates the proposition for the consumption rule in period $T − 1$ .

pict

Figure 15:Moderation Illustrated: $cT− 1 < `cT −1 < ¯cT−1$

Proof is more diﬃcult than might be imagined, but the necessary work is done in Carroll (2022) so we will take the proposition as a fact and proceed by manipulating the inequality:

▴m κ < c(m + ▴m ) < (▴m + ▴ 𝔥 )κ
t-t t--t t t t-t
− ▴mt κt > − ct(mt + ▴mt ) > − (▴mt + ▴ 𝔥t)κt
▴𝔥tκt > ( ¯ct(mt + ▴mt ) − ct(mt + ▴mt )) > 0
¯ct(mt-+--▴mt-) −-ct(mt-+-▴mt--)
1 > ▴ 𝔥 κ > 0
◟--------------◝t◜-t------------◞
≡ˆϙt

where the fraction in the middle of the last inequality is the ratio of actual precautionary saving (the numerator is the diﬀerence between perfect-foresight consumption and optimal consumption in the presence of uncertainty) to the maximum conceivable amount of precautionary saving (the amount that would be undertaken by the pessimist who consumes nothing out of any future income beyond the perfectly certain component).

Deﬁning $μ = log ▴m t t$ (which can range from $− ∞$ to $∞$ ), the object in the middle of the last inequality is

( ) ¯ct(mt-+-eμt)-−-ct(mt--+-eμt) ˆϙt(μt) ≡ ▴𝔥 κ , tt

(31)

and we now deﬁne

( 1 − ˆϙ (μt) ) ˆχχχt(μt) = log -----t---- ˆϙt(μt) = log (1∕ˆϙt(μt) − 1 )

(32)

which has the virtue that it is linear in the limit as $μt$ approaches $+ ∞$ .

Given $ˆχχχ$ , the consumption function can be recovered from

◜------=◞ˆϙ◟t----◝ ( 1 ) ˆct = ¯ct − ------------ ▴𝔥tκt. 1 + exp (χχˆχt)

(33)

Thus, the procedure is to calculate $ˆχχχt$ at the points $⃗μt$ corresponding to the log of the $▴ ⃗mt$ points deﬁned above, and then using these to construct an interpolating approximation $`ˆχχχt$ from which we indirectly obtain our approximated consumption rule $`ˆct$ by substituting $`ˆχχχ t$ for $ˆχχχ$ in equation (33).

Because this method relies upon the fact that the problem is easy to solve if the decision maker has unreasonable views (either in the optimistic or the pessimistic direction), and because the correct solution is always between these immoderate extremes, we call our solution procedure the ‘method of moderation.’

Results are shown in Figure 16; a reader with very good eyesight might be able to detect the barest hint of a discrepancy between the Truth and the Approximation at the far righthand edge of the ﬁgure – a stark contrast with the calamitous divergence evident in Figure 14.

pict

Figure 16:Extrapolated $`ˆcT− 1$ Constructed Using the Method of Moderation

5.11 Approximating the Slope Too

Until now, we have calculated the level of consumption at various diﬀerent gridpoints and used linear interpolation (either directly for $cT −1$ or indirectly for, say, $ˆχχχT− 1$ ). But the resulting piecewise linear approximations have the unattractive feature that they are not diﬀerentiable at the ‘kink points’ that correspond to the gridpoints where the slope of the function changes discretely.

Carroll (2022) shows that the true consumption function for this problem is ‘smooth:’ It exhibits a well-deﬁned unique marginal propensity to consume at every positive value of $m$ . This suggests that we should calculate, not just the level of consumption, but also the marginal propensity to consume (henceforth $κ$ ) at each gridpoint, and then ﬁnd an interpolating approximation that smoothly matches both the level and the slope at those points.

This requires us to diﬀerentiate (31) and (32), yielding

( --≡κκκt(mt)---) μ −1 μt( ◜m ◞◟ μt◝) ˆϙt (μt) = (▴ 𝔥tκt) e κt − ct (mt + e ) ( ) μ − ˆϙ μt(μt)∕ ˆϙ2 ˆχχχt (μt) = -----------t 1∕ϙˆt(μt) − 1

(34)

and (dropping arguments) with some algebra these can be combined to yield

( ) μ κt▴mt ▴ 𝔥t(κt − κt) χχχˆt = ------------------------- . (¯ct − ct)(¯ct − ct − κt▴ 𝔥t)

(35)

To compute the vector of values of (34) corresponding to the points in $⃗μt$ , we need the marginal propensities to consume (designated $κ$ ) at each of the gridpoints, $m ct$ (the vector of such values is $⃗κt$ ). These can be obtained by diﬀerentiating the Euler equation (12) (where we deﬁne $𝔪t(a) ≡ 𝔠t(a) + a$ ):

′ a u (𝔠t) = ˆ𝔳t(𝔪t − 𝔠t)

(36)

with respect to $a$ , yielding a marginal propensity to have consumed $𝔠a$ at each gridpoint:

u′′(𝔠t)𝔠at = ˆ𝔳at(𝔪t − 𝔠t) a ˆa ′′ 𝔠t = 𝔳t(𝔪t − 𝔠t)∕u (𝔠t)

(37)

and the marginal propensity to consume at the beginning of the period is obtained from the marginal propensity to have consumed by noting that

𝔠 = 𝔪 − a a a 𝔠 + 1 = 𝔪

which, together with the chain rule $𝔠a = cm 𝔪a$ , yields the MPC from

=𝔪a ◜-◞◟-◝ cm(𝔠a + 1) = 𝔠a cm = 𝔠a∕(1 + 𝔠a).

(38)

Designating $`ˆcT−1$ as the approximated consumption rule obtained using an interpolating polynomial approximation to $χχχˆ$ that matches both the level and the ﬁrst derivative at the gridpoints, Figure 17 plots the diﬀerence between this latest approximation and the true consumption rule for period $T − 1$ up to the same large value (far beyond the largest gridpoint) used in prior ﬁgures. Of course, at the gridpoints the approximation will match the true function; but this ﬁgure illustrates that the approximation is quite accurate far beyond the last gridpoint (which is the last point at which the diﬀerence touches the horizontal axis). (We plot here the diﬀerence between the two functions rather than the level plotted in previous ﬁgures, because in levels the approximation error would not be detectable even to the most eagle-eyed reader.)

pict

Figure 17:Diﬀerence Between True $cT− 1$ and $`ˆcT− 1$ Is Minuscule

5.12 Value

Often it is useful to know the value function as well as the consumption rule. Fortunately, many of the tricks used when solving for the consumption rule have a direct analogue in approximation of the value function.

Consider the perfect foresight (or “optimist’s”) problem in period $T − 1$ :

¯v (m ) ≡ u(c ) + βu (c ) T −1 T−1 T−1 ( T ) = u(cT−1) 1 + β ((βTR )1∕ρ)1−ρ ( 1∕ρ−1) = u(cT−1)( 1 + β (βTR ) ) = u(cT−1) 1 + (βT R)1∕ρ∕R T = u(cT−1)P◟DV--t-(c◝)◜∕cT-−1◞ ≡ℂT t

where $T T ℂt = PDV t (c)$ is the present discounted value of consumption. A similar function can be constructed recursively for earlier periods, yielding the general expression

¯vt(mt ) = u(¯ct)ℂT −t1 = u(¯ct)κt = u((▴mt + ▴𝔥t)κt)κ−t1 1−ρ −1 = u(▴mt + ▴𝔥t)κt κ-t = u(▴m + ▴𝔥 )κ− ρ t t-t

(39)

where the second line uses the fact demonstrated in Carroll (2022) that $ℂt = κ−t1$ .

This can be transformed as

1∕(1−ρ) ¯Λt ≡ ((1 − ρ)¯vt) T 1∕(1−ρ) = ct(ℂ t ) = (▴mt + ▴ 𝔥t)κ−tρ∕(1−ρ)

with derivative

¯Λm = (ℂT )1∕(1−ρ)κ-, t t t = κ−tρ∕(1− ρ)

and since $ℂT t$ is a constant while the consumption function is linear, $¯Λ t$ will also be linear.

We apply the same transformation to the value function for the problem with uncertainty (the “realist’s” problem) and diﬀerentiate

1∕(1− ρ) ¯Λt = ((1 − ρ)¯vt(mt )) ¯Λm = ((1 − ρ)¯vt(mt ))− 1+1 ∕(1−ρ)¯vm (mt) t t

and an excellent approximation to the value function can be obtained by calculating the values of $¯Λ$ at the same gridpoints used by the consumption function approximation, and interpolating among those points.

However, as with the consumption approximation, we can do even better if we realize that the $¯Λ$ function for the optimist’s problem is an upper bound for the $Λ$ function in the presence of uncertainty, and the value function for the pessimist is a lower bound. Analogously to (31), deﬁne an upper-case

( ¯Λ (m- + eμt) − Λ (m + eμt)) Ϙˆt (μt) = -t---t-------T-t1∕(1−tρ)----- ▴ 𝔥tκt(ℂ t )

(40)

with derivative (dropping arguments)

ˆμ T 1∕(1−ρ) −1 μt m m Ϙt = (▴ 𝔥tκt(ℂ t ) ) e (¯Λt − Λt )

(41)

and an upper-case version of the $χχχ$ equation in (32):

( ) ˆX 1-−-ˆϘt(μt) XXt(μt) = log Ϙˆt(μt) ( ) = log 1∕ˆϘt(μt) − 1

(42)

with corresponding derivative

( μ ) ˆμ -−Ϙˆt∕ˆϘ2t- XXXt = 1∕ ˆϘ − 1 t

(43)

and if we approximate these objects then invert them (as above with the $ˆϙ$ and $χχˆχ$ functions) we obtain a very high-quality approximation to our inverted value function at the same points for which we have our approximated value function:

=ˆϘt ◜(------◞◟-----)◝ 1 T 1∕(1− ρ) ˆΛt = ¯Λt − ---------ˆ-- ▴ 𝔥tκt(ℂt ) 1 + exp (XXXt)

(44)

from which we obtain our approximation to the value function and its derivatives as

ˆvt = u(ˆΛt) ˆvm = u′(ˆΛ )ˆΛm t t ˆvmtm = u′′(ˆΛt)(ˆΛm )2 + u′(ˆΛt)ˆΛmm.

Although a linear interpolation that matches the level of $Λ$ at the gridpoints is simple, a Hermite interpolation that matches both the level and the derivative of the $¯Λt$ function at the gridpoints has the considerable virtue that the $¯vt$ derived from it numerically satisﬁes the envelope theorem at each of the gridpoints for which the problem has been solved.

If we use the double-derivative calculated above to produce a higher-order Hermite polynomial, our approximation will also match marginal propensity to consume at the gridpoints; this would guarantee that the consumption function generated from the value function would match both the level of consumption and the marginal propensity to consume at the gridpoints; the numerical diﬀerences between the newly constructed consumption function and the highly accurate one constructed earlier would be negligible within the grid.

5.13 Reﬁnement: A Tighter Upper Bound

Carroll (2022) derives an upper limit $κ¯t$ for the MPC as $mt$ approaches its lower bound. Using this fact plus the strict concavity of the consumption function yields the proposition that

ct(mt + ▴mt ) < ¯κt▴mt.

(45)

The solution method described above does not guarantee that approximated consumption will respect this constraint between gridpoints, and a failure to respect the constraint can occasionally cause computational problems in solving or simulating the model. Here, we describe a method for constructing an approximation that always satisﬁes the constraint.

Deﬁning $m# t$ as the ‘cusp’ point where the two upper bounds intersect:

( ) # # ▴m t + ▴𝔥t κt = ¯κt▴m t # κt▴ 𝔥t ▴m t = ---------- (1 − κt)¯κt # -κt𝔥t −-𝔥t mt = (1 − κ )¯κ , -t t

we want to construct a consumption function for $# mt ∈ (mt, m t ]$ that respects the tighter upper bound:

▴mt κt < ct(mt + ▴mt ) < ¯κt▴mt
▴mt (¯κt − κt) > κ¯t▴(mt − ct(mt + ▴mt) ) > 0
¯κt▴mt−ct(mt+▴mt)-
1 > ▴mt(¯κt− κt) > 0.

Again deﬁning $μ = log ▴m t t$ , the object in the middle of the inequality is

μ −μ ¯κt-−-ct(mt-+-e-t)e--t ˇϙt(μt) ≡ ¯κt − κt μt −μt m μt ˇϙμ(μt) = ct(mt-+-e--)e---−-κκκt (mt-+-e-)-. t ¯κt − κt

As $m t$ approaches $− m- t$ , $ˇϙ (μ ) t t$ converges to zero, while as $m t$ approaches $+ ∞$ , $ˇϙt(μt)$ approaches $1$ .

As before, we can derive an approximated consumption function; call it $`ˇct$ . This function will clearly do a better job approximating the consumption function for low values of $mt$ while the previous approximation will perform better for high values of $mt$ .

For middling values of $m$ it is not clear which of these functions will perform better. However, an alternative is available which performs well. Deﬁne the highest gridpoint below $# m t$ as $¯ # ˇm t$ and the lowest gridpoint above $# m t$ as $# ˆm-t$ . Then there will be a unique interpolating polynomial that matches the level and slope of the consumption function at these two points. Call this function $˜ct(m )$ .

Using indicator functions that are zero everywhere except for speciﬁed intervals,

111Lo(m ) = 1 if m ≤ m¯ˇ# t# # 111Mid(m ) = 1 if ¯ˇm t < m < ˆm-t # 111Hi(m ) = 1 if ˆm-t ≤ m

we can deﬁne a well-behaved approximating consumption function

` ` `ct = 111Lo`ˇct + 111Mid˜ct + 111Hiˆct.

(46)

This just says that, for each interval, we use the approximation that is most appropriate. The function is continuous and once-diﬀerentiable everywhere, and is therefore well behaved for computational purposes.

We now construct an upper-bound value function implied for a consumer whose spending behavior is consistent with the reﬁned upper-bound consumption rule.

For $m ≥ m# t t$ , this consumption rule is the same as before, so the constructed upper-bound value function is also the same. However, for values $# mt < m t$ matters are slightly more complicated.

Start with the fact that at the cusp point,

# # T ¯vt(mt ) = u(¯ct(m t ))ℂt = u(▴m# ¯κt)ℂT . t t

But for all $mt$ ,

¯vt(m ) = u(¯ct(m )) + ¯𝔳t(m − ¯ct(m )),

and we assume that for the consumer below the cusp point consumption is given by $¯κ▴mt$ so for $mt < m#t$

¯vt(m ) = u(¯κt▴m ) + ¯𝔳t((1 − ¯κt)▴m ),

which is easy to compute because $¯𝔳t(at) = β ¯vt+1 (atℛ + 1)$ where $¯vt$ is as deﬁned above because a consumer who ends the current period with assets exceeding the lower bound will not expect to be constrained next period. (Recall again that we are merely constructing an object that is guaranteed to be an upper bound for the value that the ‘realist’ consumer will experience.) At the gridpoints deﬁned by the solution of the consumption problem can then construct

¯Λt(m ) = ((1 − ρ)¯vt(m ))1∕(1−ρ)

and its derivatives which yields the appropriate vector for constructing $ˇXXX$ and $ˇϘ$ . The rest of the procedure is analogous to that performed for the consumption rule and is thus omitted for brevity.

5.14 Extension: A Stochastic Interest Factor

Thus far we have assumed that the interest factor is constant at $R$ . Extending the previous derivations to allow for a perfectly forecastable time-varying interest factor $R t$ would be trivial. Allowing for a stochastic interest factor is less trivial.

The easiest case is where the interest factor is i.i.d.,

2 2 log Rt+n ∼ 𝒩 (r + φ − σr∕2,σ r) ∀ n > 0

(47)

where $φ$ is the risk premium and the $σ2 ∕2 r$ adjustment to the mean log return guarantees that an increase in $2 σr$ constitutes a mean-preserving spread in the level of the return.

This case is reasonably straightforward because Merton (1969) and Samuelson (1969) showed that for a consumer without labor income (or with perfectly forecastable labor income) the consumption function is linear, with an inﬁnite-horizon MPC²⁰

( 1−ρ)1∕ρ κ = 1 − β 𝔼t[R t+1 ]

(48)

and in this case the previous analysis applies once we substitute this MPC for the one that characterizes the perfect foresight problem without rate-of-return risk.

The more realistic case where the interest factor has some serial correlation is more complex. We consider the simplest case that captures the main features of empirical interest rate dynamics: An AR(1) process. Thus the speciﬁcation is

rt+1 − r = (rt − r)γ + 𝜖t+1

(49)

where $r$ is the long-run mean log interest factor, $0 < γ < 1$ is the AR(1) serial correlation coeﬃcient, and $𝜖t+1$ is the stochastic shock.

The consumer’s problem in this case now has two state variables, $m t$ and $r t$ , and is described by

vt(mt, rt) = max u(ct) + 𝔼t [βt+1ΦΦΦ1t+−1ρvt+1(mt+1,rt+1)] ct s.t. at = mt − ct rt+1 − r = (rt − r)γ + 𝜖t+1 Rt+1 = exp(rt+1) mt+1 = (Rt+1-∕ΦΦΦt+1-)at + 𝜃𝜃𝜃t+1. ◟ ≡ℛ◝◜ ◞ t+1

We approximate the AR(1) process by a Markov transition matrix using standard techniques. The stochastic interest factor is allowed to take on 11 values centered around the steady-state value $r$ and chosen [how?]. Given this Markov transition matrix, conditional on the Markov AR(1) state the consumption functions for the ‘optimist’ and the ‘pessimist’ will still be linear, with identical MPC’s that are computed numerically. Given these MPC’s, the (conditional) realist’s consumption function can be computed for each Markov state, and the converged consumption rules constitute the solution contingent on the dynamics of the stochastic interest rate process.

In principle, this reﬁnement should be combined with the previous one; further exposition of this combination is omitted here because no new insights spring from the combination of the two techniques.

5.15 Imposing ‘Artiﬁcial’ Borrowing Constraints

Optimization problems often come with additional constraints that must be satisﬁed. Particularly common is an ‘artiﬁcial’ liquidity constraint that prevents the consumer’s net worth from falling below some value, often zero.²¹ The problem then becomes

1−ρ vT−1(mT −1) = mcaTx−1 u(cT− 1) + 𝔼T −1[βΦΦΦ T vT(mT )] s.t. aT−1 = mT − 1 − cT− 1 mT = ℛT aT −1 + 𝜃𝜃𝜃T aT−1 ≥ 0.

By deﬁnition, the constraint will bind if the unconstrained consumer would choose a level of spending that would violate the constraint. Here, that means that the constraint binds if the $cT− 1$ that satisﬁes the unconstrained FOC

−ρ ′ cT−1 = 𝔳T− 1(mT − 1 − cT −1)

(50)

is greater than $mT − 1$ . Call $`c∗ T−1$ the approximated function returning the level of $cT− 1$ that satisﬁes (50). Then the approximated constrained optimal consumption function will be

∗ `cT−1(mT −1) = min [mT −1,`cT−1(mT −1)].

(51)

The introduction of the constraint also introduces a sharp nonlinearity in all of the functions at the point where the constraint begins to bind. As a result, to get solutions that are anywhere close to numerically accurate it is useful to augment the grid of values of the state variable to include the exact value at which the constraint ceases to bind. Fortunately, this is easy to calculate. We know that when the constraint is binding the consumer is saving nothing, which yields marginal value of $𝔳′ (0) T−1$ . Further, when the constraint is binding, $cT− 1 = mT −1$ . Thus, the largest value of consumption for which the constraint is binding will be the point for which the marginal utility of consumption is exactly equal to the (expected, discounted) marginal value of saving 0. We know this because the marginal utility of consumption is a downward-sloping function and so if the consumer were to consume $𝜖$ more, the marginal utility of that extra consumption would be below the (discounted, expected) marginal utility of saving, and thus the consumer would engage in positive saving and the constraint would no longer be binding. Thus the level of $mT − 1$ at which the lconstraint stops binding is:²²

′ ′ u(mT − 1) = 𝔳 T−1(0) mT −1 = (𝔳′ (0))(−1∕ρ) T −1 = 𝔠T− 1(0 ).

pict

Figure 18:Constrained (solid) and Unconstrained (dashed) Consumption

The constrained problem is solved by 2periodIntExpFOCInvPesReaOptCon.m; the resulting consumption rule is shown in Figure 18. For comparison purposes, the approximate consumption rule from Figure 18 is reproduced here as the solid line. The presence of the liquidity constraint requires three changes to the procedures outlined above:

We redeﬁne $𝔥- t$ , which now is the PDV of receiving $𝜃𝜃𝜃t+1 = 𝜃𝜃𝜃-$ next period and $𝜃𝜃𝜃t+n = 0 ∀ n > 1$ – that is, the pessimist believes he will receive nothing beyond period $t + 1$
We augment the end-of-period aVec with zero and with a point with a small positive value so that the generated mVec will the binding point $m#$ and a point just above it (so that we can better capture the curvature around that point)
We redeﬁne the optimal consumption rule as in equation (51). This ensures that the liquidity-constrained ‘realist’ will consume more than the redeﬁned ‘pessimist,’ so that we will have $ϙ$ still between $0$ and $1$ and the ‘method of moderation’ will proceed smoothly.

As expected, the liquidity constraint only causes a divergence between the two functions at the point where the optimal unconstrained consumption rule runs into the 45 degree line.

6 Recursion

6.1 Theory

Before we solve for periods earlier than $T − 1$ , we assume for convenience that in each such period a liquidity constraint exists of the kind discussed above, preventing $c$ from exceeding $m$ . This simpliﬁes things a bit because now we can always consider an aVec that starts with zero as its smallest element.

Recall now equations (11) and (12):

𝔳′(a) = 𝔼 [βR ΦΦΦ −ρu ′(c (ℛ a + 𝜃𝜃𝜃 ))] t′ t t′ t+1 t+1 t+1 t t+1 u (ct) = 𝔳 t(mt − ct).

Assuming that the problem has been solved up to period $t + 1$ (and thus assuming that we have an approximated $`c (m ) t+1 t+1$ ), our solution method essentially involves using these two equations in succession to work back progressively from period $T − 1$ to the beginning of life. Stated generally, the method is as follows. (Here, we use the original, rather than the “reﬁned,” method for constructing consumption functions; the generalization of the algorithm below to use the reﬁned method presents no diﬃculties.)

For the grid of values $at,i$ in aVec $t$ , numerically calculate the values of $𝔠t(at,i)$ and $𝔠at(at,i)$ ,
$′ −1∕ρ 𝔠t,i = (𝔳t(at,i)) , ( [ −ρ −ρ])− 1∕ρ = β 𝔼t RΦΦΦ t+1(`ct+1(ℛt+1at,i + 𝜃𝜃𝜃t+1)) , 𝔠a = − (1∕ ρ)(𝔳′(a ))−1−1∕ρ𝔳 ′′(a ), t,i t t,i t t,i$ (52)

generating vectors of values $⃗𝔠t$ and $⃗𝔠at$ .
Construct a corresponding list of values of $ct,i$ and $mt,i$ from $ct,i = 𝔠t,i$ and $mt,i = ct,i + at,i$ ; similarly construct a corresponding list of $κt,i$ using equation (38).
Construct a corresponding list of $μt,i$ , the levels and ﬁrst derivatives of $ϙt,i$ , and the levels and ﬁrst derivatives of $χt,i$ .
Construct an interpolating approximation $χ`t$ that smoothly matches both the level and the slope at those points.
If we are to approximate the value function, construct a corresponding list of values of $vt,i$ , the levels and ﬁrst derivatives of $Ϙt,i$ , and the levels and ﬁrst derivatives of $ˆXXXt,i$ ; and construct an interpolating approximation $ˆXXXt$ that matches those points.

With $χ`t$ in hand, our approximate consumption function is computed directly from the appropriate substitutions in (33) and related equations. With this consumption rule in hand, we can continue the backwards recursion to period $t − 1$ and so on back to the beginning of life.

Note that this loop does not contain steps for constructing $′ ˆvt(mt )$ . This is because with $`ˆct(mt)$ in hand, we simply deﬁne $ˆv ′t(mt ) = u′(`ˆct(mt ))$ so there is no need to construct interpolating approximations - the function arises ‘free’ (or nearly so) from our constructed $`ˆct(mt)$ .

The program multiperiodCon.m²³ presents a fairly general and ﬂexible approach to solving problems of this kind. The essential structure of the program is a loop that simply works its way back from an assumed last period of life, using the command AppendTo to record the interpolated $`χt$ functions in the earlier time periods back from the end. For a realistic life cycle problem, it would also be necessary at a minimum to calibrate a nonconstant path of expected income growth over the lifetime that matches the empirical proﬁle; allowing for such a calibration is the reason we have included the ${ΦΦΦ}Tt$ vector in our computational speciﬁcation of the problem.

6.2 Mathematica Background

Mathematica has several features that are useful in solving the multiperiod problem.

It can treat a user-created function as an object just like a number or a character.
Mathematica uses the ‘list’ as its basic data structure. A Mathematica ‘list’ is a very powerful and ﬂexible data construct. A list of length N in Mathematica can hold essentially anything in each of its $N um$ positions - a function, a number, another list, a symbolic expression, or any other object that Mathematica can recognize. The items at position $i$ in a list named ExampleList are retrieved or addressed using the syntax ExampleList[[i]].
The function Apply[FuncName_, DataListName_] takes the function whose name is FuncName (for example, Vt) and the data in DataListName (for example, ${1,19}$ ) and returns the result that would have been returned by calling the function Vt[1,19].
The function Map[FuncToApply_,DataToApplyItTo_] takes a list of possible arguments to the function FuncToApply and applies that function to each of the elements of that list sequentially. For example, Map[Sin,{1,2,3}] would return a list {Sin[1],Sin[2],Sin[3]}.

6.3 Program Structure

After the usual initializations, the heart of the program works like this.

6.3.1 Iteration

After setting up a variable PeriodsToSolve which deﬁnes the total number of periods that the program will solve, the program sets up a “Do[SolveAnotherPeriod,{PeriodsToSolve}]” loop that runs the function SolveAnotherPeriod the number of times corresponding to PeriodsToSolve. Every time SolveAnotherPeriod is run, the interpolated consumption function for one period of life earlier is calculated. The structure of the SolveAnotherPeriod function is as follows:

Add various period-t parameters to their respective lifecycle lists, which is accomplished by calling the AddNewPeriodToParamLifeDates function.
For each $at,i$ in aVec, construct $𝔠$ as follows:
$( [ −ρ − ρ] )−1∕ρ 𝔠t(at,i) = β 𝔼t RΦΦΦt+1(`ˆct+1(ℛt+1at,i + 𝜃𝜃𝜃t+1 )) ( )− 1∕ρ 1 n∑𝜃𝜃𝜃 ( −ρ −ρ) = β --- R ΦΦΦ t+1(`ˆct+1(ℛt+1at,i + 𝜃𝜃𝜃i)) . n𝜃𝜃𝜃 i=1$ (53)

and similarly construct the corresponding $𝔠at(at,i)$ We also construct the corresponding mVec, $κ$ Vec, etc. by calling the AddNewPeriodToSolvedLifeDates function.
For each $m$ in mVec, we can deﬁne $▴m$ Vec, ﬁnd the corresponding optimal consumption vector for a pessimist and an optimist, construct the $ϙ$ and $χ$ vectors, and ﬁnally an interpolation function $χ`t$ . Similarly we can construct an interpolation function $`XXXt$ that approximates the value function. The whole process is done by calling the AddNewPeriodToSolvedLifeDatesPesReaOpt function.
Various period- $t$ functions are derived from $`χt$ and $` XXXt$ (in functions_ConsNVal.m). Note that the liquidity constraint is dealt with by comparing the unconstrained solution $c$ From $χ$ with the 45 degree line.

6.4 Results

As written, the program creates $`χ (μ ) t t$ functions from which the relevant $`c (m ) t t$ functions are recovered in any period for any value of $m$ .

pict

Figure 19:Converging $`cT −n(m )$ Functions as $n$ Increases

As an illustration, Figure 19 shows $`cT −n(m )$ for $n = {20,15,10, 5,1}$ . At least one feature of this ﬁgure is encouraging: the consumption functions converge as the horizon extends, something that Carroll (2022) shows must be true under certain parametric conditions that are satisﬁed by the baseline parameter values being used here.

7 Multiple Control Variables

We now consider how to solve problems with multiple control variables. (To reduce notational complexity, in this section we set $ΦΦΦt = 1 ∀ t$ .)

7.1 Theory

The new control variable that the consumer can now choose is the portion of the portfolio to invest in risky assets. Designating the gross return on the risky asset as $Rt+1$ , and using $ςt$ to represent the proportion of the portfolio invested in this asset between $t$ and $t + 1$ (restricted here, as often in the literature, to values between 0 and 1, corresponding to an assumption that the consumer cannot be ‘net short’ and cannot issue net equity), the overall return on the consumer’s portfolio between $t$ and $t + 1$ will be:

Rt+1 = R(1 − ςt) + Rt+1 ςt = R + (R − R)ς t+1 t

(54)

and the maximization problem is

v (m ) = max u(c ) + β 𝔼 [v (m )] t t {ct,ςt} t t t+1 t+1 s.t. Rt+1 = R + (Rt+1 − R)ςt m = (m − c)R + 𝜃𝜃𝜃 t+1 t t t+1 t+1 0 ≤ςt ≤ 1,

or, more compactly,

vt(mt) = m{acxt,ςt} u(ct) + 𝔼t[βvt+1((mt − ct)Rt+1 + 𝜃𝜃𝜃t+1)] s.t. 0 ≤ ςt ≤ 1.

The ﬁrst order condition with respect to $ct$ is almost identical to that in the single-control problem, equation (5), with the only diﬀerence being that the nonstochastic interest factor $R$ is now replaced by $Rt+1$ ,

′ ′ u (ct) = β 𝔼t[Rt+1v t+1(mt+1 )],

(55)

and the Envelope theorem derivation remains the same, yielding the Euler equation for consumption

u′(ct) = 𝔼t[βRt+1u ′(ct+1)].

(56)

The ﬁrst order condition with respect to the risky portfolio share is

′ 0 = 𝔼t[v t+1(mt+1 )(Rt+1 − R )at] = at𝔼t[u′(ct+1(mt+1))(Rt+1 − R)].

As before, it will be useful to deﬁne $𝔳t$ as a function that yields the expected $t + 1$ value of ending period $t$ in a given state. However, now that there are two control variables, the expectation must be deﬁned as a function of the chosen values of both of those variables, because expected end-of-period value will depend not just on how much the agent saves, but also on how the saved assets are allocated between the risky and riskless assets. Thus we deﬁne

𝔳t(at,ςt) = 𝔼t [βvt+1 (mt+1 )]

which has derivatives

𝔳a = 𝔼 [βR vm (m )] = 𝔼 [βR u ′ (c (m ))] tς t t+1 t+1 t+m1 t t+1 t+1 t+1 t+1 ′ 𝔳t = 𝔼t[β(Rt+1 − R)vt+1(mt+1)]at = 𝔼t[β(Rt+1 − R )ut+1(ct+1(mt+1 ))]at

implying that the ﬁrst order conditions (56) and (57) can be rewritten

′ a u (ct) = 𝔳t(mt − ct,ςt) 0 = 𝔳ςt(at,ςt).

(57)

7.2 Application

Our ﬁrst step is to specify the stochastic process for $Rt+1$ . We follow the common practice of assuming that returns are lognormally distributed, $logR ∼ 𝒩 (φ + r − σ2φ∕2,σ2φ)$ where $φ$ is the equity premium over the returns $r$ available on the riskless asset.²⁴

As with labor income uncertainty, it is necessary to discretize the rate-of-return risk in order to have a problem that is soluble in a reasonable amount of time. We follow the same procedure as for labor income uncertainty, generating a set of $nr$ equiprobable shocks to the rate of return; in a slight abuse of notation, we will designate the portfolio-weighted return (contingent on the chosen portfolio share in equity, and potentially contingent on any other aspect of the consumer’s problem) simply as $Ri,j$ (where dependence on $i$ is allowed to permit the possibility of nonzero correlation between the return on the risky asset and the shock to labor income (for example, in recessions the stock market falls and labor income also declines).

The direct expressions for the derivatives of $𝔳t$ are

( ) n∑𝜃𝜃𝜃 ∑nr 𝔳a(a ,ς) = β --1-- R (c (R a + 𝜃𝜃𝜃 ))−ρ t t t nrn𝜃𝜃𝜃 i,j t+1 i,j t i ( ) i=1 j=1 ς 1 n∑𝜃𝜃𝜃 ∑nr −ρ 𝔳t(at,ςt) = β ----- (Ri,j − R )(ct+1(Ri,jat + 𝜃𝜃𝜃i)) . nrn𝜃𝜃𝜃 i=1 j=1

(58)

Writing these equations out explicitly makes a problem very apparent: For every diﬀerent combination of ${at,ςt}$ that the routine wishes to consider, it must perform two double-summations of $nr × n$ terms. Once again, there is an ineﬃciency if it must perform these same calculations many times for the same or nearby values of ${a ,ς } t t$ , and again the solution is to construct an approximation to the derivatives of the $𝔳$ function.

Details of the construction of the interpolating approximation are given below; assume for the moment that we have the approximations $ˆ𝔳a t$ and $ˆ𝔳ς t$ in hand and we want to proceed. As noted above, nonlinear equation solvers (including those built into Mathematica) can ﬁnd the solution to a set of simultaneous equations. Thus we could ask Mathematica to solve

−ρ a ct = ˆ𝔳t(mt − ct,ςt) 0 = ˆ𝔳ςt(mt − ct,ςt)

(59)

simultaneously for $c$ and $ς$ at the set of potential $mt$ values deﬁned in mVec. However, multidimensional constrained maximization problems are diﬃcult and sometimes quite slow to solve. There is a better way. Deﬁne the problem

˜𝔳t(at) = maxςt 𝔳t(at,ςt) s.t. 0 ≤ ςt ≤ 1

where the typographical diﬀerence between $˜𝔳$ and $𝔳$ indicates that this is the $𝔳$ that has been optimized with respect to all of the arguments other than the one still present ( $at$ ). We solve this problem for the set of gridpoints in aVec and use the results to construct the interpolating function $`a ˜𝔳t(at)$ .²⁵ With this function in hand, we can use the ﬁrst order condition from the single-control problem

−ρ ct = `˜𝔳at(mt − ct)

to solve for the optimal level of consumption as a function of $mt$ . Thus we have transformed the multidimensional optimization problem into a sequence of two simple optimization problems for which solutions are much easier and more reliable.

Note the parallel between this trick and the fundamental insight of dynamic programming: Dynamic programming techniques transform a multi-period (or inﬁnite-period) optimization problem into a sequence of two-period optimization problems which are individually much easier to solve; we have done the same thing here, but with multiple dimensions of controls rather than multiple periods.

7.3 Implementation

The program which solves the constrained problem with multiple control variables is multicontrolCon.m.

Some of the functions deﬁned in multicontrolCon.m correspond to the derivatives of $𝔳t(at,ςt)$ .

The ﬁrst function deﬁnition that does not resemble anything in multiperiod.m is $ς$ Raw[at_]. This function, for its input value of $at$ , calculates the value of the portfolio share $ςt$ which satisﬁes the ﬁrst order condition (59), tests whether the optimal portfolio share would violate the constraints, and if so resets the portfolio share to the constrained optimum. The function returns the optimal value of the portfolio share itself, $ς∗t$ , from which the functions $¯𝔳at(at)$ and $ˆςt(at)$ will be constructed.

As $ˆςt(at)$ can be constructed by $ς$ Raw[at_], $a ¯𝔳 t(at)$ is constructed by another newly deﬁned function $𝔳$ aOpt[at_], where the naming convention is obviously that ‘Opt’ stands for ‘Optimized.’ With $¯𝔳a(at) t$ in hand (as well as the appropriately redeﬁned $¯𝔳t(at)$ and $¯𝔳aa(a ) t t$ ) the analysis is essentially identical to that for the standard multiperiod problem with a single control variable.

The structure of the program in detail is as follows. First, perform the usual initializations. Then initialize $ς$ Vec and the other variables speciﬁc to the multiple control problem.²⁶ In particular, there are now three kinds of functions: those with both $at$ and $ςt$ as arguments, those with just $at$ , and those with $mt$ .

Once the setup is complete, the heart of the program is the following.

Construct $𝔳ς(a ,ς ) t t t$ using the usual calculation over the tensor deﬁned by the combinations of the elements of aVec and $ς$ Vec.
For any level of saving at, the function $ς$ Raw[at_] performs a rootﬁnding operation²⁷
$0 = 𝔳ς(a,ς ) t t t s.t. 0 ≤ ςt ≤ 1$

and generates the corresponding optimal portfolio share $ς∗t$ .
Construct the function $˜𝔳$ a[at_]
$a a ∗ ˜𝔳t(at) ≡ 𝔳 t(at,ςt (at))$ (60)

where $∗ ςt (at)$ is computed by $ς$ Raw[at_].
Using $˜𝔳at(at) ≡$ $˜𝔳$ a[at_] and the redeﬁned $˜𝔳t(at)$ and $˜𝔳ata(at)$ (in place of $𝔳at(at) ≡$ $𝔳$ a[at_] in multiperiod.m), follow the same procedures as in multiperiod.m to generate $`c(m ) t$ .

7.4 Results

Figure 20 plots the ﬁrst-period consumption function generated by the program; qualitatively it does not look much diﬀerent from the consumption functions generated by the program without portfolio choice. Figure 21 plots the optimal portfolio share as a function of the level of assets. This ﬁgure exhibits several interesting features. First, even with a coeﬃcient of relative risk aversion of 6, an equity premium of only 4 percent, and an annual standard deviation in equity returns of 15 percent, the optimal choice is for the agent to invest a proportion 1 (100 percent) of the portfolio in stocks (instead of the safe bank account with riskless return $R$ ) is at values of $at$ less than about 2. Second, the proportion of the portfolio kept in stocks is declining in the level of wealth - i.e., the poor should hold all of their meager assets in stocks, while the rich should be cautious, holding more of their wealth in safe bank deposits and less in stocks. This seemingly bizarre (and highly counterfactual) prediction reﬂects the nature of the risks the consumer faces. Those consumers who are poor in measured ﬁnancial wealth are likely to derive a high proportion of future consumption from their labor income. Since by assumption labor income risk is uncorrelated with rate-of-return risk, the covariance between their future consumption and future stock returns is relatively low. By contrast, persons with relatively large wealth will be paying for a large proportion of future consumption out of that wealth, and hence if they invest too much of it in stocks their consumption will have a high covariance with stock returns. Consequently, they reduce that correlation by holding some of their wealth in the riskless form.

pict

Figure 20: $c(m1 )$ With Portfolio Choice

pict

Figure 21:Portfolio Share in Risky Assets in First Period $ς(a)$

8 The-Inﬁnite-Horizon

All of the solution methods presented so far have involved period-by-period iteration from an assumed last period of life, as is appropriate for life cycle problems. However, if the parameter values for the problem satisfy certain conditions (detailed in Carroll (2022)), the consumption rules (and the rest of the problem) will converge to a ﬁxed rule as the horizon (remaining lifetime) gets large, as illustrated in Figure 19. Furthermore, Deaton (1991), Carroll (1992; 1997) and others have argued that the ‘buﬀer-stock’ saving behavior that emerges under some further restrictions on parameter values is a good approximation of the behavior of typical consumers over much of the lifetime. Methods for ﬁnding the converged functions are therefore of interest, and are dealt with in this section.

Of course, the simplest such method is to solve the problem as speciﬁed above for a large number of periods. This is feasible, but there are much faster methods.

8.1 Convergence

In solving an inﬁnite-horizon problem, it is necessary to have some metric that determines when to stop because a solution that is ‘good enough’ has been found.

A natural metric is deﬁned by the unique ‘target’ level of wealth that Carroll (2022) proves will exist in problems of this kind: The $mˆ$ such that

𝔼t [mt+1 ∕mt ] = 1 if mt = ˆm

(61)

where the $∨$ accent is meant to signify that this is the value that other $m$ ’s ‘point to.’

Given a consumption rule $c(m )$ it is straightforward to ﬁnd the corresponding $ˆm$ . So for our problem, a solution is declared to have converged if the following criterion is met: $|mˆt+1 − ˆmt| < 𝜖$ , where $𝜖$ is a very small number and measures our degree of convergence tolerance.

Similar criteria can obviously be speciﬁed for other problems. However, it is always wise to plot successive function diﬀerences and to experiment a bit with convergence criteria to verify that the function has converged for all practical purposes.

9 Structural Estimation

This section describes how to use the methods developed above to structurally estimate a life-cycle consumption model, following closely the work of Cagetti (2003).²⁸ The key idea of structural estimation is to look for the parameter values (for the time preference rate, relative risk aversion, or other parameters) which lead to the best possible match between simulated and empirical moments. (The code for the structural estimation is in the self-contained subfolder StructuralEstimation in the Matlab and Mathematica directories.)

9.1 Life Cycle Model

The decision problem for the household at age $t$ is:

{ [ T ( ) ]} ∑ s−t s ˆ max u (ct) + 𝔼t ℶ Π i=t+1βiℒi u(cs) s=t+1

(62)

subject to the constraints

as = ms − cs ms+1 = Ras + Ys+1 Y = p 𝜃𝜃𝜃 s+1 s+1 s+1 ps+1 = ΦΦΦs+1ps Ψs+1

where

ℒs : probability alive (not dead ) until age s given alive at age s − 1 ˆ βs : time-varying discount factor between age s − 1 and s Ψs : mean -one shock to permanent income ℶ : time-invariant discount factor

and all the other variables are deﬁned as in section 2.

Households start life at age $s = 25$ and live with probability 1 until retirement ( $s = 65$ ). Thereafter the survival probability shrinks every year and agents are dead by $s = 91$ as assumed by Cagetti. Note that in addition to a typical time-invariant discount factor $ℶ$ , there is a time-varying discount factor $ˆ βs$ in (62) which captures the eﬀect of time-varying demographic variables (e.g. changes in family size).

Transitory and permanent shocks are distributed as follows:

{ Ξs = 0 with probability ℘ > 0 𝜃𝜃𝜃s∕℘ with probability (1 − ℘), where log𝜃𝜃𝜃s ∽ 𝒩 (− σ2𝜃𝜃𝜃∕2, σ2𝜃𝜃𝜃) 2 2 log Ψs ∽ 𝒩 (− σΨ ∕2,σΨ )

(63)

where $℘$ is the probability of unemployment (and unemployment shocks are turned oﬀ after retirement).

The parameter values for the shocks are taken from Carroll (1992), $℘ = 0.5∕100$ , $σ 𝜃𝜃𝜃 = 0.1$ , and $σ Ψ = 0.1$ .²⁹ The income growth proﬁle $ΦΦΦ s$ is from Carroll (1997) and the values of $ℒs$ and $ˆ βs$ are obtained from Cagetti (2003) (Figure 22).³⁰ The interest rate is assumed to equal $1.03$ . The model parameters are included in Table 1.

pict

Figure 22:Time Varying Parameters

------------------------------------
σ𝜃𝜃𝜃 0.1 Carroll (1992)
σ Ψ 0.1 Carroll (1992)

℘ 0.005 Carroll (1992)
ΦΦΦs ﬁgure 22 Carroll (1997)
ˆβs,ℒs ﬁgure 22 Cagetti (2003 )
R 1.03 Cagetti (2003 )
------------------------------------ — Table 1:Parameter Values

The parameters $ℶ$ and $ρ$ are structurally estimated following the procedure described below.

9.2 Estimation

When economists say that they are performing “structural estimation” of a model like this, they mean that they have devised a formal procedure for searching for values for the parameters $ℶ$ and $ρ$ at which some measure of the model’s outcome (like “median wealth by age”) is as close as possible to an empirical measure of the same thing. Here, we choose to match the median of the wealth to permanent income ratio across 7 age groups, from age $26 − 30$ up to $56 − 60$ .³¹ The choice of matching the medians rather the means is motivated by the fact that the wealth distribution is much more concentrated at the top than the model is capable of explaining using a single set of parameter values. This means that in practice one must pick some portion of the population who one wants to match well; since the model has little hope of capturing the behavior of Bill Gates, but might conceivably match the behavior of Homer Simpson, we choose to match medians rather than means.

As explained in section 3, it is convenient to work with the normalized version the model which can be written as:

{ } ˆ 1−ρ vt(mt ) = macxt u(ct) + ℶ ℒt+1βt+1𝔼t [(Ψt+1 ΦΦΦt+1) vt+1(mt+1 )] s.t. at = mt − ct ( R ) mt+1 = at ---------- + 𝜃𝜃𝜃t+1 ◟-Ψt+1◝◜ΦΦΦt+1-◞ ≡ℛt+1

with the ﬁrst order condition:

u′(c ) = ℶ ℒ βˆ R 𝔼 [u ′(Ψ ΦΦΦ c (a ℛ + 𝜃𝜃𝜃 ))]. t t+1 t+1 t t+1 t+1 t+1 t t+1 t+1

(64)

The ﬁrst step is to solve for the consumption functions at each age using the routines included in the setup_ConsFn.m ﬁle. We need to discretize the shock distribution and solve for the policy functions by backward induction using equation (64) following the procedure in sections 5 and 6 (ConstructcFuncLife). The latter routine is slightly complicated by the fact that we are considering a life-cycle model and therefore the growth rate of permanent income, the probability of death, the time-varying discount factor and the distribution of shocks will be diﬀerent across the years. We thus must ensure that at each backward iteration the right parameter values are used.

Once we have the age varying consumption functions, we can proceed to generate the simulated data and compute the simulated medians using the routines deﬁned in the setup_Sim.m ﬁle. We ﬁrst have to draw the shocks for each agent and period. This involves discretizing the shock distribution for as many points as the number of agents we want to simulate (ConstructShockDistribution). We then randomly permute this shock vector as many times as we need to simulate the model for, thus obtaining a time varying shock for each agent (ConstructSimShocks). This is much more time eﬃcient than drawing at each time from the shock distribution a shock for each agent, and also ensures a stable distribution of shocks across the simulation periods even for a small number of agents. (Similarly, in order to speed up the process, at each backward iteration we compute the consumption function and other variables as a vector at once.) Then, following Cagetti (2003), we initialize the wealth-to-income ratio of agents at age $25$ by randomly assigning the equal probability values to $0.17$ , $0.50$ and $0.83$ and run the simulation (Simulate). In particular we consider a population of agents at age 25 and follow their consumption and wealth accumulation dynamics as they reach the age of $60$ , using the appropriate age-speciﬁc consumption functions and the age-varying parameters. The simulated medians are obtained by taking the medians of the wealth to income ratio of the $7$ age groups.

Given these simulated medians, we can estimate the model by calculating empirical medians and measure the model’s success by calculating the diﬀerence between the empirical median and the actual median. Speciﬁcally, deﬁning $ξ$ as the set of parameters to be estimated (in the current case $ξ = {ρ,ℶ }$ ), we could search for the parameter values which solve

∑ 7 min |ςτ − sτ(ξ)| ξ τ=1

(65)

where $ςτ$ and $sτ$ are respectively the empirical and simulated medians of the wealth to permanent income ratio for age group $τ$ .

A drawback of proceeding in this way is that it treats the empirically estimated medians as though they reﬂected perfect measurements of the truth. Imagine, however, that one of the age groups happened to have (in the consumer survey) four times as many data observations as another age group; then we would expect the median to be more precisely estimated for the age group with more observations; yet (65) assigns equal importance to a deviation between the model and the data for all age groups.

We can get around this problem (and a variety of others) by instead minimizing a slightly more complex object:

∑N min ω |ςτ− sτ(ξ)| ξ i i i

(66)

where $ωi$ is the weight of household $i$ in the entire population,³² and $ςτ i$ is the empirical wealth-to-permanent-income ratio of household $i$ whose head belongs to age group $τ$ . $ω i$ is needed because unequal weight is assigned to each observation in the Survey of Consumer Finances (SCF). The absolute value is used since the formula is based on the fact that the median is the value that minimizes the sum of the absolute deviations from itself.

The actual data are taken from several waves of the SCF and the medians and means for each age category are plotted in ﬁgure 23. More details on the SCF data are included in appendix A.

pict

Figure 23:Wealth to Permanent Income Ratios from SCF (means (dashed) and medians (solid))

The key function to perform structural estimation is deﬁned in the setup_Estimation.m ﬁle as follows:

GapEmpiricalSimulatedMedians [ρ,ℶ ]:= [ ConstructcFuncLife [ρ,ℶ ]; Simulate; ∑N ω |ςτ − sτ(ξ)| i i i ];

For a given pair of the parameters to be estimated, the GapEmpiricalSimulatedMedians routine therefore:

solves for the consumption functions by calling ConstructcFuncLife
simulates the data and computes the simulated medians by calling Simulate
returns the value of equation (66)

We delegate the task of ﬁnding the coeﬃcients that minimize the GapEmpiricalSimulatedMedians function to the Mathematica built-in numerical minimizer FindMinimum. This task can be quite time demanding and rather problematic if the GapEmpiricalSimulatedMedians function has very ﬂat regions or sharp features. It is thus wise to verify the accuracy of the solution, for example by experimenting with a variety of alternative starting values for the parameter search.

Finally the standard errors are computed by bootstrap using the routines in the setup_Bootstrap.m ﬁle.³³ This involves:

drawing new shocks for the simulation
drawing a random sample (with replacement) of actual data from the SCF
obtaining new estimates for $ρ$ and $ℶ$

We repeat the above procedure several times (Bootstrap) and take the standard deviation for each of the estimated parameters across the various bootstrap iterations.

The ﬁle StructEstimation.m produces our $ρ$ and $ℶ$ estimates with standard errors using 10,000 simulated agents.³⁴ Results are reported in Table 2.³⁵ Figure 24 shows the contour plot of the GapEmpiricalSimulatedMedians function and the parameter estimates. The contour plot shows equally spaced isoquants of the GapEmpiricalSimulatedMedians function, i.e. the pairs of $ρ$ and $ℶ$ which lead to the same deviations between simulated and empirical medians (equivalent values of equation (66)). We can thus interestingly see that there is a large rather ﬂat region, or more formally speaking there exists a broad set of parameter pairs which leads to similar simulated wealth to income ratios. Intuitively, the ﬂatter and larger is this region, the harder it is for the structural estimation procedure to precisely identify the parameters.

-----------------
---ρ-------ℶ-----
4.68 1.00
(0.13) (0.00)
----------------- — Table 2:Estimation Results

pict

Figure 24:Contour Plot (larger values are shown lighter) with ${ρ,ℶ }$ Estimates (red dot).

10 Conclusion

There are many alternative choices that can be made for solving microeconomic dynamic stochastic optimization problems. The set of techniques, and associated programs, described in these notes represents an approach that I have found to be powerful, ﬂexible, and eﬃcient, but other problems may require other techniques. For a much broader treatment of many of the issues considered here, see Judd (1998).

Appendices

A Further Details on SCF Data

Data used in the estimation is constructed using the SCF 1992, 1995, 1998, 2001 and 2004 waves. The deﬁnition of wealth is net worth including housing wealth, but excluding pensions and social securities. The data set contains only households whose heads are aged 26-60 and excludes singles, following Cagetti (2003).³⁶ Furthermore, the data set contains only households whose heads are college graduates. The total sample size is 4,774.

In the waves between 1995 and 2004 of the SCF, levels of normal income are reported. The question in the questionnaire is "About what would your income have been if it had been a normal year?" We consider the level of normal income as corresponding to the model’s theoretical object $P$ , permanent noncapital income. Levels of normal income are not reported in the 1992 wave. Instead, in this wave there is a variable which reports whether the level of income is normal or not. Regarding the 1992 wave, only observations which report that the level of income is normal are used, and the levels of income of remaining observations in the 1992 wave are interpreted as the levels of permanent income.

Normal income levels in the SCF are before-tax ﬁgures. These before-tax permanent income ﬁgures must be rescaled so that the median of the rescaled permanent income of each age group matches the median of each age group’s income which is assumed in the simulation. This rescaled permanent income is interpreted as after-tax permanent income. This rescaling is crucial since in the estimation empirical proﬁles are matched with simulated ones which are generated using after-tax permanent income (remember the income process assumed in the main text). Wealth / permanent income ratio is computed by dividing the level of wealth by the level of (after-tax) permanent income, and this ratio is used for the estimation.³⁷

References

Attanasio, O.P., J. Banks, C. Meghir, and G. Weber (1999): “Humps and Bumps in Lifetime Consumption,” Journal of Business and Economic Statistics, 17(1), 22–35.

Cagetti, Marco (2003): “Wealth Accumulation Over the Life Cycle and Precautionary Savings,” Journal of Business and Economic Statistics, 21(3), 339–353.

Carroll, Christopher D. (1992): “The Buﬀer-Stock Theory of Saving: Some Macroeconomic Evidence,” Brookings Papers on Economic Activity, 1992(2), 61–156, https://www.econ2.jhu.edu/people/ccarroll/BufferStockBPEA.pdf.

__________ (1997): “Buﬀer Stock Saving and the Life Cycle/Permanent Income Hypothesis,” Quarterly Journal of Economics, CXII(1), 1–56.

__________ (2006): “The Method of Endogenous Gridpoints for Solving Dynamic Stochastic Optimization Problems,” Economics Letters, 91(3), 312–320, https://www.econ2.jhu.edu/people/ccarroll/EndogenousGridpoints.pdf.

__________ (2022): “Theoretical Foundations of Buﬀer Stock Saving,” Submitted.

__________ (Current): “Math Facts Useful for Graduate Macroeconomics,” Online Lecture Notes.

Carroll, Christopher D., and Miles S. Kimball (1996): “On the Concavity of the Consumption Function,” Econometrica, 64(4), 981–992, https://www.econ2.jhu.edu/people/ccarroll/concavity.pdf.

Carroll, Christopher D., and Andrew A. Samwick (1997): “The Nature of Precautionary Wealth,” Journal of Monetary Economics, 40(1), 41–71.

Deaton, Angus S. (1991): “Saving and Liquidity Constraints,” Econometrica, 59, 1221–1248, https://www.jstor.org/stable/2938366.

den Haan, Wouter J, and Albert Marcet (1990): “Solving the Stochastic Growth Model by Parameterizing Expectations,” Journal of Business and Economic Statistics, 8(1), 31–34, Available at http://ideas.repec.org/a/bes/jnlbes/v8y1990i1p31-34.html.

Gourinchas, Pierre-Olivier, and Jonathan Parker (2002): “Consumption Over the Life Cycle,” Econometrica, 70(1), 47–89.

Horowitz, Joel L. (2001): “The Bootstrap,” in Handbook of Econometrics, ed. by James J. Heckman, and Edward Leamer, vol. 5. Elsevier/North Holland.

Judd, Kenneth L. (1998): Numerical Methods in Economics. The MIT Press, Cambridge, Massachusetts.

Kopecky, Karen A., and Richard M.H. Suen (2010): “Finite State Markov-Chain Approximations To Highly Persistent Processes,” Review of Economic Dynamics, 13(3), 701–714, http://www.karenkopecky.net/RouwenhorstPaper.pdf.

Merton, Robert C. (1969): “Lifetime Portfolio Selection under Uncertainty: The Continuous Time Case,” Review of Economics and Statistics, 51, 247–257.

Palumbo, Michael G (1999): “Uncertain Medical Expenses and Precautionary Saving Near the End of the Life Cycle,” Review of Economic Studies, 66(2), 395–421, Available at http://ideas.repec.org/a/bla/restud/v66y1999i2p395-421.html.

Samuelson, Paul A. (1969): “Lifetime Portfolio Selection by Dynamic Stochastic Programming,” Review of Economics and Statistics, 51, 239–46.

Valencia, Fabian (2006): “Banks’ Financial Structure and Business Cycles,” Ph.D. thesis, Johns Hopkins University.

A Wealth In Utility Model

This appendix considers how to solve a model with a utility function that allows a role for ﬁnancial balances distinct from the implications those balances might have for consumption expenditures: $u(c,a)$ .

For purposes of articulating the exact structure of the model it is useful to deﬁne a sequence for events that are in principle simultaneous. These correspond essentially to a set of steps that will be executed in order by the code solving (or simulating) the model. Notationally the sequence of steps can be indexed by time gaps of inﬁnitesimal duration $𝜖$ . We conceive of the sequence of events in the period as follows, where for example the notation $2𝜖$ means that the event is conceived of has happening two instants after the beginning of the period and because the period is of total duration 1, $1 − 1𝜖$ means that the event is conceived of as happening an instant before the end of the period:

The period begins with the consumer knowing the ratio of end-of-last-period assets to permanent income. Call this :
$k = e t+0𝜖 t− 1𝜖$ (67)
- If this is the initial period, there is no $et−1𝜖$ because there was no previous period
- In that case, some assumption must be made about the starting value of $k$
The consumer learns the transitory () and permanent () shocks
- In simulation, the consumer’s permanent income is updated according to
  $pt = ΦΦΦtpt− 1Ψt$ (68)
- $ΦΦΦ$ is the predictable (say, life cycle) growth factor
- Because the problem is normalized by permanent income, we do not need to keep track of $p$ during the solution stage
- Normalized income is therefore $𝜃 yt+1𝜖 = 𝜃𝜃t+1𝜖$
The consumer calculates normalized bank balances
$bt+2𝜖 = kt∕(ΦΦΦtΨt )$ (69)
Market resources are determined
$mt+3 𝜖 = bt+2𝜖 + yt+1𝜖$ (70)
The consumer decides how much to consume for the year
- We imagine that this amount is immediately deducted from market resources; concretely, imagine that the amount $c t+4𝜖$ is put into an untouchable escrow account
- The escrow account funds a constant ﬂow rate of consumption that will be maintained throughout the interval from $t + 4𝜖$ to $t + 1 − 𝜖$
The consumption decision determines the amount of investable assets:
$at+5𝜖 = mt+3𝜖 − ct+4𝜖$ (71)
The consumer makes a decision about the proportion of assets $at+5𝜖𝜖$ to invest in the risky asset, $ς˜ t+6𝜖$
The rate of return on the risky asset $˜ R$ for the year is determined; combining this with the riskless (but potentially time-varying) return $˜Rt$ , this yields the portfolio return
End-of-period (December 31 11:59:59 $¯ 9$ ) ﬁnancial balances are

The model in the main text considers the problem at the point we have designated $4𝜖$ above: After the realization of all of the stochastic variables that determine $m$ .

Using this notation we can now unambiguously deﬁne period- $t$ post-all-decisions but pre-realization-of-returns expected value as being calculable immediately after the portfolio share has been chosen (and as in the main text we use a Gothic font for this $v$ because the Goths ﬂourished after the Romans):

Having now established this conceptual sequence, we can dispense with the $𝜖$ timing conventions for all variables, and simply use a $t$ subscript to denote the value of any variable determined at any point within the period, leaving the reader to remember the logic of the implicit timing above. For example, beginning-of-period- $(t + 1)$ (January 01 12:00:00) ﬁnancial capital is the same as end-of-period- $t$ assets because an inﬁnitesimal amount of time separates them:

since the value of $et$ is in principle determined at the last instant of period $t$ ; that is, by $et$ we expect the reader to understand us to mean what we wrote more elaborately as $et+1−1𝜖$ above. Likewise, in the simpler notation, we can rewrite (75) more compactly as

Now we can imagine inserting another step (in principle, between steps $5𝜖$ and $6𝜖$ ; but now that our timing is clear, we will use the simpler notation) to calculate the optimal risky share as the share that maximizes expected value:

which lets us construct a function $∗ 𝔳t(at)$ that calculates expected-value-given-optimal-portfolio-choice (with the asterisk accent indicating this is the maximum):

whose derivative is calculable as

Collecting all of this, in our new notation the Roman-step problem is

with FOC

and the Envelope theorem says

To make further progress, we now must specify the structure of the utility function. We consider two utility speciﬁcations, respectively called CobbDouglas and CDC. The CDC function is designed to capture the following:

Remain homothetic so that the problem scales
Allow diﬀerent relative risk aversion for ﬂuctuations in wealth $ϱ$ versus in consumption $ρ$
Allow utility weights for consumption and wealth that are independently determined from their risk aversions

and the relationship between $ua$ and $uc$ allows us to write

In the CobbDouglas value function, relative risk aversion with respect to (proportional) ﬂuctuations in $a$ (for a ﬁxed $c$ ) is given by $δρ$ , while relative risk aversion with respect to (proportional) ﬂuctuations in $c$ (for a ﬁxed $a$ ) is given by $(1 − δ)ρ$ . Suppose we calibrate $δ$ to 1/3, so that in the last period of life a consumer who faced no risk would choose to set $a = (1∕2 )c$ . In such a case, when we consider introducing rate of return risk, the consumer’s relative aversion to consumption risk will be twice as large as their relative aversion to ﬂuctuations in ﬁnancial balances.

A.1 Solution

Now, as in the main text, designate a matrix of values of end-of-period assets in $[a]i$ for $i ∈ {0,...,I − 1 }$ and for each element in $i$ compute the corresponding matrix of values of Gothic maximized value (for the particular problem we have speciﬁed, these matrices will have only one dimension – they will be vectors):

That is, $∗a ∗a [𝔳t]i = 𝔳t([a ]i) ∀ i$ .

A.2 CobbDouglas Model

Now suppose for convenience we deﬁne $`ρ = ρ (1 − δ) + δ$ so that

and we deﬁne a pseudo-inverse function

Then

Now we use the fact that for an optimizing consumer

For any ﬁxed $a > 0$ this is a nonlinear equation that can be solved for the unique $c$ that satisﬁes it. (See below for discussion of options for solving the equation).

Deﬁne the ‘consumed’ function obtained in that manner as $𝔠 (a) t$ .

Now for convenience deﬁne a matrix of values of $−1 μ$ calculated at the values in $[a]$ :

So for any given $[a]i$ we must have

We can now construct a consumption function $ct+3𝜖(mt+3𝜖) ≡ ct(mt)$ that corresponds to the Roman period (by interpolating among using ${[m ],[c]}$ ).

If income is nonstochastic, say at $yt = 1$ , we can (if we like) deﬁne $[b] = [m ] − 1$ and construct the Greek version of the solution from

and so on.

A.3 CDC Utility Speciﬁcation

If we deﬁne a pseudo-inverse function

and a corresponding

as in the main text, we can construct a list of gridpoints and an interpolating consumption function as in the basic model in the main text.