Michael Kitces wrote an intriguing article in 2008, which notably quantified the (empirical) relationship between the Cyclically Adjusted PE ratio (aka CAPE) and safe withdrawal rates (SWR) of subsequent retirement cycles. This blog article extends this study, adding ten more years of data (i.e. up to 2017), and then ponders about the practical applicability of such findings.
CAPE and Realized Returns
Professor Shiller’s popular valuation metric, CAPE (also known as PE10), is the ratio of price per share to earnings per share, where earnings are averaged over the prior 10 years. It is typically used to assess if a given market appears to be strongly overvalued or undervalued. Past crises in the US like the Great Depression (1929) or the Internet bubble (2002) were indeed characterized by exceptional high CAPE numbers right before the crash occurred, as illustrated by the following chart, which shows the S&P 500 CAPE trajectory over the years.
Note: This graph only uses annual values (beginning of the year). Similar graphs from Multpl.com provide more granularity, showing intra-year peaks. The computations in this entire blog are based on annual values, for simplicity. The historical average is the average since the first known year (1881).
Many people pondered how related a given CAPE value would be with future (realized) returns. The following graphs are based on computing, for each starting year since 1881, the subsequent annualized return of the S&P 500 index over XX years (XX ranging from 1 to 25). The hypothesis is that the CAPE value, at the beginning of the starting year, is somewhat correlated with future returns. The correlation metric displayed on the graph is ‘R2’ (which is the square value of the regular correlation function).
Multiple tests were performed, first one including all years starting from 1881, second one restricted to 1927+, then more modern times only (1950+ and 1970+). The quality of old financial data is known to be poor, notably before the Cowles commission started to consistently archive financial data. Unsurprisingly, the results are less convincing when the older data (1881-1926) is included. Otherwise, the results are fairly consistent. The research in this article will consequently focus on the 1927+ time period.
Looking at the results along the horizontal axis, the correlation (R2) is very poor to start with (people trying to time the market based on valuation metrics have little hope of succeeding!), then becomes significant (higher than 0.4) when at least 10 years of returns are included. When more than 25 years are included, then the correlation starts dropping. The last point on the horizontal axis is the correlation (R2) of CAPE with the average of the 10-yrs, 11-yrs … 25-years series (hence mitigating side-effects of market valuations for the last year of a time period; note the significant improvement of this specific measure for the 1950+/1970+ chart).
Note: As to the fast degradation of the results at the end of the trajectory for 1970+, the author suspects it is mostly due to lack of data, there aren’t many 30 years cycles between 1970 and 2017 (less than 10!) to draw proper statistics from.
Such empirical finding appears to indicate that valuations did matter, and that subsequent returns were significantly dependent on the valuation of the starting years, at least in the mid/long-term (e.g. at least a decade). Still, a correlation (R2) of 0.4 to 0.5 isn’t that strong of an indicator, and we’ll see later that there is quite variability in the actual numbers compared to expectations.
CAPE and Safe Withdrawal Rates
A Safe Withdrawal Rate (SWR) is the maximum spending rate (adjusted for inflation every year) allowing to keep the portfolio in the black at the end of a full retirement period (e.g. 30 years). In other words, this is the maximum amount (inflation-adjusted) one could have spent (withdrawn) at the beginning of a given retirement period, and then in each following year, to avoid running out of money.
Note: Technically, this definition is actually known as ‘Sustainable Spending Rate’ (SSR). The term ‘SWR’ is intended to be either (1) synonymous with SSR, or (2) the minimum value of a collection of SSR (across a range of starting years), depending on the authors. This article assumes the former meaning, a quantity related to a SINGLE retirement cycle.
An intriguing finding reported in Michael Kitces’ article is that the correlation between CAPE and the SWR for the coming 30 years was remarkably strong. Let’s run those numbers again (starting in 1927, up and including the 1988 cycle, which ends in 2017), using multiple 2-funds portfolios (S&P 500, Intermediate-Term Treasuries), and varying the asset allocation.
The correlation (R2) is indeed impressively high for portfolios with a fairly high equity exposure (e.g. at least 60% of the portfolio). Unsurprisingly, when the asset allocation is heavier in bonds, the correlation drops, CAPE being an indicator of stock valuation, while bonds have a life of their own.
Michael Kitces pointed out that there is a significant relationship between actual returns (CAGR, Compound Actual Growth Rate) over the first 15 years of the retirement period and the SWR. And quite obviously, there is also a (looser) linkage with the returns over the entire retirement period. This is an indirect way to observe that the SWR metric is highly dependent on the sequence of returns (while the CAGR metric is not). Let’s expand the previous chart to capture the CAPE correlation with both SWR and CAGR numbers, both expressed in inflation-adjusted terms. Those results are really quite remarkable, the CAGR correlation numbers being rather weak, while the SWR correlation numbers are much more significant, at least when equity exposure is high enough.
Michael Kitces also broke down results per CAPE quintile -Figure 7 in his article (*)-, here is a similar table, using a 60/40 asset allocation. This clearly shows that SWR numbers were strongly dependent on the starting CAPE value, and that starting years with elevated valuations lead to the most difficult outcomes.
(*) Note that Kitces computed his version of the table using a different time period (e.g. 1881-2007), and taking in account monthly values (while research for this blog was based on a simpler model using annual values).
Here is the same table for a 80/20 asset allocation. As a side note, if one uses SWR as a measure of risk (as opposed to the academic definition equating risk with volatility), i.e. focusing more about income protection than portfolio value, then this table shows that this equity-heavy asset allocation turned out to be LESS risky than the previous one.
Those quintile results should be taken with a grain of salt though, as the breakdown in quintiles changed a bit over time, depending on the time period being studied.
1/CAPE and Safe Withdrawal Rates
CAPE is an interesting metric, but its inverse (1/CAPE) is probably more directly useful. It turns out to be a very simple, and yet somewhat effective way to compute expected returns in real (inflation-adjusted) terms, in foresight.
From a correlation standpoint, the 1/CAPE results (R2) against actual subsequent returns are of course exactly the same as previously described with CAPE, i.e. fairly decent (0.4 to 0.6), but not that great. Correlation only captures directionality though, it doesn’t say anything about the amplitude of the results, nor does it provide a practical model for estimating expected returns in foresight.
It turns out that 1/CAPE has displayed this interesting property of matching the amplitude of actual returns, at least when computing returns over 10 to 25 years. The average difference between the expected returns (1/CAPE) and the actual subsequent returns ranged from 0.2% to 0.3% (1/CAPE overshooting a bit) in the 1927-2017 time interval, which is pretty remarkable. Unfortunately, the root-mean-square deviation, aka RMSD, (which essentially captures the variability around the benchmark) was between 2% and 3%, which isn’t that good.
Please remember that expected returns are probabilistic in nature, they are intended to provide a rough epicenter across a wide range of possible outcomes, but certainly NOT a specific prediction. This variability strongly blunts its usefulness, as illustrated by the following chart (each vertical compares 1/CAPE for a given starting year with the actual annualized returns of the following 20 years, as a case in point).
1/CAPE and Safe Withdrawal Rates
Combining the various observations we made so far, one could wonder if there is a special relationship between 1/CAPE and SWR, which not only captures correlation, but also amplitude, and which could help with regular portfolios (including bonds).
- 1/CAPE is an expected return model for stocks, hence a probabilistic ‘guesstimate’ of the coming CAGR for the stocks part of the portfolio over a couple of decades, for a given starting year. This is a model established in foresight, quantifying the expected gains of the portfolio in a probabilistic manner.
- One can easily extend the expected return model for a regular portfolio, by assuming 2% (real) for Intermediate-Term (IT) bonds, which is the historical average. The model then becomes (1/CAPE) * %-stocks + (2%) * %-bonds.
- SWR is a tool allowing to analyze the past, in hindsight. It aims at the full depletion of the portfolio in a period of time (say 30 years), hence a higher rate than simply spending the gains. SWR is quite susceptible to a tough sequence of returns.
- Both are real (inflation-adjusted) quantities.
Note: It would seem tempting to use the bond’s current SEC yield instead of the 2% constant, as it is a fairly good expected returns model for nominal IT bonds returns in the following decade. Unfortunately, adjusting it for expected inflation and using such numbers for a longer horizon (e.g. 30 years) proved problematic, as the author experienced when trying. The historical ups and downs of bond’s real returns in the US (and other countries) did follow a rather unpredictable trajectory.
Let’s take a 60/40 portfolio, and empirically compare both metrics (SWR, expected returns) for each starting year since 1927. Then let’s do the same thing with a 40/60 portfolio, then a 80/20 portfolio. The results are striking. Look closely, very few blue dots on those charts are meaningfully higher than the corresponding red dot. And for most exceptions, the blue dots are higher than 7% (the historical average stock returns in the US), and common sense would then dictate to curb one’s enthusiasm…
In other words, the expected returns model (capped by common sense) provided a solid lower cap for the SWR to come, while avoiding to undercut everything by the single worst case in history (aka ‘4% rule’). Considering the probabilistic nature of expected returns models and the dispersion of actual returns in real life, this seems a remarkable result.
Sharp eye readers might notice a few troublesome data points though, i.e. the early 70s for the 80/20 chart. This is the only known example in US history where a deep stock crisis resulted from an external event (oil crisis), while valuations were fairly subdued at the time. But well, one should never say never with stock markets… This observation should incite retirees to use common sense, and reassess where they are every few years. There is a lot to be said for being adaptive, and self-correct course.
The author tried more extreme asset allocations (100/0, 20/80, 0/100) and results get more haphazard. There is a sweet spot here, with bonds and stocks balancing each other, roughly in line with Benjamin Graham’s recommendation of sticking within the bounds of 25% stocks and 75% stocks.
Doubts and Reinforcement
Critical readers are probably rolling their eyes at the very empirical nature of those findings, perceiving that those graphs aren’t that convincing and might simply be a bad case of curve-fitting. Also, drawing conclusions from 90 years of data (1927-2017), hence only 3 non-overlapping periods of 30 years, would not satisfy any statistician. There is definitely some truth to such views.
Furthermore, quite some ink has been spent by various writers criticizing the CAPE metric itself, with various considerations about changing regulations about earnings reporting, changing dynamics between dividends and earnings, changing dynamics in the ease and popularity of investing that may affect valuations, etc. Some well-known industry players added to the confusion by introducing dubious ways to compute P/E ratios (e.g. ignoring negative earnings). Fortunately, S&P, MSCI and Prof. Shiller continue to do the PE (and CAPE) math by simply combining earnings, whatever they are, viewing an index as one giant merger.
It would be extremely valuable to run out-of-sample tests to refine or invalidate the findings (e.g. using historical data from another country than the US, or focus on a different market segment than large-cap blend – as embodied by S&P 500 index). Unfortunately, although historical price and return data can be found, there is very little publicly available historical data about earnings (which are required to compute P/E and CAPE metrics). This line of thinking is an impasse in other words, frustratingly so.
Another approach is to ponder about time-insensitive common sense explanations to the empirical observations. And it turns out that there is a fairly solid case that 1/PE conveys true semantics.
- First, if the ‘P’ (Price) factor gets artificially inflated solely due to the occasional outburst of speculative foolishness than human beings regularly displayed in the past, then of course, future returns should mechanically become much more muted (good news is that your portfolio inflated as well… or at last the stocks part of it).
- Furthermore, when prices do get inflated or deflated in such speculative manner, in the past, it usually took a few years (a decade at most) before the stock market re-adjusted itself (often violently!). We can see the sequence of returns at play here, hence the linkage with SWR underlying dynamics.
- Next, doing a modicum of algebra on E/P (the inverse of PE), you can observe that it is equal to D/P + (E-D)/P. The first factor is dividends, the first part of total returns. The second factor is about re-invested earnings, which should be the fundamental reason to drive future earnings growth, hence (non-speculative) price increase (the second part of total returns). As the great Benjamin Graham once stated:
- The second advantage of common stocks lay in their higher average return to investors over the years. This was produced both by an average dividend income exceeding the yield on good bonds and by an underlying tendency for market value to increase over the years in consequence of the reinvestment of undistributed profits.
- In other words, using 1/PE (or 1/CAPE) as a model for expected returns goes much further than empirical observations, there is a common sense explanation for the fact that it should work – statistically speaking, on average, in the midst of speculative outbursts.
- Finally, the trajectory of earnings year over year did display a lot of variability. Cyclically smoothing the P/E ratio like Prof. Shiller suggested in the CAPE (otherwise known as P/E10) computation is obviously sensible.
- One could also note that the “reinvestment of undistributed profits” is a cumulative process, and the ‘E10’ quantity might be better perceived as a sum divided by a constant instead of a ‘smoothing’ average trick.
Given this state of affairs, the author thinks that there is probably a good kernel of truth to the whole CAPE and 1/CAPE theme, and that there is merit in staying cautious AND open-minded, while acknowledging the probabilistic nature of such metrics.
Practical application – fixed withdrawals
One of the most fundamental decisions in life (when to retire?) is often summarized as “did I reach my number” or “how much can I spend in retirement”. This often translates into blindly using the “4% SWR rule” for assessing future fixed (inflation-adjusted) withdrawals. Unfortunately, many people appear to just follow such simple rule of thumb. This leads to obvious contradictions, as explained with great clarity by Michael Kitces at the beginning of his article.
Kitces made a case that it would be wise to factor in current valuations (e.g. CAPE) to get a better assessment of SWR expectations. It should make would-be retirees much more cautious when markets appear to be highly valued (as of the time of writing!). It should also help avoiding ultra-conservatism for would-be retirees at a time when valuations are on the low side (a higher SWR would then be perfectly reasonable, we have only one life, don’t miss the opportunity to enjoy it more). It is difficult to quantify though, without taking the risk of unduly underestimating the probabilistic nature of valuations. The quintile table suggested by Michael Kitces (and updated in this blog) is probably a good compromise for (conservative) decision-making. Such tables do not take in account the exact Asset Allocation though, notably the bonds exposure. They are also US-centric, and unfortunately cannot be generalized (by lack of historical data).
Another way to proceed would be to use the findings about Expected Returns vs. SWR that were previously described (the red dots vs. blue dots charts). Let’s make the leap of faith that the 1/CAPE model would apply to International returns (based on the ‘common sense’ explanation, and also some research from Star Capital). The very useful Star Capital ‘stock market valuation’ Web page provides CAPE values for numerous countries and areas of the world (underlying data source is MSCI).
Let’s take various asset allocations (rows 175 and 176 in the screenshot below), use our simple equation for expected returns and illustrate expectations for an investor concentrating equities on the US market (row 177), or an investor diversifying equities worldwide (row 178), as of Jan 31st, 2018.
The resulting expected returns should provide a conservative lower cap for SWR expectations. This is NOT a pretty picture, and even less for US-only investors or for investors with heavy bonds exposure. This is hardly a surprise, considering the current CAPE valuation of the US market (check again the very first chart of this article). Sure enough, this is a conservative assessment, reality will hopefully get a bit rosier than that, but the historical record (blue/red dots charts) seems to indicate that it wasn’t too far off in dire situations. This is sobering, to say the least.
As previously mentioned, it would seem reasonable to redo such assessment every few years, to correct course as needs be. This seems even more important when starting from a rather unusual environment (either displaying abnormally high or low valuations, or following a generational event like the oil crisis in the 70s). The probabilistic nature of expected returns is such that one really should enforce some sort of self-correction mechanism (while keeping a good dose of common sense).
Practical application – variable withdrawals
The author is a firm believer in having a precise plan in retirement to figure out one’s annual spending budget, and of the virtues of using a variable withdrawal method to mechanically tighten the belt when times are tough, and relax when times are rosy. Robust decision rules (e.g. Guyton-Klinger) or actuarial methods (e.g. VPW and derivatives) appear to work pretty well in this respect.
A vexing issue though is to try to get a rough idea, in foresight, of where this might lead a would-be retiree, in terms on average spend over one’s retirement, based on a given starting portfolio (and fixed income flows, e.g. SSA, Pension, etc). This is actually very similar to the previous questions: the decision of when to retire, or what to expect when retiring.
A fascinating article from David Blanchett about “Simple Formulas to Implement Complex Withdrawal Strategies” introduced a metric called “Withdrawal Efficiency Rate” (WER), which essentially quantifies the performance of a variable withdrawal method (simulating the rules operating in foresight) and comparing to the SWR (established in hindsight) for a given retirement cycle. Such concepts can be used to backtest and analyze the performance of robust variable methods. Long story short, when sensible parameters are used, the WER goes in the upper 90%, which is excellent.
Now let’s connect the dots. We have seen that we can get, in foresight, a conservative estimate of the SWR to come based on a very simple expected return equation (itself leveraging on the 1/CAPE computation). We know that robust variable withdrawal methods can deliver, on average, a retirement income close to the actual SWR. The vexing issue suddenly received an interesting answer. Simply apply the expected returns formula (derived from 1/CAPE) to the starting portfolio, and one should get a conservative estimate on the subsequent average withdrawals (inflation-adjusted) during the retirement period.
This is a robust line of thought, in line with the probabilistic nature of expected returns, because even if this assessment turns out to be somewhat off the mark, a proper variable withdrawal method will keep you disciplined, and maybe the future will be less (or more) rosy than expected, and so be it. At least, you made an informed decision instead of completely jumping in the blind, or totally underestimating outcomes. And such assessment can be refreshed every now and then, to keep your eyes open.
It cannot be emphasized enough that empirical findings have no hard guarantee of reproducing themselves in the future. This is especially true when there is no practical way to test findings out of sample, and when the metrics being looked at are very probabilistic in nature. Still, an empirical finding that comes with a time-insensitive common sense rationale, and that provided consistent results over nearly a century of data, should probably not be ignored.
If one can use such findings in a context where error tolerance is fairly high, with some sort of self-correction process, then it should become quite reasonable. The author believes that the findings described in this article present such opportunity, when taken with appropriate caution. To summarize:
- Safe Withdrawal Rates (SWR) for retirement cycles of 30 years have been surprisingly highly correlated with CAPE starting valuations, at least for portfolios with a significant equity exposure (60% or more).
- Comparing current valuations to past occurrences (say a quintile breakdown) could give a rough idea of reasonable SWRs for a given starting situation, avoiding to gate one’s retirement spending budget by the single worst case in history.
- A more elaborate approach consists of using a simple expected returns model (e.g. based on 1/CAPE and historical bonds returns), use some common sense to sanitize the outcome, and use the result as a lower cap for SWR expectations.
- Such logic has applicability for fixed withdrawal methods as well as (sensible) variable withdrawal methods. Just don’t forget to re-assess from time to time, and use common sense.