For example, standard risk-based portfolio allocation methods (minimum variance, equal risk contributions, hierarchical risk parity…) critically depend on the ability to build accurate volatility forecasts^{1}.
Multiple methods for estimating volatility have been proposed over the past several decades, and in this blog post I will focus on range-based volatility estimators.
These estimators, the first of which introduced by Parkinson^{2} as a way to compute the true variance of the rate of return of a common stock^{2}, rely on the highest and lowest prices of an asset over a given time period to estimate its volatility, hence their name^{3}.
After describing the four most well known range-based volatility estimators, I will reproduce the analysis of Arthur Sepp in his presentation Volatility Modelling and Trading^{4} made at Global Derivatives Conference 2016 and test the predictive power of the naive volatility forecasts produced by these estimators for various ETFs.
Notes:
A very accessible series of papers about range-based volatility estimators has recently^{5} been released by people at Lombard Odier, c.f. here, here and here.
One of the main^{6} assumptions made when working with range-based volatility estimators^{7} is that the price movements $S_t$ of the asset under consideration follow a geometric Brownian motion with unknown volatility coefficient^{8} $\sigma$ and unknown drift coefficient $\mu$, that is
\[d S_t = \mu S_t dt + \sigma S_t dW_t\], where $W_t$ is a standard Brownian motion.
Under this working assumption, $\sigma$ represents the volatility of the asset.
Although anyone can empirically observe the impact of “volatility” on the prices of a given asset, the volatility coefficient $\sigma$ of this asset is not directly observable^{9} and must be estimated using stock market information.
A statistical estimator of $\sigma$ is then called a volatility estimator, and a statistical estimator of $\sigma^2$ is called a variance estimator.
In order to determine the quality of a volatility estimator, two measures are commonly used:
The bias of a volatility estimator measures whether this estimator produces, on average, too high or too low volatility estimates.
More formally, a volatility estimator $\sigma_A$ is said to be unbiased when $\mathbb{E}[\sigma_A] = \sigma$ and biased otherwise.
The efficiency of a volatility estimator measures the uncertainty of the volatility estimates produced by this estimator, with the greater the efficiency of the estimator, the more accurate the volatility estimates.
More formally, the relative efficiency $Eff \left( \sigma_A \right)$ of a volatility estimator $\sigma_A$ compared to a reference volatility estimator $\sigma_B$ is defined as the ratio of the variance of the estimator $\sigma_B^2$ over the variance of the estimator $\sigma_A^2$, that is,
\[Eff \left( \sigma_A \right) = \frac{Var \left( \sigma_B^2 \right)}{Var \left( \sigma_A^2 \right)}\]To be noted that bias and efficiency are sometimes conflicting, which is more generally known in statistics as the bias-variance tradeoff.
Let $C_1,…,C_T$ be the closing prices of an asset for $T$ time periods $t=1..T$^{10}.
Then,
\[\sigma_{cc,0} \left( T \right) = \sqrt{ \frac{1}{T-1} \sum_{i=2}^T \ln{\frac{C_i}{C_{i-1}}}^2 }\]is a biased^{11} estimator of the asset volatility $\sigma$ over the $T$ time periods, assuming zero drift (i.e., $\mu = 0$), c.f. Parkinson^{2}.
In addition,
\[\sigma_{cc} \left( T \right) = \sqrt{ \frac{1}{T-2} \sum_{i=2}^T \left( \ln \frac{C_i}{C_{i-1}} - \mu_{cc} \right)^2 }\], with $\mu_{cc} = \frac{1}{T-1} \sum_{i=2}^T \ln \frac{C_i}{C_{i-1}} $, is a biased^{11} estimator of the asset volatility $\sigma$ over the $T$ time periods, assuming non-zero drift (i.e., $\mu \ne 0$), c.f. Yang and Zhang^{12}.
These two estimators are known as close-to-close volatility estimators.
Let be:
As mentioned in the introduction, a volatility estimator fully or partially relying on the highest prices $H_t, t=1..T$ and on the lowest prices $L_t, t=1..T$ is called a range-based volatility estimator.
The underlying idea behind such estimators is that information contained in the asset high-low price ranges $H_t - L_t, t=1..T$ should allow to build volatility estimators that are more efficient than the close-to-close volatility estimators, which use only one price inside this range^{13}.
This quest for efficiency is important because, contrary to one of the working assumptions^{6}, the volatility of an asset is known to be time-varying^{14}, so that the less the number of time periods required to estimate its volatility, the more chances that its volatility is constant(ish) over the time periods under consideration.
As Rogers et al.^{15} put it:
[…] volatility may change over long periods of time; a highly efficient procedure will allow researchers to estimate volatility with a small number of observations.
Parkinson^{2} introduces an estimator for the diffusion coefficient of a Brownian motion without drift that relies on the highest and lowest observed values of this Brownian motion over a given time period.
When applied to the estimation of an asset volatility, this gives the Parkinson volatility estimator $\sigma_{P} \left( T \right)$ defined over $T$ time periods by
\[\sigma_{P} \left( T \right) = \sqrt{\frac{1}{T}} \sqrt{\frac{1}{4 \ln 2} \sum_{i=1}^T \left( \ln \frac{H_i}{L_i} \right) ^2}\]Intuitively, the Parkinson estimator should be “better” than the close-to-close estimators because large price movements impacting the high-low price range $H_t - L_t$ but leaving the closing price $C_t$ unchanged might occur within any time period $t$.
This is confirmed by the efficiency of this estimator, up to 5.2 times higher than the efficiency of the close-to-close estimators^{16}.
Garman and Klass^{17} propose to improve the Parkinson estimator by taking into account the opening prices $O_t, t=1..T$ and the closing prices $C_t, t=1..T$.
This leads to the Garman-Klass volatility estimator $\sigma_{GK} \left( T \right)$, defined over $T$ time periods by
\[\sigma_{GK} \left( T \right) = \sqrt{\frac{1}{T}} \sqrt{ \sum_{i=1}^T \frac{1}{2} \left( \ln\frac{H_i}{L_i} \right) ^2 - \left( 2 \ln2 - 1 \right) \left( \ln\frac{C_i}{O_i} \right )^2 }\]For the historical comment, Garman and Klass^{17} establish in their paper that $\sigma_{GK}$ is the “best reasonable”^{18} volatility estimator that depends only on the high-open price range $H_t - O_t$, the low-open price range $L_t - O_t$ and the close-open price range $C_t - O_t$, $t=1..T$.
The Garman-Klass estimator is up to 7.4 times more efficient than the close-to-close estimators^{16}.
The Parkinson and the Garman-Klass estimators have both been derived under a zero drift assumption.
When this assumption is not verified for an asset, for example because of a strong upward or downward trend in the asset prices or because of the usage of large time periods (monthly, yearly…), these estimators should in theory not be used because the quality of their volatility estimates is negatively impacted by the presence of a non-zero drift^{19}^{15}.
In order to solve this problem, Rogers and Satchell^{19} devise the Rogers-Satchell volatility estimator $\sigma_{RS} \left( T \right)$, defined over $T$ time periods by
\[\sigma_{RS} \left( T \right) = \sqrt{\frac{1}{T}} \sqrt{ \sum_{i=1}^T \ln\frac{H_i}{C_i} \ln\frac{H_i}{O_i} - \ln\frac{L_i}{C_i} \ln\frac{L_i}{O_i} }\]The Rogers-Satchell estimator is up to 6 times more efficient than the close-to-close estimators^{19}, which is less than the Garman-Klass estimator^{20}.
The range-based volatility estimators discussed so far do not take into account opening jumps in an asset prices^{21}, that is, the potential difference between an asset opening price $O_t$ and its closing price $C_{t-1}$ for a time period $t$^{22}.
This limitation causes a systematic underestimation of the true volatility^{12}.
When trying to integrate opening jumps into the Parkinson, the Garman-Klass and the Rogers-Satchell estimators, Yang and Zhang^{12} discover that it is unfortunately not possible for any “reasonable” single-period^{23} volatility estimator to properly handle both a non-zero drift and opening jumps.
This leads them to introduce the multi-period^{23} Yang-Zhang volatility estimator $\sigma_{YZ} \left( T \right)$, defined over $T$ time periods by
\[\sigma_{YZ} \left( T \right) = \sqrt{ \sigma_{ov}^2+ k \sigma_{oc}^2 + (1-k ) \sigma_{RS}^2 ) }\], where:
$\sigma_{co} \left( T \right)$ is the close-to-open volatility, defined as
\[\sigma_{co} = \sqrt{\frac{1}{T-2} \sum_{i=2}^T \left( \ln \frac{O_i}{C_{i-1}} - \mu_{co} \right)^2}\], with $\mu_{co} = \frac{1}{T-1} \sum_{i=2}^T \ln \frac{O_i}{C_{i-1}}$
$\sigma_{oc} $ is the open-to-close volatility, defined as
\[\sigma_{oc} \left( T \right) = \sqrt{\frac{1}{T-2} \sum_{i=2}^T \left( \ln \frac{O_i}{C_{i}} - \mu_{oc} \right)^2}\], with $\mu_{oc} = \frac{1}{T-1} \sum_{i=2}^T \ln \frac{C_i}{O_{i}}$
$\sigma_{RS}$ is the Rogers-Satchell volatility estimator over the time periods $t=2..T$
$k = \frac{0.34}{1.34 + \frac{T}{T-2}}$
In addition to the new estimator $\sigma_{YZ}$, Yang and Zhang^{12} also provide multi-period versions of the Parkinson, the Garman-Klass and the Rogers-Satchell estimators that support opening jumps^{24}.
The Yang-Zhang estimator is up to 14 times more efficient than the close-to-close estimators^{12}, a result that Yang and Zhang^{12} comment as follows
The improvement of accuracy over the classical close-to-close estimator is dramatic for real-life time series
The family of range-based volatility estimators has many other members:
Still, the Parkinson, the Garman-Klass, the Rogers-Satchell and the Yang-Zhang volatility estimators are representative of this family, so that I will not detail any other range-based volatility estimator in this blog post.
Range-based volatility estimators are based on the assumption of independent sample and observations within the sample^{4}, so that the corresponding volatility forecasts are simply naive forecasts under a random walk model.
In other words, with such volatility estimators, the “natural” forecast of an asset volatility over the next $T$ time periods is the (past) estimate of the asset volatility over the last $T$ time periods.
That being said, it is perfectly possible to use range-based volatility estimates together with any volatility forecasting model such as:
Theoretical and practical performances of range-based volatility estimators are studied in several papers, for example Shu and Zhang^{32}, Jacob and Vipul^{28} and Brandt and Kinlay^{33}, among others.
Most of these studies agree that range-based volatility estimators are biased^{11}, but other conclusions differ depending on the exact methodology used.
In particular, as highlighted by Brandt and Kinlay^{33}, the results from empirical research differ significantly from those seen in simulation studies in a number of respects^{33}.
One perfect example of these differences is Shu and Zhang^{32} concluding, using a Monte Carlo simulation, that
If the drift term is large, the Parkinson estimator and the [Garman-Klass] estimator will significantly overestimate the true variance […]
, while Jacob and Vipul^{28} concluding, using real stock market data, that
Overall, the [Garman-Klass] estimator, which indirectly adjusts for the drift, performs better for the high-drift stocks.
Motivated by such inconsistencies, Lyocsa et al.^{34}, building on Patton and Sheppard^{35}, introduced what I will call the Lyocsa-Plihal-Vyrost volatility estimator $\sigma_{LPV}$, defined as the arithmetic average of the Parkinson, the Garman-Klass and the Rogers-Satchell volatility estimators^{36}
\[\sigma_{LPV} = \frac{\sigma_{P} + \sigma_{GK} + \sigma_{RS}}{3}\]As Lyocsa et al.^{34} explain, the motivation behind using the (naive) equally weighted average is based on the assumption that we have no prior information on which estimator might be more accurate^{34}.
I personally like the idea of an averaged estimator, but at this point, I think it is safe to highlight that there is no “best” range-based volatility estimator…
Portfolio Optimizer implements all the volatility estimators discussed in this blog post:
/assets/volatility/estimation/close-to-close
/assets/volatility/estimation/parkinson
/assets/volatility/estimation/garman-klass
/assets/volatility/estimation/garman-klass/original
/assets/volatility/estimation/rogers-satchell
/assets/volatility/estimation/yang-zhang
, as well as their jump-adjusted variations, whenever applicable.
To illustrate possible uses of range-based volatility estimators, I propose to reproduce a couple of results from Sepp^{4}:
Such examples will allow to compare the empirical behavior of the different volatility estimators and maybe reach a conclusion as to their relative performance in this specific setting.
I will estimate the SPY ETF monthly volatility using all the daily open/high/low/close prices^{37} observed during that month^{38}.
Figure 1, limited to 5 volatility estimators for readability purposes, illustrates the results obtained over the period 31 January 2005 - 29 February 2016^{39}.
Figure 1 is mostly identical to the figure on slide 22 from Sepp^{4}, on which it seems in particular that the close-to-close and the Yang-Zhang volatility estimators provide higher estimates of volatility when the overall level of volatility is high^{4}.
Overall, though, the behavior of the different volatility estimators is essentially the same on this specific example, which is confirmed by their correlations displayed in Figure 2.
Using the same methodology as in Sepp^{4}, I will now evaluate the quality of the naive forecasts produced by all the range-based volatility estimators implemented in Portfolio Optimizer against the next month’s close-to-close observed volatility^{40}, for 10 ETFs representative of misc. asset classes:
These ETFs are used in the Adaptative Asset Allocation strategy from ReSolve Asset Management, described in the paper Adaptive Asset Allocation: A Primer^{41}.
For each ETF, Sepps’s methodology is as follows:
At each month’s end, compute the volatility estimates $\sigma_{cc, t}$, $\sigma_{P, t}$, … using all the ETF daily open/high/low/close prices^{37} observed during that month^{38}
Under a random walk volatility model, each of these estimates represents the next month’s volatility forecast $\hat{\sigma}_{t+1}$
At each month’s end, also compute the next month’s close-to-close volatility estimate $\sigma_{cc, t+1}$ using all the ETF daily close prices^{37} observed during that month^{38}
This estimate is the volatility benchmark, which represents how the ETF “volatility” is perceived by an investor monitoring her portfolio daily.
Once all months have been processed that way, regress the volatility forecasts on the volatility benchmarks by applying the Mincer-Zarnowitz^{42} regression model:
\[\hat{\sigma}_{t+1} = \alpha + \beta \sigma_{cc, t+1} + \epsilon_{t+1}\], where $\epsilon_{t+1}$ is an error term.
Then, the estimator producing [the best] volatility forecast is indicated by [a] high explanatory power R^2, [a] small intercept $\alpha$ and [a] $\beta$ coefficient close to one^{4}.
In the case of the SPY ETF, Figure 3 illustrates Sepps’s methodology for the Lyocsa-Plihal-Vyrost volatility estimator $\sigma_{LPV}$ over the period 31 January 2005 - 29 February 2016.
Detailed results for all regression models over the period 31 January 2005 - 29 February 2016:
Volatility estimator | $\alpha$ | $\beta$ | $R^2$ |
---|---|---|---|
Close-to-close | 4.1% | 0.75 | 57% |
Close-to-close (zero drift) | 3.9% | 0.77 | 57% |
Parkinson | 3.5% | 0.95 | 58% |
Parkinson (jump-adjusted) | 3.4% | 0.79 | 58% |
Garman-Klass | 3.7% | 0.92 | 57% |
Garman-Klass (jump-adjusted) | 3.6% | 0.77 | 58% |
Garman-Klass (original) | 3.7% | 0.92 | 57% |
Garman-Klass (original, jump-adjusted) | 3.6% | 0.77 | 58% |
Rogers-Satchell | 4.0% | 0.88 | 56% |
Rogers-Satchell (jump-adjusted) | 3.9% | 0.74 | 57% |
Yang-Zhang | 3.8% | 0.75 | 58% |
Lyocsa-Plihal-Vyrost | 3.7% | 0.92 | 57% |
While these figures are far^{43} from those on slide 42 from Sepp^{4}, with for example nearly no variation in terms of $R^2$ among the different volatility estimators, two observations are similar:
Going beyond the SPY ETF, averaged results for all ETFs/regression models over each ETF price history^{44} are the following:
Volatility estimator | $\bar{\alpha}$ | $\bar{\beta}$ | $\bar{R^2}$ |
---|---|---|---|
Close-to-close | 5.8% | 0.66 | 44% |
Close-to-close (zero drift) | 5.6% | 0.67 | 45% |
Parkinson | 5.6% | 0.94 | 44% |
Parkinson (jump-adjusted) | 4.9% | 0.70 | 45% |
Garman-Klass | 5.7% | 0.93 | 43% |
Garman-Klass (jump-adjusted) | 5.0% | 0.70 | 44% |
Garman-Klass (original) | 5.7% | 0.93 | 43% |
Garman-Klass (original, jump-adjusted) | 5.0% | 0.70 | 44% |
Rogers-Satchell | 6.1% | 0.88 | 42% |
Rogers-Satchell (jump-adjusted) | 5.2% | 0.68 | 43% |
Yang-Zhang | 5.1% | 0.69 | 44% |
Lyocsa-Plihal-Vyrost | 5.7% | 0.92 | 43% |
A couple of remarks:
As an empirical conclusion, it is disappointing that the naive monthly volatility forecasts produced by range-based volatility estimators have about the same predictive power as the forecasts produced by the close-to-close volatility estimator. Nevertheless, because these forecasts are much less biased than their close-to-close counterparts, they still represent an improvement for the many investors who currently rely on close prices only^{45}.
To also be noted, similar to one of the conclusions of Lyocsa et al.^{34}, that the Lyocsa-Plihal-Vyrost volatility estimator should probably be preferred to the Parkinson, the Garman-Klass or the Rogers-Satchell volatility estimators because using only one range-based estimators has occasionally led to very inaccurate forecasts, which could successfully be avoided by using the average of the three range-based estimators^{34}.
One aspect of range-based volatility estimators not discussed in this blog post is their capability to capture important stylized facts about asset returns^{46}.
This, together with possible ways to incorporate them in more predictive volatility models than the random walk model, will be the subject of future blog posts.
Meanwhile, for more volatile discussions, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
As well as correlation forecasts. ↩
See Parkinson, Michael H., The Extreme Value Method for Estimating the Variance of the Rate of Return, The Journal of Business 53 (1980), 61-65, which is the final version of the working paper The random walk problem: extreme value method for estimating the variance of the displacement (diffusion constant) started 4 years before. ↩ ↩^{2} ↩^{3} ↩^{4}
Because the range of prices of an asset over a given time period is contained, by definition, within its highest and its lowest price. ↩
See Sepp, Artur, Volatility Modelling and Trading. Global Derivatives Workshop Global Derivatives Trading & Risk Management, Budapest, 2016. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8}
At the date of publication of this post. ↩
Other working assumptions are also commonly made, like assuming that the asset does not pay dividends, assuming that the volatility coefficient $\sigma$ remains constant, assuming that the geometric Brownian motion model also applies during time periods with no trading activity (e.g., stock market closure), etc. ↩ ↩^{2}
In details, the geometric Brownian motion assumption slightly differs between authors; for example, Garman and Klass^{17} assume that asset prices follow a more generic diffusion process, which includes the geometric Brownian motion as a specific case. ↩
$\sigma$ is also called the diffusion coefficient of the geometric Brownian motion, but in the context of this blog post, I think it is clearer to explicitly call it the volatility coefficient. ↩
See Andersen, T., Bollerslev, T., Diebold, F., & Labys, P. (2003). Modeling and forecasting realized volatility. Econometrica, 71, 579–625. ↩
In practice, a time period $t$ usually corresponds to a trading day, a week or a month, so that the closing prices $C_t, t=1..T$ are simply the daily, weekly or monthly closing prices of the asset. ↩ ↩^{2}
These estimators are biased, due to Jensen’s inequality; c.f. also Molnar^{46}. ↩ ↩^{2} ↩^{3}
See Yang, D., and Q. Zhang, 2000, Drift-Independent Volatility Estimation Based on High, Low, Open, and Close Prices, Journal of Business 73:477–491. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6}
The asset closing price $C_t, t=1..T$. ↩
See French, K. R., Schwert, G. W., & Stambaugh, R. F. (1987). Expected stock returns and volatility. Journal of Financial Economics, 19, 3–29. ↩
See L. C. G. Rogers, S. E. Satchell & Y. Yoon (1994) Estimating the volatility of stock prices: a comparison of methods that use high and low prices, Applied Financial Economics, 4:3, 241-247. ↩ ↩^{2}
See Colin Bennett, Trading Volatility, Correlation, Term Structure and Skew. ↩ ↩^{2}
See Garman, M. B., and M. J. Klass, 1980, On the Estimation of Security Price Volatilities from Historical Data, Journal of Business 53:67–78. ↩ ↩^{2} ↩^{3}
More precisely, Garman and Klass^{17} establish that a variation of $\sigma_{GK}$ is the “best” reasonable estimator but note that $\sigma_{GK}$ is 1) more practical and 2) as efficient as this variation, which I will call the original Garman-Klass volatility estimator $\sigma_{GKo}$. ↩ ↩^{2}
See L. C. G. Rogers and S. E. Satchell, Estimating Variance From High, Low and Closing Prices, The Annals of Applied Probability, Vol. 1, No. 4 (Nov., 1991), pp. 504-512. ↩ ↩^{2} ↩^{3}
Such a decrease in efficiency cannot be avoided because the Rogers-Satchell estimator belongs to class of estimators studied in Garman and Klass^{17}, so that its efficiency is necessarily smaller than the efficiency of the Garman-Klass estimator (maximal by definition). ↩
Garman and Klass^{17} provide a volatility estimator that takes into account opening jumps, but this estimator has a dependency on an unknown $f$ parameter which makes it unusable in practice; Yang and Zhang^{12} show that this dependency is actually spurious and provide a usable form of this estimator. ↩
When the time periods $t$ are measured in trading days, opening jumps are called overnight jumps. ↩
A single-period volatility estimator is a volatility estimator that can be used to estimate the volatility of an asset over a single time period $t$ using price data for this time period only; for example, the Parkinson, the Garman-Klass and the Rogers-Satchell estimators are single-period estimators while the close-to-close estimators are multi-period estimators. ↩ ↩^{2}
C.f. also Molnar^{46} on this subject. ↩
See Kunitomo, N. (1992). Improving the Parkinson method of estimating security price volatilities. Journal of Business, 65, 295–302. ↩
See Alizadeh ,S., Brandt, W. M., and Diebold, X.F., 2002. Range-based estimation of stochastic volatility models. Journal of Finance 57: 1047-1091. ↩
See Meilijson , I. (2011). The Garman–Klass Volatility Estimator Revisited. REVSTAT-Statistical Journal, 9(3), 199–212. ↩
See Jacob, J. and Vipul, (2008), Estimation and forecasting of stock volatility with range-based estimators. J. Fut. Mark., 28: 561-581. ↩ ↩^{2} ↩^{3}
See Mapa, Dennis S., 2003. A Range-Based GARCH Model for Forecasting Volatility, MPRA Paper 21323, University Library of Munich, Germany. ↩
See Chou, R.Y. (2005). Forecasting Financial Volatilities with Extreme Values: The Conditional Autoregressive Range (CARR) Model. Journal of Money Credit and Banking, 37(3): 561-582. ↩
See Harris, R. D. F., & Yilmaz, F. (2010). Estimation of the conditional variance–covariance matrix of returns using the intraday range. International Journal of Forecasting, 26, 180–194. ↩
See Shu, J. and Zhang, J.E. (2006), Testing range estimators of historical volatility. J. Fut. Mark., 26: 297-313. ↩ ↩^{2}
See Brandt, Michael W. and Kinlay, J, Estimating Historical Volatility (March 10, 2005). ↩ ↩^{2} ↩^{3}
See Lyocsa S, Plihal T, Vyrost T. FX market volatility modelling: Can we use low-frequency data? Financ Res Lett. 2021 May;40:101776. doi: 10.1016/j.frl.2020.101776. Epub 2020 Sep 30. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5}
See Patton A.J., Sheppard K. Optimal combinations of realised volatility estimators. Int. J. Forecast. 2009;25(2):218–238. ↩
The Yang-Zhang volatility estimator is excluded to avoid mixing jump-adjusted volatility estimators with non-jump-adjusted ones. ↩
(Adjusted) prices have have been retrieved using Tiingo. ↩ ↩^{2} ↩^{3}
The jump-adjusted Yang-Zhang volatility estimator, as well as the close-to-close volatility estimators, require the closing price of the last day of the previous month as an additional price. ↩ ↩^{2} ↩^{3}
This period more or less matches with the period used in Sepp^{4}. ↩
The next month’s close-to-close volatility is then taken as a proxy for the next month’s realized volatility; this choice is important, because different proxies might result in different conclusions as to the out-of-sample forecast performances. ↩
See Butler, Adam and Philbrick, Mike and Gordillo, Rodrigo and Varadi, David, Adaptive Asset Allocation: A Primer. ↩
See Mincer, J. and V. Zarnowitz (1969). The evaluation of economic forecasts. In J. Mincer (Ed.), Economic Forecasts and Expectations. ↩
This is due to slight differences in methodology, with mainly 1) the definition of “monthly volatility” in Sepp^{4} taken to be the volatility from the 3rd Friday of a month to the 3rd Friday of the next month and 2) the usage in Sepp^{4} of a linear regression model robust to outliers. ↩
The common ending price history of all the ETFs is 31 August 2023, but there is no common starting price history, as all ETFs started trading on different dates. ↩
For example, for all investors running some kind of monthly tactical asset allocation strategy. ↩
See Peter Molnar, Properties of range-based volatility estimators, International Review of Financial Analysis, Volume 23, 2012, Pages 20-29,. ↩ ↩^{2} ↩^{3}
This methodology allows to easily model known unknowns when designing stress testing scenarios, but falls short with unknown unknows, that is, completely unanticipated correlation breakdowns. Indeed, by definition, these cannot be represented by an a-priori correlation matrix toward which a baseline correlation matrix could be shrunk^{1}…
In this blog post, I will describe another approach that can be used instead in this case, based on random perturbations of a baseline correlation matrix.
As an example of application, I will show how to identify extreme correlation stress scenarios through direct and reverse correlation stress testing.
Notes:
The main reference for this post is a presentation from Opdyke^{2} at the QuantMinds International 2020 event.
As a general reminder, a square matrix $C \in \mathcal{M} \left( \mathbb{R}^{n \times n} \right)$ is a (valid) correlation matrix if and only if
A correlation matrix is a real symmetric matrix.
Thus, from standard linear algebra, any correlation matrix $C$ is diagonalizable by an orthogonal matrix and can be decomposed as a product
\[C = P \Lambda P^{-1}\], where:
This decomposition is called the eigendecomposition of the correlation matrix $C$.
Rapisarda et al.^{3} establish that any correlation matrix $C \in \mathcal{M}(\mathbb{R}^{n \times n})$ can be decomposed as a product
\[C = B B {}^t\], where $B \in \mathcal{M}(\mathbb{R}^{n \times n})$ is a lower triangular matrix defined by
\[b_{i,j} = \begin{cases} \cos \theta_{i,1}, \textrm{for } j = 1 \newline \cos \theta_{i,j} \prod_{k=1}^{j-1} \sin \theta_{i,k}, \textrm{for } 2 \leq j \leq i-1 \newline \prod_{k=1}^{i-1} \sin \theta_{i,k}, \textrm{for } j = i \newline 0, \textrm{for } i+1 \leq j \leq n \end{cases}\]with:
This decomposition is called the hypersphere decomposition, or the triangular angles parametrization^{4}, of the correlation matrix $C$ and is detailed in the previous post of this series.
A random perturbation of a baseline correlation matrix $C$ can be loosely defined as a correlation matrix $\widetilde{C}$ generated “at random” whose correlation coefficients are more or less “close” to those of $C$.
From Opdyke’s^{2} extensive literature review, there are three main^{5} families of methods to generate random perturbations of a correlation matrix:
The first family of methods to randomly perturb a correlation matrix is based on random perturbations of its coefficients.
The most natural method to randomly perturb the coefficients of a correlation matrix simply consists in … randomly perturbing these coefficients!
Unfortunately, this method does not work in general, because the resulting randomly perturbed correlation matrix is almost never a valid correlation matrix due to the lack of positive semi-definiteness.
To illustrate this problem, let’s take an Harry Browne’s permanent portfolio à la ReSolve, equally invested in:
The correlations of these assets over the period 18 November 2004 - 11 August 2023 are displayed in Figure 1, adapted from Portfolio Visualizer.
Before thinking about perturbing all these correlations, let’s assume that we would merely like to perturb the U.S. stock-bond correlation so as to bring it to a level representative of the pre-2000 period, like 0.5 or above, c.f. Figure 2 reproduced from Brixton et al.^{6}.
It turns out that this single perturbation already results in an invalid correlation matrix^{7}!
As a consequence, trying to perturb the coefficients of a correlation matrix both simultaneously and at random has little chance to produce a valid correlation matrix in general, especially as the number of assets increases.
One solution to this issue is to replace the randomly perturbed correlation matrix by its nearest valid correlation matrix^{8}, c.f. the post When a Correlation Matrix is not a Correlation Matrix: the Nearest Correlation Matrix Problem.
This leads to the following naive method to generate random perturbations of a correlation matrix $C \in \mathcal{M} \left( \mathbb{R}^{n \times n} \right)$:
While straightforward to implement^{9}, this method has several limitations:
It requires the computation of the nearest correlation matrix to every randomly perturbed correlation matrix generated
Such systematic computation is expensive.
It usually^{10} generates randomly perturbed correlation matrices that are singular
This is because standard algorithms to compute the nearest correlation matrix, like Higham’s alternating projections algorithm^{8}, output a singular correlation matrix^{11}.
It provides no guarantee on the magnitude or on the distribution of the perturbations
Due to the nearest correlation matrix step #3, it seems actually rather difficult to control either the magnitude or the probability distribution of the perturbations $ \left | C_{i,j} - \widehat{C}_{i,j} \right | $, $i=1..n$, $j=i+1..n$.
Hardin et al.^{12} introduce another method to randomly perturb the coefficients of a correlation matrix, relying on the dot product of normalized [independent gaussian random vectors]^{12} as random perturbations.
One of the many advantages of this method compared to the naive method previously described is that the resulting randomly perturbed correlation matrix is a valid correlation matrix by construction, which allows to bypass the nearest correlation matrix step #3.
In details, given $C \in \mathcal{M} \left( \mathbb{R}^{n \times n} \right)$ a baseline correlation matrix, Hardin et al.’s method to generate random perturbations of $C$ works as follows:
Select a maximum noise level $\epsilon_{max}$ such that $0 < \epsilon_{max} < \lambda_{n}$, where $\lambda_{n}$ is the smallest eigenvalue of $C$
$\epsilon_{max}$ controls the magnitude of the generated perturbations.
Select the dimension $m \geq 1$ of what is called the noise space in Hardin et al.^{12}
$m$ influences the distributional characteristics of the random perturbations, as depicted in Figure 3 adapted from Hardin et al.^{12} on which it is visible that:
Compute the randomly perturbed correlation matrix $\widetilde{C}$ as $\widetilde{C} = C + \epsilon_{max} \left( U{}^t U - I_n \right)$, where $I_n \in \mathcal{M} \left( \mathbb{R}^{n \times n} \right)$ is the identity matrix of order $n$
The definition of $\widetilde{C}$ ensures that the perturbations are bounded by the maximum noise level $\epsilon_{max}$, i.e., $\left | C_{i,j} - \widetilde{C}_{i,j} \right | \leq \epsilon_{max} $, $i=1..n$, $j=i+1..n$.
Whenever possible, Hardin et al.’s method should be used (computationally cheap, possibility to control the perturbations in terms of magnitude and distribution…), although it suffers from two major limitations:
It is not applicable to correlation matrices singular or close to singular
This is due to the condition on the maximum noise level $\epsilon_{max}$ in step #1 and is regrettably a problem for applications in finance^{13}, because as highlighted in Opdyke^{2}:
correlation matrices estimated on large portfolios often (perhaps usually) are not positive definite for a wide range of reasons, and once positive definiteness is enforced using reliable, proven methods […], the smallest eigenvalue of the resulting matrix is almost always virtually zero.
It might not be applicable to a specific correlation matrix, even if not remotely close to singular
This is again due to the condition on the maximum noise level $\epsilon_{max}$ in step #1.
For instance, in the case of the Harry Browne’s permanent portfolio introduced in the previous sub-section, Hardin et al.’s method cannot be used to perturb the coefficients of the asset correlation matrix represented in Figure 1 by more than +/- 0.25^{14}, and in particular, cannot be used to generate perturbed U.S. stock-bond correlations higher than -0.08^{15}!
The second family of methods to randomly perturb a correlation matrix is based on random perturbations of its eigenvalues.
A representative member of this family is the following method, with $C \in \mathcal{M} \left( \mathbb{R}^{n \times n} \right)$ a baseline correlation matrix to be perturbed:
Generate $n$ randomly perturbed eigenvalues $\widetilde{\lambda}_i \geq 0$ satisfying $\sum_{i=1}^n \widetilde{\lambda}_i = n$ around the baseline eigenvalues $\lambda_i$, $i=1..n$
Galeeva et al.^{4} describe several algorithms and associated probability distributions that can be used in this step.
Any method from this family guarantees, in theory, the validity of the resulting randomly perturbed correlation matrix.
Nevertheless, in practice, Opdyke^{2} notes that:
perturbing eigenvalues fails under challenging empirical conditions, e.g. when the positive definiteness of the matrix has to be enforced algorithmically […] and eigenvalues are virtually zero (or at least unreliably estimated)
In addition, controlling the $\frac{n (n-1)}{2}$ perturbations $\left | C_{i,j} - \widetilde{C}_{i,j} \right |$, $i=1..n$, $j=i+1..n$, which is ultimately what matters, through the $n$ perturbations $\left| \lambda_i - \widetilde{\lambda}_i \right|$, $i=1..n$, sounds rather difficult.
For these reasons, this family of methods might not be the first choice to generate random perturbations of a correlation matrix.
The third and last family of methods to randomly perturb a correlation matrix is based on random perturbations of its correlative angles.
Here, a representative member of this family is the following method, with $C \in \mathcal{M} \left( \mathbb{R}^{n \times n} \right)$ a baseline correlation matrix to be perturbed:
Any method from this family again guarantees, in theory, the validity of the resulting randomly perturbed correlation matrix.
This time, though, theory seems to be confirmed in practice:
One important remark at this stage is that the exact algorithms and associated probability distributions used in step #2 greatly influence the behavior of this family of methods.
For reference, Opdyke^{2} proposes an algorithm called Cosecant, Cotangent, Cotangent (C3) able to generate a distribution of correlative angles median-centered on the baseline correlative angles and satisfying many other desirable properties^{17}.
This algorithm generates a randomly perturbed correlative angle $\widetilde{\theta}_{i,j}$ around a baseline correlative angle $\theta_{i,j}$, $i=1..n$, $j=1..i-1$, as follows:
This family of methods is particularly well-suited to what is called generalized (correlation) stress testing in Opdyke^{2}.
Still, like the family of methods based on random perturbations of the eigenvalues of a correlation matrix, one limitation of this family of methods is that controlling the $\frac{n (n-1)}{2}$ perturbations $\left | C_{i,j} - \widetilde{C}_{i,j} \right |$, $i=1..n$, $j=i+1..n$ sounds once again rather difficult.
Portfolio Optimizer allows to generate random perturbations of a baseline correlation matrix with:
The naive method of randomly perturbing the coefficients of a correlation matrix
Once a (potentially invalid) randomly perturbed correlation matrix is generated on client side, the endpoint /assets/correlation/matrix/nearest
can be used to compute the nearest correlation matrix to this matrix.
The method of randomly perturbing the correlative angles of a correlation matrix
Together with Opdyke’s C3 algorithm^{2}, through the endpoint /assets/correlation/matrix/perturbed
Together with a proprietary algorithm able to control the magnitude of the perturbations of the correlation coefficients, again through the endpoint /assets/correlation/matrix/perturbed
In this case, the distribution of the randomly perturbed correlation matrices is asymptotically uniform over the space of positive definite correlation matrices whose distance in terms of max norm to the baseline correlation matrix is at most equal to (resp. exactly equal to) a given maximum noise level (resp. a given exact noise level), similar in spirit to the method of Hardin et al.^{12}
Suppose that we are managing the Harry Browne’s permanent portfolio introduced earlier.
Suppose also that on 18 February 2020, we feel something is off and would like to assess the impact of a potential correlation breakdown on this portfolio.
Because this potential correlation breakdown could manifest in many ways (increased correlations between certain ETFs, decreased correlation between other ETFs…), it would be a mistake to impose any prior on how correlations should behave or should not behave^{19}.
So, what could we do?
Following the previous sections, one possibility is to generate random perturbations around the current correlation matrix of the ETFs in the portfolio, which will allow to simulate many potential correlation breakdowns in a prior-free way.
Once this is done, it will then be possible to evaluate the portfolio sensitivity to these random shocks.
Such a direct (correlation) stress testing procedure allows to catch difficult-to-anticipate and/or difficult-to quantify second and third order effects of a large, multivariate, impactful scenario (e.g. pandemic + economic upheaval)^{2}.
In order to apply this procedure to the portfolio at hand, three prerequisites are necessary:
Estimating the current correlation matrix $C_{PP}$ of the ETFs in the portfolio
I will estimate $C_{PP}$ as the correlation matrix of the four ETFs in the portfolio^{20} over the 24-day period 14 January 2020 - 18 February 2020^{21}, which gives
\[C_{PP} \approx \begin{pmatrix} 1 & -0.81 & -0.82 & -0.65 & \\ -0.81 & 1 & 0.84 & 0.70 \\ -0.82 & 0.84 & 1 & 0.75 \\ -0.65 & 0.70 & 0.75 & 1 \end{pmatrix}\]Selecting a method to randomly perturb the current correlation matrix $C_{PP}$
I will generate random perturbations of $C_{PP}$ thanks to Opdyke’s C3 algorithm^{2} as implemented through the Portfolio Optimizer endpoint /assets/correlation/matrix/perturbed
.
Determining how to evaluate the portfolio sensitivity to the random perturbations of the current correlation matrix $C_{PP}$
To keep things simple, I will evaluate the portfolio effective number of bets^{22}
(ENB), using the Portfolio Optimizer endpoint /portfolio/analysis/effective-number-of-bets
.
With these prerequisites met, it is possible to generate random perturbations around the current correlation matrix $C_{PP}$ and compute the corresponding ENB distribution.
An example of ENB distribution is provided in Figure 5, in the case of 10000 randomly perturbed correlation matrices.
Some associated summary statistics:
Mean | 1.89 |
Standard deviation | 0.42 |
Minimum | 1.01 |
5% percentile | 1.33 |
25% percentile | 1.60 |
Median | 1.81 |
75% percentile | 2.12 |
95% percentile | 2.72 |
Maximum | 3.98 |
And, for reference, the value of the current ENB of the portfolio, computed with the current correlation matrix $C_{PP}$: 1.87.
A couple of comments:
More than half of the ENB are located very close^{23} to the current ENB (1.87)
These ENB are not representative of any real correlation breakdown.
The 95% percentile of the ENB distribution (2.72) is much further apart from the current ENB (1.87) than the 5% percentile of the ENB distribution (1.33)
This means that a correlation breakdown with the biggest impact on the ENB would correspond, maybe counter-intuitively, to a scenario of de-correlation^{24} of the four ETFs in the portfolio.
As a side note, and again maybe counter-intuitively, the impact of such a correlation breakdown would then be rather harmless, because an increase in ENB is usually desirable from a portfolio diversification perspective.
The minimum (1.01) and maximum (3.98) ENB both correspond to the theoretical minimum (1) and maximum (4) ENB
This shows that all possible (correlation) unknown unknowns have been covered by the stress testing procedure.
In the previous sub-section, we have (empirically) established that the most impactful correlation breakdown scenario for the ENB of the portfolio corresponds to a de-correlation of the ETFs.
The next logical step is now to compute a correlation matrix that would somehow best illustrate this de-correlation scenario, a procedure known as reverse (correlation) stress testing.
For this, inspired by the concept of market states from Stepanov et al^{25}, I propose to apply a k-means clustering algorithm^{26}, with $k = 2$, to the randomly perturbed correlation matrices generated during the direct stress testing procedure.
One output of this algorithm is two “representative” correlation matrices^{27}, which are, in the case of the 10000 randomly perturbed correlation matrices of the previous sub-section:
A correlation matrix $\widetilde{C}_{PP,1}$ “representative” of all the randomly perturbed correlation matrices that are “maximally similar” to the current correlation matrix $C_{PP}$
\[\widetilde{C}_{PP,1} \approx \begin{pmatrix} 1 & -0.76 & -0.78 & -0.67 & \\ -0.76 & 1 & 0.75 & 0.66 \\ -0.78 & 0.75 & 1 & 0.67 \\ -0.67 & 0.66 & 0.67 & 1 \end{pmatrix}\]A correlation matrix $\widetilde{C}_{PP,2}$ “representative” of all the randomly perturbed correlation matrices that are “maximally dissimilar” from $\widetilde{C}_{PP,1}$
\[\widetilde{C}_{PP,2} \approx \begin{pmatrix} 1 & -0.64 & -0.64 & -0.12 & \\ -0.64 & 1 & 0.53 & 0.07 \\ -0.64 & 0.53 & 1 & 0.06 \\ -0.12 & 0.07 & 0.06 & 1 \end{pmatrix}\], with “representative”, “maximally similar” and “maximally dissimilar” loosely defined but usually corresponding to intuition^{28}.
In terms of market states^{25}:
The correlation matrix $\widetilde{C}_{PP,1}$ embodies the current market state
Indeed, $\widetilde{C}_{PP,1}$ is very close to $C_{PP}$, as confirmed by the small Frobenius distance between these two matrices (0.21).
In the current market state, the ENB is concentrated^{29} around the current ENB of the portfolio (1.87).
The correlation matrix $\widetilde{C}_{PP,2}$ embodies a market state maximally distinct from the current market state, which I will call the de-correlation market state
The rationale for this name is that a comparison between $C_{PP}$ and $\widetilde{C}_{PP,2}$ shows that this second market state corresponds to a de-correlation of the four ETFs in the portfolio^{30}.
In the de-correlation market state, the ENB is much higher that in the current market state, with for example the ENB computed with the correlation matrix $\widetilde{C}_{PP,2}$ equal to 2.97, well above the 95% percentile of the ENB distribution (2.72).
Thanks to these observations, it is possible to conclude that $\widetilde{C}_{PP,2}$ is the correlation matrix that best illustrate the most impactful correlation breakdown scenario for the ENB of the portfolio.
I will conclude this example on generalized stress testing by a reality check on the results obtained in the previous sub-sections.
The correlation matrix $C_{PP, COVID}$ below is the correlation matrix of the four ETFs in the portfolio^{20} over the subsequent 24-day “full crisis” period 19 February 2020 - 23 March 2020^{21}.
\[C_{PP, COVID} \approx \begin{pmatrix} 1 & -0.50 & -0.40 & 0.00 & \\ -0.50 & 1 & 0.71 & 0.25 \\ -0.40 & 0.71 & 1 & 0.19 \\ 0.00 & 0.25 & 0.19 & 1 \end{pmatrix}\]Of particular interest:
In other words:
The possibility to generate random perturbations of a correlation matrix has many other applications in risk management and even beyond.
As an example, in mean-variance optimization, the resampled efficient frontier is partially based on random perturbations of a baseline correlation matrix.
Also, as a last remark on Opdyke’s C3 algorithm^{2}, a fully nonparametric version of it is described on Opdyke’s website.
This extended version, called Nonparametric Angles-based Correlation (NAbC), covers not only correlation matrices based on any underlying data distributions^{31} but also correlation matrices beyond the standard Pearson’s correlation matrix, like Spearman’s Rho correlation matrix or Kendall’s Tau correlation matrix.
For more random quantitative discussions, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
Otherwise, these unknown unknows would become known unknows! ↩
See Opdyke, JD, Full Probabilistic Control for Direct and Robust, Generalized and Targeted Stressing of the Correlation Matrix (Even When Eigenvalues are Empirically Challenging) (May 30, 2020). QuantMinds/RiskMinds September 22-23, 2020. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12}
See Rapisarda, F., Brigo, D. and Mercurio, F. (2007) Parameterizing correlations: a geometric interpretation, IMA Journal of Management Mathematics, 18(1), pp. 55–73. ↩
See Risk Management in Commodity Markets: From Shipping to Agriculturals and Energy, Chapter 6, Roza Galeeva, Jiri Hoogland, and Alexander Eydeland, Measuring Correlation Risk for Energy Derivatives. ↩ ↩^{2} ↩^{3} ↩^{4}
Of course, many other methods exist; for example, if the data generating process is known, it is possible to use a Monte-Carlo method to generate random samples from this process and compute their associated (sample) correlation matrix, which is then a perturbed version of the original correlation matrix. ↩
See A Changing Stock–Bond Correlation: Drivers and Implications, Alfie Brixton, Jordan Brooks, Pete Hecht, Antti Ilmanen, Thomas Maloney, Nicholas McQuinn, The Journal of Portfolio Management, Multi-Asset Special Issue 2023, 49 (4) 64 - 80. ↩
I’ll skip the math, but the interested reader can for example compute the eigenvalues of the asset correlation matrix represented in Figure 1 with the U.S. stock-bond correlation altered from -0.33 to 0.5. ↩
Nicholas J. Higham, Computing the Nearest Correlation Matrix—A Problem from Finance, IMA J. Numer. Anal. 22, 329–343, 2002. ↩ ↩^{2}
Assuming that an algorithm to compute the nearest correlation matrix is available; otherwise, this method becomes immediately less straightforward to implement… ↩
Except if the initial randomly perturbed correlation matrices are actually valid, non-singular, correlation matrices. ↩
It is sometimes possible, though, to integrate an additional constraint on the minimum eigenvalue of the computed nearest valid correlation matrix into these algorithms. ↩
See Hardin, Johanna; Garcia, Stephan Ramon; Golan, David. A method for generating realistic correlation matrices. Ann. Appl. Stat. 7 (2013), no. 3, 1733–1762. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5}
This is maybe less of a problem in other applications, like in biology. ↩
Because the smallest eigenvalue of the asset correlation matrix represented in Figure 1 is 0.25. ↩
Similarly, Hardin et al.’s method cannot be used to generate random pertubations of the U.S. stock-bond correlation that would bring this correlation to a level lower than -0.58. ↩
Strictly speaking, when the correlation matrix $C$ is positive semi-definite, its hypersphere decomposition is not unique. ↩
C.f. Opdyke^{2} for the complete list of goals of his proposed approach. ↩
See Enes Makalic & Daniel F. Schmidt (2022) An efficient algorithm for sampling from sink(x) for generating random correlation matrices, Communications in Statistics - Simulation and Computation, 51:5, 2731-2735. ↩
For example, assuming that all correlations would go to one in case of a correlation breakdown is a prior. ↩
More specifically, of the daily arithmetic total returns of the four ETFs in the portfolio, whose prices have been retrieved using Tiingo. ↩ ↩^{2}
I used a 24-day period because the period 19 February 2020 - 23 March 2020, which corresponds to the peak of the COVID financial crisis - c.f. Wikipedia - is also a 24-day period. ↩ ↩^{2}
Due to personal preferences, I will use the effective number of bets based on principal components analysis as the factors extraction method; in addition, I will use the asset correlation matrix as if it were the asset covariance matrix to not introduce any additional variables (volatilities). ↩
More precisely, within a +/- 0.30 interval around 1.87. ↩
Intuitively, the higher the ENB of an equally-weighted portfolio, the more uncorrelated its constituents. ↩
See Stepanov, Y., Rinn, P., Guhr, T., Peinke, J., & Schafer, R. (2015). Stability and hierarchy of quasi-stationary states: financial markets as an example. Journal of Statistical Mechanics: Theory and Experiment, 2015(8), P08011. ↩ ↩^{2}
I used the standard Scikit-Learn $k$-means algorithm. ↩
The k-means algorithm does not guarantee that the cluster centroids are valid correlation matrices; if this is not the case, it is possible to use either the $k$-medoids instead, or to compute the nearest correlation matrices to the cluster centroid. ↩
And more rigorously defined as per the $k$-means algorithm. ↩
To be noted that the ENB computed with the correlation matrix $C^{‘}_{PP,1}$ is nearly identical to the ENB computed with the current correlation matrix $C_{PP}$ (1.87). ↩
For example, U.S. stocks and Gold move from anti-correlated (-0.65) to nearly uncorrelated (-0.12). ↩
That is, data distributions characterized by any degree of serial correlation, asymmetry, non-stationarity, and/or heavy-tailedness. ↩
Problem is, the presence of assets whose return histories differ in length makes it nearly impossible to use standard portfolio analysis and optimization methods…
Estimating the historical covariance matrix of a multi-asset portfolio, for example, is not possible when assets have unequal return histories, so that a typical workaround used in practice is to consider only the common returns history. Unfortunately, this workaround has the side effect of discarding information contained in the longer return histories, which might greatly impact the quality of the estimated covariance matrix^{2}.
Sebastien Page proposes a solution to this problem in his paper How to Combine Long and Short Return Histories Efficiently^{3}. It consists in simulating missing asset returns based on the relationships observed between all assets over their common returns history while accounting for the associated estimation error.
In this blog post, I will describe in detail Page’s method and analyze how it behaves empirically with a two-asset class portfolio made of U.S. and E.M. stocks.
Notes:
Let be two groups of assets $X$ and $Y$ such that:
In such a situation, illustrated in Figure 1 adapted from Page^{3}, returns for the group of assets $Y$ are missing for the whole (beginning) returns history $t = 1..L - S$.
Building on the maximum likelihood procedure^{4} described in Stambaugh^{5}, Page^{3} introduces a 3-step method in order to combine [these] long and short return histories efficiently^{3} and backfill the $ m \times \left( L - S \right)$ missing asset returns.
The vector $\hat{\mu}_{Y,L} \in \mathbb{R}^{m}$ of the mean returns of the assets belonging to the group $Y$ is estimated over the long returns history by:
\[\hat{\mu}_{Y,L} = \mu_{Y,S} + \beta \left( \mu_{X,L} - \mu_{X,S} \right)\], with:
The covariance matrix $\hat{\Sigma}_{YY,L} \in \mathcal{M}(\mathbb{R}^{m \times m})$ of the assets belonging to the group $Y$ is estimated over the long returns history by:
\[\hat{\Sigma}_{YY,L} = \Sigma_{YY,S} + \beta \left( \Sigma_{XX,L} - \Sigma_{XX,S} \right) \beta {}^t\], with:
Similarly, the covariance matrix $\hat{\Sigma}_{XY,L} \in \mathcal{M}(\mathbb{R}^{m \times n})$ between the assets belonging to the group $X$ and the assets belonging to the group $Y$ is estimated over the long returns history by:
\[\hat{\Sigma}_{XY,L} = \Sigma_{XY,S} + \beta \left( \Sigma_{XX,L} - \Sigma_{XX,S} \right)\]Once the long mean vectors and covariance matrix have been estimated thanks to step 1 and step 2, it is possible to simulate the missing (multivariate) asset returns $Y_t = \left( Y_{1,t},…,Y_{m,t} \right) {}^t \in \mathbb{R}^{m}$ for $t = 1..L - S$.
Page^{3} mentions 3 backfilling procedures for doing so, all based on a transformation of the long (multivariate) asset returns $X_t = \left( X_{1,t},…,X_{n,t} \right) {}^t \in \mathbb{R}^{n}$:
Beta adjustment
The beta adjustment backfilling procedure is based on the deterministic transformation:
\[Y_t = \mu_{b_t}\], with $\mu_{b_t} = \hat{\mu}_{Y,L} + \hat{\Sigma}_{XY,L} \Sigma_{XX,L}^{-1} \left( X_t - \mu_{X,L} \right) \in \mathbb{R}^{m}$
The main problem with this procedure is that it gives a false sense of uniqueness for the backfilled asset returns.
Indeed, as Page puts it^{3}:
[…] the solution will not be unique: Many sets of simulated missing returns correspond to a given covariance matrix. This feature of the backfilling process is intuitive because, after all, the missing returns are unknown and so the model must recognize the uncertainty around the estimates.
So, this backfilling procedure is probably best used only for bechmarking purposes.
Conditional sampling
In order to take into account estimation error into the backfilled asset returns, the conditional sampling backfilling procedure models the missing asset returns as a (multivariate) Gaussian distribution:
\[Y_t \sim \mathcal{N} \left(\mu_{b_t}, \Sigma_b \right)\], with:
Here, Page^{3} notes that at the null noise limit (i.e., $\Sigma_b = 0$), this backfilling procedure becomes equivalent to the beta adjustment backfilling procedure.
Residuals recycling
Modeling the missing asset returns by a Gaussian distribution might be appropriate in some cases, depending on the assets and on the returns measurement frequency^{7}, but generally speaking, financial assets exhibit skewed and fat-tailed return distributions.
So, it would make sense if backfilled asset returns were to take into account these characteristics.
This is the aim of the residuals recycling backfilling procedure, which works as follows:
, with $t’ \in [L - S + 1..T]$ chosen uniformly at random.
Page^{3} highlights that this backfilling procedure represents a hybrid between [maximum likelihood estimation] and bootstrapping^{3} and that it provides a simple, relatively assumption-free approach to account for fat tails and other features of the distribution beyond means and covariances in the backfilling process.^{3}.
Page’s method as described in the previous paragraph assumes that all the assets belonging to the short group of assets $Y$ share a common returns history, and in particular a common returns history starting date.
In practice, though, most assets do not usually share a common returns history starting date, as illustrated in Figure 2 adapted from Gramacy et al.^{8}.
In such a situation, a possible way to extend Page’s method is to apply it iteratively as proposed in Jiang and Martin^{9}.
For this, let be $G_1,…,G_J, J \geq 1$ groups of assets whose length of returns history $L = L_1 > L_2 > … > L_J \geq 1$ differ, but which share a common returns history ending date, as illustrated in Figure 2.
Then, Page’s method can be extended as follows:
Some numerical subtelties need to be taken into account when implementing Page’s method, among which that:
Page^{3} highlights that his method does not magically transforms missing data into additional information^{3} and lists several of its limitations.
I think one of the most important of these is that^{3}
The model assumes that betas between the existing [asset returns] and the missing [asset returns] do not change, which is not necessarily a realistic assumption.
To also be noted that even with this method, backfilling missing returns for a completely new asset class might unfortunately remain elusive.
For example, in their piece Risk Analysis of Crypto Assets, people at Two Sigma concludes that Bitcoin is not easily explained by the Two Sigma Factor Lens, nor is it substantially correlated to other currencies or any of the major commodities, so that no long returns history of any asset class seems to contain sufficient information to accurately backfill Bitcoin returns…
Portfolio Optimizer implements the extension of Page’s method for multiple starting dates described in the previous section, together with specific care for the numerical subtelties also described in the previous section,
through the endpoint /assets/returns/backfilled
.
Page’s residuals recycling backfilling procedure has been designed to better account for non-normal distributions^{3}.
To which extent is this goal reached in practice?
Let’s check.
Page^{3} uses a simulation framework in order to compare backfilled v.s. expected returns for a bivariate $t$-distribution and obtain the results displayed in Figure 3, taken from Page^{3}.
Results from Figure 3 leads to the following conclusions:
In other words, the residuals recycling backfilling procedure seems to reach its advertised goal, at least when applied to a known theoretical distribution.
More empirically, Page^{3} uses monthly returns on:
in order to compare backfilled v.s. actual returns for E.M. stocks.
In more details:
Returns on U.S. stocks from January 1988 to May 2011 (long returns history) and returns on E.M. stocks from February 1998 to May 2011 (short returns history) are used to backfill returns on E.M. stocks from January 1988 to January 1998^{10}
This process is repeated 10000 times to obtain 10000 different backfilled paths for emerging-market stocks^{3}.
Moments are computed on each backfilled path, and the grand average of these moments is computed over all backfilled paths
The moments of interest are the mean, the variance, the skewness and the kurtosis of backfilled returns.
Moments are computed on E.M. stocks using actual returns data from January 1988 to January 1998^{11}
Using Portfolio Optimizer, this test can easily be reproduced^{12}, c.f. the Jupyter notebook corresponding to this post, which gives for example the figures below:
Backfilling procedure | Mean | Variance | Skewness | Kurtosis |
---|---|---|---|---|
None (actual returns) | 1.5% | 0.0039 | -0.25 | 3.83 |
Conditional sampling | 2.2% | 0.0036 | -0.07 | 3.11 |
Recycled residuals | 2.2% | 0.0036 | -0.20 | 3.24 |
These figures clearly show that the recycled residuals backfilling procedure generate asset returns that are closer, in terms of higher moments, to actual returns v.s. the conditional sampling backfilling procedure.
From this perspective, and even though the mean of backfilled returns is quite far off the mean of actual returns, the recycled residuals backfilling procedure can definitely be considered to properly recover fat tails in the missing [returns] data^{3}.
Page’s method provides a formal, plug-and-play solution^{3} to the problem of unequal return histories in portfolio analysis and optimization.
While other methods certainly exist, like methods based on risk factors, these other methods usually tend to be more complex, so that Page’s method is a very good choice for anyone requiring a simple way to manage missing asset returns.
For more quantitative methods with just the right level of complexity, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
For instance, historical returns of Emerging Markets (E.M.) stocks are available from the late 1980s^{13} while historical returns of U.S. stocks are available from the late 1920s^{14} or even earlier. ↩
See Steven P. Peterson, John T. Grier, Covariance Misspecification in Asset Allocation, Financial Analysts Journal, Vol. 62, No. 4 (Jul. - Aug., 2006), pp. 76-85. ↩
See Sebastien Page (2013) How to Combine Long and Short Return Histories Efficiently, Financial Analysts Journal, 69:1, 45-52. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13} ↩^{14} ↩^{15} ↩^{16} ↩^{17} ↩^{18} ↩^{19} ↩^{20}
Page^{3} note that in the context of his paper, asset returns series are not assumed to be multivariate Gaussian, so that the maximum likelihood procedure of Stambaugh actually becomes a quasi-maximum likelihood procedure. ↩
See Stambaugh, Robert F. 1997. Analyzing Investments Whose Histories Differ in Length. Journal of Financial Economics, vol. 45, no. 3 (September):285–331.. ↩
To be noted that contrary to $\mu_{b_t}$ , $\Sigma_b$ is time-independant. ↩
Asset returns have a tendency to follow a distribution closer and closer to a Gaussian distribution the more the time period over which they are computed increases; this empirical property is called aggregational Gaussianity, c.f. Cont^{15}. ↩
See Robert B. Gramacy, Joo Hee Lee, Ricardo Silva, On estimating covariances between many assets with histories of highly variable length, arXiv. ↩ ↩^{2}
See Jiang, Yindeng and Martin, R. Douglas, Turning Long and Short Return Histories into Equal Histories: A Better Way to Backfill Returns (August 31, 2016). ↩
To be noted that there is a typo in the heading of Table 3 in Page^{3}, because known data is taken over January 1988 - January 1999, which is not 10 years but 20 years! ↩
Returns on E.M. stocks are indeed available from the full period January 1988 to May 2011. ↩
Results are not strictly identical to those of Page^{3}, due to the random nature of the test; in addition, the skewness of actual E.M. returns is -0.26 in Page^{3} v.s. -0.25 here, probably due to some slight different in returns data. ↩
C.f. the MSCI website. ↩
See R. Cont (2001) Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, 1:2, 223-236. ↩
Wedderburn unfortunately never had the opportunity to publish his report^{3} and his work was forgotten until Li^{4} rediscovered it nearly 20 years later.
In this short blog post, I will first describe the standard algorithm used to simulate i.i.d. samples from a multivariate normal distribution and I will then detail Wedderburn’s original algorithm as well as some of the modifications proposed by Li.
A textbook result related to the multivariate normal distribution is that any linear combination of normally distributed random variables is also normally distributed.
More formally:
Property 1: Let $X$ be a n-dimensional random variable following a multivariate normal distribution $\mathcal{N} \left( \mu, \Sigma \right)$ of mean vector $\mu \in \mathbb{R}^{n}$ and of covariance matrix $\Sigma \in \mathcal{M}(\mathbb{R}^{n \times n})$. Then, any affine transformation $Z = AX + b$ with $A \in \mathcal{M}(\mathbb{R}^{n \times m})$ and $b \in \mathbb{R}^{m}$, $m \ge 1$, follows a m-dimensional multivariate normal distribution $\mathcal{N} \left( A \mu + b, A \Sigma A {}^t \right)$.
An orthogonal matrix of order $n$ is a matrix $Q \in \mathcal{M}(\mathbb{R}^{n \times n})$ such that $Q {}^t Q = Q Q {}^t = \mathbb{I_n}$, with $\mathbb{I_n}$ the identity matrix of order $n$.
By extension, a rectangular orthogonal matrix is a matrix $Q \in \mathcal{M}(\mathbb{R}^{m \times n}), m \geq n$ such that $Q {}^t Q = \mathbb{I_n}$.
A random orthogonal matrix of order $n$ is a random matrix $Q \in \mathcal{M}(\mathbb{R}^{n \times n})$ distributed according to the Haar measure over the group of orthogonal matrices, c.f. Anderson et al.^{5}.
By extension, a random rectangular orthogonal matrix is a matrix $Q \in \mathcal{M}(\mathbb{R}^{m \times n}) , m \geq n$, whose columns are, for example, the first $n$ columns of a random orthogonal matrix of order $m$, c.f. Li^{4}.
An Helmert matrix of order $n$ is a square orthogonal matrix $H \in \mathcal{M}(\mathbb{R}^{n \times n})$ having a prescribed first row and a triangle of zeroes above the diagonal^{6}.
For example, the matrix $H_n$ defined by
\[H_n = \begin{pmatrix} \frac{1}{\sqrt n} &\frac{1}{\sqrt n} & \frac{1}{\sqrt n} & \dots & \frac{1}{\sqrt n} \\ \frac{1}{\sqrt 2} & -\frac{1}{\sqrt 2} & 0 & \dots & 0 \\ \frac{1}{\sqrt 6} & \frac{1}{\sqrt 6} & -\frac{2}{\sqrt 6} & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots & \vdots\\ \frac{1}{\sqrt { n(n-1) }} & \frac{1}{\sqrt { n(n-1) }} & \frac{1}{\sqrt { n(n-1) }} &\dots & -\frac{n-1}{\sqrt { n(n-1) }} \end{pmatrix}\]is a Helmert matrix.
A generalized Helmert matrix of order $n$ is a square orthogonal matrix $G \in \mathcal{M}(\mathbb{R}^{n \times n})$ that can be transformed by permutations of its rows and columns and by transposition and by change of sign of rows, to a form of a [standard] Helmert matrix^{6}.
For example, the matrix $G_n$ defined by
\[G_n = \begin{pmatrix} \frac{1}{\sqrt n} &\frac{1}{\sqrt n} & \frac{1}{\sqrt n} & \dots & \frac{1}{\sqrt n} \\ -\frac{1}{\sqrt 2} & \frac{1}{\sqrt 2} & 0 & \dots & 0 \\ -\frac{1}{\sqrt 6} & -\frac{1}{\sqrt 6} & \frac{2}{\sqrt 6} & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots & \vdots\\ -\frac{n-1}{\sqrt { n(n-1) }} & -\frac{1}{\sqrt { n(n-1) }} & -\frac{1}{\sqrt { n(n-1) }} &\dots & \frac{n-1}{\sqrt { n(n-1) }} \end{pmatrix}\]is a generalized Helmert matrix, obtained from the matrix $H_n$ by change of sign of rows $i=2..n$.
Let be:
One of the most well known algorithm to generate $m \geq 1$ i.i.d. samples $X_1, …, X_m$ from the $n$-dimensional multivariate normal distribution $\mathcal{N}(\mu, \Sigma)$ relies on the Cholesky decomposition of the covariance matrix $\Sigma$.
In details, this algorithm is as follows:
When the previous algorithm is used to generate $m$ i.i.d. samples $X_1, …, X_m$ from the $n$-dimensional multivariate normal distribution $\mathcal{N}(\mu, \Sigma)$, the sample mean vector
\[\bar{X} = \frac{1}{m} \sum_{i = 1}^m X_i\]and the (unbiased) sample covariance matrix
\[Cov(X) = \frac{1}{m-1} \sum_{i = 1}^m \left(X_i - \bar{X} \right) \left(X_i - \bar{X} \right) {}^t\]will be different from their theoretical counterparts, as illustrated in Figure 1 with $\mu = \left( 0, 0 \right){}^t$, $\Sigma = \begin{bmatrix} 3 & 1 \newline 1 & 2 \end{bmatrix}$ and $m = 250$.
While convergence of the first two sample moments toward the first two theoretical moments is guaranteed when $m \to +\infty$, their mismatch for finite $m$ is usually^{9} an issue in practical applications.
Indeed, a large number of samples is then usually required in order to reach a reasonable level of accuracy for whatever statistical estimator is being computed, and generating such a large number of samples is costly in computation time.
Let be:
Wedderburn’s algorithm^{1} is a conditional Monte Carlo algorithm to generate multivariate normal samples conditional on a given mean and dispersion matrix^{1}.
In other words, given a desired sample mean vector $\bar{\mu}$ and a desired (unbiased) sample covariance matrix $\bar{\Sigma}$, Wedderburn’s algorithm allows to generate $m \geq n + 1$ i.i.d. samples $X_1, …, X_m$ from a $n$-dimensional multivariate normal distribution satisfying the two relationships
\[\bar{X} = \frac{1}{m} \sum_{i = 1}^m X_i = \bar{\mu}\]and
\[Cov(X) = \frac{1}{m-1} \sum_{i = 1}^m \left(X_i - \bar{X} \right) \left(X_i - \bar{X} \right) {}^t = \bar{\Sigma}\]By enforcing an exact match for finite $m$ between the first two sample moments and the first two theoretical moments of a multivariate normal distribution, Wedderburn’s algorithm allows to reduce the number of samples required in order to reach a reasonable level of accuracy for whatever statistical estimator is being computed, hence the total computation time^{10}.
From this perspective, Wedderburn’s algorithm can be considered as a Monte Carlo variance reduction technique.
In details, Wedderburn’s algorithm is as follows^{1}^{4}:
Li^{4} proposes several modifications to the original Wedderburn’s algorithm and show in particular how to manage a positive semi-definite covariance matrix $\bar{\Sigma}$.
In details, Wedderburn-Li’s algorithm is as follows^{4}:
A couple of remarks:
Contrary to the algorithm described in the previous section, it appears at first sight that no sample from the univariate standard normal distribution $\mathcal{N}(0, 1)$ needs to be generated when using Wedderburn’s algorithm.
This is actually not the case, because generating a random orthogonal matrix implicitely relies on the generation of such samples!
Wedderburn^{1} uses the eigenvalue decomposition of the covariance matrix $\bar{\Sigma}$ instead of its Cholesky decomposition, but Li^{4} demonstrates that it is actually possible to use any decomposition of $\bar{\Sigma}$ such that $\bar{\Sigma} = A A {}^t$ with $A \in \mathcal{M} \left( \mathbb{R}^{n \times n} \right)$ and advocates for the usage of the Cholesky decomposition.
As highlighted in Wedderburn^{1}, the theoretical mean vector and covariance matrix of the multivariate normal distribution are irrelevant.
Portfolio Optimizer implements both the standard algorithm and Wedderburn-Li’s algorithm to simulate from a multivariate normal distribution through the endpoint
/assets/returns/simulation/monte-carlo/gaussian/multivariate
.
To be noted, though, that for internal consistency reasons, the input covariance matrix when using Wedderburn-Li’s algorithm is assumed to be the desired biaised sample covariance matrix and not the desired unbiased sample covariance matrix.
To conclude this post, a word about applications of Wedderburn’s algorithm.
These are of course numerous in finance, c.f. for example applications of similar Monte Carlo variance reduction techniques in asset pricing in Wang^{11} or in risk management in Meucci^{12}.
But Wedderburn’s algorithm is more applicable in any context requiring the simulation from a multivariate normal distribution, which makes it a very interesting generic algorithm to have in one’s toolbox!
For more analysis of forgotten research reports and algorithms, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
See Wedderburn, R.W.M. (1975), Random rotations and multivariate normal simulation. Research Report, Rothamsted Experimental Station. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6}
That is, the samples are simulated so that they have the desired sample mean and sample covariance matrix. ↩
Because he died suddenly in 1975… ↩
See K.-H. Li, Generation of random matrices with orthonormal columns and multivariate normal variates with given sample mean and covariance, J. Statist. Comput. Simulation 43 (1992) 11–18. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6}
See T. W. Anderson, I. Olkin, L. G. Underhill, Generation of Random Orthogonal Matrices, SIAM Journal on Scientific and Statistical ComputingVol. 8, Iss. 4 (1987). ↩
See H. O. Lancaster, The Helmert Matrices, The American Mathematical Monthly, 72(1965), no. 1, 4-12. ↩ ↩^{2}
On this blog such variables are typically assets, but they can also be genes or species in a biology context, etc. ↩ ↩^{2}
In case a matrix is positive definite, its Cholesky decomposition exists and is unique; in case a matrix is only positive semi-definite, its Cholesky decomposition exists but is not unique in general. ↩ ↩^{2} ↩^{3}
Not always, as variability in the sample mean and in the sample covariance matrix might be desired. ↩
Under the assumption that the total time taken to generate this reduced number of samples + to compute the associated estimator is (much) lower than the total time taken to generate the initial larger number of samples + to compute the associated estimator. ↩
See Jr-Yan Wang, Variance Reduction for Multivariate Monte Carlo Simulation, The Journal of Derivatives Fall 2008, 16 (1) 7-28. ↩
See Meucci, Attilio, Simulations with Exact Means and Covariances (June 7, 2009). ↩
This model relies on what Bogle describes as the single most important factor in forecasting future total returns [of a government bond], which is the the initial yield to maturity.
In this post, I will describe Bogle’s methodology and analyze its forecasting performances when applied to constant maturity U.S. government bonds, which is a category of U.S. government bonds representative of most U.S. government bond ETFs as detailled in a previous post.
The Bogle Sources of Return Model for Bonds^{2} (BSRM/B) is a simple empirical model which states that there is but a single dominant source of decade-long returns on [government] bonds: the interest coupon^{2}.
This model has initially been introduced by Bogle in the case of the 20-year U.S. Treasury bond^{1} and has then latter been shown to also be applicable in the case of the 10-year U.S. Treasury bond^{2}.
Let’s dig into the details.
Bogle^{1} examines the relationship between the initial yield to maturity of a 20-year U.S. Treasury bond and its subsequent 10-year annualized total^{3} return, over the period 1930 - 1980.
He notices that^{1}:
[…] in bonds, [the initial interest rate] is the single most important factor in forecasting future total returns. The other two factors are the reinvestment rate (the rate at which the interest coupons compound), and the terminal (or end-of-period) yield.
As a matter of fact, he later shows in his paper The 1990s at the Halfway Mark^{4} that these three factors taken together have a correlation of 0.99 with the actual returns on bonds in each of the decades^{4}.
Now, because the reinvestment rate and the terminal yield are by definition unknown quantities, they cannot be used to forecast future bond returns, which leaves the initial interest rate as the critical variable, [which] has a correlation of 0.709 with the returns subsequently earned by bonds.^{1}.
In other words, the initial yield to maturity of a 20-year U.S. Treasury bond is actually sufficient to explain substantially [the bond] […] return […] over the subsequent decade^{4}.
Bogle and Nolan^{2} examine the relationship between the initial yield to maturity of a 10-year U.S. Treasury bond and its subsequent 10-year annualized return over the period 1915–2014, and find that the initial interest rate explains ~90% of the variability of the subsequent 10-year annualized bond returns.
This finding is illustrated in Figure 1, directly reproduced from Bogle and Nolan^{2}.
I propose to analyze the forecasting performances of the BSRM/B model when applied to constant maturity U.S. government bonds over the out-of-sample period 31th October 1993 - 31th May 2013.
Due to the close relationship between this category of U.S. government bonds and most U.S. government bond ETFs, this will help to understand if Bogle’s model could be of any practical use to today’s investors for setting long-term capital assumptions for U.S. governement bonds.
Using the monthly 20-Year Treasury Constant Maturity Rates series from the Federal Reserve website, Figure 2 shows that the initial yield to maturity at the end of any given month over the period 31th October 1993 - 31th May 2013 explains ~72.1% of the variability of the subsequent 10-year annualized bond returns.
The associated monthly correlation coefficient is ~0.849, which is consistent with the yearly correlation coefficient of ~0.709 determined by Bogle^{1}.
Using the monthly 10-Year Treasury Constant Maturity Rates series from the Federal Reserve website, Figure 3 shows that the initial yield to maturity at the end of any given month over the period 31th October 1993 - 31th May 2013 explains ~85.2% of the variability of the subsequent 10-year annualized bond returns!
Such a value for the monthly $r^2$ coefficient is again consistent with the yearly $r^2$ coefficient of ~90% obtained by Bogle and Nolan^{2} and displayed in Figure 1.
The empirical conclusions of this section are that:
But what about the forecasting performances of the BSRM/B model for other maturities? For example, for the 3-year constant maturity U.S. government bond, represented in Figure 4, or for the 30-year constant maturity U.S. government bond, represented in Figure 5?
From Figure 2 to Figure 5, another empirical conclusion is that the BSRM/B model is best suited to a 10-year constant maturity U.S. government bond, because the more the deviation from a maturity of 10 years the more the degradation in forecasting performances^{6}.
Surprinsingly, all empirical conclusions of the previous section, and especially the last one, are backed up by theoretical results.
Indeed, Leibowitz et al.^{7} analyze the behaviour of constant duration bond funds and establish that multi-year […] returns […] converge in both mean and volatility around the starting yield^{7}, with a convergence horizon of about $2D - 1$ years for a bond fund whose duration is $D$ years, regardless of interim changes in yields^{8} which only impact this convergence by widening the distribution of returns around the mean return^{7}.
These results allow to understand the behaviour of the BSRM/B model when applied to the 10-year constant maturity U.S. government bond^{9}:
These results also allow to understand the behaviour of the BSRM/B model when applied to the the 3-year, 20-year and 30-year constant maturity U.S. government bonds, as the initial yield to maturity of these bonds should then NOT be predictive of their annualized returns over the subsequent 10 years, but should rather be predictive of their annualized returns over the subsequent $2 D_3 - 1$, $2 D_{20} - 1$ and $2 D_{30} - 1$ years, with $D_3$, $D_{20}$ and $D_{30}$ their respective durations^{12}.
This theoretical behaviour is (somewhat) confirmed in practice, with for example Figure 7 illustrating that the initial yield to maturity of the 3-year constant maturity U.S. government bond is highly predictive of the annualized returns of this bond over the subsequent 3 years over the period 28th February 1962 - 31th May 2020.
A proprietary variation of Bogle’s BSRM/B model is implemented through the Portfolio Optimizer endpoint /markets/indicators/bsrmb/us
to compute
Typical examples of usage for the BSRM/B model are similar to the ones described in a previous blog post about a predictor of long-term stock market returns called the AIAE.
For example:
As an illustration, Figure 8 displays the “path” of expected long-term returns for the 10-year constant maturity U.S. government bond over the period 28th February 1962 - 31th May 2023.
From Figure 8, and at the date of publication of this post^{13}, a buy-and-hold investment in the 10-year constant maturity U.S. government bond is expected to yield an annualized return of ~4% over the next 10 years.
Here, Figure 9 displays the price scenario corresponding to Figure 8 in the case of the IEF ETF, with a 95% confidence interval added.
A less typical example of usage would be to combine^{14} the forecasts produced by the BSRM/B model with estimations of the equity risk premium in order to predict future stock market returns.
A good starting place to find such estimations of the equity risk premium is the website of Aswath Damodaran, who maintains estimates of the historical implied equity risk premiums for the U.S. with plenty of details in his yearly-updated paper Equity Risk Premiums (ERP): Determinants, Estimation and Implications^{15}.
Bogel’s bond model has proved very effective at predicting long-term U.S. governement bonds returns over more than thirty years after its initial publication, which confirms what Bogle^{1} wrote in his original paper:
when we know the current coupon, we know most of what we need to know to forecast [government] bond returns in the coming decade
I find this model is another interesting addition to one’s forecasting toolbox on top of the AIAE indicator!
For more forecasts, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
See Bogle, J., Investing in the 1990s, The Journal of Portfolio Management, Vol. 17, No. 3 (1991a), pp. 5-14. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7}
See Bogle J., Nolan M., Occam’s Razor Redux: Establishing Reasonable Expectations for Financial Market Returns, Journal of Portfolio Management, Vol. 42, No. 101, Sep Fall 2015. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6}
All bond returns considered in this blog post are total returns, so that I will omit “total”. ↩
See Bogle, J., The 1990s at the Halfway Mark, The Journal of Portfolio Management, Vol. 21, No. 4 (1995), pp. 21-31. ↩ ↩^{2} ↩^{3}
More data is needed to support this claim here; for example, the in-sample monthly $r^2$ coefficient for the 10-year constant maturity U.S. Treasury bond is equal to ~83.9%, so that there is no apparent degradation of the BSRM/B model. ↩
This is especially visible in Figure 5, with an $r^2$ coefficient of only ~59.7%. ↩
See Martin L. Leibowitz, Anthony Bova & Stanley Kogelman (2014) Long-Term Bond Returns under Duration Targeting, Financial Analysts Journal, 70:1, 31-51. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5}
That is, whatever the pace of yield changes or the magnitude of yield changes. As a matter of fact, Leibowitz et al.^{7} show that what is important is the standard deviation of the yield change distribution. ↩
Viewing a constant maturity bond as a constant duration bond fund might seem as a leap of faith, but the analysis of constant duration bond funds done in Leibowitz et al.^{7} shows that they should not differ much in practice. ↩
The duration of the 10-year constant maturity U.S. Treasury bond/IEF ETF is not constant over time, so that this is a kind of first-order approximation. ↩
In Leibowitz et al.^{7}, the effective convergence horizon of a 5-year constant duration bond fund is shown to 6 years instead of 9 years. ↩
Again, assumed to be constant in Leibowitz et al.^{7}, which is not the case in practice. ↩
More precisely, assuming the investment starts on 31th May 2023. ↩
Of course, in a non-circular way. ↩
See Damodaran, Aswath, Equity Risk Premiums (ERP): Determinants, Estimation and Implications - The 2023 Edition. ↩
This indicator, called the Aggregate Investor Allocation to Equities (AIAE), has been further analyzed by Raymond Micaletti in his paper Towards a Better Fed Model^{3}, with the conclusion that it indeed has superior equity-return forecasting ability compared to other well-known indicators (such as the CAPE ratio, Tobin’s Q, Market Cap-to-GDP, etc.)^{3}.
In this post, I will describe in details the AIAE indicator, come back on nearly ten years of out-of-sample performances and show how to use the forecast procedure proposed by Micaletti^{3} in order to set long-term capital assumptions for the U.S. stock market.
Livermore^{4} defines the AIAE indicator as the total amount of stocks that investors are holding in aggregate divided by the total amount of stocks plus bonds plus cash that these same investors are holding in aggregate, that is
\[AIAE = \frac{TMV_s}{TMV_s + TMV_b + C}\], where:
By definition, this indicator represents the average investor allocation to stocks^{5}, hence its name.
Through some approximations, Livermore^{4} shows that it is possible to compute the AIAE indicator for the U.S. thanks to economic data published in the quarterly Federal Reserve release Financial Accounts of the United States - Z.1.
In details:
The U.S. AIAE indicator computed using the above Fred data series is available -> here.
Figure 1, adapted from Livermore^{4}, compares the value of the U.S. AIAE indicator at the end of any given quarter over the period 31th December 1951 - 30th September 2003^{6} with the subsequent 10-year annualized S&P 500 total^{7} returns.
It appears that this indicator is doing an impressive job at predicting future U.S. stock market returns!
This can be confirmed more formally through an ordinary least square regression.
Figure 2, directly reproduced from Livermore^{4}, shows that the value of the U.S. AIAE indicator at the end of any given quarter over the period 31th December 1951 - 30th September 2003^{6} explains ~91.3% of the variability of the subsequent 10-year annualized S&P 500 returns.
These compelling forecasting performances need nevertheless to be taken with a grain of salt, because they are not exactly achievable in real life due to two problems:
This problem is discussed more in details for example in Asness et al.^{8} in the case of the CAPE ratio, but it suffices to say that many equity valuation indicators usually present both encouraging in-sample long-horizon [performance]^{8} and directionally right but weak and disappointing out-of-sample performance^{8}.
One solution to this problem is to evaluate the forecasting performances of equity valuation indicators in a kind of walk-forward fashion.
For the U.S. AIAE indicator, this is done in Micaletti^{3}, who conclude that
the Aggregate Investor Allocation to Equities (AIAE) has superior equity-return forecasting ability compared to other well-known indicators (such as the CAPE ratio, Tobin’s Q, Market Cap-to-GDP, etc.)
More on this in the next section.
After the initial release of economic data (unemployment, GDP, etc.), it is usual to see these data being revised a couple of weeks, months, or quarters later.
So, because the AIAE indicator is based on economic data, its value on a given past date as computed today v.s. as computed just after the initial release of the associated economic data might be different.
Hopefully, there are some hints in Micaletti^{3} that the impact of this problem might be neglictible in practice.
Livermore^{4} argues that, under reasonable assumptions, long-term stock market returns must be driven by dynamics in equities supply v.s. bonds plus cash supply, dynamics that are precisely captured by the AIAE indicator.
I will not repeat his whole reasoning here^{9}, but it shares some similarities with the reasoning of Sharpe in his paper The Arithmetic of Active Management^{10} in that it uses arithmetic arguments to model the behaviour of an imaginary “aggregate investor”.
In this section, I will study the forecasting performances of the U.S. AIAE indicator since its publication.
Because Livermore published the associated blog post on 20th December 2013^{4}, he had access to
, which allowed him to analyze the 10-year forcecasting performances of the U.S. AIAE indicator over the period 31th December 1951 - 30th September 2003.
On my side, at the date of publication of this post, I have access to
, which allows me to analyze the 10-year forcecasting performances of the U.S. AIAE over the additional out-of-sample period 31th December 2003 - 31th March 2013.
I will use two methodologies:
The data sources for this study are the following:
To be noted that it is possible to access Alfred economic data through a well-documented Web API.
Figure 3 is my reproduction of Figure 2 from Livermore^{4}.
Although there is a slight difference in the $r^2$ coefficients (~91.3% in Figure 2 v.s. ~88.5% in Figure 3), probably related to differences in both Fred data^{12} and U.S. stock market return data^{13}, these two figures look very much alike.
This validates my reproduction of Livermore’s methodology.
Figure 4 is the same as Figure 3, with the data points corresponding to the out-of-sample period 31th December 2003 - 31th March 2013 added.
Unfortunately, the $r^2$ coefficient has decreased (from ~88.5% in Figure 3 to ~85.2% in Figure 4), which implies that forecasting performances have degraded over the most recent period.
This is confirmed by Figure 5, which displays only the data points corresponding to the out-of-sample period.
On this figure, it is clearly visible that the relationship between the U.S. AIAE indicator and the subsequent 10-year annualized U.S. stock market returns is linear-ish, but with a high variability.
Figure 6 empirically demonstrates that the forecasts of the 10-year annualized U.S. stock market returns obtained using Micaletti’s methodology^{3} match extremely well with their actual counterparts over the period 31th December 1951 - 30th September 2003.
As a side note, the $r^2$ coefficient obtained with the “real-time” methodology of Micaletti (~89.1% in Figure 6) is a little bit higher than the $r^2$ coefficient obtained using the “hindsight-biased” methodology of Livermore (~88.5% in Figure 3).
This might be linked to the usage of an expanding window in Micaletti’s methodology, which allows to account for dynamics in the evolution of the linear regression coefficients, or this might more probably just be noise.
Figure 7 is the same as Figure 6, with the data points corresponding to the out-of-sample period 31th December 2003 - 31th March 2013 added.
The situation here is the same as with Livermore’s methodology, that is, the $r^2$ coefficient has decreased over the most recent period (from ~89.1% in Figure 6 to ~84.1% in Figure 7).
The associated degradation in forecasting performances is confirmed in Figure 8, which displays only the data points corresponding to the out-of-sample period.
The bottom line of what precedes is that the forecasting performances of the U.S. AIAE indicator have indubitably decreased since its publication by Livermore^{4}, which is materialized by a lower $r^2$ coefficient.
While the COVID crisis and recovery certainly played a role, it is actually not the first time in history that such a decrease in forecasting performances occurs.
Figure 9 displays the rolling $r^2$ coefficient, over a prior period of 10 years, of the 10-year annualized U.S. stock market returns forecasts v.s. actual values.
On this figure, there are many 10-year periods with an $r^2$ coefficient much lower than ~77.8%, with even 10-year periods whose $r^2$ coefficient is much lower than 20%!!
So, as glimmers of hope, 1) there have been much worse underperforming periods in history than the current one (relative hope) and 2) the current $r^2$ coefficient is still very high for an equity valuation indicator^{4} (absolute hope).
A proprietary variation of Micaletti’s methodology^{3} is implemented through the Portfolio Optimizer endpoint /markets/indicators/aiae/us
to compute:
Every year, major financial institutions publish their long-term capital market assumptions based on their internal valuation models (BlackRock, J.P.Morgan…).
The U.S. AIAE indicator enables individual investors to have access to such a valuation model in the case of the U.S. stock market.
Even better, the U.S. AIAE indicator also enables individual investors to have access to the “path” of expected long-term U.S. stock market returns, as for example regularly published by Micaletti on his twitter account through images like Figure 10.
For comparison (shameless plug):
Thanks to these forecasts, it becomes possible to contextualize the long-term capital market assumptions of financial institutions.
For example:
This forecast can be compared to the U.S. AIAE forecast of ~2.1% in Figure 12 and of not a much higher value in Figure 11.
Lower valuations and higher yields mean that markets today offer the best potential long-term returns since 2010
This comment can be put into perspective by looking at the U.S. AIAE forecasts around 2010 in Figure 12 and Figure 11.
The U.S. AIAE forecasts of long-term U.S. stock market returns can be converted into future price scenarios for misc. traded instruments.
In the case of the SPY ETF, the price scenarios corresponding to Figure 12 are represented in Figure 13.
Such price scenarios are sometimes more easy to grasp, or to describe to customers, than forecasts of pure returns.
Looking at Figure 13, it is for example clear that the price of the SPY ETF is currently not where it “should” be, and that, assuming the U.S. AIAE indicator continues to be reliable, we might be in for at best ~2 years of flat returns and at worst for a moderate to severe U.S. market correction anytime soon…
The forecasts generated by the U.S. AIAE indicator can also be used within a tactical asset allocation framework, as detailed for example in Micaletti^{3}, who note that:
[…] over the last 43 years and across various subperiods, the AIAE-based TAA strategy delivered the most consistently high-level performance relative to its competitors.
Alternatively, they could also be integrated in any other tactical asset allocation framework that already uses an equity valuation indicator, like the framework described in Asness et al.^{8}.
I will not go into more details here, though.
Nearly one decade of out-of-sample forecasting performances validates that the AIAE indicator introduced by Livermore^{4} has been very impressive in the U.S.
Now, whether the current period of (relative) underperformance will continue or stop in the future is of course open to debate, but in any cases, I hope that this blog post shed some light on this little-known equity valuation indicator.
For more uncommon quantitative nuggets, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
This is a pseudonym. ↩
See Campbell, John Y., Robert J. Shiller (1988a). Stock prices, earnings, and expected dividends. Journal of Finance 43, 661-676. ↩
See Micaletti, Raymond, Towards a Better Fed Model. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9}
See The Single Greatest Predictor of Future Stock Market Returns. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11}
Livermore^{4} makes the working assumption that an investor can only invest in stocks, bonds or cash as far as financial assets are concerned; whether the development of alternative financial asset classes like cryptocurrencies will at some point impact this assumption remains to be seen. ↩
Although there is no reference in Livermore^{4} to the exact period used for his graphs, my best guess is 31th December 1951 - 30th September 2003. ↩ ↩^{2}
All stock market returns considered in this blog post are total returns, so that I will omit “total”. ↩
See Cliff Asness, Antti Ilmanen and Thomas Maloney, Market Timing: Sin a Little Resolving the Valuation Timing Puzzle, Journal Of Investment Management, Volume 15, Number 3, 2017. ↩ ↩^{2} ↩^{3} ↩^{4}
Micaletti^{3} includes a summary of Livermore^{4}. ↩
See William F. Sharpe, The Arithmetic of Active Management, Financial Analysts Journal, Vol. 47, No. 1 (Jan. - Feb., 1991), pp. 7-9. ↩
The Alfred website is a point in time version of the Fred website, which allows to access initial releases or more generally specific point in time versions of economic data. ↩
There is no reference in Livermore^{4} to whether the Fred data are initial releases, but my best guess is that they are not. ↩
I used the stock market returns provided on the Kenneth French’s website, while Livermore^{4} used the S&P 500 returns. ↩
See BlackRock website. ↩
In this post, after providing the necessary definitions, I will reproduce the empirical study of Gerber et al.^{1} which highlights the superiority of the Gerber correlation matrix relative to the sample correlation matrix and I will discuss some practical aspects associated to the usage of the Gerber statistic (how to choose the Gerber threshold…).
Notes:
- A Google sheet corresponding to this post is available here
Let be two assets $i$ and $j$ observed over $T$ time periods, with:
Let then be the scatter plot of the standardised joint returns $\left( \frac{r_{i,t}}{\sigma_1}, \frac{r_{j,t}}{\sigma_2} \right), t=1..T$ of these two assets, as depicted for example in Figure 1, slightly adapted from Flint and Polakow^{2}.
By introducing the Gerber threshold $c \in [0,1]$, this scatter plot can be partitioned into nine different subsets:
The Gerber statistic $g_{i,j}$ is then defined as^{1}
\[g_{i,j} = \frac{n_{i,j}^{UU} + n_{i,j}^{DD} - n_{i,j}^{UD} -n_{i,j}^{DU}}{T - n_{i,j}^{NN}}\], where:
The definition of the Gerber statistic makes it a measure of co-movement more robust to outliers and to noise than the
Pearson correlation coefficient.
Indeed, as highlighted in Gerber et al.^{1}:
From this perspective, the Gerber statistic is especially well suited for financial time series, which often exhibit extreme movements and a great amount of noise^{1}.
Flint and Polakow^{2} reference three existing variations of the Gerber statistic and note that these alternative definitions materially change[] the resultant measure, to the point that the three GS variants in press should arguably be viewed as entirely separate dependence measures^{2}.
For the sake of clarity, this blog post will only discuss the variant termed the Gerber statistic in Gerber et al.^{1}.
Gerber et al.^{1} illustrate the computation of the Gerber statistic using the $T = 24$ monthly returns^{4} of the asset pair S&P 500 (SPX) - Gold (XAU) over the period January 2019 - December 2020.
I propose to re-use the same example and to validate the computation thanks to the SPX - XAU returns available in the Google sheet associated to this post.
Figure 2 represents the scatter plot of the standardised joint returns of these two assets overlaid with the nine subsets corresponding to a Gerber threshold of 0.5.
Figure 2 is exactly the same as the figure in panel A of exhibit A2 of Gerber et al.^{1}, so that the computation of the Gerber statistic should result in a value of ~0.286.
Let’s double check this.
From Figure 2, we have:
So that:
\[g_{SPX,XAU} = \frac{7 + 1 - 0 - 2}{24 - 3} = \frac{2}{7} \approx 0.286\]All good!
Let be:
The asset Gerber correlation matrix $G \in \mathcal{M}(\mathbb{R}^{n \times n})$, also called the Gerber matrix, is then defined by:
\[G_{i,j} = g_{i,j}, i=1..n, j=1..n\], where $g_{i,j}$ is the Gerber statistic between asset $i$ and asset $j$.
Let be:
The asset Gerber covariance matrix $\Sigma_G \in \mathcal{M}(\mathbb{R}^{n \times n})$ is then defined by:
\[\left( \Sigma_{G} \right)_{i,j} = g_{i,j} \, \sigma_i \, \sigma_j, i=1..n, j=1..n\], where $g_{i,j}$ is the Gerber statistic between asset $i$ and asset $j$.
Portfolio Optimizer implements two endpoints related to the Gerber statistic:
/assets/correlation/matrix/gerber
, to compute the Gerber correlation matrix/assets/covariance/matrix/gerber
, to compute the Gerber covariance matrixGerber et al.^{1} analyze the empirical performance of the Gerber covariance matrix within the Markowitz’s mean-variance framework.
For this, they consider a universe of nine asset classes:
, inside which they backtest the following portfolio investment strategy over the period January 1988 - December 2020:
Figure 3, reproduced from Gerber et al.^{1}, illustrates the resulting three ex post mean-variance efficient frontiers in the case of a Gerber threshold equal to 0.5.
On this figure, it is pretty clear that the efficient frontier corresponding to the Gerber covariance matrix dominates the two other efficient frontiers^{7}.
Based on these empirical findings, using the Gerber covariance matrix as an alternative to both [the sample covariance matrix] and to the shrinkage estimator of Ledoit and Wolf^{1} seems really compelling.
Nevertheless, some practicalities must be discussed first.
In order to determine whether the empirical performances reported in Gerber et al.^{1} are robust to slight changes in implementation details^{8}, I propose to reproduce the backest of the portfolio investment strategy detailled in the previous section using Portfolio Optimizer^{9}.
To be noted that because Portfolio Optimizer does not support Ledoit-Wolf type shrinkage at the date of publication of this post^{10}, I am only able to compare the Gerber covariance matrix with the historical covariance matrix.
The two reproduced ex post mean-variance efficient frontiers are displayed in Figure 4 in the case of a Gerber threshold equal to 0.5.
Figure 3 and Figure 4 are pretty close^{11}, except really for the mean-variance efficient portfolio with an annualized volatility target of 11%, which confirms the robustness in terms of reproducibility of the empirical study of Gerber et al.^{1}.
The Gerber statistic, the Gerber correlation matrix and the Gerber covariance matrix all depend on the Gerber threshold, so that it is important to understand the impact of varying the Gerber threshold on these quantities.
In the case of two assets, this impact is extensively studied in Flint and Polakow^{2} thanks to numerical simulations of joint normal and joint non-normal return distributions. Their conclusion is that the dependency of the Gerber statistic on the Gerber threshold is highly non trivial…
On my (less ambitious) side, I will study the impact of varying the Gerber threshold from 0 to 1 in increments of 0.1 on the backest of the portfolio investment strategy detailled in the previous section.
Figure 5 (resp. Figure 6) illustrates the evolution of the resulting portfolio investment strategy equity curves for an ex ante annualized volatility target of 5% (resp. 10%).
Figure 6 shows that the influence of the Gerber threshold on performances can be neglictible, which is very good news.
Unfortunately, Figure 5 shows on the contrary that the influence of the Gerber threshold on performances can be non-neglictible at all, which is bad news.
This leads to the question of how to “best” chose the Gerber threshold in practice.
From the definition of the Gerber statistic, the higher the Gerber threshold:
As a consequence, it would make sense to choose the Gerber threshold dynamically, as a function of the “signal-to-noise ratio” of the considered universe of assets.
In the context of the portfolio investment strategy detailled in the previous section, I experimented with a simple approach based on past risk-adjusted performances:
Figure 7 illustrates the evolution of the resulting portfolio investment strategy equity curve for an ex ante annualized volatility target of 5%.
From Figure 7, it appears that this simple data-driven method to choose the Gerber threshold is able to match the performances of the best Gerber threshold chosen in hindsight^{13} ($c = 0.5$).
Some (annualized) statistics to support this observation:
Gerber portfolio (fixed c = 0.50) | Gerber portfolio (adaptative c) | |
---|---|---|
Average return | 7.6% | 7.9% |
Volatility | 6.2% | 6.2% |
Sharpe ratio | 1.26 | 1.23 |
Of course, no generic conclusion can be drawn from this example, but I do think that the adaptative computation of the Gerber threshold would be an interesting research topic.
Gerber et al.^{1} mention that
In the empirical studies performed, and for all cases of Gerber thresholds $c$ considered, we always observe the […] Gerber matrix G to be positive semidefinite
, but give no formal proof that the Gerber correlation matrix is positive semidefinite in general.
It is thus natural to wonder whether this is the case, all the more because positive semidefinitess is usually lacking in other correlation matrices built from robust pairwise scatter estimates^{14}^{15}.
Hopefully, the Gerber correlation matrix is indeed positive semidefinite, as established in the paper Proofs that the Gerber Statistic is Positive Semidefinite from Gerber et al.^{16}
Note:
- The initial version of this post stated that there were no proof that the Gerber correlation matrix was a positive semidefinite matrix; this was incorrect, as a proof was available on Mr Enrst website.
In their paper, Gerber et al.^{1} confine [their] analysis to the mean–variance optimization (MVO) framework of Markowitz.
What about other portfolio allocation frameworks, though?
Would the Gerber statistic be somewhat taylored to the mean-variance framework, for example because of a hidden relationship with quadratic utility?
To answer this question, I propose to adapt the portfolio investment strategy detailled inthe previous section to the risk parity framework, and more precisely to the equal risk contributions framework^{17}, as follows:
Figure 8 illustrates the resulting portfolio investment strategy equity curves in the case of a Gerber threshold equal to 0.5.
Figure 8 also empirically confirms that the Gerber covariance matrix also behaves properly in a non mean–variance framework.
Figure 3 and Figure 4 might give the wrong impression that the Gerber covariance matrix always dominates the sample covariance matrix in terms of ex post risk-return within the Markowitz’s mean-variance framework^{18}.
In order to illustrate that this is not the case, I will backtest the same portfolio investment strategy as detailled in one of the previous sections, but this time with the ten-asset universe of the Adaptative Asset Allocation strategy^{19}^{20} from ReSolve Asset Management:
This universe is very similar to the nine-asset universe used in Gerber et al.^{1}, because it is also well-diversified in terms of asset classes.
Unfortunately, with this universe, the ex post efficient frontier corresponding to the Gerber covariance matrix does not always dominate anymore the ex post efficient frontier corresponding to the sample covariance matrix, as can be seen in Figure 9, Figure 10 and Figure 11.
Because of its definition, the Gerber statistic must intuitively be more sensitive than the sample covariance matrix to measurement error^{21}.
Still, in my own testing, I did not notice any excessive sensitivity^{22}.
For example, Figure 12 is the equivalent of Figure 9 when daily asset returns are used instead of monthly returns.
Comparing these two figures, it is hard to conclude that the Gerber covariance matrix is dramatically more sensitive to the number of observations than the sample covariance matrix^{23}.
That being said, Flint and Polakow^{2} investigate the sensitivity of the Gerber statistic to estimation error more rigourously, and do find that there is considerable variation in the GS when estimated with limited observations^{2}.
So, better to err on the side of caution here.
I hope that thanks to this post you now have a good overview of the Gerber statistic, along with some of the practical concerns associated with its usage.
As Flint and Polakow^{2} put it:
Overall, the GS is an interesting conditional dependence metric, but not without its flaws or caveats.
If you have any questions, or if you would like to discuss further, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
See Gerber, S., B. Javid, H. Markowitz, P. Sargen, and D. Starer (2022). The gerber statistic: A robust co-movement measure for portfolio optimization. The Journal of Portfolio Management 48(2), 87–102. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13} ↩^{14} ↩^{15} ↩^{16} ↩^{17} ↩^{18} ↩^{19}
See Flint, Emlyn and Polakow, Daniel A., Deconstructing the Gerber Statistic. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7}
More precisely, Gerber et al.^{1} define a concordant pair of returns as a pair which both components pierce their thresholds while moving in the same direction and a discordant pair of returns as a pair whose components pierce their thresholds while moving in opposite directions. ↩ ↩^{2}
All asset returns considered in this blog post are total returns. ↩
See Ledoit, O., and M. Wolf. 2004. “Honey, I Shrunk the Sample Covariance Matrix.” The Journal of Portfolio Management 30 (4): 110–119. ↩
The exact implementation details used by Gerber et al.^{1} can be found in the Python code associated to their paper; one important detail to note is that when there is no mean-variance efficient portfolio with the desired volatility, the minimum variance portfolio or the maximum return portfolio is used instead. ↩
The same conclusion applies for the two other values of the Gerber threshold, c.f. Gerber et al.^{1}. ↩
In particular, when there is no mean-variance efficient portfolio with a desired volatility because the desired volatility is too low, it might be more in line with the mean-variance framework to use a partially invested portfolio v.s. the minimum variance portfolio as in Gerber et al.^{1}. ↩
I would like to thank Mr William Smyth^{24} for providing me returns data for the nine-asset universe. ↩
It’s definitely on the to do list, though. ↩
In addition to the difference in managing the portfolio volatility constraint^{8}, there are other subtle differences in my reproduction of the backtest of Gerber et al.^{1}; for example, I do not consider any transaction cost, I use the arithmetic average return of indexes and not their geometric average return, etc. ↩
The performances of the method seem to be robust w.r.t. the lookback period; to be noted that a lookback period of 12 months results in the best performances, but I chose 24 months to be consistent with the lookback period used to compute mean-variance input estimates. ↩
In terms of the Sharpe ratio of the resulting portfolio investment strategy, the best Gerber threshold among the thresholds displayed in Figure 5 is equal to 0.5. ↩
Zhao, Tuo, et al. Positive Semidefinite Rank-Based Correlation Matrix Estimation With Application to Semiparametric Graph Estimation. Journal of Computational and Graphical Statistics, vol. 23, no. 4, 2014, pp. 895–922. JSTOR ↩
See F.A. Alqallaf, K.P. Konis, R.D. Martin, and R.H. Zamar. Scalable robust covariance and correlation estimates for data mining. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 14-23. ACM, 2002. ↩
See S. Gerber, H. Markowitz, P. Ernst, Y. Miao, B. Javid, P. Sargen, Proofs that the Gerber Statistic is Positive Semidefinite, arXiv. ↩
See Richard, Jean-Charles and Roncalli, Thierry, Constrained Risk Budgeting Portfolios: Theory, Algorithms, Applications & Puzzles. ↩
Which, to be clear, is not at all the conclusion of Gerber et al.^{1}. ↩
See Butler, Adam and Philbrick, Mike and Gordillo, Rodrigo and Varadi, David, Adaptive Asset Allocation: A Primer. ↩
The associated returns data have been retrieved using Tiingo. ↩
In this context, the measurement error is due to the short length of the time series of asset returns that are typically used for covariance matrix estimation. ↩
To be noted that I only used the Gerber statistic with well diversified universes of assets and not with let’s say a universe of stocks (S&P 500…). ↩
Or at the very least, if it really is, this does not translate into dramatically different risk-return performances, which is what ultimately matters from a portfolio management perspective. ↩
See William Smyth, Daniel Broby, An enhanced Gerber statistic for portfolio optimization, Finance Research Letters, Volume 49, 2022, 103229. ↩
Since its publication, mVaR has been widely adopted by academic researchers, financial regulators^{3} and practitioners, who typically highlight its straightforward numerical implementation and its ease of interpretation thanks to its explicit form^{4}.
Nevertheless, it has been observed in practice that mVaR only works well for non-normal distributions that are close to the Gaussian distribution and for tail probabilities which are not too small^{5}.
In this post, I will explain why in the light of the results of Maillard^{6} and Lamb et al.^{7}, who show that mVaR accuracy is related to the mathematics of the Cornish-Fisher expansion.
I will also empirically demonstrate, using Bitcoin and the SPY ETF, that the method proposed by Maillard^{6} to improve mVaR accuracy makes it usable for moderately to highly non-normal distributions as well as for small tail probabilities^{8}.
The (percentage) Value-at-Risk (VaR) of a portfolio of financial assets corresponds to the percentage of portfolio wealth that can be lost over a certain time horizon and with a certain probability^{9}.
More formally, the Value-at-Risk $VaR_{\alpha}$ of a portfolio over a time horizon $T$ (1 day, 10 days…) and at a confidence level $\alpha$% $\in ]0,1[$ (95%, 97.5%, 99%…) can be defined^{5} as the opposite of the lower $1 - \alpha$ quantile of the portfolio return^{10} distribution over the time horizon $T$
\[\text{VaR}_{\alpha} (X) = - \inf_{x} \left\{x \in \mathbb{R}, P(X \leq x) \geq 1 - \alpha \right\}\], where $X$ is a random variable representing the portfolio return over the time horizon $T$.
This formula is also equivalent^{11} to
\[\text{VaR}_{\alpha} (X) = - F_X^{-1}(1 - \alpha)\], where $F_X^{-1}$ is the inverse cumulative distribution function, also called the quantile function, of the random variable $X$.
The previous definition of VaR is not directly usable, because it requires to specify the portfolio return distribution.
One possible approach is to approximate the portfolio return distribution by its empirical distribution, in which case the associated VaR is called historical Value-at-Risk (HVaR).
Another possible approach is to approximate the portfolio return distribution by a given probability distribution, in which case the associated VaR is called parametric Value-at-Risk.
When this distribution is chosen to be the Gaussian distribution $\mathcal{N}_{\mu, \sigma^2}$, that is, when $X \sim \mathcal{N} \left( \mu, \sigma^2 \right)$ with $\mu$ the location parameter and $\sigma$ the scale parameter, the associated VaR is called Gaussian Value-at-Risk (GVaR) and is computed through the formula^{12}
\[\text{GVaR}_{\alpha} (X) = - \mu - \sigma z_{1 - \alpha}\], where:
Approximating a portfolio return distribution by a Gaussian distribution might be appropriate in some cases, depending on the assets present in the portfolio and on the time horizon^{13}, but generally speaking, financial assets exhibit skewed and fat-tailed return distributions^{2}, so that it makes more sense to also consider higher moments than just the first two.
For this reason, Zangari^{1} proposed to approximate the $1 - \alpha$ quantile of the portfolio return distribution by a fourth order Cornish–Fisher expansion of the $1 - \alpha$ quantile of the standard normal distribution, which allows to take into account skewness and kurtosis present in the portfolio return distribution.
The resulting VaR, called modified Value-at-Risk or sometimes Cornish-Fisher Value-at-Risk (CFVaR), is computed through the formula^{12}
\[\text{mVaR}_{\alpha} (X) = - \mu - \sigma \left[ z_{1-\alpha} + (z_{1-\alpha}^2 - 1) \frac{\kappa}{6} + (z_{1-\alpha}^3-3z_{1-\alpha}) \frac{\gamma}{24} -(2z_{1-\alpha}^3-5z_{1-\alpha})\frac{\kappa^2 }{36} \right]\], where the location parameter $\mu$, the scale parameter $\sigma$, the skewness parameter $\kappa$ and the excess kurtosis parameter $\gamma$ are usually^{2} estimated by their sample counterparts computed from past portfolio returns
To be noted that using this formula to compute VaR is equivalent to making the assumption that the portfolio return distribution follows what could be called a Cornish-Fisher distribution^{7} $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$, whose inverse cumulative distribution function is given by
\[F_X^{-1}(u) = \mu + \sigma \left[ z_u + (z_u^2 - 1) \frac{\kappa}{6} + (z_u^3-3z_u) \frac{\gamma}{24} -(2z_u^3-5z_u)\frac{\kappa^2}{36} \right]\], where:
, which is also equivalent^{7} to making the assumption that
\[X \sim \mu + \sigma \left[ Z + (Z^2 - 1) \frac{\kappa}{6} + (Z^3-3Z) \frac{\gamma}{24} -(2Z^3-5Z)\frac{\kappa^2}{36} \right]\], where:
Figure 1 compares, over the period 01 February 1993 - 04 April 2023, the empirical distribution of the SPY ETF daily returns^{14} to the Cornish-Fisher distribution $\mathcal{CF}_{\mu_s, \sigma_s, \kappa_s, \gamma_s}$ with parameters:
On this figure, it is visible that the Cornish-Fisher distribution does not accurately approximate the empirical distribution of the SPY ETF returns.
The same also applies to the left tail of the empirical distribution of the SPY ETF returns, as can be seen in Figure 2.
On top of this poor approximation accuracy, and maybe even worse, taking a closer look at Figure 1 also reveals that the Cornish-Fisher distribution does not seem to be monotonous. For example, quantiles between 20% and 40% are positive while quantiles between 60% and 80% are negative! This means that the Cornish-Fisher distribution is not a proper probability distribution^{15}.
What could explain these observations, while the Cornish-Fisher expansion is supposed, by construction, to be able to approximate the quantiles of any distribution?
Let’s dig in Maillard^{6}!
Maillard^{6} notes that in order for the Cornish-Fisher expansion to result in a well-defined quantile function, the skewness parameter $\kappa$ and the excess kurtosis parameter $\gamma$ must satisfy the constraints
\[| \kappa | \leq 6 \left( \sqrt{2} - 1 \right)\] \[27 \gamma^2 - (216 + 66 \kappa^2) \gamma + 40 \kappa^4 + 336 \kappa^2 \leq 0\]These two constraints define the domain of validity of the Cornish-Fisher expansion, represented in Figure 3.
When used outside of its domain of validity, the Cornish-Fisher expansion is known to have several issues impacting its accuracy^{16}, among which non-monotonous quantiles.
And as can be seen in Figure 4, this is exactly what happens in the case of the SPY ETF, with the parameters $\left( \kappa, \gamma \right) \approx (-0.28740, 10.898897) $ clearly outside of the domain of validity of the Cornish-Fisher expansion.
Hopefully, there is a way to circumvent the relative narrowness of the domain of validity of the Cornish-Fisher expansion thanks to a regularization procedure called increasing rearrangement^{17} and described in details in Chernozhukov et al.^{18}
The impact of this procedure is illustrated in Figure 5, which compares the same two distributions as in Figure 1, except that the Cornish-Fisher distribution has been rearranged.
The rearranged Cornish-Fisher distribution is now monotonous, as it should be, but unfortunately, it only marginally better approximates the empirical distribution of the SPY ETF returns.
So, either all hope is lost w.r.t. using mVaR with moderately non-normal return distributions or there is another problem hidden somewhere waiting to be found…
Let’s dig a little bit further in Maillard^{6}!
Maillard^{6} also notes that the scale, skewness and excess kurtosis parameters $\sigma$, $\kappa$ and $\gamma$ do not match the actual standard deviation $\sigma_{CF}$, skewness $ \kappa_{CF}$ and excess kurtosis $\gamma_{CF}$ of the Cornish-Fisher distribution $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$.
More precisely, he establishes the following relationships
\[\begin{align} \mu_{CF} &= \mu \\ \sigma_{CF} &= \sigma \sqrt{ 1 + \frac{1}{96} \gamma^2 + \frac{25}{1296} \kappa^4 - \frac{1}{36} \gamma \kappa^2 } \\ \kappa_{CF} &= f_1(\kappa, \gamma) \\ \gamma_{CF} &= f_2(\kappa, \gamma) \\ \end{align}\], where:
As a consequence, when the sample moments of a return distribution are used as plug-in estimators for the Cornish-Fisher parameters, the actual moments of the resulting Cornish-Fisher distribution differ from these sample moments!
Do they differ enough to create a real problem, though?
Re-using the SPY ETF example:
So, yes, they do differ a lot, especially the excess kurtosis!
This subtlety is the hidden problem explaining^{19} the observed lack of accuracy of modified Value-at-Risk when return distributions are not close to normal^{5}. Indeed, it cannot be expected from a “wrong” Cornish-Fisher distribution to accurately approximate anything useful.
The solution to this problem consists in inverting the relationships (1)-(4) between the actual moments and the parameters of the Cornish-Fisher distribution $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$.
In other words, we need to determine the value of the parameters $\mu$, $\sigma$, $\kappa$ and $\gamma$ of the Cornish-Fisher distribution $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$ so that its actual moments $\mu_{CF}$, $\sigma_{CF}$, $\kappa_{CF}$ and $\gamma_{CF}$ are equal to the sample moments $\mu_{s}$, $\sigma_{s}$, $\kappa_{s}$ and $\gamma_{s}$ of the empirical return distribution, c.f. Lamb et al.^{7}.
More on how to do this numerically later.
The resulting Cornish-Fisher distribution is called the corrected Cornish-Fisher distribution $\mathcal{cCF}_{\mu_s, \sigma_s, \kappa_s, \gamma_s}$ and the underlying Cornish-Fisher expansion the corrected Cornish-Fisher expansion^{4}.
Re-using one last time the SPY ETF example, we have:
, and Figure 6 compares the resulting corrected Cornish-Fisher distribution to the two distributions of Figure 5.
The approximation of the empirical return distribution by the corrected Cornish-Fisher distribution is so accurate that these two distributions are nearly indistinguishable in this figure.
Figure 7, Figure 8 and Figure 9 compare the left tail of the three distributions from Figure 6.
A nearly perfect fit again between the empirical return distribution and the corrected Cornish-Fisher distribution.
This example empirically demonstrates that modified Value-at-Risk, when corrected using Maillard^{6} results, works well for moderately non-normal distributions and for very small tail probabilities.
As mentioned in the previous section, computing the corrected Cornish-Fisher distribution requires to invert the relationships (1)-(4) between the actual moments and the parameters of the Cornish-Fisher distribution $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$.
Because the location parameter $\mu$ is invariant by (1), and because the scale parameter $\sigma$ is easily computed thanks to (2) once the skewness parameter $\kappa$ and the excess kurtosis parameter $\gamma$ have been computed, the main mathematical challenge is to invert the system of non-linear equations (3)-(4).
Before thinking about how to invert these equations numerically, we first need to make sure that they are invertible theoretically.
Lamb et al.^{7} prove that this is the case when the actual skewness $\kappa_{CF}$ and the actual excess kurtosis $\gamma_{CF}$ belong^{20} to what could be called the domain of validity of the corrected Cornish-Fisher expansion^{21}, represented in Figure 10.
Lamb et al.^{7} also establish that the resulting skewness parameter $\kappa$ and excess kurtosis parameter $\gamma$ belong to the domain of validity of the Cornish-Fisher expansion, which ensures that the resulting corrected Cornish-Fisher distribution is a proper distribution.
To be noted that the domain of validity of the corrected Cornish-Fisher expansion (Figure 10) is much wider than the domain of validity of the Cornish-Fisher expansion (Figure 3).
This is extremely important in applications, because the actual skewness $\kappa_{CF}$ and the actual excess kurtosis $\gamma_{CF}$ of the corrected Cornish-Fisher distribution typically correspond to the sample skewness $\kappa_s$ and to the sample excess kurtosis $\gamma_s$ of a given distribution^{22}, so that the corrected Cornish-Fisher distribution is valid in practice for a much wider range of skewness and excess kurtosis than the non-corrected Cornish-Fisher distribution.
At least two algorithms have been analyzed in the literature to compute the corrected Cornish-Fisher parameters from the actual moments:
Portfolio Optimizer implements a proprietary algorithm to compute the parameters of the corrected Cornish-Fisher distribution, whose general description is:
These are either directly provided in input of the endpoint (e.g. /assets/returns/simulation/monte-carlo/cornish-fisher/corrected
) or
computed from an empirical distribution of returns (e.g. /portfolio/analysis/value-at-risk/cornish-fisher/corrected
).
Once these parameters are known, the relationships (1)-(4) allow to determine the resulting corrected Cornish-Fisher distribution $\mathcal{cCF}_{\mu_s, \sigma_s, \kappa_s, \gamma_s}$.
Bitcoin is an example of asset exhibiting strong non-normal characteristics^{24}, for which the standard measures of Value-at-Risk like Gaussian Value-at-Risk or modified Value-at-Risk would be inaccurate.
But what about modified Value-at-Risk based on the corrected Cornish-Fisher expansion?
In order to investigate the accuracy of this measure, that I will call corrected Cornish-Fisher Value-at-Risk (cCFVaR), Figure 11 compares, over the period 20 August 2011 - 06 April 2023, the empirical distribution of Bitcoin daily returns^{14} to the corrected Cornish-Fisher distribution $\mathcal{cCF}_{\mu_s, \sigma_s, \kappa_s, \gamma_s}$ with actual moments:
It seems that the corrected Cornish-Fisher distribution does a pretty good job in approximating the empirical return distribution of Bitcoin, except in the right tail though.
Figure 12 and Figure 13 compare the left tail of these two distributions.
There figures confirm that the corrected Cornish-Fisher distribution accurately approximates the empirical return distribution of Bitcoin down to a confidence level of $\approx 95\%$, but no lower.
This can also be confirmed numerically, with a comparison between historical Value-at-Risk and corrected Cornish-Fisher Value-at-Risk at different confidence levels:
Confidence level $\alpha$ | $\text{HVaR}_{\alpha}$ | $\text{cCFVaR}_{\alpha}$ |
---|---|---|
95% | 6.90% | 6.86% |
97.5% | 9.53% | 10.63% |
99% | 13.36% | 16.51% |
99.5% | 15.92% | 21.56% |
99.9% | 27.04% | 35.08% |
All in all, this example empirically demonstrates that modified Value-at-Risk, when corrected following Maillard^{6} results, works well for highly non-normal distributions with not too small tail probabilities.
The goal of this post was to highlight that accuracy issues reported by practitioneers with modified Value-at-Risk have been understood since more than ten years, but that, as Amedee-Manesme et al.^{4} put it:
this point […] does not seem to have received sufficient attention
If you are such a practitioneer, I hope that this post will encourage you to double check how modified Value-at-Risk is computed by your internal risk management software.
Waiting for an answer from your (puzzled) IT teams, feel free to connect with me on LinkedIn or follow me on Twitter.
–
See Zangari, P. (1996). A VaR methodology for portfolios that include options. RiskMetrics Monitor First Quarter, 4–12. ↩ ↩^{2}
See Martin, R. Douglas and Arora, Rohit, Inefficiency of Modified VaR and ES. ↩ ↩^{2} ↩^{3} ↩^{4}
For example, European financial regulators require to use mVaR in order to compute the Summary Risk Indicator (SRI), i.e. the risk score, of Packaged Retail Investment and Insurance Products (PRIIPs) starting 1st January 2023, c.f. regulatory Technical Standards on the content and presentation of the KIDs for PRIIPs. ↩
See Amedee-Manesme, CO., Barthelemy, F. & Maillard, D. Computation of the corrected Cornish–Fisher expansion using the response surface methodology: application to VaR and CVaR. Ann Oper Res 281, 423–453 (2019). ↩ ↩^{2} ↩^{3} ↩^{4}
See Stoyan V. Stoyanov, Svetlozar T. Rachev, Frank J. Fabozzi, Sensitivity of portfolio VaR and CVaR to portfolio return characteristics, Working paper. ↩ ↩^{2} ↩^{3}
See Maillard, Didier, A User’s Guide to the Cornish Fisher Expansion. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9}
See Lamb, John D., Maura E. Monville, and Kai-Hong Tee. Making Cornish–fisher Fit for Risk Measurement, Journal of Risk, Volume 21, Number 5, Pages 53-81. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7}
Like 1% quantile or even less. ↩
See Jorion, P. (2007). Value at risk: The new benchmark for managing financial risk. New York, NY: McGraw-Hill. ↩
In this post, returns are assumed to be logarithmic returns. ↩
This is the case when the portfolio return cumulative distribution function is strictly increasing and continuous; otherwise, a similar formula is still valid, with $F_X^{-1}$ the generalized inverse distribution function of $X$, but these subtleties - important in mathematical proofs and in numerical implementations - are out of scope of this post. ↩
See Boudt, Kris and Peterson, Brian G. and Croux, Christophe, Estimation and Decomposition of Downside Risk for Portfolios with Non-Normal Returns (October 31, 2007). Journal of Risk, Vol. 11, No. 2, pp. 79-103, 2008. ↩ ↩^{2}
Asset returns have a tendency to follow a distribution closer and closer to a Gaussian distribution the more the time period over which they are computed increases; this empirical property is called aggregational Gaussianity, c.f. Cont^{25}. ↩
The associated adjusted prices have been retrieved using Tiingo. ↩ ↩^{2}
This also means that it is possible to have $\text{mVaR}_{95\%} > \text{mVaR}_{99\%} $, which requires some funny arguments to be explained… ↩
See Barton, D.E., & Dennis, K.E. (1952). The conditions under which Gram-Charlier and Edgeworth curves are positive definite and unimodal. Biometrika, 39(3-4), 425–427. ↩
I will not enter into the mathematical details in this post, but it suffices to say that this procedure allows to correct the behavior of the Cornish-Fisher expansion when used outside of its domain of validity thanks to a sorting operator. ↩
See Chernozhukov, V., Fernandez-Val, I. & Galichon, A. Rearranging Edgeworth–Cornish–Fisher expansions. Econ Theory 42, 419–435 (2010). ↩ ↩^{2}
In addition, Maillard^{6} mentions that when the skewness and excess kurtosis parameters are small enough, in a loose sense, they coincide with the actual skewness and excess kurtosis of the Cornish-Fisher distribution, which perfectly explains the behavior of the modified Value-at-Risk observed in practice with return distributions close to normal^{5}. ↩
Actually, the result of Lamb et al.^{7} is a little bit more generic: they establish that the system of non-linear equations is invertible on a region which includes the domain of validity of the Cornish-Fisher expansion. ↩
The domain of validity of the corrected Cornish-Fisher expansion is the mathematical image, by the functions $f_1$ and $f_2$, of the domain of validity of the Cornish-Fisher expansion. ↩
In the context of this blog post, the given distribution is a return distribution (asset, portfolio, strategy…). ↩
This tentative computation is theoretically justified by the results from Lamb et al.^{7}. ↩
See Joerg Osterrieder, The Statistics of Bitcoin and Cryptocurrencies, Proceedings of the 2017 International Conference on Economics, Finance and Statistics (ICEFS 2017). ↩
See R. Cont (2001) Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, 1:2, 223-236. ↩
One issue with such instruments, though, is that their price history dates back to at best 2002^{1}, which is problematic in some applications like trading strategy backtesting or portfolio historical stress-testing.
In this post, which builds on the paper Treasury Bond Return Data Starting in 1962 from Laurens Swinkels^{2}, I will show that the returns of specific bond ETFs - those seeking a constant maturity exposure to government-issued bonds - can be simulated using standard textbook formulas^{2} together with appropriate yields to maturity.
This allows in particular to extend the price history of these ETFs by several tens of years thanks to publicly available yield to maturity series published by governments, government-affiliated agencies, researchers…
Notes:
- A Google sheet corresponding to this post is available here
In what comes next, I will make heavy use of the formula expressing the price of a bond as a function of its yield to maturity.
This formula can be found in the appendix A3.1 Yield to maturity for settlement dates other than coupon payment dates of Tuckman and Serrat^{3}, and is reproduced below for convenience.
Le be a bond^{4} at a date $t$, with a remaining maturity equal to $T$, a yield to maturity equal to $y_t$ and a coupon rate equal to $c_t$.
Then, its price $P_t(c_t,y_t,T)$ per 100 face amount is equal to
\[\left( 1 + \frac{y_t}{2} \right)^{1 - \tau_{t}} \left[ \frac{100 c_{t}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2T}} \right) + \frac{100}{\left( 1 + \frac{y_t}{2} \right)^{2T}} \right] - 1\], where $\tau_{t}$ is the fraction of a semiannual period until the next coupon payment.
Using the bond yield formula, it is possible to approximate the total return $TR$ of a par bond over a specific period using only its remaining maturity at the beginning of the period, its yield to maturity at the beginning of the period and its yield to maturity at the end of the period.
In the case of a monthly period, let be a bond such that:
Then, assuming that
, the total return $TR_t$ of this bond from the end of the month $t-1$ to the end of the month $t$ can be approximated by
\[\frac{y_{t-1}}{12} + \frac{y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} - 1\]A possible demonstration for the previous formula goes as follows.
At the end of the month $t-1$, the bond has the following characteristics:
Its price $P_{t-1}(c_{t-1},y_{t-1},T)$ is then equal, through the bond yield formula, to
\[100 \left( 1 + \frac{y_{t-1}}{2} \right)^{1 - \tau_{t-1}}\], with $\tau_{t-1}$ the fraction of a semiannual period until the next coupon payment at the end of month $t-1$.
At the end of the month $t$, the bond has the following characteristics:
Its price $P_t(c_{t},y_{t},T - \frac{1}{12})$ is then equal, through the bond yield formula, to
\[\left( 1 + \frac{y_t}{2} \right)^{1 - \tau_{t}} \left[ \frac{100 y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{100}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right]\], with $\tau_{t}$ the fraction of a semiannual period until the next coupon payment at the end of month $t$.
The total return $TR_t$ of this bond from the end of the month $t-1$ to the end of the month $t$ is then by definition equal to
\[\frac{P_t(c_{t},y_{t},T - \frac{1}{12})}{P_{t-1}(c_{t-1},y_{t-1},T)} - 1\], that is
\[\frac{\left( 1 + \frac{y_t}{2} \right)^{1 - \tau_{t}}}{\left( 1 + \frac{y_{t-1}}{2} \right)^{1 - \tau_{t-1}}} \left[ \frac{y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right] - 1\]The first term of this expression corresponds to the re-investment of the accrued interest.
Under the practical assumptions that
and noticing that $ \tau_{t} = \tau_{t-1} - \frac{1}{6}$^{7}, this expression becomes
\[TR_t \approx \left[ \left( 1 + \frac{y_{t-1}}{2} \right)^{\frac{1}{6}} - 1 \right] + \left[ \frac{y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right] - 1\]Finally, by linearizing the accrued interest through the first-order Taylor approximation $ \left( 1 + \frac{y_{t-1}}{2} \right)^{\frac{1}{6}} \approx \frac{y_{t-1}}{12} $, this expression becomes
\[TR_t \approx \frac{y_{t-1}}{12} + \left[ \frac{y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right] - 1\]Remark:
- The formula above is based on a suggestion by Dr Winfried Hallerbach to improve the accuracy of the initial formula used in Swinkels^{2} which is based on a second-order Taylor approximation of the bond yield formula, c.f. Swinkels^{8}.
Thanks to a variation^{9} of the par bond total return formula established in the previous section, Swinkels^{2} describes how to construct long (total) return series for government bonds using publicly available constant maturity government rates^{10}.
These rates correspond to the yields to maturity of (fictitious) government bonds whose maturity is kept constant and are typically estimated by governments or government-affiliated agencies, which explains why they are publicly available. For example:
As a side note, long return series for government bonds are usually commercially licensed (Global Financial Data, Bloomberg…), so that the methodology of Swinkels^{2} participates to have a high-quality public alternative to commercially available data^{2} for research purposes.
As an illustration of the methodology of Swinkels^{2}, below are yields to maturity for 3 consecutive months taken from the FRED 10-Year Treasury Constant Maturity Rates series:
Date | Yield to maturity |
---|---|
31 Dec 2022 | 3.880% |
31 Jan 2023 | 3.520% |
28 Feb 2023 | 3.920% |
The total return series $ \left( TR_1, TR_2 \right) $ of the fictitious 10-year constant maturity government bond associated to these yields to maturity is then constructed by:
Computing the total return $TR_1$ from 31 Dec 2022 to 31 Jan 2023 thanks to the par bond total return formula, with $T = 10$, $y_{t-1} = 3.880\%$ and $y_t=3.520\%$.
This gives
\[TR_1 \approx \frac{0.0388}{12} + \frac{0.0388}{0.0352} \left( 1 - \frac{1}{\left( 1 + \frac{0.0352}{2} \right)^{2(10-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{0.0352}{2} \right)^{2(10-\frac{1}{12})}} - 1\]That is
\[TR_1 \approx 3.31\%\]Computing the total return $TR_2$ from 31 Jan 2023 to 28 Feb 2023 thanks again to the par bond total return formula, but with this time $T = 10$^{11}, $y_{t-1} = 3.520\%$ and $y_t=3.920\%$.
This gives
\[TR_2 \approx \frac{0.0352}{12} + \frac{0.0352}{0.0392} \left( 1 - \frac{1}{\left( 1 + \frac{0.0392}{2} \right)^{2(10-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{0.0392}{2} \right)^{2(10-\frac{1}{12})}} - 1\]That is
\[TR_2 \approx -2.97\%\]The Portfolio Optimizer endpoint /bonds/returns/par/constant-maturity
implements the methodology of Swinkels^{2}
using the par bond total return formula established in the previous section.
Many government bond ETFs target a specific maturity, a specific average maturity or a specific maturity range for their underlying portfolio of government bonds.
For example, the iShares 7-10 Year Treasury Bond ETF
seeks to track the investment results of an index composed of U.S. Treasury bonds with remaining maturities between seven and ten years^{12}.
Intuitively, such ETFs should more or less behave like a constant maturity government bond, so that it should be possible to simulate their (total) returns using the methodology of Swinkels^{2} detailed in the previous section.
Nevertheless, and especially because these ETFs need to frequently rebalance their holdings^{13}, such simulated returns might not be accurate enough to be of any practical use…
Let’s dig in.
In order to illustrate the quality of the simulated returns discussed above, Figure 1 through Figure 5 compare the actual returns of the members of the iShares family of U.S. Treasury bond ETFs to the theoretical returns simulated using the methodology of Swinkels^{2}.
The theoretical returns of this ETF are simulated with the FRED 3-Year Treasury Constant Maturity Rates.
The theoretical returns of this ETF are simulated with the FRED 7-Year Treasury Constant Maturity Rates.
The theoretical returns of this ETF are simulated with the FRED 10-Year Treasury Constant Maturity Rates.
The theoretical returns of this ETF are simulated with the FRED 20-Year Treasury Constant Maturity Rates.
The theoretical returns of this ETF are simulated with the FRED 30-Year Treasury Constant Maturity Rates.
On all these figures, it is clear that simulated returns are closely matching actual returns.
The IEF ETF is an exception, though, because several simulated returns were significantly different from their actual counterparts over the period 2012 - 2014.
Nevertheless, for all the five ETFs, correlations between actual and simulated returns are greater than ~97%, which confirms that it is possible to accurately simulate the returns of constant maturity government bond ETFs^{14} using the methodology of Swinkels^{2}.
Each of the five ETFs analyzed in the previous section invests over a given segment of the U.S. Treasury yield curve (1-3 years, 3-7 years, 10-20 years…).
This segment is sometimes wide, like in case of the TLH ETF, but this characteristic allows these ETFs to be considered as constant maturity government bonds.
Now, what about non-constant maturity government bond ETFs?
To answer this question empirically, Figure 6 compares the actual returns of the iShares U.S. Treasury Bond ETF (GOVT ETF) to the
theoretical returns simulated using the methodology of Swinkels^{2} with a weighted average of 3-year, 7-year, 10-year, 20-year and 30-year Treasury constant maturity rates^{15}.
Once again, it appears that simulated returns are closely matching actual returns^{16}.
This example shows that, at least in some cases, it should be possible to accurately simulate the (total) returns of non-constant maturity government bond ETFs using the methodology of Swinkels^{2}, provided that these ETFs are considered as a weighted average of constant maturity government bonds instead of a single constant maturity government bond.
The previous sections demonstrated that it is possible to simulate quite accurately the returns of constant maturity government bond ETFs.
This opens the door to extending their price history.
I will use the TLT ETF as an example.
Figure 5 showed that the actual returns of the TLT ETF are devilishly close to the theoretical returns simulated using the methodology of Swinkels^{2} with the FRED 30-Year Treasury Constant Maturity Rates series.
As a consequence, because the FRED provides the historical values of these rates back to February 1977, the price history of the TLT ETF can be extended by ~25 years.
This extended history is depicted in Figure 7.
This blog post described how to use the methodology of Swinkels^{2} to simulate present and past returns of constant maturity government bond ETFs.
One possible next step is to also use this methodology to simulate future returns of such ETFs, from views on future yields to maturity.
Maybe the subject of another post.
Meanwhile, feel free to connect with me on LinkedIn or follow me on Twitter to discuss about Portfolio Optimizer or about how to best approximate bond ETFs returns :-) !
–
See Swinkels, L., 2019, Treasury Bond Return Data Starting in 1962, Data 4(3), 91. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13} ↩^{14} ↩^{15}
See Tuckman, B., and Serrat A., 2022, Fixed Income Securities Tools for Today’s Markets, 4th edition, John Wiley And Sons Ltd. ↩
In this post, I use the same conventions as in Tuckman and Serrat^{3}: bonds are assumed to be paying semiannual coupons, their coupon rate is assumed to be annual, their yield to maturity is assumed to be provided as semiannually compounded and their maturity is assumed to be expressed in years. ↩
Another sensible choice would be to use a rate equal to $\frac{y_{t-1} + y_{t}}{2}$. ↩
Since bonds with semi-annual coupons are paying coupons every six months, these coupons are anyway hardly collected and re-invested every month in practice, so that this is a sensible simplifying assumption. ↩
C.f. Tuckman and Serrat^{3} for explanations about the term $\frac{1}{6}$. ↩
See Swinkels, L., 2023, Historical Data: International monthly government bond returns, Erasmus University Rotterdam (EUR). ↩
C.f. the remark at the end of the previous section. ↩
Constant yield to maturity rates are frequently estimated from government bonds that trade close to par, even though interest rates have changed since their original issuance, which justifies the usage of this formula, c.f. Swinkels^{2}. ↩
The maturity $T$ did not change because the bond is supposed to have a constant maturity. ↩
C.f. the iShares 7-10 Year Treasury Bond ETF website. ↩
For example, in order to target a specific maturity range, a government bond ETF must replace its holdings whose remaining maturity has become too short. As a side note, this behaviour explains the crazy annual portfolio turnover rate of these ETFs, with for example a turnover rate of 114% for the iShares 7-10 Year Treasury Bond ETF in 2022^{17}. ↩
Or at the very least, to accurately simulate the (total) returns of some constant maturity government bond ETFs. ↩
The weights correspond to the percentage breakdown of the GOVT ETF portfolio per maturity, retrieved from the iShares U.S. Treasury Bond ETF website on 19 March 2023. ↩
Numerically, correlation between actual and simulated returns is ~98%. ↩
C.f. the iShares 7-10 Year Treasury Bond ETF annual or semi-annual report. ↩
In this post, based on the paper Optimal Portfolios in Good Times and Bad by Chow et al.^{2}, I will describe how the turbulence index can be used to partition a set of asset returns into different subsets, each of them corresponding to a specific market risk regime.
I will also provide two examples of usage, one in portfolio optimization and one in the modeling of asset returns.
Let be:
The (raw) turbulence index $d(y_t)$ for the universe of assets and for a given period $t=1..T$ is defined as the squared Mahalanobis distance^{2}:
\[d(y_t) = (y_t - \mu) {}^t \Sigma^{-1} (y_t - \mu)\]The main mathematical property of the turbulence index relevant for this post^{3} is the following^{2}:
Property 1: If the asset returns $y_t$ follow a multivariate Gaussian distribution, that is, $y_t \sim \mathcal{N} \left( \mu, \Sigma \right)$, the turbulence index $d(y_t)$ follows a chi-square distribution with $n$ degrees of freedom, that is, $d(y_t) \sim \mathcal{X}^2(n)$.
Chow et al.^{2} describe how to use the turbulence index to identify multivariate outliers from a series of asset returns. These outliers, that are characterized by the unusual performance of an individual asset or from the unusual interaction of a combination of assets, none of which are necessarily unusual in isolation^{2} are representative of a [turbulent^{4} market] risk regime while the inliers are representative of a quiet market risk regime^{2}.
In more details, the method of Chow et al.^{2} partitions a set of multivariate asset returns $y_t \in \mathbb{R}^{n}, t=1..T$ into two subsets corresponding to these two regimes as follows:
Compute the mean vector $\mu$ of the asset returns
Compute the covariance matrix $\Sigma$ of the asset returns
Choose a turbulence threshold $tt$%, which represents the percentage of asset returns desired to be classified as quiet, with typical value 70%, 80%, 95%^{5}…
Convert the turbulence threshold $tt$% into a turbulence score $ts$
For each asset return vector $y_t, t=1..T$:
The turbulence threshold $tt$% is not directly comparable to the turbulence index values $d(y_t), t=1..T$ because they are not expressed in the same units^{6}.
As a consequence, $tt$% needs to be converted into a turbulence score $ts$.
Under the assumption that the asset returns $y_t$ follow a multivariate Gaussian distribution, and based on Property 1, this conversion can be done by thanks to the computation of the $tt$-th percentile of the chi-square distribution with $n$ degrees of freedom, that is,
\[ts = \left( \mathcal{X}^2(n) \right)^{-1} (tt)\]Nevertheless, because asset returns do not follow a multivariate Gaussian distribution in practice^{7}, this conversion will result in a proportion of asset returns classified as quiet that is different from the desired proportion.
This problem is highlighted in Chow et al.^{2}, in which a turbulence threshold of 75% is used to separate outliers from inliers while the actual proportion of asset returns classified as quiet v.s. turbulent is equal to 79.1%.
A possible solution is to convert the turbulence threshold $tt$% into a turbulence score thanks to the computation of the $tt$-th empirical percentile of the turbulence index distribution^{8}^{9}, with the caveat that this solution requires a long enough series of asset returns.
I will illustrate the method of Chow et al.^{2} with a simple two-asset universe made of:
The turbulence index for this universe of assets, computed using monthly returns over the period August 2002 - January 2023^{10}, is represented in Figure 1.
Then, using for example a turbulence threshold $tt$% of 80%, converted into a turbulence score $ts$ of ~3.22^{11}:
This partitioning of the SPY-TLT returns seems to make some sense, as several periods of market stress are identifiable within the turbulent regime: the Global Financial Crisis, the COVID-19 pandemic, the Russian invasion of Ukraine…
Portfolio Optimizer implements the method from Chow et al.^{2} through the endpoint /assets/returns/turbulence-partitioned
, with two extensions:
Being able to partition asset returns into different market risk regimes has several applications.
For example, it allows to analyze the potential behavior of a given portfolio during periods of market stress, which is of utmost importance for long-term investing.
As Chow et al.^{2} puts it:
[a] portfolio may not survive to generate long-term performance if [it] cannot withstand exceptional periods of market turbulence.
I will not insist on this specific example, though, but I will provide one example related to portfolio optimization and one example related to the modeling of asset returns.
Under the Markowitz’s mean-variance framework, building an optimal portfolio within a universe of assets requires an estimation of the asset covariance matrix.
Because [the] typical risk-estimation procedure […] is to weight a sample [of asset returns]’ observations equally in order to estimate risk parameters^{2}, the expected volatility of such a portfolio during periods of market stress, when asset returns typically become more volatile and more correlated, will be underestimated.
This situation is illustrated in Figure 3, taken from Chow et al.^{2}, in the case of a universe of assets made of eight distinct asset classes^{13}.
In this figure, it can be seen that an optimal portfolio whose asset covariance matrix is estimated from a non-partitioned set of asset returns (Full-Sample Optimal Mix) sees its volatility skyrocket during periods of market stress (Stressful environment) when compared to periods of market stability (Normal environment).
One solution to this issue is to estimate the asset covariance matrix from only the subset of asset returns corresponding to periods of market stress, but in this case, the expected portfolio return might be negatively impacted.
Indeed, as can also be seen in Figure 3, an optimal portfolio whose asset covariance matrix is estimated from only the subset of asset returns corresponding to periods of market stress (Outlier-Sample Optimal Mix) sees its expected return diminished by $\approx 1.24$% compared to the previous optimal portfolio (Full-Sample Optimal Mix).
Another solution, suggested by Chow et al^{2} and Kritzman et al.^{9}, is to estimate a blended asset covariance matrix from both^{14}:
More specifically, these papers suggest to estimate an asset covariance matrix $\Sigma^*$ equal to
\[\Sigma^* = \lambda_{i}^* p_{i} \Sigma_{i} + \lambda_{o}^* \left( 1 - p_{i} \right) \Sigma_{o}\], where:
Such a blended covariance matrix enables investors to express their views about the likelihood of each risk regime and to differentiate their aversion to the regimes^{2}.
Now, in practice, the computation of $\Sigma^*$ requires to estimate the probability $p_{i}$.
This can be done in an ad-hoc fashion, or using one’s preferred forecast technique, like the hidden Markov model used in Kritzman et al.^{9}.
A couple of remarks to finish:
Under the Markowitz’s mean-variance framework, building an optimal portfolio within a universe of assets also requires an estimation of the expected asset returns.
They are assumed to be regime-independent in Chow et al.^{2}, but they could perfectly be made conditional on the regime like in Bruder et al.^{17}.
In the specific case where investors are equally averse to both the quiet and the turbulent regime, the formula for $\Sigma^*$ simplifies to
It has been known since the early 1960s that the (marginal) statistical distribution of asset returns is neither normal nor lognormal^{7}, but more than sixty years later it is still an open question in financial mathematics to determine its exact nature.
Empirically, though, it has been demonstrated that several distributions are able to capture most of the stylized facts^{18} of asset returns.
One such distribution is the Gaussian mixture distribution, which is a convex combination of Gaussian distributions with different means and variances in the univariate case and a a convex combination of multivariate Gaussian distributions with different mean vectors and covariance matrices in the multivariate case.
The main advantages of this distribution over other alternatives like the multivariate t distribution is that it is a non-elliptical distribution^{19} both numerically tractable^{20} and extremely flexible.
For example, a univariate Gaussian mixture distribution can be unimodal, symmetric, skewed, multimodal, leptokurtic…^{21}.
As another example, a multivariate Gaussian mixture distribution with two components allows to approximate a multivariate jump-diffusion model driven by a standard Lévy process^{17}.
In this context, the method of Chow et al.^{2} can be applied to fit the parameters of a two-component^{22} multivariate Gaussian mixture distribution as detailed below^{23}:
In order to illustrate the validity of this approach on the two-asset SPY-TLT universe introduced in the previous section, Figure 4 through Figure 6 compare the empirical distribution of the monthly SPY (log) returns with the first marginal of:
It is clearly visible on these figures that both marginal Gaussian mixture distributions are much more appropriate than the marginal Gaussian distribution to model the SPY returns, with a slightly better fit obtained with the expectation–maximization algorithm.
This is confirmed numerically by the Kolmogorov-Smirnov goodness of fit test^{27}.
Going beyond univariate marginals, a 2D Kolmogorov-Smirnov goodness of fit test^{28} also confirms that both multivariate Gaussian mixture distributions are much more appropriate than the multivariate Gaussian distribution to model the joint SPY-TLT returns^{29}, with a slightly better fit again obtained with the expectation–maximization algorithm.
This example shows that it is possible to fit the parameters of a multivariate Gaussian mixture distribution modeling joint asset returns through an easily interpretable procedure, with no local optima and no convergence issue to worry about^{30}.
This concludes this second post on the turbulence index.
As usual, feel free to connect with me on LinkedIn or follow me on Twitter to discuss about Portfolio Optimizer or quantitative finance in general.
–
See M. Kritzman, Y. Li, Skulls, Financial Turbulence, and Risk Management,Financial Analysts Journal, Volume 66, Number 5, Pages 30-41, Year 2010. ↩
See George Chow, Jacquier, E., Kritzman, M., & Kenneth Lowry. (1999). Optimal Portfolios in Good Times and Bad. Financial Analysts Journal, 55(3), 65–73.. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13} ↩^{14} ↩^{15} ↩^{16} ↩^{17}
For other properties, c.f. the first blog post of this series. ↩
The turbulent regime is called stressful in Chow et al.^{2}. ↩
To be noted that $1 - tt$% is used in Chow et al.^{2} as the turbulence threshold, and not directly $tt$%. ↩
The turbulence threshold is expressed as a percentage, while the turbulence index is expressed as a squared Mahalanobis distance. ↩
See Mandelbrot B, The variation of certain speculative prices, The Journal of Business, 1963, vol. 36, 394. ↩ ↩^{2}
See M. Kritzman, Y. Li, Skulls, Financial Turbulence, and Risk Management,Financial Analysts Journal, Volume 66, Number 5, Pages 30-41, Year 2010. ↩ ↩^{2}
See Mark Kritzman, Kenneth Lowry and Anne-Sophie Van Royen, Risk, Regimes, and Overconfidence, The Journal of Derivatives Spring 2001, 8 (3) 32-42. ↩ ↩^{2} ↩^{3} ↩^{4}
I retrieved the monthly adjusted ETF prices over the period July 2002 - January 2023 using Tiingo. ↩
Using for example the Matlab function chi2inv, chi2inv(0.80,2) = 3.218875824868201. ↩
For example, Kinlaw et al.^{31} use, although in a slightly different context, three subsets corresponding to the three market risk regimes calm, moderate and turbulent. ↩
The eight asset classes are Domestic equities, Foreign equities, Emerging market, Domestic bonds, Foreign bonds, High-yield bonds, Commodities and Cash. ↩
Or from all the subsets of asset returns corresponding to all the market regimes in case more than two turbulence thresholds are used. ↩
In Chow et al^{2}, the relative risk aversion parameters are actually rescaled so that they sum to 2, that is, they must verify $\lambda_{i}^* + \lambda_{p}^* = 2$. ↩
Because Chow et al^{2} consider only two regimes, $p_{o} = 1 - p_{i}$. ↩
See Bruder, Benjamin and Kostyuchyk, Nazar and Roncalli, Thierry, Risk Parity Portfolios with Skewness Risk: An Application to Factor Investing and Alternative Risk Premia (September 22, 2016). ↩ ↩^{2}
See R. Cont (2001) Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, 1:2, 223-236. ↩
See Ian Buckley, David Saunders, Luis Seco, Portfolio optimization when asset returns have the Gaussian mixture distribution, European Journal of Operational Research, Volume 185, Issue 3, 2008, Pages 1434-1461. ↩
Because it is “just” an extension of the Gaussian model, calculations with this distribution are usually similar to those using the Gaussian distribution. ↩
See Gaussian mixtures and financial returns, C. Cuevas-Covarrubias, J. Inigo-Martinez, R. Jimenez-Padilla, Discussiones Mathematicae, Probability and Statistics 37 (2017) 101–122. ↩
Or of a multivariate Gaussian mixture distribution with more than two components in case more than two turbulence thresholds are used. ↩
This approach is similar in spirit to the thresholding method of Bruder et al.^{17}. ↩
Or, for more robustness in case the chi-square distribution is used to convert the turbulence threshold into a turbulence score, to the actual proportion of asset returns qualified as quiet v.s. turbulent. ↩
This turbulence threshold results in an actual proportion of asset returns that qualify as quiet v.s. turbulent equal to ~0.83%. ↩
Thanks to the Python Scikit-Learn package. ↩
The Kolmogorov-Smirnov statistic (resp. p-values) for the three marginal distributions are, in order ~0.0892, ~0.0544, ~0.0527 (resp. ~0.0373, ~0.4437, ~0.4852). ↩
Using the Python library https://github.com/syrte/ndtest. ↩
The 2-sample 2D Kolmogorov-Smirnov p-values^{28} are usually ~< 0.01 for the multivariate Gaussian distribution and much greater than ~0.20 for both multivariate Gaussian mixture distributions, indicating that the joint SPY-TLT returns distribution is significantly different from the former and not significantly different from any of the latter. ↩
See Hichem Snoussi, Ali Mohammad-Djafari, Penalized maximum likelihood for multivariate Gaussian mixture, arXiv. ↩
See William Kinlaw, Mark P. Kritzman, David Turkington, Harry M. Markowitz, A Practitioner’s Guide to Asset Allocation, Wiley. ↩