This model relies on what Bogle describes as the single most important factor in forecasting future total returns [of a government bond], which is the the initial yield to maturity.
In this post, I will describe Bogle’s methodology and analyze its forecasting performances when applied to constant maturity U.S. government bonds, which is a category of U.S. government bonds representative of most U.S. government bond ETFs as detailled in a previous post.
The Bogle Sources of Return Model for Bonds^{2} (BSRM/B) is a simple empirical model which states that there is but a single dominant source of decade-long returns on [government] bonds: the interest coupon^{2}.
This model has initially been introduced by Bogle in the case of the 20-year U.S. Treasury bond^{1} and has then latter been shown to also be applicable in the case of the 10-year U.S. Treasury bond^{2}.
Let’s dig into the details.
Bogle^{1} examines the relationship between the initial yield to maturity of a 20-year U.S. Treasury bond and its subsequent 10-year annualized total^{3} return, over the period 1930 - 1980.
He notices that^{1}:
[…] in bonds, [the initial interest rate] is the single most important factor in forecasting future total returns. The other two factors are the reinvestment rate (the rate at which the interest coupons compound), and the terminal (or end-of-period) yield.
As a matter of fact, he later shows in his paper The 1990s at the Halfway Mark^{4} that these three factors taken together have a correlation of 0.99 with the actual returns on bonds in each of the decades^{4}.
Now, because the reinvestment rate and the terminal yield are by definition unknown quantities, they cannot be used to forecast future bond returns, which leaves the initial interest rate as the critical variable, [which] has a correlation of 0.709 with the returns subsequently earned by bonds.^{1}.
In other words, the initial yield to maturity of a 20-year U.S. Treasury bond is actually sufficient to explain substantially [the bond] […] return […] over the subsequent decade^{4}.
Bogle and Nolan^{2} examine the relationship between the initial yield to maturity of a 10-year U.S. Treasury bond and its subsequent 10-year annualized return over the period 1915–2014, and find that the initial interest rate explains ~90% of the variability of the subsequent 10-year annualized bond returns.
This finding is illustrated in Figure 1, directly reproduced from Bogle and Nolan^{2}.
I propose to analyze the forecasting performances of the BSRM/B model when applied to constant maturity U.S. government bonds over the out-of-sample period 31th October 1993 - 31th May 2013.
Due to the close relationship between this category of U.S. government bonds and most U.S. government bond ETFs, this will help to understand if Bogle’s model could be of any practical use to today’s investors for setting long-term capital assumptions for U.S. governement bonds.
Using the monthly 20-Year Treasury Constant Maturity Rates series from the Federal Reserve website, Figure 2 shows that the initial yield to maturity at the end of any given month over the period 31th October 1993 - 31th May 2013 explains ~72.1% of the variability of the subsequent 10-year annualized bond returns.
The associated monthly correlation coefficient is ~0.849, which is consistent with the yearly correlation coefficient of ~0.709 determined by Bogle^{1}.
Using the monthly 10-Year Treasury Constant Maturity Rates series from the Federal Reserve website, Figure 3 shows that the initial yield to maturity at the end of any given month over the period 31th October 1993 - 31th May 2013 explains ~85.2% of the variability of the subsequent 10-year annualized bond returns!
Such a value for the monthly $r^2$ coefficient is again consistent with the yearly $r^2$ coefficient of ~90% obtained by Bogle and Nolan^{2} and displayed in Figure 1.
The empirical conclusions of this section are that:
But what about the forecasting performances of the BSRM/B model for other maturities? For example, for the 3-year constant maturity U.S. government bond, represented in Figure 4, or for the 30-year constant maturity U.S. government bond, represented in Figure 5?
From Figure 2 to Figure 5, another empirical conclusion is that the BSRM/B model is best suited to a 10-year constant maturity U.S. government bond, because the more the deviation from a maturity of 10 years the more the degradation in forecasting performances^{6}.
Surprinsingly, all empirical conclusions of the previous section, and especially the last one, are backed up by theoretical results.
Indeed, Leibowitz et al.^{7} analyze the behaviour of constant duration bond funds and establish that multi-year […] returns […] converge in both mean and volatility around the starting yield^{7}, with a convergence horizon of about $2D - 1$ years for a bond fund whose duration is $D$ years, regardless of interim changes in yields^{8} which only impact this convergence by widening the distribution of returns around the mean return^{7}.
These results allow to understand the behaviour of the BSRM/B model when applied to the 10-year constant maturity U.S. government bond^{9}:
These results also allow to understand the behaviour of the BSRM/B model when applied to the the 3-year, 20-year and 30-year constant maturity U.S. government bonds, as the initial yield to maturity of these bonds should then NOT be predictive of their annualized returns over the subsequent 10 years, but should rather be predictive of their annualized returns over the subsequent $2 D_3 - 1$, $2 D_{20} - 1$ and $2 D_{30} - 1$ years, with $D_3$, $D_{20}$ and $D_{30}$ their respective durations^{12}.
This theoretical behaviour is (somewhat) confirmed in practice, with for example Figure 7 illustrating that the initial yield to maturity of the 3-year constant maturity U.S. government bond is highly predictive of the annualized returns of this bond over the subsequent 3 years over the period 28th February 1962 - 31th May 2020.
A proprietary variation of Bogle’s BSRM/B model is implemented through the Portfolio Optimizer endpoint /markets/indicators/bsrmb/us
to compute
Typical examples of usage for the BSRM/B model are similar to the ones described in a previous blog post about a predictor of long-term stock market returns called the AIAE.
For example:
As an illustration, Figure 8 displays the “path” of expected long-term returns for the 10-year constant maturity U.S. government bond over the period 28th February 1962 - 31th May 2023.
From Figure 8, and at the date of publication of this post^{13}, a buy-and-hold investment in the 10-year constant maturity U.S. government bond is expected to yield an annualized return of ~4% over the next 10 years.
Here, Figure 9 displays the price scenario corresponding to Figure 8 in the case of the IEF ETF, with a 95% confidence interval added.
A less typical example of usage would be to combine^{14} the forecasts produced by the BSRM/B model with estimations of the equity risk premium in order to predict future stock market returns.
A good starting place to find such estimations of the equity risk premium is the website of Aswath Damodaran, who maintains estimates of the historical implied equity risk premiums for the U.S. with plenty of details in his yearly-updated paper Equity Risk Premiums (ERP): Determinants, Estimation and Implications^{15}.
Bogel’s bond model has proved very effective at predicting long-term U.S. governement bonds returns over more than thirty years after its initial publication, which confirms what Bogle^{1} wrote in his original paper:
when we know the current coupon, we know most of what we need to know to forecast [government] bond returns in the coming decade
I find this model is another interesting addition to one’s forecasting toolbox on top of the AIAE indicator!
For more forecasts, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
See Bogle, J., Investing in the 1990s, The Journal of Portfolio Management, Vol. 17, No. 3 (1991a), pp. 5-14. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7}
See Bogle J., Nolan M., Occam’s Razor Redux: Establishing Reasonable Expectations for Financial Market Returns, Journal of Portfolio Management, Vol. 42, No. 101, Sep Fall 2015. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6}
All bond returns considered in this blog post are total returns, so that I will omit “total”. ↩
See Bogle, J., The 1990s at the Halfway Mark, The Journal of Portfolio Management, Vol. 21, No. 4 (1995), pp. 21-31. ↩ ↩^{2} ↩^{3}
More data is needed to support this claim here; for example, the in-sample monthly $r^2$ coefficient for the 10-year constant maturity U.S. Treasury bond is equal to ~83.9%, so that there is no apparent degradation of the BSRM/B model. ↩
This is especially visible in Figure 5, with an $r^2$ coefficient of only ~59.7%. ↩
See Martin L. Leibowitz, Anthony Bova & Stanley Kogelman (2014) Long-Term Bond Returns under Duration Targeting, Financial Analysts Journal, 70:1, 31-51. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5}
That is, whatever the pace of yield changes or the magnitude of yield changes. As a matter of fact, Leibowitz et al.^{7} show that what is important is the standard deviation of the yield change distribution. ↩
Viewing a constant maturity bond as a constant duration bond fund might seem as a leap of faith, but the analysis of constant duration bond funds done in Leibowitz et al.^{7} shows that they should not differ much in practice. ↩
The duration of the 10-year constant maturity U.S. Treasury bond/IEF ETF is not constant over time, so that this is a kind of first-order approximation. ↩
In Leibowitz et al.^{7}, the effective convergence horizon of a 5-year constant duration bond fund is shown to 6 years instead of 9 years. ↩
Again, assumed to be constant in Leibowitz et al.^{7}, which is not the case in practice. ↩
More precisely, assuming the investment starts on 31th May 2023. ↩
Of course, in a non-circular way. ↩
See Damodaran, Aswath, Equity Risk Premiums (ERP): Determinants, Estimation and Implications - The 2023 Edition. ↩
This indicator, called the Aggregate Investor Allocation to Equities (AIAE), has been further analyzed by Raymond Micaletti in his paper Towards a Better Fed Model^{3}, with the conclusion that it indeed has superior equity-return forecasting ability compared to other well-known indicators (such as the CAPE ratio, Tobin’s Q, Market Cap-to-GDP, etc.)^{3}.
In this post, I will describe in details the AIAE indicator, come back on nearly ten years of out-of-sample performances and show how to use the forecast procedure proposed by Micaletti^{3} in order to set long-term capital assumptions for the U.S. stock market.
Livermore^{4} defines the AIAE indicator as the total amount of stocks that investors are holding in aggregate divided by the total amount of stocks plus bonds plus cash that these same investors are holding in aggregate, that is
\[AIAE = \frac{TMV_s}{TMV_s + TMV_b + C}\], where:
By definition, this indicator represents the average investor allocation to stocks^{5}, hence its name.
Through some approximations, Livermore^{4} shows that it is possible to compute the AIAE indicator for the U.S. thanks to economic data published in the quarterly Federal Reserve release Financial Accounts of the United States - Z.1.
In details:
The U.S. AIAE indicator computed using the above Fred data series is available -> here.
Figure 1, adapted from Livermore^{4}, compares the value of the U.S. AIAE indicator at the end of any given quarter over the period 31th December 1951 - 30th September 2003^{6} with the subsequent 10-year annualized S&P 500 total^{7} returns.
It appears that this indicator is doing an impressive job at predicting future U.S. stock market returns!
This can be confirmed more formally through an ordinary least square regression.
Figure 2, directly reproduced from Livermore^{4}, shows that the value of the U.S. AIAE indicator at the end of any given quarter over the period 31th December 1951 - 30th September 2003^{6} explains ~91.3% of the variability of the subsequent 10-year annualized S&P 500 returns.
These compelling forecasting performances need nevertheless to be taken with a grain of salt, because they are not exactly achievable in real life due to two problems:
This problem is discussed more in details for example in Asness et al.^{8} in the case of the CAPE ratio, but it suffices to say that many equity valuation indicators usually present both encouraging in-sample long-horizon [performance]^{8} and directionally right but weak and disappointing out-of-sample performance^{8}.
One solution to this problem is to evaluate the forecasting performances of equity valuation indicators in a kind of walk-forward fashion.
For the U.S. AIAE indicator, this is done in Micaletti^{3}, who conclude that
the Aggregate Investor Allocation to Equities (AIAE) has superior equity-return forecasting ability compared to other well-known indicators (such as the CAPE ratio, Tobin’s Q, Market Cap-to-GDP, etc.)
More on this in the next section.
After the initial release of economic data (unemployment, GDP, etc.), it is usual to see these data being revised a couple of weeks, months, or quarters later.
So, because the AIAE indicator is based on economic data, its value on a given past date as computed today v.s. as computed just after the initial release of the associated economic data might be different.
Hopefully, there are some hints in Micaletti^{3} that the impact of this problem might be neglictible in practice.
Livermore^{4} argues that, under reasonable assumptions, long-term stock market returns must be driven by dynamics in equities supply v.s. bonds plus cash supply, dynamics that are precisely captured by the AIAE indicator.
I will not repeat his whole reasoning here^{9}, but it shares some similarities with the reasoning of Sharpe in his paper The Arithmetic of Active Management^{10} in that it uses arithmetic arguments to model the behaviour of an imaginary “aggregate investor”.
In this section, I will study the forecasting performances of the U.S. AIAE indicator since its publication.
Because Livermore published the associated blog post on 20th December 2013^{4}, he had access to
, which allowed him to analyze the 10-year forcecasting performances of the U.S. AIAE indicator over the period 31th December 1951 - 30th September 2003.
On my side, at the date of publication of this post, I have access to
, which allows me to analyze the 10-year forcecasting performances of the U.S. AIAE over the additional out-of-sample period 31th December 2003 - 31th March 2013.
I will use two methodologies:
The data sources for this study are the following:
To be noted that it is possible to access Alfred economic data through a well-documented Web API.
Figure 3 is my reproduction of Figure 2 from Livermore^{4}.
Although there is a slight difference in the $r^2$ coefficients (~91.3% in Figure 2 v.s. ~88.5% in Figure 3), probably related to differences in both Fred data^{12} and U.S. stock market return data^{13}, these two figures look very much alike.
This validates my reproduction of Livermore’s methodology.
Figure 4 is the same as Figure 3, with the data points corresponding to the out-of-sample period 31th December 2003 - 31th March 2013 added.
Unfortunately, the $r^2$ coefficient has decreased (from ~88.5% in Figure 3 to ~85.2% in Figure 4), which implies that forecasting performances have degraded over the most recent period.
This is confirmed by Figure 5, which displays only the data points corresponding to the out-of-sample period.
On this figure, it is clearly visible that the relationship between the U.S. AIAE indicator and the subsequent 10-year annualized U.S. stock market returns is linear-ish, but with a high variability.
Figure 6 empirically demonstrates that the forecasts of the 10-year annualized U.S. stock market returns obtained using Micaletti’s methodology^{3} match extremely well with their actual counterparts over the period 31th December 1951 - 30th September 2003.
As a side note, the $r^2$ coefficient obtained with the “real-time” methodology of Micaletti (~89.1% in Figure 6) is a little bit higher than the $r^2$ coefficient obtained using the “hindsight-biased” methodology of Livermore (~88.5% in Figure 3).
This might be linked to the usage of an expanding window in Micaletti’s methodology, which allows to account for dynamics in the evolution of the linear regression coefficients, or this might more probably just be noise.
Figure 7 is the same as Figure 6, with the data points corresponding to the out-of-sample period 31th December 2003 - 31th March 2013 added.
The situation here is the same as with Livermore’s methodology, that is, the $r^2$ coefficient has decreased over the most recent period (from ~89.1% in Figure 6 to ~84.1% in Figure 7).
The associated degradation in forecasting performances is confirmed in Figure 8, which displays only the data points corresponding to the out-of-sample period.
The bottom line of what precedes is that the forecasting performances of the U.S. AIAE indicator have indubitably decreased since its publication by Livermore^{4}, which is materialized by a lower $r^2$ coefficient.
While the COVID crisis and recovery certainly played a role, it is actually not the first time in history that such a decrease in forecasting performances occurs.
Figure 9 displays the rolling $r^2$ coefficient, over a prior period of 10 years, of the 10-year annualized U.S. stock market returns forecasts v.s. actual values.
On this figure, there are many 10-year periods with an $r^2$ coefficient much lower than ~77.8%, with even 10-year periods whose $r^2$ coefficient is much lower than 20%!!
So, as glimmers of hope, 1) there have been much worse underperforming periods in history than the current one (relative hope) and 2) the current $r^2$ coefficient is still very high for an equity valuation indicator^{4} (absolute hope).
A proprietary variation of Micaletti’s methodology^{3} is implemented through the Portfolio Optimizer endpoint /markets/indicators/aiae/us
to compute:
Every year, major financial institutions publish their long-term capital market assumptions based on their internal valuation models (BlackRock, J.P.Morgan…).
The U.S. AIAE indicator enables individual investors to have access to such a valuation model in the case of the U.S. stock market.
Even better, the U.S. AIAE indicator also enables individual investors to have access to the “path” of expected long-term U.S. stock market returns, as for example regularly published by Micaletti on his twitter account through images like Figure 10.
For comparison (shameless plug):
Thanks to these forecasts, it becomes possible to contextualize the long-term capital market assumptions of financial institutions.
For example:
This forecast can be compared to the U.S. AIAE forecast of ~2.1% in Figure 12 and of not a much higher value in Figure 11.
Lower valuations and higher yields mean that markets today offer the best potential long-term returns since 2010
This comment can be put into perspective by looking at the U.S. AIAE forecasts around 2010 in Figure 12 and Figure 11.
The U.S. AIAE forecasts of long-term U.S. stock market returns can be converted into future price scenarios for misc. traded instruments.
In the case of the SPY ETF, the price scenarios corresponding to Figure 12 are represented in Figure 13.
Such price scenarios are sometimes more easy to grasp, or to describe to customers, than forecasts of pure returns.
Looking at Figure 13, it is for example clear that the price of the SPY ETF is currently not where it “should” be, and that, assuming the U.S. AIAE indicator continues to be reliable, we might be in for at best ~2 years of flat returns and at worst for a moderate to severe U.S. market correction anytime soon…
The forecasts generated by the U.S. AIAE indicator can also be used within a tactical asset allocation framework, as detailed for example in Micaletti^{3}, who note that:
[…] over the last 43 years and across various subperiods, the AIAE-based TAA strategy delivered the most consistently high-level performance relative to its competitors.
Alternatively, they could also be integrated in any other tactical asset allocation framework that already uses an equity valuation indicator, like the framework described in Asness et al.^{8}.
I will not go into more details here, though.
Nearly one decade of out-of-sample forecasting performances validates that the AIAE indicator introduced by Livermore^{4} has been very impressive in the U.S.
Now, whether the current period of (relative) underperformance will continue or stop in the future is of course open to debate, but in any cases, I hope that this blog post shed some light on this little-known equity valuation indicator.
For more uncommon quantitative nuggets, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
This is a pseudonym. ↩
See Campbell, John Y., Robert J. Shiller (1988a). Stock prices, earnings, and expected dividends. Journal of Finance 43, 661-676. ↩
See Micaletti, Raymond, Towards a Better Fed Model. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9}
See The Single Greatest Predictor of Future Stock Market Returns. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11}
Livermore^{4} makes the working assumption that an investor can only invest in stocks, bonds or cash as far as financial assets are concerned; whether the development of alternative financial asset classes like cryptocurrencies will at some point impact this assumption remains to be seen. ↩
Although there is no reference in Livermore^{4} to the exact period used for his graphs, my best guess is 31th December 1951 - 30th September 2003. ↩ ↩^{2}
All stock market returns considered in this blog post are total returns, so that I will omit “total”. ↩
See Cliff Asness, Antti Ilmanen and Thomas Maloney, Market Timing: Sin a Little Resolving the Valuation Timing Puzzle, Journal Of Investment Management, Volume 15, Number 3, 2017. ↩ ↩^{2} ↩^{3} ↩^{4}
Micaletti^{3} includes a summary of Livermore^{4}. ↩
See William F. Sharpe, The Arithmetic of Active Management, Financial Analysts Journal, Vol. 47, No. 1 (Jan. - Feb., 1991), pp. 7-9. ↩
The Alfred website is a point in time version of the Fred website, which allows to access initial releases or more generally specific point in time versions of economic data. ↩
There is no reference in Livermore^{4} to whether the Fred data are initial releases, but my best guess is that they are not. ↩
I used the stock market returns provided on the Kenneth French’s website, while Livermore^{4} used the S&P 500 returns. ↩
See BlackRock website. ↩
In this post, after providing the necessary definitions, I will reproduce the empirical study of Gerber et al.^{1} which highlights the superiority of the Gerber correlation matrix relative to the sample correlation matrix and I will discuss some practical aspects associated to the usage of the Gerber statistic (how to choose the Gerber threshold…).
Notes:
- A Google sheet corresponding to this post is available here
Let be two assets $i$ and $j$ observed over $T$ time periods, with:
Let then be the scatter plot of the standardised joint returns $\left( \frac{r_{i,t}}{\sigma_1}, \frac{r_{j,t}}{\sigma_2} \right), t=1..T$ of these two assets, as depicted for example in Figure 1, slightly adapted from Flint and Polakow^{2}.
By introducing the Gerber threshold $c \in [0,1]$, this scatter plot can be partitioned into nine different subsets:
The Gerber statistic $g_{i,j}$ is then defined as^{1}
\[g_{i,j} = \frac{n_{i,j}^{UU} + n_{i,j}^{DD} - n_{i,j}^{UD} -n_{i,j}^{DU}}{T - n_{i,j}^{NN}}\], where:
The definition of the Gerber statistic makes it a measure of co-movement more robust to outliers and to noise than the
Pearson correlation coefficient.
Indeed, as highlighted in Gerber et al.^{1}:
From this perspective, the Gerber statistic is especially well suited for financial time series, which often exhibit extreme movements and a great amount of noise^{1}.
Flint and Polakow^{2} reference three existing variations of the Gerber statistic and note that these alternative definitions materially change[] the resultant measure, to the point that the three GS variants in press should arguably be viewed as entirely separate dependence measures^{2}.
For the sake of clarity, this blog post will only discuss the variant termed the Gerber statistic in Gerber et al.^{1}.
Gerber et al.^{1} illustrate the computation of the Gerber statistic using the $T = 24$ monthly returns^{4} of the asset pair S&P 500 (SPX) - Gold (XAU) over the period January 2019 - December 2020.
I propose to re-use the same example and to validate the computation thanks to the SPX - XAU returns available in the Google sheet associated to this post.
Figure 2 represents the scatter plot of the standardised joint returns of these two assets overlaid with the nine subsets corresponding to a Gerber threshold of 0.5.
Figure 2 is exactly the same as the figure in panel A of exhibit A2 of Gerber et al.^{1}, so that the computation of the Gerber statistic should result in a value of ~0.286.
Let’s double check this.
From Figure 2, we have:
So that:
\[g_{SPX,XAU} = \frac{7 + 1 - 0 - 2}{24 - 3} = \frac{2}{7} \approx 0.286\]All good!
Let be:
The asset Gerber correlation matrix $G \in \mathcal{M}(\mathbb{R}^{n \times n})$, also called the Gerber matrix, is then defined by:
\[G_{i,j} = g_{i,j}, i=1..n, j=1..n\], where $g_{i,j}$ is the Gerber statistic between asset $i$ and asset $j$.
Let be:
The asset Gerber covariance matrix $\Sigma_G \in \mathcal{M}(\mathbb{R}^{n \times n})$ is then defined by:
\[\left( \Sigma_{G} \right)_{i,j} = g_{i,j} \, \sigma_i \, \sigma_j, i=1..n, j=1..n\], where $g_{i,j}$ is the Gerber statistic between asset $i$ and asset $j$.
Portfolio Optimizer implements two endpoints related to the Gerber statistic:
/assets/correlation/matrix/gerber
, to compute the Gerber correlation matrix/assets/covariance/matrix/gerber
, to compute the Gerber covariance matrixGerber et al.^{1} analyze the empirical performance of the Gerber covariance matrix within the Markowitz’s mean-variance framework.
For this, they consider a universe of nine asset classes:
, inside which they backtest the following portfolio investment strategy over the period January 1988 - December 2020:
Figure 3, reproduced from Gerber et al.^{1}, illustrates the resulting three ex post mean-variance efficient frontiers in the case of a Gerber threshold equal to 0.5.
On this figure, it is pretty clear that the efficient frontier corresponding to the Gerber covariance matrix dominates the two other efficient frontiers^{7}.
Based on these empirical findings, using the Gerber covariance matrix as an alternative to both [the sample covariance matrix] and to the shrinkage estimator of Ledoit and Wolf^{1} seems really compelling.
Nevertheless, some practicalities must be discussed first.
In order to determine whether the empirical performances reported in Gerber et al.^{1} are robust to slight changes in implementation details^{8}, I propose to reproduce the backest of the portfolio investment strategy detailled in the previous section using Portfolio Optimizer^{9}.
To be noted that because Portfolio Optimizer does not support Ledoit-Wolf type shrinkage at the date of publication of this post^{10}, I am only able to compare the Gerber covariance matrix with the historical covariance matrix.
The two reproduced ex post mean-variance efficient frontiers are displayed in Figure 4 in the case of a Gerber threshold equal to 0.5.
Figure 3 and Figure 4 are pretty close^{11}, except really for the mean-variance efficient portfolio with an annualized volatility target of 11%, which confirms the robustness in terms of reproducibility of the empirical study of Gerber et al.^{1}.
The Gerber statistic, the Gerber correlation matrix and the Gerber covariance matrix all depend on the Gerber threshold, so that it is important to understand the impact of varying the Gerber threshold on these quantities.
In the case of two assets, this impact is extensively studied in Flint and Polakow^{2} thanks to numerical simulations of joint normal and joint non-normal return distributions. Their conclusion is that the dependency of the Gerber statistic on the Gerber threshold is highly non trivial…
On my (less ambitious) side, I will study the impact of varying the Gerber threshold from 0 to 1 in increments of 0.1 on the backest of the portfolio investment strategy detailled in the previous section.
Figure 5 (resp. Figure 6) illustrates the evolution of the resulting portfolio investment strategy equity curves for an ex ante annualized volatility target of 5% (resp. 10%).
Figure 6 shows that the influence of the Gerber threshold on performances can be neglictible, which is very good news.
Unfortunately, Figure 5 shows on the contrary that the influence of the Gerber threshold on performances can be non-neglictible at all, which is bad news.
This leads to the question of how to “best” chose the Gerber threshold in practice.
From the definition of the Gerber statistic, the higher the Gerber threshold:
As a consequence, it would make sense to choose the Gerber threshold dynamically, as a function of the “signal-to-noise ratio” of the considered universe of assets.
In the context of the portfolio investment strategy detailled in the previous section, I experimented with a simple approach based on past risk-adjusted performances:
Figure 7 illustrates the evolution of the resulting portfolio investment strategy equity curve for an ex ante annualized volatility target of 5%.
From Figure 7, it appears that this simple data-driven method to choose the Gerber threshold is able to match the performances of the best Gerber threshold chosen in hindsight^{13} ($c = 0.5$).
Some (annualized) statistics to support this observation:
Gerber portfolio (fixed c = 0.50) | Gerber portfolio (adaptative c) | |
---|---|---|
Average return | 7.6% | 7.9% |
Volatility | 6.2% | 6.2% |
Sharpe ratio | 1.26 | 1.23 |
Of course, no generic conclusion can be drawn from this example, but I do think that the adaptative computation of the Gerber threshold would be an interesting research topic.
Gerber et al.^{1} mention that
In the empirical studies performed, and for all cases of Gerber thresholds $c$ considered, we always observe the […] Gerber matrix G to be positive semidefinite
, but give no formal proof that the Gerber correlation matrix is positive semidefinite in general.
It is thus natural to wonder whether this is the case, all the more because positive semidefinitess is usually lacking in other correlation matrices built from robust pairwise scatter estimates^{14}^{15}.
Hopefully, the Gerber correlation matrix is indeed positive semidefinite, as established in the paper Proofs that the Gerber Statistic is Positive Semidefinite from Gerber et al.^{16}
Note:
- The initial version of this post stated that there were no proof that the Gerber correlation matrix was a positive semidefinite matrix; this was incorrect, as a proof was available on Mr Enrst website.
In their paper, Gerber et al.^{1} confine [their] analysis to the mean–variance optimization (MVO) framework of Markowitz.
What about other portfolio allocation frameworks, though?
Would the Gerber statistic be somewhat taylored to the mean-variance framework, for example because of a hidden relationship with quadratic utility?
To answer this question, I propose to adapt the portfolio investment strategy detailled inthe previous section to the risk parity framework, and more precisely to the equal risk contributions framework^{17}, as follows:
Figure 8 illustrates the resulting portfolio investment strategy equity curves in the case of a Gerber threshold equal to 0.5.
Figure 8 also empirically confirms that the Gerber covariance matrix also behaves properly in a non mean–variance framework.
Figure 3 and Figure 4 might give the wrong impression that the Gerber covariance matrix always dominates the sample covariance matrix in terms of ex post risk-return within the Markowitz’s mean-variance framework^{18}.
In order to illustrate that this is not the case, I will backtest the same portfolio investment strategy as detailled in one of the previous sections, but this time with the ten-asset universe of the Adaptative Asset Allocation strategy^{19}^{20} from ReSolve Asset Management:
This universe is very similar to the nine-asset universe used in Gerber et al.^{1}, because it is also well-diversified in terms of asset classes.
Unfortunately, with this universe, the ex post efficient frontier corresponding to the Gerber covariance matrix does not always dominate anymore the ex post efficient frontier corresponding to the sample covariance matrix, as can be seen in Figure 9, Figure 10 and Figure 11.
Because of its definition, the Gerber statistic must intuitively be more sensitive than the sample covariance matrix to measurement error^{21}.
Still, in my own testing, I did not notice any excessive sensitivity^{22}.
For example, Figure 12 is the equivalent of Figure 9 when daily asset returns are used instead of monthly returns.
Comparing these two figures, it is hard to conclude that the Gerber covariance matrix is dramatically more sensitive to the number of observations than the sample covariance matrix^{23}.
That being said, Flint and Polakow^{2} investigate the sensitivity of the Gerber statistic to estimation error more rigourously, and do find that there is considerable variation in the GS when estimated with limited observations^{2}.
So, better to err on the side of caution here.
I hope that thanks to this post you now have a good overview of the Gerber statistic, along with some of the practical concerns associated with its usage.
As Flint and Polakow^{2} put it:
Overall, the GS is an interesting conditional dependence metric, but not without its flaws or caveats.
If you have any questions, or if you would like to discuss further, feel free to connect with me on LinkedIn or to follow me on Twitter.
–
See Gerber, S., B. Javid, H. Markowitz, P. Sargen, and D. Starer (2022). The gerber statistic: A robust co-movement measure for portfolio optimization. The Journal of Portfolio Management 48(2), 87–102. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13} ↩^{14} ↩^{15} ↩^{16} ↩^{17} ↩^{18} ↩^{19}
See Flint, Emlyn and Polakow, Daniel A., Deconstructing the Gerber Statistic. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7}
More precisely, Gerber et al.^{1} define a concordant pair of returns as a pair which both components pierce their thresholds while moving in the same direction and a discordant pair of returns as a pair whose components pierce their thresholds while moving in opposite directions. ↩ ↩^{2}
All asset returns considered in this blog post are total returns. ↩
See Ledoit, O., and M. Wolf. 2004. “Honey, I Shrunk the Sample Covariance Matrix.” The Journal of Portfolio Management 30 (4): 110–119. ↩
The exact implementation details used by Gerber et al.^{1} can be found in the Python code associated to their paper; one important detail to note is that when there is no mean-variance efficient portfolio with the desired volatility, the minimum variance portfolio or the maximum return portfolio is used instead. ↩
The same conclusion applies for the two other values of the Gerber threshold, c.f. Gerber et al.^{1}. ↩
In particular, when there is no mean-variance efficient portfolio with a desired volatility because the desired volatility is too low, it might be more in line with the mean-variance framework to use a partially invested portfolio v.s. the minimum variance portfolio as in Gerber et al.^{1}. ↩
I would like to thank Mr William Smyth^{24} for providing me returns data for the nine-asset universe. ↩
It’s definitely on the to do list, though. ↩
In addition to the difference in managing the portfolio volatility constraint^{8}, there are other subtle differences in my reproduction of the backtest of Gerber et al.^{1}; for example, I do not consider any transaction cost, I use the arithmetic average return of indexes and not their geometric average return, etc. ↩
The performances of the method seem to be robust w.r.t. the lookback period; to be noted that a lookback period of 12 months results in the best performances, but I chose 24 months to be consistent with the lookback period used to compute mean-variance input estimates. ↩
In terms of the Sharpe ratio of the resulting portfolio investment strategy, the best Gerber threshold among the thresholds displayed in Figure 5 is equal to 0.5. ↩
Zhao, Tuo, et al. Positive Semidefinite Rank-Based Correlation Matrix Estimation With Application to Semiparametric Graph Estimation. Journal of Computational and Graphical Statistics, vol. 23, no. 4, 2014, pp. 895–922. JSTOR ↩
See F.A. Alqallaf, K.P. Konis, R.D. Martin, and R.H. Zamar. Scalable robust covariance and correlation estimates for data mining. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 14-23. ACM, 2002. ↩
See S. Gerber, H. Markowitz, P. Ernst, Y. Miao, B. Javid, P. Sargen, Proofs that the Gerber Statistic is Positive Semidefinite, arXiv. ↩
See Richard, Jean-Charles and Roncalli, Thierry, Constrained Risk Budgeting Portfolios: Theory, Algorithms, Applications & Puzzles. ↩
Which, to be clear, is not at all the conclusion of Gerber et al.^{1}. ↩
See Butler, Adam and Philbrick, Mike and Gordillo, Rodrigo and Varadi, David, Adaptive Asset Allocation: A Primer. ↩
The associated returns data have been retrieved using Tiingo. ↩
In this context, the measurement error is due to the short length of the time series of asset returns that are typically used for covariance matrix estimation. ↩
To be noted that I only used the Gerber statistic with well diversified universes of assets and not with let’s say a universe of stocks (S&P 500…). ↩
Or at the very least, if it really is, this does not translate into dramatically different risk-return performances, which is what ultimately matters from a portfolio management perspective. ↩
See William Smyth, Daniel Broby, An enhanced Gerber statistic for portfolio optimization, Finance Research Letters, Volume 49, 2022, 103229. ↩
Since its publication, mVaR has been widely adopted by academic researchers, financial regulators^{3} and practitioners, who typically highlight its straightforward numerical implementation and its ease of interpretation thanks to its explicit form^{4}.
Nevertheless, it has been observed in practice that mVaR only works well for non-normal distributions that are close to the Gaussian distribution and for tail probabilities which are not too small^{5}.
In this post, I will explain why in the light of the results of Maillard^{6} and Lamb et al.^{7}, who show that mVaR accuracy is related to the mathematics of the Cornish-Fisher expansion.
I will also empirically demonstrate, using Bitcoin and the SPY ETF, that the method proposed by Maillard^{6} to improve mVaR accuracy makes it usable for moderately to highly non-normal distributions as well as for small tail probabilities^{8}.
The (percentage) Value-at-Risk (VaR) of a portfolio of financial assets corresponds to the percentage of portfolio wealth that can be lost over a certain time horizon and with a certain probability^{9}.
More formally, the Value-at-Risk $VaR_{\alpha}$ of a portfolio over a time horizon $T$ (1 day, 10 days…) and at a confidence level $\alpha$% $\in ]0,1[$ (95%, 97.5%, 99%…) can be defined^{5} as the opposite of the lower $1 - \alpha$ quantile of the portfolio return^{10} distribution over the time horizon $T$
\[\text{VaR}_{\alpha} (X) = - \inf_{x} \left\{x \in \mathbb{R}, P(X \leq x) \geq 1 - \alpha \right\}\], where $X$ is a random variable representing the portfolio return over the time horizon $T$.
This formula is also equivalent^{11} to
\[\text{VaR}_{\alpha} (X) = - F_X^{-1}(1 - \alpha)\], where $F_X^{-1}$ is the inverse cumulative distribution function, also called the quantile function, of the random variable $X$.
The previous definition of VaR is not directly usable, because it requires to specify the portfolio return distribution.
One possible approach is to approximate the portfolio return distribution by its empirical distribution, in which case the associated VaR is called historical Value-at-Risk (HVaR).
Another possible approach is to approximate the portfolio return distribution by a given probability distribution, in which case the associated VaR is called parametric Value-at-Risk.
When this distribution is chosen to be the Gaussian distribution $\mathcal{N}_{\mu, \sigma^2}$, that is, when $X \sim \mathcal{N} \left( \mu, \sigma^2 \right)$ with $\mu$ the location parameter and $\sigma$ the scale parameter, the associated VaR is called Gaussian Value-at-Risk (GVaR) and is computed through the formula^{12}
\[\text{GVaR}_{\alpha} (X) = - \mu - \sigma z_{1 - \alpha}\], where:
Approximating a portfolio return distribution by a Gaussian distribution might be appropriate in some cases, depending on the assets present in the portfolio and on the time horizon^{13}, but generally speaking, financial assets exhibit skewed and fat-tailed return distributions^{2}, so that it makes more sense to also consider higher moments than just the first two.
For this reason, Zangari^{1} proposed to approximate the $1 - \alpha$ quantile of the portfolio return distribution by a fourth order Cornish–Fisher expansion of the $1 - \alpha$ quantile of the standard normal distribution, which allows to take into account skewness and kurtosis present in the portfolio return distribution.
The resulting VaR, called modified Value-at-Risk or sometimes Cornish-Fisher Value-at-Risk (CFVaR), is computed through the formula^{12}
\[\text{mVaR}_{\alpha} (X) = - \mu - \sigma \left[ z_{1-\alpha} + (z_{1-\alpha}^2 - 1) \frac{\kappa}{6} + (z_{1-\alpha}^3-3z_{1-\alpha}) \frac{\gamma}{24} -(2z_{1-\alpha}^3-5z_{1-\alpha})\frac{\kappa^2 }{36} \right]\], where the location parameter $\mu$, the scale parameter $\sigma$, the skewness parameter $\kappa$ and the excess kurtosis parameter $\gamma$ are usually^{2} estimated by their sample counterparts computed from past portfolio returns
To be noted that using this formula to compute VaR is equivalent to making the assumption that the portfolio return distribution follows what could be called a Cornish-Fisher distribution^{7} $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$, whose inverse cumulative distribution function is given by
\[F_X^{-1}(u) = \mu + \sigma \left[ z_u + (z_u^2 - 1) \frac{\kappa}{6} + (z_u^3-3z_u) \frac{\gamma}{24} -(2z_u^3-5z_u)\frac{\kappa^2}{36} \right]\], where:
, which is also equivalent^{7} to making the assumption that
\[X \sim \mu + \sigma \left[ Z + (Z^2 - 1) \frac{\kappa}{6} + (Z^3-3Z) \frac{\gamma}{24} -(2Z^3-5Z)\frac{\kappa^2}{36} \right]\], where:
Figure 1 compares, over the period 01 February 1993 - 04 April 2023, the empirical distribution of the SPY ETF daily returns^{14} to the Cornish-Fisher distribution $\mathcal{CF}_{\mu_s, \sigma_s, \kappa_s, \gamma_s}$ with parameters:
On this figure, it is visible that the Cornish-Fisher distribution does not accurately approximate the empirical distribution of the SPY ETF returns.
The same also applies to the left tail of the empirical distribution of the SPY ETF returns, as can be seen in Figure 2.
On top of this poor approximation accuracy, and maybe even worse, taking a closer look at Figure 1 also reveals that the Cornish-Fisher distribution does not seem to be monotonous. For example, quantiles between 20% and 40% are positive while quantiles between 60% and 80% are negative! This means that the Cornish-Fisher distribution is not a proper probability distribution^{15}.
What could explain these observations, while the Cornish-Fisher expansion is supposed, by construction, to be able to approximate the quantiles of any distribution?
Let’s dig in Maillard^{6}!
Maillard^{6} notes that in order for the Cornish-Fisher expansion to result in a well-defined quantile function, the skewness parameter $\kappa$ and the excess kurtosis parameter $\gamma$ must satisfy the constraints
\[| \kappa | \leq 6 \left( \sqrt{2} - 1 \right)\] \[27 \gamma^2 - (216 + 66 \kappa^2) \gamma + 40 \kappa^4 + 336 \kappa^2 \leq 0\]These two constraints define the domain of validity of the Cornish-Fisher expansion, represented in Figure 3.
When used outside of its domain of validity, the Cornish-Fisher expansion is known to have several issues impacting its accuracy^{16}, among which non-monotonous quantiles.
And as can be seen in Figure 4, this is exactly what happens in the case of the SPY ETF, with the parameters $\left( \kappa, \gamma \right) \approx (-0.28740, 10.898897) $ clearly outside of the domain of validity of the Cornish-Fisher expansion.
Hopefully, there is a way to circumvent the relative narrowness of the domain of validity of the Cornish-Fisher expansion thanks to a regularization procedure called increasing rearrangement^{17} and described in details in Chernozhukov et al.^{18}
The impact of this procedure is illustrated in Figure 5, which compares the same two distributions as in Figure 1, except that the Cornish-Fisher distribution has been rearranged.
The rearranged Cornish-Fisher distribution is now monotonous, as it should be, but unfortunately, it only marginally better approximates the empirical distribution of the SPY ETF returns.
So, either all hope is lost w.r.t. using mVaR with moderately non-normal return distributions or there is another problem hidden somewhere waiting to be found…
Let’s dig a little bit further in Maillard^{6}!
Maillard^{6} also notes that the scale, skewness and excess kurtosis parameters $\sigma$, $\kappa$ and $\gamma$ do not match the actual standard deviation $\sigma_{CF}$, skewness $ \kappa_{CF}$ and excess kurtosis $\gamma_{CF}$ of the Cornish-Fisher distribution $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$.
More precisely, he establishes the following relationships
\[\begin{align} \mu_{CF} &= \mu \\ \sigma_{CF} &= \sigma \sqrt{ 1 + \frac{1}{96} \gamma^2 + \frac{25}{1296} \kappa^4 - \frac{1}{36} \gamma \kappa^2 } \\ \kappa_{CF} &= f_1(\kappa, \gamma) \\ \gamma_{CF} &= f_2(\kappa, \gamma) \\ \end{align}\], where:
As a consequence, when the sample moments of a return distribution are used as plug-in estimators for the Cornish-Fisher parameters, the actual moments of the resulting Cornish-Fisher distribution differ from these sample moments!
Do they differ enough to create a real problem, though?
Re-using the SPY ETF example:
So, yes, they do differ a lot, especially the excess kurtosis!
This subtlety is the hidden problem explaining^{19} the observed lack of accuracy of modified Value-at-Risk when return distributions are not close to normal^{5}. Indeed, it cannot be expected from a “wrong” Cornish-Fisher distribution to accurately approximate anything useful.
The solution to this problem consists in inverting the relationships (1)-(4) between the actual moments and the parameters of the Cornish-Fisher distribution $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$.
In other words, we need to determine the value of the parameters $\mu$, $\sigma$, $\kappa$ and $\gamma$ of the Cornish-Fisher distribution $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$ so that its actual moments $\mu_{CF}$, $\sigma_{CF}$, $\kappa_{CF}$ and $\gamma_{CF}$ are equal to the sample moments $\mu_{s}$, $\sigma_{s}$, $\kappa_{s}$ and $\gamma_{s}$ of the empirical return distribution, c.f. Lamb et al.^{7}.
More on how to do this numerically later.
The resulting Cornish-Fisher distribution is called the corrected Cornish-Fisher distribution $\mathcal{cCF}_{\mu_s, \sigma_s, \kappa_s, \gamma_s}$ and the underlying Cornish-Fisher expansion the corrected Cornish-Fisher expansion^{4}.
Re-using one last time the SPY ETF example, we have:
, and Figure 6 compares the resulting corrected Cornish-Fisher distribution to the two distributions of Figure 5.
The approximation of the empirical return distribution by the corrected Cornish-Fisher distribution is so accurate that these two distributions are nearly indistinguishable in this figure.
Figure 7, Figure 8 and Figure 9 compare the left tail of the three distributions from Figure 6.
A nearly perfect fit again between the empirical return distribution and the corrected Cornish-Fisher distribution.
This example empirically demonstrates that modified Value-at-Risk, when corrected using Maillard^{6} results, works well for moderately non-normal distributions and for very small tail probabilities.
As mentioned in the previous section, computing the corrected Cornish-Fisher distribution requires to invert the relationships (1)-(4) between the actual moments and the parameters of the Cornish-Fisher distribution $\mathcal{CF}_{\mu, \sigma, \kappa, \gamma}$.
Because the location parameter $\mu$ is invariant by (1), and because the scale parameter $\sigma$ is easily computed thanks to (2) once the skewness parameter $\kappa$ and the excess kurtosis parameter $\gamma$ have been computed, the main mathematical challenge is to invert the system of non-linear equations (3)-(4).
Before thinking about how to invert these equations numerically, we first need to make sure that they are invertible theoretically.
Lamb et al.^{7} prove that this is the case when the actual skewness $\kappa_{CF}$ and the actual excess kurtosis $\gamma_{CF}$ belong^{20} to what could be called the domain of validity of the corrected Cornish-Fisher expansion^{21}, represented in Figure 10.
Lamb et al.^{7} also establish that the resulting skewness parameter $\kappa$ and excess kurtosis parameter $\gamma$ belong to the domain of validity of the Cornish-Fisher expansion, which ensures that the resulting corrected Cornish-Fisher distribution is a proper distribution.
To be noted that the domain of validity of the corrected Cornish-Fisher expansion (Figure 10) is much wider than the domain of validity of the Cornish-Fisher expansion (Figure 3).
This is extremely important in applications, because the actual skewness $\kappa_{CF}$ and the actual excess kurtosis $\gamma_{CF}$ of the corrected Cornish-Fisher distribution typically correspond to the sample skewness $\kappa_s$ and to the sample excess kurtosis $\gamma_s$ of a given distribution^{22}, so that the corrected Cornish-Fisher distribution is valid in practice for a much wider range of skewness and excess kurtosis than the non-corrected Cornish-Fisher distribution.
At least two algorithms have been analyzed in the literature to compute the corrected Cornish-Fisher parameters from the actual moments:
Portfolio Optimizer implements a proprietary algorithm to compute the parameters of the corrected Cornish-Fisher distribution, whose general description is:
These are either directly provided in input of the endpoint (e.g. /assets/returns/simulation/monte-carlo/cornish-fisher/corrected
) or
computed from an empirical distribution of returns (e.g. /portfolio/analysis/value-at-risk/cornish-fisher/corrected
).
Once these parameters are known, the relationships (1)-(4) allow to determine the resulting corrected Cornish-Fisher distribution $\mathcal{cCF}_{\mu_s, \sigma_s, \kappa_s, \gamma_s}$.
Bitcoin is an example of asset exhibiting strong non-normal characteristics^{24}, for which the standard measures of Value-at-Risk like Gaussian Value-at-Risk or modified Value-at-Risk would be inaccurate.
But what about modified Value-at-Risk based on the corrected Cornish-Fisher expansion?
In order to investigate the accuracy of this measure, that I will call corrected Cornish-Fisher Value-at-Risk (cCFVaR), Figure 11 compares, over the period 20 August 2011 - 06 April 2023, the empirical distribution of Bitcoin daily returns^{14} to the corrected Cornish-Fisher distribution $\mathcal{cCF}_{\mu_s, \sigma_s, \kappa_s, \gamma_s}$ with actual moments:
It seems that the corrected Cornish-Fisher distribution does a pretty good job in approximating the empirical return distribution of Bitcoin, except in the right tail though.
Figure 12 and Figure 13 compare the left tail of these two distributions.
There figures confirm that the corrected Cornish-Fisher distribution accurately approximates the empirical return distribution of Bitcoin down to a confidence level of $\approx 95\%$, but no lower.
This can also be confirmed numerically, with a comparison between historical Value-at-Risk and corrected Cornish-Fisher Value-at-Risk at different confidence levels:
Confidence level $\alpha$ | $\text{HVaR}_{\alpha}$ | $\text{cCFVaR}_{\alpha}$ |
---|---|---|
95% | 6.90% | 6.86% |
97.5% | 9.53% | 10.63% |
99% | 13.36% | 16.51% |
99.5% | 15.92% | 21.56% |
99.9% | 27.04% | 35.08% |
All in all, this example empirically demonstrates that modified Value-at-Risk, when corrected following Maillard^{6} results, works well for highly non-normal distributions with not too small tail probabilities.
The goal of this post was to highlight that accuracy issues reported by practitioneers with modified Value-at-Risk have been understood since more than ten years, but that, as Amedee-Manesme et al.^{4} put it:
this point […] does not seem to have received sufficient attention
If you are such a practitioneer, I hope that this post will encourage you to double check how modified Value-at-Risk is computed by your internal risk management software.
Waiting for an answer from your (puzzled) IT teams, feel free to connect with me on LinkedIn or follow me on Twitter.
–
See Zangari, P. (1996). A VaR methodology for portfolios that include options. RiskMetrics Monitor First Quarter, 4–12. ↩ ↩^{2}
See Martin, R. Douglas and Arora, Rohit, Inefficiency of Modified VaR and ES. ↩ ↩^{2} ↩^{3} ↩^{4}
For example, European financial regulators require to use mVaR in order to compute the Summary Risk Indicator (SRI), i.e. the risk score, of Packaged Retail Investment and Insurance Products (PRIIPs) starting 1st January 2023, c.f. regulatory Technical Standards on the content and presentation of the KIDs for PRIIPs. ↩
See Amedee-Manesme, CO., Barthelemy, F. & Maillard, D. Computation of the corrected Cornish–Fisher expansion using the response surface methodology: application to VaR and CVaR. Ann Oper Res 281, 423–453 (2019). ↩ ↩^{2} ↩^{3} ↩^{4}
See Stoyan V. Stoyanov, Svetlozar T. Rachev, Frank J. Fabozzi, Sensitivity of portfolio VaR and CVaR to portfolio return characteristics, Working paper. ↩ ↩^{2} ↩^{3}
See Maillard, Didier, A User’s Guide to the Cornish Fisher Expansion. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9}
See Lamb, John D., Maura E. Monville, and Kai-Hong Tee. Making Cornish–fisher Fit for Risk Measurement, Journal of Risk, Volume 21, Number 5, Pages 53-81. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7}
Like 1% quantile or even less. ↩
See Jorion, P. (2007). Value at risk: The new benchmark for managing financial risk. New York, NY: McGraw-Hill. ↩
In this post, returns are assumed to be logarithmic returns. ↩
This is the case when the portfolio return cumulative distribution function is strictly increasing and continuous; otherwise, a similar formula is still valid, with $F_X^{-1}$ the generalized inverse distribution function of $X$, but these subtleties - important in mathematical proofs and in numerical implementations - are out of scope of this post. ↩
See Boudt, Kris and Peterson, Brian G. and Croux, Christophe, Estimation and Decomposition of Downside Risk for Portfolios with Non-Normal Returns (October 31, 2007). Journal of Risk, Vol. 11, No. 2, pp. 79-103, 2008. ↩ ↩^{2}
Asset returns have a tendency to follow a distribution closer and closer to a Gaussian distribution the more the time period over which they are computed increases; this empirical property is called aggregational Gaussianity, c.f. Cont^{25}. ↩
The associated adjusted prices have been retrieved using Tiingo. ↩ ↩^{2}
This also means that it is possible to have $\text{mVaR}_{95\%} > \text{mVaR}_{99\%} $, which requires some funny arguments to be explained… ↩
See Barton, D.E., & Dennis, K.E. (1952). The conditions under which Gram-Charlier and Edgeworth curves are positive definite and unimodal. Biometrika, 39(3-4), 425–427. ↩
I will not enter into the mathematical details in this post, but it suffices to say that this procedure allows to correct the behavior of the Cornish-Fisher expansion when used outside of its domain of validity thanks to a sorting operator. ↩
See Chernozhukov, V., Fernandez-Val, I. & Galichon, A. Rearranging Edgeworth–Cornish–Fisher expansions. Econ Theory 42, 419–435 (2010). ↩ ↩^{2}
In addition, Maillard^{6} mentions that when the skewness and excess kurtosis parameters are small enough, in a loose sense, they coincide with the actual skewness and excess kurtosis of the Cornish-Fisher distribution, which perfectly explains the behavior of the modified Value-at-Risk observed in practice with return distributions close to normal^{5}. ↩
Actually, the result of Lamb et al.^{7} is a little bit more generic: they establish that the system of non-linear equations is invertible on a region which includes the domain of validity of the Cornish-Fisher expansion. ↩
The domain of validity of the corrected Cornish-Fisher expansion is the mathematical image, by the functions $f_1$ and $f_2$, of the domain of validity of the Cornish-Fisher expansion. ↩
In the context of this blog post, the given distribution is a return distribution (asset, portfolio, strategy…). ↩
This tentative computation is theoretically justified by the results from Lamb et al.^{7}. ↩
See Joerg Osterrieder, The Statistics of Bitcoin and Cryptocurrencies, Proceedings of the 2017 International Conference on Economics, Finance and Statistics (ICEFS 2017). ↩
See R. Cont (2001) Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, 1:2, 223-236. ↩
One issue with such instruments, though, is that their price history dates back to at best 2002^{1}, which is problematic in some applications like trading strategy backtesting or portfolio historical stress-testing.
In this post, which builds on the paper Treasury Bond Return Data Starting in 1962 from Laurens Swinkels^{2}, I will show that the returns of specific bond ETFs - those seeking a constant maturity exposure to government-issued bonds - can be simulated using standard textbook formulas^{2} together with appropriate yields to maturity.
This allows in particular to extend the price history of these ETFs by several tens of years thanks to publicly available yield to maturity series published by governments, government-affiliated agencies, researchers…
Notes:
- A Google sheet corresponding to this post is available here
In what comes next, I will make heavy use of the formula expressing the price of a bond as a function of its yield to maturity.
This formula can be found in the appendix A3.1 Yield to maturity for settlement dates other than coupon payment dates of Tuckman and Serrat^{3}, and is reproduced below for convenience.
Le be a bond^{4} at a date $t$, with a remaining maturity equal to $T$, a yield to maturity equal to $y_t$ and a coupon rate equal to $c_t$.
Then, its price $P_t(c_t,y_t,T)$ per 100 face amount is equal to
\[\left( 1 + \frac{y_t}{2} \right)^{1 - \tau_{t}} \left[ \frac{100 c_{t}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2T}} \right) + \frac{100}{\left( 1 + \frac{y_t}{2} \right)^{2T}} \right] - 1\], where $\tau_{t}$ is the fraction of a semiannual period until the next coupon payment.
Using the bond yield formula, it is possible to approximate the total return $TR$ of a par bond over a specific period using only its remaining maturity at the beginning of the period, its yield to maturity at the beginning of the period and its yield to maturity at the end of the period.
In the case of a monthly period, let be a bond such that:
Then, assuming that
, the total return $TR_t$ of this bond from the end of the month $t-1$ to the end of the month $t$ can be approximated by
\[\frac{y_{t-1}}{12} + \frac{y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} - 1\]A possible demonstration for the previous formula goes as follows.
At the end of the month $t-1$, the bond has the following characteristics:
Its price $P_{t-1}(c_{t-1},y_{t-1},T)$ is then equal, through the bond yield formula, to
\[100 \left( 1 + \frac{y_{t-1}}{2} \right)^{1 - \tau_{t-1}}\], with $\tau_{t-1}$ the fraction of a semiannual period until the next coupon payment at the end of month $t-1$.
At the end of the month $t$, the bond has the following characteristics:
Its price $P_t(c_{t},y_{t},T - \frac{1}{12})$ is then equal, through the bond yield formula, to
\[\left( 1 + \frac{y_t}{2} \right)^{1 - \tau_{t}} \left[ \frac{100 y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{100}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right]\], with $\tau_{t}$ the fraction of a semiannual period until the next coupon payment at the end of month $t$.
The total return $TR_t$ of this bond from the end of the month $t-1$ to the end of the month $t$ is then by definition equal to
\[\frac{P_t(c_{t},y_{t},T - \frac{1}{12})}{P_{t-1}(c_{t-1},y_{t-1},T)} - 1\], that is
\[\frac{\left( 1 + \frac{y_t}{2} \right)^{1 - \tau_{t}}}{\left( 1 + \frac{y_{t-1}}{2} \right)^{1 - \tau_{t-1}}} \left[ \frac{y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right] - 1\]The first term of this expression corresponds to the re-investment of the accrued interest.
Under the practical assumptions that
and noticing that $ \tau_{t} = \tau_{t-1} - \frac{1}{6}$^{7}, this expression becomes
\[TR_t \approx \left[ \left( 1 + \frac{y_{t-1}}{2} \right)^{\frac{1}{6}} - 1 \right] + \left[ \frac{y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right] - 1\]Finally, by linearizing the accrued interest through the first-order Taylor approximation $ \left( 1 + \frac{y_{t-1}}{2} \right)^{\frac{1}{6}} \approx \frac{y_{t-1}}{12} $, this expression becomes
\[TR_t \approx \frac{y_{t-1}}{12} + \left[ \frac{y_{t-1}}{y_t} \left( 1 - \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{y_t}{2} \right)^{2(T-\frac{1}{12})}} \right] - 1\]Remark:
- The formula above is based on a suggestion by Dr Winfried Hallerbach to improve the accuracy of the initial formula used in Swinkels^{2} which is based on a second-order Taylor approximation of the bond yield formula, c.f. Swinkels^{8}.
Thanks to a variation^{9} of the par bond total return formula established in the previous section, Swinkels^{2} describes how to construct long (total) return series for government bonds using publicly available constant maturity government rates^{10}.
These rates correspond to the yields to maturity of (fictitious) government bonds whose maturity is kept constant and are typically estimated by governments or government-affiliated agencies, which explains why they are publicly available. For example:
As a side note, long return series for government bonds are usually commercially licensed (Global Financial Data, Bloomberg…), so that the methodology of Swinkels^{2} participates to have a high-quality public alternative to commercially available data^{2} for research purposes.
As an illustration of the methodology of Swinkels^{2}, below are yields to maturity for 3 consecutive months taken from the FRED 10-Year Treasury Constant Maturity Rates series:
Date | Yield to maturity |
---|---|
31 Dec 2022 | 3.880% |
31 Jan 2023 | 3.520% |
28 Feb 2023 | 3.920% |
The total return series $ \left( TR_1, TR_2 \right) $ of the fictitious 10-year constant maturity government bond associated to these yields to maturity is then constructed by:
Computing the total return $TR_1$ from 31 Dec 2022 to 31 Jan 2023 thanks to the par bond total return formula, with $T = 10$, $y_{t-1} = 3.880\%$ and $y_t=3.520\%$.
This gives
\[TR_1 \approx \frac{0.0388}{12} + \frac{0.0388}{0.0352} \left( 1 - \frac{1}{\left( 1 + \frac{0.0352}{2} \right)^{2(10-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{0.0352}{2} \right)^{2(10-\frac{1}{12})}} - 1\]That is
\[TR_1 \approx 3.31\%\]Computing the total return $TR_2$ from 31 Jan 2023 to 28 Feb 2023 thanks again to the par bond total return formula, but with this time $T = 10$^{11}, $y_{t-1} = 3.520\%$ and $y_t=3.920\%$.
This gives
\[TR_2 \approx \frac{0.0352}{12} + \frac{0.0352}{0.0392} \left( 1 - \frac{1}{\left( 1 + \frac{0.0392}{2} \right)^{2(10-\frac{1}{12})}} \right) + \frac{1}{\left( 1 + \frac{0.0392}{2} \right)^{2(10-\frac{1}{12})}} - 1\]That is
\[TR_2 \approx -2.97\%\]The Portfolio Optimizer endpoint /bonds/returns/par/constant-maturity
implements the methodology of Swinkels^{2}
using the par bond total return formula established in the previous section.
Many government bond ETFs target a specific maturity, a specific average maturity or a specific maturity range for their underlying portfolio of government bonds.
For example, the iShares 7-10 Year Treasury Bond ETF
seeks to track the investment results of an index composed of U.S. Treasury bonds with remaining maturities between seven and ten years^{12}.
Intuitively, such ETFs should more or less behave like a constant maturity government bond, so that it should be possible to simulate their (total) returns using the methodology of Swinkels^{2} detailed in the previous section.
Nevertheless, and especially because these ETFs need to frequently rebalance their holdings^{13}, such simulated returns might not be accurate enough to be of any practical use…
Let’s dig in.
In order to illustrate the quality of the simulated returns discussed above, Figure 1 through Figure 5 compare the actual returns of the members of the iShares family of U.S. Treasury bond ETFs to the theoretical returns simulated using the methodology of Swinkels^{2}.
The theoretical returns of this ETF are simulated with the FRED 3-Year Treasury Constant Maturity Rates.
The theoretical returns of this ETF are simulated with the FRED 7-Year Treasury Constant Maturity Rates.
The theoretical returns of this ETF are simulated with the FRED 10-Year Treasury Constant Maturity Rates.
The theoretical returns of this ETF are simulated with the FRED 20-Year Treasury Constant Maturity Rates.
The theoretical returns of this ETF are simulated with the FRED 30-Year Treasury Constant Maturity Rates.
On all these figures, it is clear that simulated returns are closely matching actual returns.
The IEF ETF is an exception, though, because several simulated returns were significantly different from their actual counterparts over the period 2012 - 2014.
Nevertheless, for all the five ETFs, correlations between actual and simulated returns are greater than ~97%, which confirms that it is possible to accurately simulate the returns of constant maturity government bond ETFs^{14} using the methodology of Swinkels^{2}.
Each of the five ETFs analyzed in the previous section invests over a given segment of the U.S. Treasury yield curve (1-3 years, 3-7 years, 10-20 years…).
This segment is sometimes wide, like in case of the TLH ETF, but this characteristic allows these ETFs to be considered as constant maturity government bonds.
Now, what about non-constant maturity government bond ETFs?
To answer this question empirically, Figure 6 compares the actual returns of the iShares U.S. Treasury Bond ETF (GOVT ETF) to the
theoretical returns simulated using the methodology of Swinkels^{2} with a weighted average of 3-year, 7-year, 10-year, 20-year and 30-year Treasury constant maturity rates^{15}.
Once again, it appears that simulated returns are closely matching actual returns^{16}.
This example shows that, at least in some cases, it should be possible to accurately simulate the (total) returns of non-constant maturity government bond ETFs using the methodology of Swinkels^{2}, provided that these ETFs are considered as a weighted average of constant maturity government bonds instead of a single constant maturity government bond.
The previous sections demonstrated that it is possible to simulate quite accurately the returns of constant maturity government bond ETFs.
This opens the door to extending their price history.
I will use the TLT ETF as an example.
Figure 5 showed that the actual returns of the TLT ETF are devilishly close to the theoretical returns simulated using the methodology of Swinkels^{2} with the FRED 30-Year Treasury Constant Maturity Rates series.
As a consequence, because the FRED provides the historical values of these rates back to February 1977, the price history of the TLT ETF can be extended by ~25 years.
This extended history is depicted in Figure 7.
This blog post described how to use the methodology of Swinkels^{2} to simulate present and past returns of constant maturity government bond ETFs.
One possible next step is to also use this methodology to simulate future returns of such ETFs, from views on future yields to maturity.
Maybe the subject of another post.
Meanwhile, feel free to connect with me on LinkedIn or follow me on Twitter to discuss about Portfolio Optimizer or about how to best approximate bond ETFs returns :-) !
–
See Swinkels, L., 2019, Treasury Bond Return Data Starting in 1962, Data 4(3), 91. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13} ↩^{14} ↩^{15}
See Tuckman, B., and Serrat A., 2022, Fixed Income Securities Tools for Today’s Markets, 4th edition, John Wiley And Sons Ltd. ↩
In this post, I use the same conventions as in Tuckman and Serrat^{3}: bonds are assumed to be paying semiannual coupons, their coupon rate is assumed to be annual, their yield to maturity is assumed to be provided as semiannually compounded and their maturity is assumed to be expressed in years. ↩
Another sensible choice would be to use a rate equal to $\frac{y_{t-1} + y_{t}}{2}$. ↩
Since bonds with semi-annual coupons are paying coupons every six months, these coupons are anyway hardly collected and re-invested every month in practice, so that this is a sensible simplifying assumption. ↩
C.f. Tuckman and Serrat^{3} for explanations about the term $\frac{1}{6}$. ↩
See Swinkels, L., 2023, Historical Data: International monthly government bond returns, Erasmus University Rotterdam (EUR). ↩
C.f. the remark at the end of the previous section. ↩
Constant yield to maturity rates are frequently estimated from government bonds that trade close to par, even though interest rates have changed since their original issuance, which justifies the usage of this formula, c.f. Swinkels^{2}. ↩
The maturity $T$ did not change because the bond is supposed to have a constant maturity. ↩
C.f. the iShares 7-10 Year Treasury Bond ETF website. ↩
For example, in order to target a specific maturity range, a government bond ETF must replace its holdings whose remaining maturity has become too short. As a side note, this behaviour explains the crazy annual portfolio turnover rate of these ETFs, with for example a turnover rate of 114% for the iShares 7-10 Year Treasury Bond ETF in 2022^{17}. ↩
Or at the very least, to accurately simulate the (total) returns of some constant maturity government bond ETFs. ↩
The weights correspond to the percentage breakdown of the GOVT ETF portfolio per maturity, retrieved from the iShares U.S. Treasury Bond ETF website on 19 March 2023. ↩
Numerically, correlation between actual and simulated returns is ~98%. ↩
C.f. the iShares 7-10 Year Treasury Bond ETF annual or semi-annual report. ↩
In this post, based on the paper Optimal Portfolios in Good Times and Bad by Chow et al.^{2}, I will describe how the turbulence index can be used to partition a set of asset returns into different subsets, each of them corresponding to a specific market risk regime.
I will also provide two examples of usage, one in portfolio optimization and one in the modeling of asset returns.
Let be:
The (raw) turbulence index $d(y_t)$ for the universe of assets and for a given period $t=1..T$ is defined as the squared Mahalanobis distance^{2}:
\[d(y_t) = (y_t - \mu) {}^t \Sigma^{-1} (y_t - \mu)\]The main mathematical property of the turbulence index relevant for this post^{3} is the following^{2}:
Property 1: If the asset returns $y_t$ follow a multivariate Gaussian distribution, that is, $y_t \sim \mathcal{N} \left( \mu, \Sigma \right)$, the turbulence index $d(y_t)$ follows a chi-square distribution with $n$ degrees of freedom, that is, $d(y_t) \sim \mathcal{X}^2(n)$.
Chow et al.^{2} describe how to use the turbulence index to identify multivariate outliers from a series of asset returns. These outliers, that are characterized by the unusual performance of an individual asset or from the unusual interaction of a combination of assets, none of which are necessarily unusual in isolation^{2} are representative of a [turbulent^{4} market] risk regime while the inliers are representative of a quiet market risk regime^{2}.
In more details, the method of Chow et al.^{2} partitions a set of multivariate asset returns $y_t \in \mathbb{R}^{n}, t=1..T$ into two subsets corresponding to these two regimes as follows:
Compute the mean vector $\mu$ of the asset returns
Compute the covariance matrix $\Sigma$ of the asset returns
Choose a turbulence threshold $tt$%, which represents the percentage of asset returns desired to be classified as quiet, with typical value 70%, 80%, 95%^{5}…
Convert the turbulence threshold $tt$% into a turbulence score $ts$
For each asset return vector $y_t, t=1..T$:
The turbulence threshold $tt$% is not directly comparable to the turbulence index values $d(y_t), t=1..T$ because they are not expressed in the same units^{6}.
As a consequence, $tt$% needs to be converted into a turbulence score $ts$.
Under the assumption that the asset returns $y_t$ follow a multivariate Gaussian distribution, and based on Property 1, this conversion can be done by thanks to the computation of the $tt$-th percentile of the chi-square distribution with $n$ degrees of freedom, that is,
\[ts = \left( \mathcal{X}^2(n) \right)^{-1} (tt)\]Nevertheless, because asset returns do not follow a multivariate Gaussian distribution in practice^{7}, this conversion will result in a proportion of asset returns classified as quiet that is different from the desired proportion.
This problem is highlighted in Chow et al.^{2}, in which a turbulence threshold of 75% is used to separate outliers from inliers while the actual proportion of asset returns classified as quiet v.s. turbulent is equal to 79.1%.
A possible solution is to convert the turbulence threshold $tt$% into a turbulence score thanks to the computation of the $tt$-th empirical percentile of the turbulence index distribution^{8}^{9}, with the caveat that this solution requires a long enough series of asset returns.
I will illustrate the method of Chow et al.^{2} with a simple two-asset universe made of:
The turbulence index for this universe of assets, computed using monthly returns over the period August 2002 - January 2023^{10}, is represented in Figure 1.
Then, using for example a turbulence threshold $tt$% of 80%, converted into a turbulence score $ts$ of ~3.22^{11}:
This partitioning of the SPY-TLT returns seems to make some sense, as several periods of market stress are identifiable within the turbulent regime: the Global Financial Crisis, the COVID-19 pandemic, the Russian invasion of Ukraine…
Portfolio Optimizer implements the method from Chow et al.^{2} through the endpoint /assets/returns/turbulence-partitioned
, with two extensions:
Being able to partition asset returns into different market risk regimes has several applications.
For example, it allows to analyze the potential behavior of a given portfolio during periods of market stress, which is of utmost importance for long-term investing.
As Chow et al.^{2} puts it:
[a] portfolio may not survive to generate long-term performance if [it] cannot withstand exceptional periods of market turbulence.
I will not insist on this specific example, though, but I will provide one example related to portfolio optimization and one example related to the modeling of asset returns.
Under the Markowitz’s mean-variance framework, building an optimal portfolio within a universe of assets requires an estimation of the asset covariance matrix.
Because [the] typical risk-estimation procedure […] is to weight a sample [of asset returns]’ observations equally in order to estimate risk parameters^{2}, the expected volatility of such a portfolio during periods of market stress, when asset returns typically become more volatile and more correlated, will be underestimated.
This situation is illustrated in Figure 3, taken from Chow et al.^{2}, in the case of a universe of assets made of eight distinct asset classes^{13}.
In this figure, it can be seen that an optimal portfolio whose asset covariance matrix is estimated from a non-partitioned set of asset returns (Full-Sample Optimal Mix) sees its volatility skyrocket during periods of market stress (Stressful environment) when compared to periods of market stability (Normal environment).
One solution to this issue is to estimate the asset covariance matrix from only the subset of asset returns corresponding to periods of market stress, but in this case, the expected portfolio return might be negatively impacted.
Indeed, as can also be seen in Figure 3, an optimal portfolio whose asset covariance matrix is estimated from only the subset of asset returns corresponding to periods of market stress (Outlier-Sample Optimal Mix) sees its expected return diminished by $\approx 1.24$% compared to the previous optimal portfolio (Full-Sample Optimal Mix).
Another solution, suggested by Chow et al^{2} and Kritzman et al.^{9}, is to estimate a blended asset covariance matrix from both^{14}:
More specifically, these papers suggest to estimate an asset covariance matrix $\Sigma^*$ equal to
\[\Sigma^* = \lambda_{i}^* p_{i} \Sigma_{i} + \lambda_{o}^* \left( 1 - p_{i} \right) \Sigma_{o}\], where:
Such a blended covariance matrix enables investors to express their views about the likelihood of each risk regime and to differentiate their aversion to the regimes^{2}.
Now, in practice, the computation of $\Sigma^*$ requires to estimate the probability $p_{i}$.
This can be done in an ad-hoc fashion, or using one’s preferred forecast technique, like the hidden Markov model used in Kritzman et al.^{9}.
A couple of remarks to finish:
Under the Markowitz’s mean-variance framework, building an optimal portfolio within a universe of assets also requires an estimation of the expected asset returns.
They are assumed to be regime-independent in Chow et al.^{2}, but they could perfectly be made conditional on the regime like in Bruder et al.^{17}.
In the specific case where investors are equally averse to both the quiet and the turbulent regime, the formula for $\Sigma^*$ simplifies to
It has been known since the early 1960s that the (marginal) statistical distribution of asset returns is neither normal nor lognormal^{7}, but more than sixty years later it is still an open question in financial mathematics to determine its exact nature.
Empirically, though, it has been demonstrated that several distributions are able to capture most of the stylized facts^{18} of asset returns.
One such distribution is the Gaussian mixture distribution, which is a convex combination of Gaussian distributions with different means and variances in the univariate case and a a convex combination of multivariate Gaussian distributions with different mean vectors and covariance matrices in the multivariate case.
The main advantages of this distribution over other alternatives like the multivariate t distribution is that it is a non-elliptical distribution^{19} both numerically tractable^{20} and extremely flexible.
For example, a univariate Gaussian mixture distribution can be unimodal, symmetric, skewed, multimodal, leptokurtic…^{21}.
As another example, a multivariate Gaussian mixture distribution with two components allows to approximate a multivariate jump-diffusion model driven by a standard Lévy process^{17}.
In this context, the method of Chow et al.^{2} can be applied to fit the parameters of a two-component^{22} multivariate Gaussian mixture distribution as detailed below^{23}:
In order to illustrate the validity of this approach on the two-asset SPY-TLT universe introduced in the previous section, Figure 4 through Figure 6 compare the empirical distribution of the monthly SPY (log) returns with the first marginal of:
It is clearly visible on these figures that both marginal Gaussian mixture distributions are much more appropriate than the marginal Gaussian distribution to model the SPY returns, with a slightly better fit obtained with the expectation–maximization algorithm.
This is confirmed numerically by the Kolmogorov-Smirnov goodness of fit test^{27}.
Going beyond univariate marginals, a 2D Kolmogorov-Smirnov goodness of fit test^{28} also confirms that both multivariate Gaussian mixture distributions are much more appropriate than the multivariate Gaussian distribution to model the joint SPY-TLT returns^{29}, with a slightly better fit again obtained with the expectation–maximization algorithm.
This example shows that it is possible to fit the parameters of a multivariate Gaussian mixture distribution modeling joint asset returns through an easily interpretable procedure, with no local optima and no convergence issue to worry about^{30}.
This concludes this second post on the turbulence index.
As usual, feel free to connect with me on LinkedIn or follow me on Twitter to discuss about Portfolio Optimizer or quantitative finance in general.
–
See M. Kritzman, Y. Li, Skulls, Financial Turbulence, and Risk Management,Financial Analysts Journal, Volume 66, Number 5, Pages 30-41, Year 2010. ↩
See George Chow, Jacquier, E., Kritzman, M., & Kenneth Lowry. (1999). Optimal Portfolios in Good Times and Bad. Financial Analysts Journal, 55(3), 65–73.. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13} ↩^{14} ↩^{15} ↩^{16} ↩^{17}
For other properties, c.f. the first blog post of this series. ↩
The turbulent regime is called stressful in Chow et al.^{2}. ↩
To be noted that $1 - tt$% is used in Chow et al.^{2} as the turbulence threshold, and not directly $tt$%. ↩
The turbulence threshold is expressed as a percentage, while the turbulence index is expressed as a squared Mahalanobis distance. ↩
See Mandelbrot B, The variation of certain speculative prices, The Journal of Business, 1963, vol. 36, 394. ↩ ↩^{2}
See M. Kritzman, Y. Li, Skulls, Financial Turbulence, and Risk Management,Financial Analysts Journal, Volume 66, Number 5, Pages 30-41, Year 2010. ↩ ↩^{2}
See Mark Kritzman, Kenneth Lowry and Anne-Sophie Van Royen, Risk, Regimes, and Overconfidence, The Journal of Derivatives Spring 2001, 8 (3) 32-42. ↩ ↩^{2} ↩^{3} ↩^{4}
I retrieved the monthly adjusted ETF prices over the period July 2002 - January 2023 using Tiingo. ↩
Using for example the Matlab function chi2inv, chi2inv(0.80,2) = 3.218875824868201. ↩
For example, Kinlaw et al.^{31} use, although in a slightly different context, three subsets corresponding to the three market risk regimes calm, moderate and turbulent. ↩
The eight asset classes are Domestic equities, Foreign equities, Emerging market, Domestic bonds, Foreign bonds, High-yield bonds, Commodities and Cash. ↩
Or from all the subsets of asset returns corresponding to all the market regimes in case more than two turbulence thresholds are used. ↩
In Chow et al^{2}, the relative risk aversion parameters are actually rescaled so that they sum to 2, that is, they must verify $\lambda_{i}^* + \lambda_{p}^* = 2$. ↩
Because Chow et al^{2} consider only two regimes, $p_{o} = 1 - p_{i}$. ↩
See Bruder, Benjamin and Kostyuchyk, Nazar and Roncalli, Thierry, Risk Parity Portfolios with Skewness Risk: An Application to Factor Investing and Alternative Risk Premia (September 22, 2016). ↩ ↩^{2}
See R. Cont (2001) Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, 1:2, 223-236. ↩
See Ian Buckley, David Saunders, Luis Seco, Portfolio optimization when asset returns have the Gaussian mixture distribution, European Journal of Operational Research, Volume 185, Issue 3, 2008, Pages 1434-1461. ↩
Because it is “just” an extension of the Gaussian model, calculations with this distribution are usually similar to those using the Gaussian distribution. ↩
See Gaussian mixtures and financial returns, C. Cuevas-Covarrubias, J. Inigo-Martinez, R. Jimenez-Padilla, Discussiones Mathematicae, Probability and Statistics 37 (2017) 101–122. ↩
Or of a multivariate Gaussian mixture distribution with more than two components in case more than two turbulence thresholds are used. ↩
This approach is similar in spirit to the thresholding method of Bruder et al.^{17}. ↩
Or, for more robustness in case the chi-square distribution is used to convert the turbulence threshold into a turbulence score, to the actual proportion of asset returns qualified as quiet v.s. turbulent. ↩
This turbulence threshold results in an actual proportion of asset returns that qualify as quiet v.s. turbulent equal to ~0.83%. ↩
Thanks to the Python Scikit-Learn package. ↩
The Kolmogorov-Smirnov statistic (resp. p-values) for the three marginal distributions are, in order ~0.0892, ~0.0544, ~0.0527 (resp. ~0.0373, ~0.4437, ~0.4852). ↩
Using the Python library https://github.com/syrte/ndtest. ↩
The 2-sample 2D Kolmogorov-Smirnov p-values^{28} are usually ~< 0.01 for the multivariate Gaussian distribution and much greater than ~0.20 for both multivariate Gaussian mixture distributions, indicating that the joint SPY-TLT returns distribution is significantly different from the former and not significantly different from any of the latter. ↩
See Hichem Snoussi, Ali Mohammad-Djafari, Penalized maximum likelihood for multivariate Gaussian mixture, arXiv. ↩
See William Kinlaw, Mark P. Kritzman, David Turkington, Harry M. Markowitz, A Practitioner’s Guide to Asset Allocation, Wiley. ↩
I first detail the mathematics associated to the diversification ratio and I then present a few of its possible uses in both portfolio analysis and portfolio optimization.
Notes:
- A Google sheet corresponding to this post is available here
Let be:
The diversification ratio $DR(w)$ of a portfolio with asset weights $w$ is the ratio of the weighted average of the asset volatilities to the portfolio volatility^{1}, that is
\[DR(w) = \frac{ \sigma{}^t w}{\sqrt{ w {}^t \Sigma w }}\]Choueifaty et al.^{2} show that the diversification ratio can be decomposed into an asset correlation component and an asset weight component, and that the diversification ratio increases (resp. decreases) when:
From this perspective, the diversification ratio can be interpreted as a correlation-based measure of portfolio diversification consistent with the intuition that portfolios with concentrated weights and/or highly correlated holdings would be poorly diversified^{2}, whatever the exact measure of diversification used^{4}.
The constrained most diversified portfolio is the portfolio whose asset weights $w_{CMDP}$ satisfy
\[w_{CMDP} = \operatorname{argmax} \frac{w {}^t \sigma }{\sqrt{ w {}^t \Sigma w}} \textrm{s.t. } w \in S\], where $S$ is a subset of $\mathbb{R}^n$ representing asset weight constraints like full-investment constraint, long-only constraint, maximum and minimum weight constraints, etc.
Choueifaty et al.^{1}^{2} demonstrate that the most diversified portfolio and the long-short most diversified portfolio always exist and are both unique if the covariance matrix $\Sigma$ is invertible^{5}.
Similarly, under reasonable assumptions^{6}, it can be demonstrated that the constrained most diversified portfolio always exists and is unique if the covariance matrix $\Sigma$ is invertible.
Risk-based portfolio construction methods^{7}, to which the most diversified portfolio belongs, grew in popularity after the Global Financial Crisis of 2007-2008, during which many portfolios allocation strategies that were supposed to be well-diversified^{4} failed to deliver the promised diversification benefits.
The most diversified portfolio aims to deliver investors the full benefit of the equity premium^{2} by avoiding in particular the concentration risk associated with market capitalization-weighted portfolios.
The name of this portfolio might be misleading, though, because it is by definition the portfolio which maximizes the diversification ratio, but nothing can be said in general about its maximization properties related to other measures of diversification.
As Lee^{8} puts it:
In other words, it cannot be ruled out that the MDP that maximizes […] the diversification ratio can itself be a relatively concentrated portfolio that is not as diversified when judged by other definitions of diversification.
Choueifaty et al.^{1}^{2} establish the following core properties^{9}:
Property 1: The long-short most diversified portfolio has the same (strictly) positive correlation $\frac{1}{DR \left( w_{LSMDP} \right) }$ with all the assets.
Property 2: The most diversified portfolio has the same (strictly) positive correlation $\frac{1}{DR \left( w_{MDP} \right) }$ with all the assets that have a non-null weight in that portfolio. Additionally, any null-weighted asset is more correlated to the most diversified portfolio than any non null-weighted asset.
Froidure et al.^{3} establish several other properties, among which:
Property 3: The most diversified portfolio is the portfolio, among all long-short portfolios, that maximizes its minimal correlation with all the assets, with all the long-only portfolios and with all the long-only factors^{10}.
Property 4: The most diversified portfolio is the portfolio, among all long-only portfolios, that is the most correlated with the long-short most diversified portfolio.
Property 5: The most diversified portfolio is stable^{11}, in that a small^{12} perturbation in the asset covariance matrix results in a small perturbation in the most diversified portfolio asset weights.
And other authors have also contributed to the analysis of the most diversified portfolio:
The correlation spectrum of a portfolio is the $n$-dimensional vector made of the correlation of that portfolio with all the $n$ assets^{3}.
More formally, the correlation spectrum of a portfolio with asset weights $w$ is the vector $\rho(w) \in [-1,1]^n$ satisfying
\[\rho(w)_i = \frac{ e_i {}^t \Sigma w }{\sqrt{ w {}^t \Sigma w } \sqrt{ e_i {}^t \Sigma e_i }} , i=1..n\], where $e_i \in \mathbb{R}^n$ is the i-th vector of the canonical basis of $\mathbb{R}^n$, representing the single-asset portfolio fully invested in the asset i.
Froidure et al.^{3} note that the weight of an asset in a portfolio does not in general reflect to which extent the portfolio is exposed to that particular asset.
As an example, Figure 1 compares the evolution of two portfolios invested in the universe of the 11 S&P Sectors represented by the 11 Sector SPDR ETFs, over the year 2022:
From this figure, as well as from the correlation of these two portfolios^{18}, the exposure of the portfolio $P_{SPDRs}$ to the Real Estate sector is actually far from null despite not being invested at all in this sector!
The correlation spectrum aims to solve this shortcoming of asset weights and to unveil the effective exposure of a portfolio, in terms of correlations, to all the assets in a universe of assets.
The main property of the correlation spectrum is^{3}:
Property 6: The correlation spectrum is a correlation-based representation of a long-short portfolio that is fully equivalent, up to leverage, to the representation of a portfolio in terms of asset weights.
This property allows to work interchangeably with the representation of a portfolio in terms of asset correlations or in terms of asset weights, depending on whatever representation is better suited to a given context.
A portfolio with asset weights $w$ is said^{3} to be rho-representative^{19} if its correlation spectrum $\rho(w)$ is such that $\rho(w)_i > 0$, $i=1..n$.
In other words, a rho-representative portfolio is (strictly) positively exposed, in terms of correlations, to all the assets in a universe of assets.
While investigating misc. properties related to rho-representativity, Froidure et al.^{3} derive the following major relationship between any long-only portfolio and the diversification ratio of the most diversified portfolio:
Property 7: A long-only portfolio is (strictly) positively correlated to at least one asset, a lower bound for this correlation being $\frac{1}{DR \left( w_{MDP} \right)^2}$.
They also examine the rho-representativity of several well known portfolios, and conclude that:
Portfolio Optimizer implements several endpoints related to the diversification ratio:
/portfolio/analysis/diversification-ratio
/portfolio/analysis/correlation-spectrum
/portfolio/optimization/most-diversified
The most natural way to use the diversification ratio is to assess the level of diversification of a specific portfolio.
For instance, re-using the universe of the 11 S&P Sectors introduced in the previous section:
The diversified near-minimum variance portfolio is visibly less concentrated in terms of asset weights than the minimum variance portfolio and should intuitively be more diversified.
Computing the diversification ratio confirms this intuition^{21}:
Portfolio | Diversification ratio |
---|---|
Minimum variance | ~1.18 |
Diversified near-minimum variance | ~1.24 |
Choueifaty et al.^{2} observe that:
Based on these observations, they propose to interpret the squared diversification ratio of a portfolio $DR(w)^2$ as the effective number of independent risk factors, or degrees of freedom, represented in the portfolio^{2}, which makes it very similar to another measure of portfolio diversification called the effective number of bets and introduced by Meucci^{22}.
In order to empirically validate this interpretation, I propose to estimate the number of independent risk factors represented in the S&P 500 using both:
of the portfolio $P_{SPY}$ reproducing the S&P 500 within the universe of the 11 S&P Sectors^{25}.
Results are displayed in Figure 4, in which it indeed appears that the general behavior of these two measures is comparable, with a severe decrease over the whole year 2022.
However, one issue with the interpretation of $DR(w)^2$ as the number of independent risk factors represented in a portfolio is that $DR(w)^2$ is unbounded^{26} due to the diversification ratio itself being unbounded.
This is easily established in the case of two assets, where the diversification ratio of an equal-weighted portfolio of two assets with a pairwise correlation $\rho$ and with an identical volatility equals $\sqrt { \frac{2}{1 + \rho} }$, which goes to infinity as $\rho$ goes to -1.
More on this a little bit later.
A consequence of the previous paragraph is that comparing the squared diversification ratio of a portfolio $DR(w)^2$ to the squared diversification ratio of the most diversified portfolio $DR(w_{MDP})^2$ enables to analyze the amount of untapped diversification^{27} of that portfolio in terms of exposures to independent risk factors within a universe of assets.
This kind of analysis is used in practice by people at TOBAM^{28}.
When applied to the universe of the 11 S&P Sectors, it gives:
Portfolio | Portfolio squared diversification ratio | MDP squared diversification ratio | Available diversification used by the portfolio | Available diversification not used by the portfolio |
---|---|---|---|---|
$P_{SPY}$ | ~1.33 | ~1.81 | ~74% | ~26% |
From Property 7, the minimal correlation between a long-only portfolio invested in a universe of assets and at least one asset in that universe is equal to the inverse of the squared diversification ratio of the most diversified portfolio, $\frac{1}{DR \left( w_{MDP} \right)^2}$.
This property allows to interpret $\frac{1}{DR \left( w_{MDP} \right)^2}$ as a measure of the intrinsic diversification of a universe of assets, in terms of the lowest achievable correlation.
In addition, because $\frac{1}{DR \left( w_{MDP} \right)^2}$ is bounded in the interval $[0,1]$, this measure can easily be compared against reference values:
$\frac{1}{DR \left( w_{MDP} \right)^2}$ goes to 0 when the diversification ratio of the most diversified portfolio goes to infinity.
It means at the limit that there exist a long-only, zero risk portfolio bearing a positive risk premium^{2}, which implies significant hedging possibilities^{29} in the considered universe of assets.
$\frac{1}{DR \left( w_{MDP} \right)^2}$ goes to $\frac{1}{n}$ when the diversification ratio of the most diversified portfolio goes to $\sqrt n$.
It means that this portfolio behaves similarly to an equal-weighted portfolio of $n$ uncorrelated assets with identical volatility^{30}, which implies major diversification potential in the considered universe of assets.
$\frac{1}{DR \left( w_{MDP} \right)^2}$ goes to 1 when the diversification ratio of the most diversified portfolio goes to 1.
It means that this portfolio behaves similarly to a singe-asset portfolio^{30}, which implies a total absence of diversification potential in the considered universe of assets.
Using this new measure, I propose to compare the diversification potential of the universe of the 11 S&P Sectors in 2021 v.s. in 2022, which was an annus horribilis for diversification as recently quantified by Jurrien Timmer, director of Global Macro at Fidelity, in the tweet reproduced in Figure 5.
In addition to $\frac{1}{DR \left( w_{MDP} \right)^2}$, I also include two other measures of the diversification of a universe of assets:
Measure | 2021 | 2022 | Comment |
---|---|---|---|
$\frac{1}{DR \left( w_{MDP} \right)^2}$ | ~0.42 | ~0.55 | Increase is bad |
Effective rank | ~4.63 | ~3.27 | Decrease is bad |
Absorption ratio | ~0.82 | ~0.88 | Increase is bad |
All these measures are unanimous^{31} in confirming that in 2022, diversification evaporated^{32}!
Ref. Property 2, the sorted correlation spectrum of the most diversified portfolio is in theory separated in two distinct regions:
To illustrate how this core property is verified in practice, Figure 6 displays the asset weights of the most diversified portfolio invested in the universe of the 11 S&P Sectors, and Figure 7 displays its sorted correlation spectrum.
Both theoretical regions are clearly visible in Figure 7 and behave as expected:
Backtested performances of the most diversified portfolio have been analyzed within several different universes of assets:
most diversified portfolios have higher Sharpe ratios than the market cap–weighted indices and have had both lower volatilities and higher returns in the long run, which can be interpreted as capturing a bigger part of the risk premium
the [most diversified portfolio] delivers the highest alpha^{35} of the five strategies tested […] which is consistent with the [most diversified portfolio]’s goal of delivering maximum diversification, and thus a balanced exposure to the effective risk factors available in the universe
Although high average returns are not the explicit goal of risk-based portfolio construction, all three risk-based portfolios outperform the excess market return
Although previous work has shown that [most diversified] portfolios exhibit greater diversification and a higher Sharpe ratio than other investment strategies, this was not found using developed market index data
Nevertheless, this last paper also states, re-assuringly, that
During the early stages following the financial crisis, the [most diversified] portfolio had the highest, albeit negative, Sharpe ratio […]. This gives us an indication that during this disastrous financial period the [most diversified] portfolio could negate many of the exogenous shocks which impacted the market at that time.
And anyway, it remains to be seen what would have been the conclusion of this paper under a fair comparison of all the tested portfolio allocation methods, because the authors explain that the [most diversified] portfolio was calculated differently to the other three asset allocation portfolios included in the analysis…
So, all in all, backtested performances generally highlight that the most diversified portfolio outperform a market capitalization-weighted portfolio.
Now, what about real-life performances?
Thanks to TOBAM, it is possible to go beyond backtested performances of the most diversified portfolio and analyze its real-life performances!
Indeed, the concept of maximum diversification as a portfolio allocation strategy has been patented by TOBAM^{37}, and has been implemented for several years through their suite of Anti-Benchmark funds.
Without further ado, Figure 8, reproduced from Quantalys, shows the live performances over the past three years of the TOBAM Anti-Benchmark US fund^{38}, which is an implementation of the most diversified portfolio within the universe of large and mid-cap U.S. equities.
Over that period, the performances of the TOBAM Anti-Benchmark US fund have been comparable to those of the S&P 500, but:
Unfortunately, since inception (not shown in Figure 8), the performances of the TOBAM Anti-Benchmark US fund have been far worse than those of the S&P 500, which would tend to moderate the theoretical claims of over-performance of the most diversified portfolio v.s. a market capitalization-weighted portfolio^{39}.
Through the usage of minimum and maximum asset weight constraints^{40}, the constrained most diversified portfolio can be used to improve the diversification of a benchmark portfolio, for example a market capitalization-weighted portfolio.
For example, Figure 9 displays together:
While there is little change in asset weights between these two portfolios, this change is sufficient to boost the diversification ratio of the portfolio representing the S&P 500 by ~6% and the available diversification used by that portfolio by ~9 percentage points^{41}.
Portfolio | Diversification ratio |
---|---|
$P_{SPY}$ | ~1.15 |
$P_{CMDPSPY}$ | ~1.22 |
Portfolio | Portfolio squared diversification ratio | MDP squared diversification ratio | Available diversification used by the portfolio | Available diversification not used by the portfolio |
---|---|---|---|---|
$P_{SPY}$ | ~1.33 | ~1.81 | ~74% | ~26% |
$P_{CMDPSPY}$ | ~1.50 | ~1.81 | ~83% | ~17% |
Another possibility to diversify a benchmark portfolio could also be to integrate the diversification ratio as an additional constraint in the benchmark portfolio allocation procedure, but I will not explore this path here.
I hope this first post of 2023 raised your interest in the diversification ratio and in what people at TOBAM are doing re. quantitative investing!
I also hope this post might serve as a motivation to study other uses of the diversification ratio (ex., correlation-based benchmark performance analysis^{28}) as well as of the most diversified portfolio (ex., potential substitute for the diversified risk parity portfolio of Lohre et al.^{42}, which maximizes the effective number of bets within a universe of assets) that I could not detail.
A last word, taken from Clarke et al.^{13}, which I think summarizes particularly well the philosophy behind the diversification ratio:
Some investors traditionally view more securities […] as equivalent to better diversification and lower risk. However, the relatively low number of securities in the maximum diversification […] portfolio[s] illustrate that risk reduction is best achieved by selecting fewer, less correlated and less risky securities, rather than just adding more securities.
As usual, feel free to connect with me on LinkedIn or follow me on Twitter to discuss about Portfolio Optimizer or quantitative finance in general.
–
See Choueifaty, Y., Coignard, Y. (2008). Toward maximum diversification. The Journal of Portfolio Management, 35(1), 40–51. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7}
See Choueifaty, Y., Tristan Froidure, T., Reynier, J. Properties of the Most Diversified Portfolio. Journal of Investment Strategies, Vol.2(2), Spring 2013, pp.49-70. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10}
See Tristan Froidure, Khalid Jalalzai, Yves Choueifaty, Portfolio Rho-Representativity, International Journal of Theoretical and Applied Finance, Vol. 22, No. 07, 1950034 (2019). ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8}
As a reminder, there is no definite formula for diversification, c.f. Meucci^{43}. ↩ ↩^{2}
If the covariance matrix is positive semi-definite and not positive definite, these two portfolios might not be unique anymore but all (long-short) most diversified portfolios maximizes the diversification ratio. An interesting question is then how do the different (long-short) most diversified portfolios compare, for example in terms of performance. This question is studied empirically in the case of the most diversified portfolio, in Choueifaty et al.^{1}. ↩
For example, if $S$ is a non-empty compact convex set. ↩
A risk-based portfolio construction method is a portfolio construction method that does not rely on expected asset returns and requires only expected asset volatilities and correlations. ↩
See Wai Lee, Risk-Based Asset Allocation: A New Answer to an Old Question?, The Journal of Portfolio Management Summer 2011, 37 (4) 11-28. ↩ ↩^{2}
The constrained most diversified portfolio also satisfy a core property^{3} in case the constraint set $S$ consists of only the full investment constraint $\sum_{i=1}^{n} w_i = 1$, the non-negativity constraint $w_i \geq 0, i=1..n$ and a common maximum weight constraint $w_i \leq \frac{1}{r}, r \ge 1, i=1..n$. ↩
A long-only factor is a factor replicable by long-only portfolios of assets belonging to the universe of assets, c.f. Froidure et al.^{3}. ↩
The stability of the most diversified portfolio has also been empirically demonstrated to be better than that of both the equal-risk contributions portfolio and the minimum-variance portfolio in du Plessis and van Rensburg^{44}. ↩
See Froidure et al.^{3} for the exact definitions. ↩
See Roger Clarke, Harindra de Silva and Steven Thorley. Risk Parity, Maximum Diversification, and Minimum Variance: An Analytic Perspective. The Journal of Portfolio Management Spring 2013, 39 (3) 39-53. ↩ ↩^{2} ↩^{3}
Under a single-factor risk model, an asset is included in the most diversified portfolio if and only if its correlation with the common factor is lower than a threshold correlation, whose construction tends to eliminates most of the assets in a universe of assets, c.f. Clarke et al.^{13} ↩
See Carmichael, Benoit, Koumou, Nettey Boevi Gilles and Moran, Kevin, (2015), A New Formulation of Maximum Diversification Indexation Using Rao’s Quadratic Entropy, Cahiers de recherche, CIRPEE. ↩
Rao’s quadratic entropy is a measure of diversity initially introduced in biology, c.f. Rao^{45}. ↩
Choueifaty and Coignard^{1} and others (Clarke et al.^{13}…) note that the most diversified portfolio is the portfolio with the highest Sharpe ratio if risk is homogeneously proportional to return across the universe of assets, so that the investment problem that the most diversified portfolio solves is compatible with the mean-variance framework. ↩
The correlation of the arithmetic returns of the two portfolios is ~0.86. ↩
A kind of similar-but-different notion, maximal rho-representativity, is defined and studied in Froidure et al.^{3}. I will not discuss it in this blog post, but several properties of the most diversified portfolio are simply a consequence of it being maximally rho-representative. ↩
C.f. Froidure et al.^{3} for the general methodology. Portfolio Optimizer implements a proprietary variation of that methodology which allows to compute the diversification ratio from time series in most pathological situations. ↩
Although counterintuitive, the diversification ratio of the diversified near-minimum variance portfolio might have been lower than the diversification ratio of the minimum variance portfolio in case the additional assets would have increased the average correlation between assets more than they would have decreased the concentration in asset weights. ↩
See Meucci, Attilio, Managing Diversification (April 1, 2010). Risk, pp. 74-79, May 2009, Bloomberg Education & Quantitative Research and Education Paper. ↩
Using the convention that one year equals 252 daily observations. ↩
C.f. the associated blog post for the definitions. ↩
The asset weights of the portfolio $P_{SPY}$ in the 11 Sector SPDR ETFs have been set to the reported SPY ETF 11 S&P Sector weights as of beg. January 2023, and using these weights, it is possible to reproduce the evolution of the SPY ETF almost perfectly, which validates this back-of-the-envelope approach. ↩
The effective number of bets does not suffer from this issue, because it is bounded by the number of assets. ↩
Or, as written in misc. TOBAM Diversification Dashboard reports, how much diversification is left on the table. ↩
This typically happens when highly negatively correlated assets are present in a universe of assets, but can also happen due to the asset covariance matrix being (numerically) positive semi-definite and not positive definite. ↩
Ref. the observations from Choueifaty et al.^{2} mentioned previously in this blog post. ↩ ↩^{2}
Although again counterintuitive^{21}, $\frac{1}{DR \left( w_{MDP} \right)^2}$ might have been higher in 2022 than in 2021 because the overall increase in correlations could have been counterbalanced by the most diversified portfolio. ↩
Nevertheless, c.f. TOBAM’s report Stability of the Pairwise Correlations Hierarchy for explanations on why being well-diversified during market crises remains critical. ↩
Or non-increasing, depending on the order of the sort, which is implicitly assumed to be a sort by increasing correlation. ↩
Correlations are not exactly equal to $\frac{1}{DR \left( w_{MDP} \right) }$, but are numerically very close, within a +/- 0.3% band. ↩
Both Choueifaty and Coignard^{1} and Choueifaty et al.^{2} find that the most diversified portfolio provides significant alpha under a three-factor Fama–French model. For why alpha under such a model is important, c.f. for example the podcast snippet Flirting with Models - Vivek Viswanathan, Quant Equity in China - Why Is the Fama-French Three Factor Alpha So Important?. ↩
See Theron, Ludan; Van Vuuren, Gary (2018) : The maximum diversification investment strategy: A portfolio performance comparison, Cogent Economics & Finance, ISSN 2332-2039, Taylor & Francis, Abingdon, Vol. 6, Iss. 1, pp. 1-16. ↩
See Choueifaty, Y. (2006). Methods and systems for providing an anti-benchmark portfolio. Patent number: USPTO 60/816, 276. ↩
Whose ISIN is LU1067856606. ↩
All hope is not lost, though, because the TOBAM Anti-Benchmark US fund is not a pure implementation of the most diversified portfolio. Indeed, as can be read from its prospectus, it is subject to many constraints, for example the concentration constraints inherited from its UCITS status. ↩
Or other constraints, like a tracking error constraint, which is implemented at TOBAM in their suite of Diversified Benchmark indices. ↩
Or ~13%. ↩
See Lohre, Harald and Opfer, Heiko and Orszag, Gabor, Diversifying Risk Parity (November 7, 2013). Journal of Risk, Vol. 16, No. 5, 2014, pp. 53-79. ↩
See Meucci, Attilio, Managing Diversification (April 1, 2010). Risk, pp. 74-79, May 2009, Bloomberg Education & Quantitative Research and Education Paper. ↩
See Hannes du Plessis & Paul van Rensburg (2020): Risk-based portfolio sensitivity to covariance estimation, Investment Analysts Journal. ↩
See C.Radhakrishna Rao, Diversity and dissimilarity coefficients: A unified approach, Theoretical Population Biology, Volume 21, Issue 1, 1982, Pages 24-43. ↩
After quickly going through the associated mathematics, I will present two examples of usage of this measure - one as potential indicator of systemic risk and the other as as a means to choose between portfolio allocation methods.
Note: A Google sheet corresponding to this post is available here.
Let be:
Then, the raw^{3} informativeness of the correlation matrix $C$ is defined as its distance to the nearest equicorrelation matrix, that is
\[\textrm{informativeness}(\textup{C}) = \min_{\rho \in [-\frac{1}{n-1}, 1]} d(\textup{C},\textup{C}_\rho)\], where $d$ is any distance metric defined over the set of correlation matrices satisfying specific properties, like the Frobenius norm already used in the computation of the nearest correlation matrix.
More on the admissible distance metrics later.
When all assets^{4} within a universe of assets have the same correlations, no pair of assets is more similar or dissimilar than any other pair^{5}.
Running a hierarchical clustering algorithm in this universe will then not reveal any particular information about its structure, because no hierarchical grouping of the assets is appropriate.
Based on this observation, Brockmeier et al.^{1} define a non-informative correlation matrix to be an equicorrelation matrix, and define the informativeness of a correlation matrix to be the distance between that correlation matrix and the set of non-informative correlation matrices^{6}.
With this definition of the informativeness:
Information theory also provides a measure of the homogeneity (or of the lack of homogeneity) within an ensemble, known as the entropy.
For example, the matrix effective rank is based on the Shannon’s entropy.
Brockmeier et al.^{1} emphasize that informativeness is different from entropy, and the differences between these two measures is best summarized by Figure 1, adapted from their paper.
As noted by Brockmeier et al.^{1}, Figure 1 shows that
informativeness is higher for nontrivial clusterings [of the objects], whereas entropy is maximized when [all objects are distinct]
Not all distance metrics defined over the set of correlation matrices can be used to compute the informativeness.
An admissible distance metric $d$ must be:
, where $A$ and $B$ are correlation matrices and $\Pi$ is any permutation matrix, so that informativeness is invariant to the ordering of the assets^{4}.
Brockmeier et al.^{1} compare the properties of several of such admissible distances, and conclude that they can be split in two groups exhibiting very different behaviour:
In particular, Brockmeier et al.^{1} establish empirically that
the three quantum-based distances (Chernoff bound, Quantum Hellinger and Bures) perform consistently well in identifying correlation matrices that are structured and more suitable for clustering
It is possible to compute the informativeness of a correlation matrix thanks to the Portfolio Optimizer API endpoint /assets/correlation/matrix/informativeness
,
for which three different distance metrics are supported:
These distances have been selected based on the results of Brockmeier et al.^{1} mentioned in the previous section.
I will first illustrate the evolution of the informativeness of the U.S. stock market, thanks to the 49 industry portfolios of the Fama and French data library.
These 49 industries cover all stocks from the NYSE, the AMEX, and the NASDAQ, so that the associated informativeness should be representative of the U.S. stock market as a whole^{8}.
I will use a rolling window approach, that is, at the end of each month I will:
Results for the period June 1970 - October 2022 are displayed in Figure 2, on which are also highlighted a handful of stock market crashes.
It appears that:
Depending on the nature of the crash, the informativeness appears either to drop violently (e.g. October 1987 stock market crash, Internet bubble bust, Global Financial Crisis, COVID 19) or to drop over a time frame comparable with the stock market crash itself (e.g. 1973-1974 stock market crash, 2022 market decline).
These observations suggest that the informativeness might be usable as an indicator of systemic risk^{11}.
I will now study whether informativeness can help to predict the out-performance of the hierarchical risk parity portfolio (HRP) over a naive risk parity portfolio (NRP).
The motivation behind this study is the empirical link established by Brockmeier et al.^{1} between the informativeness and the performance of clustering algorithms.
For this, I propose to adapt the methodology used by Gautier Marti in his blog post Which portfolio allocation method to choose? Look at the correlation matrix! in order to evaluate how features extracted from a (in-sample) correlation matrix influence the (out-of-sample) volatility of portfolios constructed from this correlation matrix.
In details:
This universe of assets is again representative of the U.S. stock market as a whole, similar to the Fama-French 49 industry portfolios of the previous section, but the difference is that this time, this universe of assets is investable.
At the end of each month, I will:
The resulting relationship between the in-sample informativeness and the difference in the out-of-sample volatility of the NRP and HRP portfolios over the period December 2000 - December 2022 is represented in Figure 3.
It seems that the higher the informativeness, the higher the volatility of the NRP portfolio in comparison with the HRP portfolio^{15}.
Nevertheless, there are big flashing red lights:
Such low values indicate that the universe of the Sector SPDR ETFs is actually too homogeneous to apply a hierarchical clustering algorithm on it.
So, whatever the relationship established, it is most probably spurious for this specific universe of assets.
The same study, using a more informative universe of assets, should allow to progress on this point.
While it could be a consequence of the previous point, it could also mean that informativeness, as a stand alone correlation matrix feature, is not capable of predicting any economically significant out-performance^{16} of one portfolio allocation method v.s. the other.
But what if informativeness was used as an additional feature to the ones already discussed in Gautier Marti’s post above? Would it help?
Another possible usage of informativeness, similar in spirit to the unsupervised machine learning applications described in Brockmeier et al.^{1}, would be to construct maximally informative universes of assets and study their properties.
For example, maybe these universes would be more stable, and so more predictable, than less informative universes?
Anyway, I hope that you enjoyed this last post of 2022 and that informativeness can find a place in your (quantitative) toolbox.
Happy end-of-year celebrations, and see you in 2023 for more research!
Meanwhile, you can connect with me on LinkedIn or follow me on Twitter to discuss about Portfolio Optimizer or quantitative finance in general.
–
See Austin J. Brockmeier and Tingting Mu and Sophia Ananiadou and John Y. Goulermas, Quantifying the Informativeness of Similarity Measurements, Journal of Machine Learning Research, 2017. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9}
For another example of usage of equicorrelation matrices, c.f. the post Correlation Matrix Stress Testing: Shrinkage Toward an Equicorrelation Matrix. ↩
Brockmeier et al.^{1} normalize the informativeness so that it belongs to the interval $[0,1]$. ↩
Although it might seem counterintuitive, even when all assets are totally uncorrelated there is no pair of assets that “stands out”, because in this case, all pairs of assets are similarly dissimilar. ↩
Another reasonable definition of the informativeness of a correlation matrix would be the distance between that correlation matrix and a particular equicorrelation matrix like an equicorrelation matrix made of 1s, but Brockmeier et al.^{1} explain why this definition would be wrong. ↩
Also known as the Bures–Wasserstein distance. ↩
To be noted, though, that working with industries v.s. with individual stocks imposes a structure on the U.S. stock market, which might artificially impact the value of the informativeness… ↩
The computed covariance matrix is thus not invertible, but this is not an issue for the computation of the informativeness. ↩
Using the Frobenius distance. ↩
My tests with other universes of assets representative of the U.S. stock market confirm that the evolution of the informativeness is a useful measure to monitor. What I couldn’t (yet) wrap my head around is whether the informativeness could be used as a stock market bottom indicator. I did not allocate as much time as I would have liked to this one and I am certainly lacking the proper mathematical tools, but I feel like it has potential… ↩
I will only use the 9 original sector ETFs, whose associated tickers are XLY, XLP, XLE, XLF, XLV, XLI, XLB, XLK and XLU. ↩
Adjusted for splits and dividends and retrieved thanks to Tiingo. ↩
Using the Bures distance, found in Brockmeier et al.^{1} to be able to identify correlation matrices suitable for clustering. Results using the two other distances implemented in Portfolio Optimizer are similar. ↩
And so, the higher the out-performance of the HRP portfolio, in terms of realized out-of-sample portfolio volatility. ↩
In terms of realized out-of-sample portfolio volatility. ↩
Monitoring their behavior once they are deployed in production is then very important to be able to detect as early as possible any inconsistency between their live returns and their expected returns.
For this, I will describe in this blog post the methodology proposed by Rej et al.^{2} in their paper You are in a drawdown. When should you start worrying?, which consists in comparing the current drawdown of a trading strategy with the theoretical drawdown that would be expected under the assumption that the profit and loss (PnL) of the strategy is modeled as a geometric Brownian motion.
Using this methodology, I will analyze the drawdowns of the U.S. stock market over the past ~150 years and I will show how the geometric Brownian motion model could be used in a risk management strategy to scale the market exposure of a portfolio of U.S. equities.
Note: A Google sheet corresponding to this post is available here.
Let be:
Then, assuming that:
Rej et al.^{2} build on the results from Shepp^{3} to establish that the joint probability density for the PnL of the trading strategy to reach a maximum $b$ at time $T - l$ and subsequently suffers a drawdown of depth $d$ is given by
\[\tilde{F}(d,b,l) = \frac{bd}{\pi \left( (T-l) l \right )^{\frac{3}{2}}} e^{ \left ( \mathrm{SR} (b-d) - \mathrm{SR}^2 \frac{T}{2} - \frac{d^2}{2l} - \frac{b^2}{2(T-l)} \right ) }\]where $d$ is defined by
\[d = \ln \left( \max_{0 \leq t \leq T} \mathrm{PnL(t)} \right) - \ln \left( \mathrm{PnL}(T) \right)\]The quantities $b$, $l$, and $d$ are illustrated in Figure 1, adapted from Rej et al.^{2}
The formula for $\tilde{F}$ is consistent with the empirical findings of Van Hemert et al.^{4} in a more realistic setting, who remark that:
Key drivers of the maximum drawdown […] are the evaluation horizon (time to dig a hole), Sharpe ratio (ability to climb out of a hole), and persistence in risk (chance of having a losing streak).
It follows from this formula that the joint probability density of the length $l$ and of the depth $d$ of the current drawdown of the trading strategy is given by
\[G(l, d) = \int_{0}^{+\infty} \tilde{F}(d,b,l) \mathrm{d}b\]To be noted that in case $\sigma$ is not normalized to $\sigma = 1$^{5}:
Using the formula for $G$, Rej et al.^{2} explicitly derive:
In addition, it is also possible to derive the conditional density of:
These probability densities allow to answer questions like:
The assumption made in Rej et al.^{2} that the PnL of the trading strategy follows a geometric Brownian motion might appear unrealistic in practice since financial returns are far from being Gaussian^{6}.
Nevertheless, two remarks about this model:
Aggregational Gaussianity^{7} is as much a stylized fact of financial returns than non-Gaussianity^{6}.
So, depending on the observation frequency of the trading strategy (monthly, quarterly, yearly…), a geometric Brownian motion model could be good enough for coarse monitoring purposes.
Non-normality, heteroskedasticity, etc. of financial returns all increase the severity of the drawdowns of a trading strategy^{2}^{4}.
From this perspective, a geometric Brownian motion model represents an over-optimistic reference model, which can be used to set minimal (or maximal) expectations.
As Rej et al.^{2} put it:
we argue that from a practitioner’s point of view, it is always better to err on the side of caution, so that the Brownian model sets a very useful benchmark.
Portfolio Optimizer does not provide any API endpoint to compute the probability distributions mentioned in the previous section.
If this is something you would need, feel free to reach out.
I will present two examples of application of the methodology of Rej et al.^{2}, slightly out of the context of trading strategy monitoring.
In this first example of application, I will analyze the drawdowns of the U.S. stock market over the past ~150 years and determine which insights (if any) can be gained from the geometric Brownian motion model.
I will use the U.S. stock market monthly returns^{8} over the period January 1871 - September 2022 available on Robert J. Shiller’s website, whose PnL is depicted in Figure 2.
Over the whole period January 1871 - September 2022, the annualized Sharpe ratio of the U.S. stock market is ~0.65-0.69, depending on the type of monthly returns used (simple v.s. log).
To be conservative, I will select $\mathrm{SR_{US}} = 0.65$.
Computing the length of historical drawdowns is straightforward.
On the other hand, computing the normalized depth of historical drawdowns requires an estimate of the volatility $\sigma_{US}$ of the geometric Brownian motion.
I will simply take $\sigma_{US} = 14$%, which corresponds to the annualized volatility of the U.S. stock market over the whole period January 1871 - September 2022 that I already used in the estimation of the Sharpe ratio $\mathrm{SR_{US}}$.
The probability densities $\rho$ and $\psi$ allow to compute the probability distribution functions of the length and of the normalized depth of the current drawdown for the U.S. stock market, with $T = \frac{1821}{12} \approx 152$ years.
They are displayed in Figure 3 and 4.
It follows that when the U.S. stock market is in a drawdown, then, with a confidence of ~95%:
I will rephrase this.
Under an over-optimistic model, a current drawdown of up to ~27% or lasting up to ~5 years needs to be considered business as usual^{9} for the U.S. stock market.
I was honestly surprised by this result, which matches with the comment from Rej et al.^{2}:
[…] investors tend to underestimate the length and depth of perfectly acceptable, “normal” drawdowns.
By the way, it is possible to confirm that the geometric Brownian motion model is indeed over-optimistic, because it underestimates the characteristics of the major historical drawdowns.
Two examples:
Borrowing the idea of Rej et al.^{2}, the conditional probability density $\eta$ allows to compute a confidence interval $[l^-, l^+]$ for the length $l$ of the current drawdown conditional on its normalized depth $\mathrm{d^*}$, such that
\[\mathbb{P} \left( l^- \leq l \leq l^+ | \mathrm{d^*} \right) = 1 - \alpha\]with $1 - \alpha \in [0,1] $ a given confidence level (90%, 95%…).
Such a conditional confidence interval for the U.S. stock market, at the 95% level, is displayed in Figure 5.
Now, if we compare the length of historical drawdowns to the conditional confidence interval of Figure 5, it appears that these lengths are usually much closer from the lower boundary of the conditional confidence interval than from its upper boundary.
Two examples again:
This over-pessimistic conditional confidence interval is in stark contrast to the over-optimistic probabilities of the previous section.
Thus, there is something really interesting happening here, which might have practical implications for investors.
For example, using history as a guide, an investor could think that because the length of the most severe drawdown for the U.S. stock market has been ~15 years^{10}, she needs to prepare, worst case, for a future drawdown of about ~15 years.
Problem is, the 95% conditional confidence interval for the length of a drawdown as severe as the said drawdown^{11} is $[9.00, 34.67]$ years, implying that this investor actually needs to prepare for a future drawdown of up to ~35 years^{12}!
In this second example of application, I will investigate how to use the geometric Brownian motion model to scale the market exposure of a portfolio of U.S. equities in case the U.S. stock market is in a drawdown.
Specifically, I propose to analyze the following risk management strategy.
At the end of each week:
Figure 6 illustrates this strategy with the SPY ETF (U.S. equities) and the SHY ETF (cash) over the period January 1993 - November 2022^{14}, and compare it to a buy and hold strategy over the same period.
A quick visual inspection of Figure 6 shows that this strategy seems to be able to limit the severity of drawdowns while matching the raw performances of the U.S. stock market.
Some statistics:
Portfolio Management Strategy | Average Exposure | Annualized Sharpe Ratio | Maximum (Weekly) Drawdown |
---|---|---|---|
Buy and hold | 100% | ~0.61 | ~55% |
Drawdown probability-adjusted | ~83% | ~0.75 | ~25% |
What I find interesting with this strategy is that it is a kind of momentum strategy, but not based on the usual momentum indicators (moving averages, rate of change…).
So, it might be a useful addition to one’s arsenal.
Geek notes:
- I made $T$ dynamic in order to mimic a real-time execution of this strategy; the initial value of $T$ is ~122 years, and with each passing week its value is incremented by $\frac{1}{52}$
- I re-used the Sharpe ratio $SR_{US}$ of 0.65 and the volatility $\sigma_{US}$ of 14%; this does not introduce any lookahead bias because these values computed over the whole period January 1871 - September 2022 are nearly identical to their counterparts computed over the period January 1871 - December 1992
- If the Sharpe ratio and/or the volatility are made dynamic, for example estimated from a rolling window of past returns, much better performances could be obtained; this goes a little beyond the static model of Rej et al.^{2} though
There are of course many other possibilities to use the methodology of Rej et al.^{2}.
One such possibility, again in risk management, could be for example to compute the probability that the depth of the current drawdown worsens having observed its current length, and de-risk a portfolio accordingly.
That’s all for now!
As usual, you can connect with me on LinkedIn or follow me on Twitter to discuss about Portfolio Optimizer or quantitative finance in general.
–
See David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado and Qiji Jim Zhu, The probability of backtest overfitting, Journal of Computational Finance, VOLUME 20, NUMBER 4 (APRIL 2017) PAGES: 39-69 or the SSRN version. ↩
See A. Rej, P. Seager and J.-P. Bouchaud, You are in a drawdown. When should you start worrying?, Wilmott, Volume 2018, Issue 93, January 2018, Pages 56-59 or the arXiv version. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13}
See Shepp, L. 1979. The joint density of the maximum and its location for a Wiener process with drift. Journal of Applied Probability 16(2), 423–427. ↩ ↩^{2}
See Drawdowns, Otto Van Hemert, Mark Ganz, Campbell R. Harvey, Sandy Rattray, Eva Sanchez Martin and Darrel Yawitch, The Journal of Portfolio Management September 2020. ↩ ↩^{2}
Which is probably the most usual situation in practice… ↩
See R. Cont (2001) Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, 1:2, 223-236. ↩ ↩^{2}
Aggregational Gaussianity refers to the tendency of asset returns to follow a distribution closer and closer to a Gaussian distribution the more the time period over which they are computed increases. ↩
The monthly returns are total nominal returns (i.e., the consumer price index was forced to 1 in Shiller’s Excel sheet to explicitly discard inflation). ↩
In particular, such a drawdown does not require any exogenous explanation (pandemic, recession, war…). ↩
January 1929 - December 1944. ↩
Whose normalized depth is 12.15407251. ↩
In this context, markets can remain irrational longer than you can remain solvent is a perfectly appropriate quote. ↩
As a definition of sufficient, I choose a drawdown whose normalized depth is such that its probability of occurrence under the geometric Brownian motion model is greater than or equal to 66% (i.e., the two third quantile of the probability distribution of the normalized depth of the current drawdown). ↩
SPY and SHY adjusted close prices were retrieved from Tiingo, and SHY history wasz extended with index data. ↩
Worse, large empirical correlation matrices have been shown to be so noisy that, except for their largest eigenvalues and their associated eigenvectors, they can essentially be considered as random.
For example, Laloux et al.^{2} reports that around 94% of the spectrum of an empirical correlation matrix estimated from the returns of the S&P 500 constituents is indistinguishable from the spectrum of a random correlation matrix!
So, before using an empirical correlation matrix, it is usually advised to denoise it.
In this post, I will present the denoising method proposed by Laloux et al.^{2}, based on results from random matrix theory (RMT), and I will illustrate its behaviour in two universes of assets.
Notes:
- An excellent introduction to random matrix theory and some of its applications in finance is the book A First Course in Random Matrix Theory^{1} from Marc Potters and Jean-Philippe Bouchaud.
In the context of this post, the most important result from random matrix theory is the Marchenko-Pastur theorem, which describes the eigenvalue distribution of large random covariance matrices.
Below is a simple version of the Marchenko-Pastur theorem, in the case of i.i.d. Gaussian observations^{1}.
Let be:
Then, given $N \to +\infty$, $T \to +\infty$ and $0 < q = \frac{n}{T} \leq 1$, the density of eigenvalues of the matrix $\Sigma$ converges to the Marchenko-Pastur density defined by
\[\rho_{MP}(\lambda) = \begin{cases} \frac{\sqrt{\left(\lambda_{+} - \lambda\right)\left(\lambda - \lambda_{-}\right)}}{2 \pi q \sigma^2}, \textrm{if } \lambda \in [\lambda_{+}, \lambda_{-}] \newline 0, \textrm{if } \lambda \notin [\lambda_{+}, \lambda_{-}] \end{cases}\], with $\lambda_{-}$ the lower edge of the spectrum defined by
\[\lambda_{-} = \sigma^2 \left(1 - \sqrt q \right)^2\]and $\lambda_{+}$ the upper edge of the spectrum defined by
\[\lambda_{+} = \sigma^2 \left(1 + \sqrt q \right)^2\]To be noted that, under proper technical assumptions, the Marchenko-Pastur theorem remains valid for observations drawn from more general distributions, like fat-tailed distributions (general i.i.d. observations^{4}, general i.i.d. columns and general dependence structure within the columns^{5}…).
In other words, the Marchenko-Pastur theorem establishes that the distribution of the eigenvalues of a large random covariance matrix is actually universal, in that it follows a distribution independent^{6} of the underlying observation matrix.
Potters and Bouchaud^{1} explain this surprising result as follows:
For large random matrices, many scalar quantities […] do not fluctuate from sample to sample, or more precisely such fluctuations go to zero in the large N limit. Physicists speak of this phenomenon as self-averaging and mathematicians speak of concentration of measure.
To help visualize the Marchenko-Pastur theorem, Figure 1 displays together
A couple of remarks to finish:
Because empirical covariance matrices are of finite dimensions in practice, a natural question to ask is whether the Marchenko-Pastur theorem remains applicable in a non-asymptotic regime^{7}.
Figure 1 already demonstrates a pretty good agreement between theory and practice with values of $n = 1000$ and $T = 5000$ that are far from infinity.
What about smaller values?
Like $n = 100$ and $T = 500$, displayed in Figure 2.
Or even like $n = 10$ and $T = 50$, displayed in Figure 3.
Based on Figures 1 to 3, it appears that the Marchenko-Pastur theorem remains applicable with small values of $n$ and $T$ down to $\approx 100$, but that caution is warranted for very small values of $n$ and $T$ of order $\approx 10$.
Laloux et al.^{2} propose a method to denoise empirical correlation matrices based on random matrix theory, called the eigenvalues clipping method^{8}.
The rationale behind this method is that by comparing the spectrum of an empirical correlation matrix to the spectrum of a random correlation matrix it is possible to identify the random part of the empirical correlation matrix.
More formally, let be:
Then, the upper edge $\lambda_{+}$ of the Marchenko-Pastur density can serve as a threshold to identify the noisy part of $C$:
This leads to the following method to denoise $C$:
The reason why the eigenvalues associated to noise should be replaced by a constant value is that^{2}
Since the eigenstates corresponding to the “noise band” are not expected to contain real information, one should not distinguish the different eigenvalues […] in this sector.
Two important practical details are missing from the previous description of the eigenvalues clipping method.
Comparing an empirical correlation matrix of aspect ratio $q$ to a random correlation matrix of aspect ratio $q$ is usually incorrect because of the presence of both temporal correlations (auto-correlations) and spatial correlations (cross-sectional correlations) in the observations used to estimate the empirical correlation matrix^{1}.
This is especially true for time series of asset returns^{2}.
As a consequence, the aspect ratio $q$ must be considered as an adjustable parameter and not as a constant value equal to $\frac{n}{T}$, as explained by Potters and Bouchaud^{1}
Intuitively, correlated samples are somehow redundant and the sample [correlation] matrix should behave as if we had observed not $T$ samples but an effective number $T^* < T$.
Comparing an empirical correlation matrix to a purely random correlation matrix is also usually incorrect, because empirical eigenvalues strictly above $\lambda_{+}$ are reducing the variance $\sigma^2$ of the random part of the empirical correlation matrix^{2}.
This variance must then be considered as an adjustable parameter and not as a constant value equal to one.
Because of what precedes, the eigenvalues clipping method must in practice include a preliminary step to find the “best” values of the parameters $q$ and $\sigma^2$, which could for example be defined as the values that bring the eigenvalue density of the random correlation matrix as close as possible^{12} to the eigenvalue density of the empirical correlation matrix, as in de Prado^{13}.
Figure 4 and Figure 5, taken from Gatheral^{14}, illustrate the impact of such a preliminary step on the empirical correlation matrix of $n = 431$ stocks belonging to the S&P 500 index and computed using $T = 2,155$ daily returns for each stock^{15}.
The Marchenko-Pastur density with optimal parameters $q = 0.34$ and $\sigma^2 = 0.53$ displayed on Figure 5 definitely better fits the random part of the eigenvalue density of the empirical correlation matrix than the Marchenko-Pastur density with “by the book” parameters $q = 0.2$ and $\sigma^2 = 1$ displayed on Figure 4.
Portfolio Optimizer implements the eigenvalues clipping method through the endpoint /assets/correlation/matrix/denoised
,
using a proprietary algorithm to find the best values of the adjustable parameters $q$ and $\sigma^2$.
There is one well-known limitation to the eigenvalues clipping method.
Results from random matrix theory establish that the spectrum of an empirical correlation matrix is usually a broadened version of the spectrum of its true unobservable counterpart^{8}. That is, small empirical eigenvalues are usually too small and large empirical eigenvalues are usually too large.
The eigenvalues clipping method does increase the small empirical eigenvalues, but does not alter the large empirical eigenvalues so that they remain overestimated in the denoised empirical correlation matrix.
Markowitz’s mean-variance analysis is one of the most well-known frameworks to construct a portfolio with an optimal level of risk^{16} and return from a universe of risky assets.
One of its inputs is a correlation matrix, representing future asset correlations, and while the most natural choice [for it] is to use the sample [correlation] matrix determined using a series of past returns, […] [this choice] […] can lead to disastrous results^{1}.
Indeed, because the sample correlation matrix is affected by noise, its usage leads to a dramatic underestimation of the real risk, by overinvesting in artificially low-risk eigenvectors^{17}.
A possible solution to this issue is to denoise it thanks to the eigenvalues clipping method, for which I will illustrate the behaviour on two very different universe of assets.
Laloux et al.^{17} analyze the impact of using the sample correlation matrix v.s. the denoised sample correlation matrix in a large universe of $n = 406$ stocks belonging to the S&P 500 index.
In details, they:
The three resulting efficient frontiers are displayed in Figure 6, taken from Laloux et al.^{17}, where:
It is clearly visible that:
The catch with the previous example^{19} is that the predicted mean-variance efficient frontiers are usually computed in the literature without taking into account the real-life constraints faced by individuals or mutual funds like no short sales constraints or maximum investment constraints.
Problem is, Jagannathan and Ma^{20} established that such constraints have a shrinkage-like effect on the sample asset correlation matrix, very similar to the effect of a denoising method.
As a consequence, the eigenvalues clipping method might actually have no additional value compared to simply imposing asset weight constraints.
In order to determine whether this is the case, I used Portfolio Optimizer to reproduce the methodology of Laloux et al.^{17} in a universe of $n = 470$ stocks belonging to the S&P 500 index^{21} and I imposed non-negativity constraints on the computed efficient portfolios’ weights.
The three resulting efficient frontiers are displayed in Figure 7, on which the denoised sample correlation matrix is again visibly better at predicting the realized risk than the sample correlation matrix.
Nevertheless, the improvement is not as dramatic as in Laloux et al.^{17}, which is in agreement, for example, with the findings of Golden and Flint^{22} for the South African equity market.
I will now analyze the impact of using the sample correlation matrix v.s. the denoised sample correlation matrix in the small universe of $n = 10$ assets of the Adaptative Asset Allocation strategy from ReSolve Asset Management, described in the paper Adaptive Asset Allocation: A Primer^{23}:
For this, I propose to adapt the methodology of Laloux et al.^{17} to the rules of the Adaptative Asset Allocation strategy^{24}, which leads to the computation of the following quantities at the end of the last trading day of each month:
/assets/correlation/matrix/denoised
When applied to the period June 2020 - September 2022^{25}, the results are that:
To summarize, on this example, using the denoised sample correlation matrix instead of the sample correlation matrix does no harm or improves the realized risk estimate ~90% of the time^{26}, which is quite remarkable for such small values of $n$ and $T$!
As usual, feel free to connect with me on LinkedIn or to follow me on Twitter if you would like to discuss about Portfolio Optimizer (new feature request, support request…) or random finance stuff.
–
See Marc Potters, Jean-Philippe Bouchaud, A First Course in Random Matrix Theory, Cambridge University Press. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8}
See Laurent Laloux, Pierre Cizeau, Jean-Philippe Bouchaud, and Marc Potters, Noise Dressing of Financial Correlation Matrices, Phys. Rev. Lett. 83, 1467. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7}
When $\sigma^2 = 1$, the matrix $\Sigma$ is the empirical correlation matrix associated to $X$. ↩
See V. A. Marchenko, L. A. Pastur - Distribution of eigenvalues for some sets of random matrices - Mat. Sb. (N.S.), 72(114):4 (1967), 507–536. ↩
See Pavel Yaskov, A short proof of the Marchenko–Pastur theorem, Comptes Rendus Mathematique Volume 354, Issue 3, March 2016. ↩
Under the proper technical assumptions mentioned above. ↩
Which could not be the case, for example because of an extremely slow rate of convergence. ↩
See Joel Bun, Jean-Philippe Bouchaud, Marc Potters, Cleaning correlation matrices, Risk.net. ↩ ↩^{2}
This is not a theoretical consequence of the Marchenko-Pastur theorem, but rather a practical consequence of the attempt to limit small eigenvalues. In addition, small eigenvalues are usually not as clearly separated from the bulk of the spectrum as the large eigenvalues. ↩
See Vasiliki Plerou, Parameswaran Gopikrishnan, Bernd Rosenow, Luís A. Nunes Amaral, Thomas Guhr, and H. Eugene Stanley, Random matrix approach to cross correlations in financial data, Phys. Rev. E 65, 066126, 27 June 2002. ↩
Choosing the constant equal to zero requires an additional manipulation of the denoised correlation matrix to make it a valid correlation matrix, c.f. Plerou et al.^{10}. ↩
For example, in terms of $l^2$ norm. ↩
See Lopez de Prado, Marcos, A Robust Estimator of the Efficient Frontier. ↩
See Jim Gatheral, Random Matrix Theory and Covariance Estimation, NYU Courant Institute Algorithmic Trading Conference (October 2008). ↩
The aspect ratio of this empirical correlation matrix is thus $q = \frac{431}{2,155} = 0.2$. ↩
In mean-variance analysis, the risk of a portfolio is defined in terms of the variance of its returns. ↩
See Laurent Laloux, Pierre Cizeau, Jean-Philippe Bouchaud, and Marc Potters, Random matrix theory and financial correlations, International Journal of Theoretical and Applied Finance, Vol. 03, No. 03, pp. 391-397 (2000). ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6}
In order to isolate the impact of the asset correlations from the asset returns and the asset volatilities, future asset returns and future asset volatilities are used in the computation of the efficient frontiers. ↩
For example, the same remark applies to Plerou et al.^{10}. ↩
See Ravi Jagannathan & Tongshu Ma, 2003. Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps, Journal of Finance, American Finance Association, vol. 58(4), pages 1651-1684, 08. ↩
The associated dataset, which covers the period 11th February 2013 - 07th February 2018, is available on Kaggle. ↩
See Golden, Daron and Flint, Emlyn, Improving Portfolio Allocation Through Covariance Matrix Filtering. ↩
See Butler, Adam and Philbrick, Mike and Gordillo, Rodrigo and Varadi, David, Adaptive Asset Allocation: A Primer. ↩
Taken from Allocate Smartly’s blog. ↩
I retrieved the adjusted ETF prices over the period January 2020 - September 2022 using Tiingo. ↩
A possible next step would be to determine if this improves the backtested performances of the Adaptative Asset Allocation strategy. ↩