Bootstrap Simulations with Exact Sample Mean Vector and Sample Covariance Matrix

14 minute read

Bootstrapping is a statistical method which consists in sampling with replacement from an original data set to compute the distribution of a desired statistic, with plenty of possible variations depending on the exact context (non-dependent data, dependent data…).

Because bootstrap methods are not, in general, based on any particular assumption on the distribution of the data, they are well suited for the analysis of [financial] returns1, in which case the original data set typically consists of historical return time series2.

Unfortunately, bootstrap simulations have the disadvantage that only previously recorded values can be simulated3, so that they cannot explicitly reflect the investors’ expectations3.

In this blog post, inspired by Rob Carver’s piece Portfolio optimisation, uncertainty, bootstrapping, and some pretty plots4, I will describe a way to incorporate exact views on expected returns, standard deviations and correlations into bootstrapping so that the resulting simulations both preserve the salient characteristics of asset returns3 and also match forward-looking risk5 and return assumptions.

As an example of application, I will revisit the methodology described in the research note Considering the Past and the Future in Asset Simulation3 from T. Rowe Price and incorporate the full range of capital market assumptions into a systematic modelling process to simulate potential asset returns for portfolio construction3.

Mathematical preliminaries

Let be:

  • $n$, the number of assets in a universe of assets
  • $T$, the number of time periods
  • $R \in \mathcal{M}(\mathbb{R}^{T \times n})$, with $R_{t,i}$ the return6 of the asset $i=1..n$ for the time period $t=1..T$, representing a scenario (historical, simulated…) for the temporal evolution of the asset returns
  • $\mu \in \mathbb{R}^{n}$, the vector of the arithmetic averages of the asset returns $R$
  • $\Sigma \in \mathcal{M}(\mathbb{R}^{n \times n})$, the covariance matrix of the asset returns $R$, supposed to be invertible to avoid numerical subtleties

Moment-matching

Let be:

  • $\bar{\mu} \in \mathbb{R}^{n}$ a vector
  • $\bar{\Sigma} \in \mathcal{M} \left( \mathbb{R}^{n \times n} \right)$ a positive-definite matrix

Suppose that we would like the first two empirical moments of the asset returns $R$ - that is, $\mu$ and $\Sigma$ - to take the values $\bar{\mu}$ and $\bar{\Sigma}$.

How to proceed?

This problem - called moment-matching by twisting scenarios in Attilio Meucci’s Advanced Risk and Portfolio Management website - consists in transforming the original asset returns $R$ into modified asset returns $\tilde{R} \in \mathcal{M}(\mathbb{R}^{T \times n})$, so that:

  • $ \frac{1}{T} \sum_{t=1}^{T} \tilde{R}_{t,i} = \bar{\mu}_i $, $i=1..n$
  • $ \frac{1}{T} \sum_{k=1}^T \left( \tilde{R}_{k,i} - \bar{\mu}_i \right) \left( \tilde{R}_{k,j} - \bar{\mu}_j \right) = \bar{\Sigma}_{i,j} $, $i=1..n$, $j=1..n$

Moment-matching with $n=1$, shift and rescaling procedure

When $n=1$, a procedure to compute the modified asset returns $\tilde{R}$ from the original asset returns $R$ is discussed in Boyle et al.7.

It involves the following intuitive shift/rescaling8 of the original asset returns:

\[\tilde{R} = \begin{pmatrix} 1 \\ 1 \\ ... \\ 1 \end{pmatrix} \bar{\mu} + \frac{\bar{\sigma}}{\sigma} \left( R - \begin{pmatrix} 1 \\ 1 \\ ... \\ 1 \end{pmatrix} \mu \right)\]

, with:

  • $\sigma = \sqrt{\Sigma_{1,1}}$, the (sample) standard deviation of the original asset returns
  • $\bar{\sigma} = \sqrt{\bar{\Sigma}_{1,1}}$, the desired (sample) standard deviation of the modified asset returns

Moment-matching with $n \geq 2$

Multivariate shift and rescaling procedure

When $n \geq 2$, Kaut and Lium9 hints at a multivariate generalization of the procedure described in the previous sub-section, which becomes an affine transformation of the original asset returns $R$:

\[\tilde{R} = \begin{pmatrix} 1 \\ 1 \\ ... \\ 1 \end{pmatrix} \bar{\mu}^t + \bar{\Sigma}^{\frac{1}{2}} \Sigma^{-\frac{1}{2}} \left( R - \begin{pmatrix} 1 \\ 1 \\ ... \\ 1 \end{pmatrix} \mu^t \right)\]

, with:

To be noted that if only the average asset returns and/or the asset standard deviations need to be altered - that is, if the asset correlations should be left unchanged -, the generic formula above becomes:

\[\tilde{R} = \begin{pmatrix} 1 \\ 1 \\ ... \\ 1 \end{pmatrix} \bar{\mu}^t + Diag \left( \frac{\bar{\sigma}}{\sigma} \right) \left( R - \begin{pmatrix} 1 \\ 1 \\ ... \\ 1 \end{pmatrix} \mu^t \right)\]

, with

\[Diag \left( \frac{\bar{\sigma}}{\sigma} \right) = \begin{pmatrix} \frac{\bar{\sigma}_1}{\sigma_1} & 0 & ... & 0 \\ 0 & \frac{\bar{\sigma}_2}{\sigma_2} & ... & 0 \\ ... & ... & ... & ... \\ 0 & 0 & ... & \frac{\bar{\sigma}_n}{\sigma_n} \end{pmatrix}\]

, which is exactly the one-dimensional procedure described in the previous sub-section, applied independently to each of the $n$ assets.

Minimum-correction second-moment-matching procedure

One problem with the multivariate shift and rescaling procedure described in the previous sub-section is that it offers no guarantees regarding the “distance” between the original asset returns and the modified asset returns.

In other words, it is theoretically possible that the modified asset returns obtained through this procedure bear no resemblance at all with the original asset returns, which would render them useless for the purpose of maintain[ing] important information from [the original asset returns]3,

To solve this problem, a natural approach is to aim for a minimal correction $\tilde{R} - R$ to avoid introducing numerical artifacts10.

Such an approach to minimally distorting the original asset returns is described in Lin and Lermusiaux10 under the name minimum-correction second-moment-matching10, with the following main result11:

\[\argmin_{\tilde{R} \in \mathcal{M}(\mathbb{R}^{T \times n}), \frac{1}{n}\tilde{R}^t R = \bar{\Sigma}} \left\Vert R - \tilde{R} \right\Vert^2_F = R A_*\]

, with $A_*$ defined by

\[A_* = { \bar{\Sigma}^{\frac{1}{2}} }^t \left( { \bar{\Sigma}^{\frac{1}{2}} } \Sigma { \bar{\Sigma}^{\frac{1}{2}} }^t \right)^{-\frac{1}{2}} { \bar{\Sigma}^{\frac{1}{2}} }\]

This leads to the mapping:

\[\tilde{R} = \begin{pmatrix} 1 \\ 1 \\ ... \\ 1 \end{pmatrix} \bar{\mu}^t + \left( R - \begin{pmatrix} 1 \\ 1 \\ ... \\ 1 \end{pmatrix} \mu^t \right) A_*\]

Comparison of the two procedures

Figure 1 and Figure 2 highlight, over the period 04rd January 2010 - 31th December 202012, the typical differences between the two procedures described in the previous sub-sections when applied to a universe of 10 ETFs representative13 of misc. asset classes.

Comparison of moment matching procedures in a 10 ETF universe, SPY ETF, 04rd January 2010 - 31th December 2020.
Figure 1. Comparison of moment matching procedures in a 10 ETF universe, SPY ETF, 04rd January 2010 - 31th December 2020.
Comparison of moment matching procedures in 10 ETF universe, TLT ETF, 04rd January 2010 - 31th December 2020.
Figure 2. Comparison of moment matching procedures in 10 ETF universe, TLT ETF, 04rd January 2010 - 31th December 2020.

On Figure 1 and Figure 2:

  • The curve in blue corresponds to the original asset returns
  • The two curves in green and orange correspond to modified asset returns so that the average returns and the standard deviations of the asset returns are left unchanged but their correlations are altered, using
    • The multivariate shift and rescaling procedure (green curve)
    • The minimum correction procedure (orange curve)

From these figures, it is clear that the curve corresponding to the minimum correction procedure is “closer” to the original curve than the curve corresponding to the multivariate shift and rescaling procedure.

This is confirmed numerically, with the “minimally corrected” asset returns being about 10% closer to the original asset returns than the “shifted and scaled” asset returns:

  • $\left\Vert R - \tilde{R}_{shift-rescaling} \right\Vert_F \approx 1.30$
  • $\left\Vert R - \tilde{R}_{min-correction} \right\Vert_F \approx 1.17$

Misc. remarks

A couple of remarks on what precedes:

  • The underlying distribution of the original asset returns - whatever its nature - is usually not preserved by the different moment-matching procedures.

    This is discussed in Boyle et al.7 for the one-dimensional case and in Kaut and Lium9 for the general multi-dimensional case.

  • The multivariate formulas for computing $\tilde{R}$ provided in the previous sub-sections are not the most generic ones.

    In particular, Kaut and Lium9 and Lin and Lermusiaux10 emphasize that the matrix square roots appearing in those formulas need not be the unique positive semidefinite square roots.

Bootstrap simulations with exact sample mean vector and sample covariance matrix

Bootstrap simulations…

Carver4 gives a great summary of why bootstrap simulations are so useful in finance:

Bootstrapping is particularly potent in the field of financial data because we only have one set of data: history. We can’t run experiments to get more data. Bootstrapping allows us to create ‘alternative histories’ that have the same basic character as our actual history, but aren’t quite the same. Apart from generating completely random data […], there isn’t really much else we can do.

When applied to historical asset returns, bootstrapping creates simulated asset returns that differ from their historical counterparts but that nevertheless maintain important information gleaned from history3.

Now, for various reasons, it might be useful to be able to tweak the first two empirical moments of those simulated asset returns:

  • To incorporate forward-looking estimates of mean asset returns based on current market valuation indicators like the U.S. AIAE indicator for U.S. stocks or the BSRM/B model for U.S. bonds.

  • To incorporate forward-looking estimates of asset correlations, like an estimate of the future stock-bond correlation based on the framework described in Czasonis et al.14 or, more recently, in Molenaar et al.15

  • To simulate correlation scenarios incompatible with the historical asset returns, like unprecedented correlation breakdowns16.

… with exact sample mean vector and sample covariance matrix

For this, one possibility is to use the moment-matching procedures described in the previous section, and in particular the minimum-correction second-moment-matching procedure.

Indeed:

  • Although these procedures do not preserve the distribution of the simulated asset returns, it is not a problem in the context of bootstrapping because - contrary to a Monte Carlo simulation - there is (usually) no assumption made about that distribution.

  • Additionally, the “minimum alteration” property of the minimum-correction second-moment-matching procedure gives some assurance that, even if the distribution of the simulated asset returns is altered by this procedure, it should be kind of “minimally” altered.

In practice, a moment-matching procedure can be used in two different ways with bootstrap simulations:

  • For tweaking the historical asset returns
  • For tweaking the simulated asset returns

Tweaking the historical asset returns

This is the method described for example in Carver4 or in Wu and Walsh3.

It consists in:

  • Transforming the historical asset returns into modified historical asset returns whose sample mean vector (resp. sample covariance matrix) is equal to a desired target mean vector (resp. target covariance matrix).

  • Bootstrapping these modified historical assets returns in order to simulate asset returns.

Figure 3 illustrates this method applied to a universe of U.S. stocks (SPY ETF) and bonds (TLT ETF) over the period 04rd January 2010 - 31th December 202012 when the historical correlation of about $-0.5$ between stocks and bonds is altered to about $0.5$, which is more representative of the pre-2000 period, c.f. Brixton et al.17.

Simulated SPY/TLT ETF returns obtained from bootstrapping moment-matched historical SPY/TLT ETF returns, 04rd January 2010 - 31th December 2020.
Figure 3. Simulated SPY/TLT ETF returns obtained from bootstrapping moment-matched historical SPY/TLT ETF returns, 04rd January 2010 - 31th December 2020.

One of the advantages of this method is that the computationally costly part - the moment-matching procedure - needs to be performed only once.

Nevertheless, with this method, the simulated asset returns will exhibit a varying sample mean vector and sample covariance matrix, which might be undesirable.

Tweaking the simulated asset returns

This method consists in:

  • Bootstrapping the historical assets returns in order to simulate asset returns.

  • Transforming the simulated asset returns into modified simulated asset returns whose sample mean vector (resp. sample covariance matrix) is equal to a desired target mean vector (resp. target covariance matrix).

Figure 4 illustrates this method in the same context as in Figure 3.

Moment-matched simulated SPY/TLT ETF returns obtained from bootstrapping historical SPY/TLT ETF returns, 04rd January 2010 - 31th December 2020.
Figure 4. Moment-matched simulated SPY/TLT ETF returns obtained from bootstrapping historical SPY/TLT ETF returns, 04rd January 2010 - 31th December 2020.

With this method, the simulated asset returns will exhibit a sample mean vector and a sample covariance matrix that exactly match the desired target mean vector and target covariance matrix.

Unfortunately, one of the drawbacks of this method is that it is computationally heavy, because every time a bootstrap simulation is generated, the moment-matching procedure must be applied to the associated simulated asset returns.

Comparison with entropy pooling

Readers familiar with a procedure called entropy pooling18 - that also allows to express views on features of the market, such as expectation or correlations, regardless of the market distribution19 - might wonder about the relationship between that procedure and the moment-matched bootstrapping procedure described in this section.

Two preliminary remarks:

  • Bootstrapping, especially its variations designed for time-series, introduces time-dependency in the sampled asset returns.

    This makes moment-matched bootstrapping closer in spirit to time-dependent entropy pooling - as described in van der Schans20, which applies the entropy pooling computational approach in a time-dependent setting with sample paths, called scenarios, instead of sample points20 - than to vanilla entropy pooling.

  • Time-dependent entropy pooling incorporates the [views] by assigning weights to the scenarios20 within a scenario-probability setting, while moment-matched bootstrapping operates on one scenario at a time, without any reference to a probability distribution.

With that in mind, the relationship between entropy pooling and moment-matched bootstrapping would be the following:

  • Time-dependent entropy pooling works by tweaking the relative probabilities of the scenarios without affecting the scenarios themselves18
  • Moment-matched bootstrapping works by tweaking the scenarios without affecting the relative probabilities themselves, since in that case, the user associates full probability to one single scenario18

That being said, entropy pooling is a much more generic21 framework for imposing views than moment-matched bootstrapping, so that the comparison ends here.

Implementation in Portfolio Optimizer

Portfolio Optimizer implements the minimum-correction second-moment-matching procedure described in the previous sections through the endpoint /assets/returns/moment-matched.

Together with the endpoint /assets/returns/simulation/bootstrap, this allows to use any of the two methods described in the previous section.

Example of application - extending an historically informed, forward-looking simulation framework

T. Rowe Price’s simulation framework

Wu and Walsh3 describes a simulation-based portfolio construction framework that embed both long-term historical asset behaviours and investor expectations of future performance when simulating asset returns3.

In details, this historical informed3 framework is a three-step process that:

  • Backfills missing historical asset returns for assets whose return histories differ in length, using the relationships observed between all assets over their common returns history while accounting for the associated estimation error.

    As a side note, this step uses the procedure described in a previous blog post.

  • Tweaks the resulting historical asset returns, taking into account user-defined return expectations3.

    This step tweaks the historical asset returns thanks to the univariate shift procedure22 described in the previous sections, using T. Rowe Price’s own expected returns as input.

    Figure 5, nearly identical23 to Figure 8 from Wu and Walsh3, depicts the result of that procedure when applied to the SPY ETF over the period December 2010 - December 2021 with a target expected return of 4.9% per year24.

    12-month rolling SPY ETF returns before and after sample mean moment-matching, December 2010 - December 2021.
    Figure 5. 12-month rolling SPY ETF returns before and after sample mean moment-matching, December 2010 - December 2021.
  • Simulates asset returns from the modified historical asset returns, ensuring to retain the actual pattern of […] asset movements in each simulated scenario3.

    This step relies on what seems to be a rolling block-bootstrap in order to recognise extreme tail risk occurrences in actual historical context3.

Ultimately, T. Rowe Price’s framework can be used to demonstrate how [a multi-asset portfolio] would have performed in different market conditions reflective of historical experience, while incorporating the investor’s expectations for the future3.

Incorporating all capital market assumptions into T. Rowe Price’s simulation framework

I propose to extend T. Rowe Price’s framework to be able to take into account:

  • User-defined return expectations
  • User-defined standard deviation expectations
  • User-defined correlation expectations

Because these three quantities usually represent the full range of capital market assumptions published by financial institutions25, the resulting simulation framework will become even more powerful while retaining its original simplicity.

In line with this blog post, this is simply done by integrating the minimum-correction second-moment-matching procedure described in the previous sections into the second step of T. Rowe Price’s framework.

Conclusion

A perfect summary of the technique described in this blog post is given by Carver4:

[…] it basically allows us to use forward looking estimates for the first two moments (and first co-moment - correlation) of the distribution, whilst using actual data for the higher moments (skew, kurtosis and so on) and co-moments (co-skew, co-kurtosis etc). In a sense it’s sort of a blend of a parameterised monte-carlo and a non parameterised bootstrap.

For more fun with bootstrap, feel free to connect with me on LinkedIn or to follow me on Twitter.

  1. See Esther Ruiz and Lorenzo Pascual. Bootstrapping Financial Time Series. Journal of Economic Surveys, 2002, vol. 16, issue 3, 271-300

  2. Like historical asset returns or historical factor returns. 

  3. See Wu, Walsh, Considering the Past and the Future in Asset Simulation, T. Rowe Price, Investment Insights, November 2022 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

  4. See Rob Carver, This Blog is Systematic, Portfolio optimisation, uncertainty, bootstrapping, and some pretty plots. Ho, ho, ho 2 3 4

  5. With risk defined in terms of volatility. 

  6. Arithmetic, logarithmic… 

  7. See Boyle P, M Broadie and P Glasserman, 1995, Recent advances in simulation for security pricing, Proceedings of the 1995 Winter Simulation Conference, pages 212–219 2

  8. See Meucci, Attilio, Simulations with Exact Means and Covariances (June 7, 2009)

  9. See Kaut, Michal, and Lium, Arnt-Gunnar. “Scenario generation with distribution functions and correlations.” Kybernetika 50.6 (2014): 1049-1064 2 3

  10. See Lin, J., Lermusiaux, P.F.J. Minimum-correction second-moment-matching: theory, algorithms and applications. Numer. Math. 147, 611–650 (2021) 2 3 4

  11. To be noted that Theorem 2.1 from Lin and Lermusiaux10 requires a couple of numerical assumptions - that $m \geq n$ and that $R$ has full column rank. 

  12. (Adjusted) prices have have been retrieved using Tiingo 2

  13. These ETFs are used in the Adaptative Asset Allocation strategy from ReSolve Asset Management, described in the paper Adaptive Asset Allocation: A Primer26

  14. See Megan Czasonis, Mark Kritzman, David Turkington, The Stock-Bond Correlation, The Journal of Portfolio Management, February 2021, 47 (3) 67-76

  15. See Molenaar, Roderick and Senechal, Edouard and Swinkels, Laurens and Wang, Zhenping, Empirical evidence on the stock-bond correlation (February 9, 2023)

  16. On this, see also the blog posts here, here and here

  17. See A Changing Stock–Bond Correlation: Drivers and Implications, Alfie Brixton, Jordan Brooks, Pete Hecht, Antti Ilmanen, Thomas Maloney, Nicholas McQuinn, The Journal of Portfolio Management, Multi-Asset Special Issue 2023, 49 (4) 64 - 80

  18. See Meucci, Attilio, Fully Flexible Views: Theory and Practice (August 8, 2008). Fully Flexible Views: Theory and Practice, Risk, Vol. 21, No. 10, pp. 97-102, October 2008 2 3

  19. See Meucci, Attilio and Nicolosi, Marco, Dynamic Portfolio Management with Views at Multiple Horizons (April 16, 2015). Applied Mathematics and Computation, Volume 274, 1 February 2016, Pages 495-518

  20. See van der Schans, M. Entropy Pooling with Discrete Weights in a Time-Dependent Setting. Comput Econ 53, 1633–1647 (2019) 2 3

  21. Interestingly though, with (time-dependent) entropy pooling, when scenarios of two variables always move along with each other, we cannot impose a negative correlation20, while this situation is not an issue for moment-matched bootstrapping! 

  22. There is no alteration of asset standard deviations or of correlations described in Wu and Walsh3, so that the multivariate shift and rescaling procedure actually becomes a simple univariate shift procedure… 

  23. Returns before shifting in Figure 5 seem slightly higher than their counterparts in Figure 8 from Wu and Walsh3; this might be due to different data sources used. As a side note, Figure 8 from Wu and Walsh3 has inverted legends for the two thick lines. 

  24. Which corresponds to the 2022 T. Rowe Price five-year capital market assumptions / expected returns for U.S. Large Cap25

  25. See T. Rowe Price, Capital Market Assumptions Five-Year Perspective 2

  26. See Butler, Adam and Philbrick, Mike and Gordillo, Rodrigo and Varadi, David, Adaptive Asset Allocation: A Primer