Covariance Matrix Forecasting: Iterated Exponentially Weighted Moving Average Model
In the previous post of this series on covariance matrix forecasting, I reviewed both the simple and the exponentially weighted moving average covariance matrix forecasting models, which are straightforward extensions of their respective univariate volatility forecasting models to a multivariate setting.
With these reference models established, we can now delve into more sophisticated approaches for forecasting covariance matrices^{1}.
In this blog post, I will describe the iterated exponentially weighted moving average (IEWMA) model that has recently^{2} been introduced in Johansson et al.^{3} and I will illustrate its empirical performances in the context of monthly covariance matrix forecasting for a multiasset class ETF portfolio.
Mathematical preliminaries
Covariance matrix modelling and covariance proxies (reminders)
This subsection contains reminders from a previous blog post.
Let $n$ be the number of assets in a universe of assets and $r_t \in \mathbb{R}^n$ be the vector of the (logarithmic) return of these assets over a time period $t$ (a day, a week, a month..), over which the (conditional) mean return vector $\mu_t \in \mathbb{R}^n$ of these assets is supposed to be null.
Then:

$r_t$ can be expressed as^{4} $r_t = \epsilon_t$, with $\epsilon_t \in \mathbb{R}^n$ an unpredictable error term, often referred to as a vector of “shocks” or as a vector of “random disturbances”^{4}, over the time period $t$

The asset (conditional) covariance matrix $\Sigma_t \in \mathcal{M}(\mathbb{R}^{n \times n})$ is defined as $\Sigma_t = \mathbb{E} \left[ r_t r_t {}^t \right]$
From this definition, the outer product of the asset returns $ r_t r_t {}^t $ is a covariance estimate  or covariance proxy^{5}  for the asset returns covariance matrix over the considered time period $t$.

The asset (conditional) correlation matrix $C_t \in \mathcal{M}(\mathbb{R}^{n \times n})$ is defined as $ C_t = V_t^{1} \Sigma_t V_t^{1} $, where $V_t \in \mathcal{M}(\mathbb{R}^{n \times n})$ is the diagonal matrix of the asset (conditional) standard deviations.
Correlation matrix modelling
In order to clarify the relation between conditional correlations and conditional variances^{6}, Engle^{6} proposes to write the [asset] returns as the conditional standard deviation times the standardized disturbance^{6}, that is
\[r_{i,t} = \epsilon_{i,t} = \sigma_{i,t} \varepsilon_{i,t}, i=1..n\], with:
 $\sigma_{i,t} = \sqrt{ \mathbb{E} \left[ r_{i,t}^2 \right] } $
 $\varepsilon_{i,t}$ a standardized disturbance that has mean zero and variance one^{6}
This way, the conditional correlation [between asset $i$ and asset $j$ becomes equal to] the conditional covariance between the standardized disturbances [$\varepsilon_{i,t}$ and $\varepsilon_{j,t}$]^{6}
\[\rho_{ij,t} = \frac{\mathbb{E} \left[ r_{i,t} r_{j,t} \right]}{\sqrt{ \mathbb{E} \left[ r_{i,t}^2 \right] \mathbb{E} \left[ r_{j,t}^2 \right] }} = \mathbb{E} \left[ \varepsilon_{i,t} \varepsilon_{j,t} \right]\]The iterated exponentially weighted moving average covariance matrix forecasting model
The IEWMA covariance matrix forecasting model^{3} is a twostep model that:
 Uses an EWMA volatility forecasting model with squared asset returns as variance proxies in order to forecast asset volatilities
 Uses an EWMA correlation matrix forecasting model with outer products of (EWMA)volatilitystandardized asset returns as covariance matrix proxies in order to forecast asset correlations
Johansson et al.^{3} highlights that the IEWMA model was originally^{7} proposed in Engle^{6} as an efficient alternative to the DCCGARCH^{6} predictor, although he did not refer to it as IEWMA^{6}.
In other words, the IEWMA covariance matrix forecasting model bridges the gap between the simple EWMA model and the more complex DCCGARCH model.
Forecasting formulas
Let be:
 $n$ be the number of assets in a universe of assets
 $r_t r_t {}^t$, $t=1..T$ the outer products of the asset returns over each of $T$ past periods
 A decay factor $\lambda_{vol} \in [0, 1]$
 A decay factor $\lambda_{cor} \in [0, 1]$
Next period’s asset returns covariance/correlation matrix
The IEWMA covariance matrix forecasting model estimates the next period’s asset returns covariance matrix $\hat{\Sigma}_{T+1}$ and correlation matrix $\hat{C}_{T+1}$ as follows^{3}:
 For each asset $i=1..n$ in the universe of assets
 Forecast the asset oneperiodahead variances $\hat{\sigma}^2_{i,2}$, …, $\hat{\sigma}^2_{i,T+1}$ using an EWMA volatility forecasting model with decay factor $\lambda_{vol}$ and squared asset returns $r_{i,t}^2$, $t=1..T$, as variance proxies

Compute the volatilitystandardized asset returns $\tilde{r}_{i,2},…,\tilde{r}_{i,T}$ defined as the asset returns standardized by their oneperiodahead EWMA volatility forecasts
\[\tilde{r}_{i,t} = \frac{r_{i,t}}{\hat{\sigma}_{i,t}}, t = 2..T\]

Compute the diagonal matrix of the assets next period’s forecasted volatilities $V_{T+1}$
\[V_{T+1} = \begin{pmatrix} \hat{\sigma}_{1,T+1} & 0 & ... & 0 \\ 0 & \hat{\sigma}_{2,T+1} & ... & 0 \\ ... & ... & ... & ... \\ 0 & 0 & ... & \hat{\sigma}_{n,T+1} \end{pmatrix}\]  Forecast the next period’s asset returns correlation matrix $\hat{C}_{T+1}$ using an EWMA covariance matrix forecasting model with decay factor $\lambda_{cor}$ and outer products of volatilitystandardized asset returns $\tilde{r_t} \tilde{r_t} {}^t$, $t=2..T$, as covariance matrix proxies

Compute the next period’s forecasted asset returns covariance matrix $\hat{\Sigma}_{T+1}$
\[\hat{\Sigma}_{T+1} = V_{T+1} \hat{C}_{T+1} V_{T+1}\]
Next $h$period’s ahead asset returns covariance/correlation matrix
The IEWMA covariance matrix forecasting model estimates the next $h$period’s ahead asset returns covariance matrix $\hat{\Sigma}_{T+h}$ and correlation matrix $\hat{C}_{T+h}$, $h \geq 2$, by the next period’s asset returns covariance matrix $\hat{\Sigma}_{T+1}$ and correlation matrix $\hat{C}_{T+1}$.
Indeed, due to the properties of the EWMA volatility and covariance matrix forecasting models^{8}, we have:
 $V_{T+h} = V_{T+1}$, $h \geq 2$
 $\hat{C}_{T+h} = \hat{C}_{T+1}$, $h \geq 2$
So that $\hat{\Sigma}_{T+h} = \hat{\Sigma}_{T+1}$, $h \geq 2$.
Averaged asset returns covariance/correlation matrix over the next $h$ periods
The IEWMA covariance matrix forecasting model estimates the averaged^{9} asset returns covariance matrix $\hat{\Sigma}_{T+1:T+h}$ and correlation matrix $\hat{C}_{T+1:T+h}$ over the next $h$ periods, $h \geq 2$, by the next period’s asset returns covariance matrix $\hat{\Sigma}_{T+1}$ and correlation matrix $\hat{C}_{T+1}$.
Indeed, from the previous subsection, we have:
 $ \hat{\Sigma}_{T+1:T+h} = \frac{1}{h} \sum_{i=1}^{h} \hat{\Sigma}_{T+i} = \hat{\Sigma}_{T+1} $, $h \geq 2$
 $ \hat{C}_{T+1:T+h} = \frac{1}{h} \sum_{i=1}^{h} \hat{C}_{T+i} = \hat{C}_{T+1} $, $h \geq 2$
Rationale
The rationale behind the IEWMA covariance matrix forecasting model is twofold:

Separate volatility forecasting from correlation matrix forecasting while using the same baseline forecasting model for internal consistency.
This first idea is well known among practitioners, c.f. for example Menchero and Li^{10} which describes the usage of different EWMA halflives for estimating volatilities and correlations^{11}.

Use volatilitystandardized asset returns instead of raw asset returns for correlation matrix forecasting.
This second idea, detailled for example in Engle^{6}, originates from the fact that conditional correlations between asset returns are equal to conditional covariances between standardized disturbances, c.f. the previous section.
So, given an estimator for the (unobservable) standardized disturbances $\epsilon_{i,t} = \frac{r_{i,t}}{\sigma_{i,t}}$, $i=1..n$, $t=1..T$, it is possible to estimate the conditional correlations between asset returns.
An example of such estimator is the volatilitystandardized asset returns $\tilde{r}_{i,t} = \frac{r_{i,t}}{\hat{\sigma}_{i,t}}$, with the drawback that the quality of the correlation forecasts is then influenced by the quality of the volatility forecasts.
And unfortunately, because it is well known that […] asset returns […] fat tails are typically reduced but not eliminated when returns are standardized by volatilities estimated from popular [volatility forecasting] models^{12} and because the use of correlation as a measure of dependence can be misdealing in the case of (conditionally) nonGaussian returns^{13}, we should not expect any magic here…
How to choose the decay factors?
Due to its relationship with the vanilla EWMA volatility and covariance matrix forecasting models, there are two main procedures to choose the decay factors $\lambda_{vol}$ and $\lambda_{cor}$ of an IEWMA covariance matrix forecasting model:

Using recommended values from the EWMA models literature (0.94, 0.97…).
On this, Johansson et al.^{3} notes that empirical studies on real return data confirm that choosing a faster volatility halflife than correlation halflife yields better estimates^{6} and uses the following pairs of decay factors $\left(\lambda_{vol}, \lambda_{cor}\right)$ in their experimental setup:
 Short term  $\left(0.870,0.933\right)$, $\left(0.933,0.967\right)$
 Medium term  $\left(0.967,0.989\right)$, $\left(0.989,0.994\right)$, $\left(0.994,0.997\right)$
 Long term  $\left(0.997,0.998\right)$, $\left(0.998,0.999\right)$

Determining the optimal values w.r.t. the forecast horizon $h$, for example through the minimization of the root mean square error (RMSE) between the forecasted covariance matrix over the desired horizon and the observed covariance matrix over that horizon^{14}.
In practice, because there are two decay factors to choose, this can be done two ways:

Either consider the two decay factors as a two independent univariate parameters $\lambda_{vol} \in [0,1]$ and $\lambda_{cor} \in [0,1]$.
This choice is justified by the original desire to separate volatility forecasting from correlation matrix forecasting.

Or consider the two decay factors as a single multivariate parameter $\left(\lambda_{vol}, \lambda_{cor}\right) \in [0,1]^2$.
This choice is justified by the observed dependency of the correlation forecasts on the volatilitystandardized asset returns.

Extensions of the iterated exponentially weighted moving average covariance matrix forecasting model
Assetspecific volatility decay factors
The IEWMA covariance matrix forecasting model uses univariate EWMA models in order to forecast asset volatilities, all these models sharing the same decay factor $\lambda_{vol}$.
Having an identical decay factor for all assets is parsimonious, but is somewhat at odds with the DCCGARCH model of Engle^{6} which uses assetspecific univariate GARCH models  that is, each with its own assetspecific parameters  in order to forecast asset volatilities.
So, one natural extension of the IEWMA model is to allow assetspecific univariate EWMA models  each with its own assetspecific decay factor $\lambda_{i,vol}$, $i=1..n$  in order to forecast asset volatilities.
Linear combination of IEWMA covariance matrix forecasting models
Another interesting covariance matrix forecasting model is introduced in Johansson et al.^{3} as the combined multiple iterated exponentially weighted moving average (CMIEWMA) model, which consist in a timevarying linear combination of individual IEWMA models, each with its own pair of fixed decay factors.
As explained in Johansson et al.^{3}:
The CMIEWMA predictor is constructed from a modest number of IEWMA predictors, with different pairs of halflives, which are combined using dynamically varying weights that are based on recent performance.
The rationale behind the CMIEWMA model is that different pairs of halflives may work better for different market conditions^{3}, with short halflives [typically performing] better in volatile markets [and] long halflives [performing] better for calm markets where conditions are changing slowly^{3}.
This behaviour is illustrated in Figure 1, taken from Johansson et al.^{3}, which shows the evolution of the weights of a 5IEWMA CMIEWMA covariance matrix forecasting model applied to a universe of U.S. stocks.
From Figure 1, it is visible that although substantial weight is put on the slower (longer halflife) IEWMAs most years^{3}, the CMIEWMA model still adapts the weights depending on market conditions^{3}.
The interested reader is referenced to Johansson et al.^{3} for all the technicalities of the CMIEWMA model, and in particular for the details about the computation of the dynamically varying weights of the individual IEWMA models through the resolution of a convex optimization problem^{15}.
A last important remark to conclude this subsection  the CMIEWMA model is actually a special case of [a more general] dynamically weighted prediction [model]^{3}, so that the same weighting logic can be applied to any combination of covariance matrix forecasting models.
Implementations
Implementation in Portfolio Optimizer
Portfolio Optimizer implements the IEWMA covariance and correlation matrix forecasting model through the endpoints /assets/covariance/matrix/forecast/iewma
and /assets/correlation/matrix/forecast/iewma
.
These endpoints support the 2 covariance proxies below:
 Squared (closetoclose) returns
 Demeaned squared (closetoclose) returns
These endpoints also allow:
 To use assetspecific univariate EWMA models in order to forecast asset volatilities
 To automatically determine the optimal value of their parameters (the decay factors $\lambda_{vol}$ and $\lambda_{cor}$) using a proprietary procedure.
To be noted that Portfolio Optimizer does not provide any implementation of the CMIEWMA model^{16}, but c.f. the next subsection.
Implementation elsewhere
Johansson et al.^{3} kindly provides an open source Python implementation of:
 The IEWMA covariance matrix forecasting model
 The CMIEWMA covariance matrix forecasting model
 The general “covariance matrix forecasting models” combination model
at https://github.com/cvxgrp/cov_pred_finance.
I definitely encourage anyone interested in the CMIEMWA model to play with this code!
Example of usage  Covariance matrix forecasting at monthly level for a portfolio of various ETFs
As an example of usage, I propose to evaluate the empirical performances of the IEWMA covariance matrix forecating model within the framework of the previous blog bost, whose aim is to forecast monthly covariance and correlation matrices for a portfolio of 10 ETFs representative^{17} of misc. asset classes:
 U.S. stocks (SPY ETF)
 European stocks (EZU ETF)
 Japanese stocks (EWJ ETF)
 Emerging markets stocks (EEM ETF)
 U.S. REITs (VNQ ETF)
 International REITs (RWX ETF)
 U.S. 710 year Treasuries (IEF ETF)
 U.S. 20+ year Treasuries (TLT ETF)
 Commodities (DBC ETF)
 Gold (GLD ETF)
Results  Covariance matrix forecasting
Results over the period 31st January 2008  31st July 2023^{18} for covariance matrices are the following^{19}:
Covariance matrix model  Covariance matrix MSE 

SMA, window size of all the previous months (historical average model)  9.59 $10^{6}$ 
SMA, window size of the previous year  9.08 $10^{6}$ 
EWMA, optimal^{20} $\lambda$  6.52 $10^{6}$ 
EWMA, $\lambda = 0.97$  6.37 $10^{6}$ 
IEWMA, $\left(\lambda_{vol},\lambda_{cor}\right) = \left(0.97,0.99\right)$  6.35 $10^{6}$ 
IEWMA, optimal^{20} $\lambda_{i,vol}$, $i=1..n$ and $\lambda_{cor}$  6.33 $10^{6}$ 
IEWMA, optimal^{20} $\left(\lambda_{vol},\lambda_{cor}\right)$  6.16 $10^{6}$ 
SMA, window size of the previous month (random walk model)  6.06 $10^{6}$ 
EWMA, $\lambda = 0.94$  5.78 $10^{6}$ 
IEWMA, $\left(\lambda_{vol},\lambda_{cor}\right) = \left(0.94,0.97\right)$  5.78 $10^{6}$ 
Within this specific evaluation framework, the IEWMA covariance matrix forecasting model unfortunately does not seem to improve upon the previous models despite the added complexity.
It is also noteworthy that the IEWMA model with assetspecific univariate EWMA models for volatility (line #6) does not exhibit better performances than the vanilla IEWMA model (line #7) when using automatically determined parameters^{21}.
Results  Correlation matrix forecasting
Results over the period 31st January 2008  31st July 2023^{18} for the correlation matrices associated to the covariance matrices of the previous subsection are the following^{19}:
Covariance matrix model  Correlation matrix MSE 

SMA, window size of the previous month (random walk model)  8.19 
SMA, window size of all the previous months (historical average model)  8.10 
EWMA, $\lambda = 0.94$  7.67 
SMA, window size of the previous year  6.50 
EWMA, $\lambda = 0.97$  6.36 
EWMA, optimal^{20} $\lambda$  5.87 
IEWMA, $\left(\lambda_{vol},\lambda_{cor}\right) = \left(0.94,0.97\right)$  5.85 
IEWMA, $\left(\lambda_{vol},\lambda_{cor}\right) = \left(0.97,0.99\right)$  5.85 
IEWMA, optimal^{20} $\lambda_{i,vol}$, $i=1..n$ and $\lambda_{cor}$  5.72 
IEWMA, optimal^{20} $\left(\lambda_{vol},\lambda_{cor}\right)$  5.70 
This time, the IEWMA model  and especially the two IEWMA models using automatically determined parameters (lines #9#10)  does exhibit slightly better performances than all the previous models.
Nevertheless, the improvement over the previously bestperforming model (line #6  the EWMA model with automatically determined parameter) is not that impressive…
So, the idea of removing the impact of volatility on asset returns in order to better estimate asset correlations seems to have some merits, but the EWMA volatility forecasting model might be insufficient to fully exploit this idea.
Also, here again, the performances of the IEWMA model with assetspecific univariate EWMA models for volatility are again strictly worse than the performances of the vanilla IEWMA model^{21}.
Conclusion
One of the main characteristics of the IEWMA covariance matrix forecasting model of Johansson et al.^{3} is to forecast asset correlations using (EWMA)volatilitystandardized asset returns instead of raw asset returns, so that the impact of asset volatilities on their correlations is (tentatively) minimized.
Unfortunately, the empirical performances of that model in terms of correlation matrix forecasting are not that different from those of the EWMA model, which raises the question of whether improving the volatilitystandardization might lead to better correlation forecasts.
This will be the subject of a future blog post in this series.
As usual, feel free to connect with me on LinkedIn or to follow me on Twitter.
–

ChatGPTgenerated, as can be seen by the signature word “delve” :) ! ↩

At the date of the initial publication of this blog post. ↩

See Kasper Johansson, Mehmet G. Ogut, Markus Pelger, Thomas Schmelzer and Stephen Boyd (2023), A Simple Method for Predicting Covariance Matrices of Financial Returns, Foundations and Trends in Econometrics: Vol. 12: No. 4, pp 324407. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11} ↩^{12} ↩^{13} ↩^{14} ↩^{15} ↩^{16}

See Valeriy Zakamulin, A Test of CovarianceMatrix Forecasting Methods, The Journal of Portfolio Management Spring 2015, 41 (3) 97108. ↩ ↩^{2}

See Patton, A.J., Sheppard, K. (2009). Evaluating Volatility and Correlation Forecasts. In: Mikosch, T., Kreiß, JP., Davis, R., Andersen, T. (eds) Handbook of Financial Time Series. Springer, Berlin, Heidelberg. ↩

See Engle, R. (2002). Dynamic Conditional Correlation. Journal of Business & Economic Statistics. 20(3): 339–350. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6} ↩^{7} ↩^{8} ↩^{9} ↩^{10} ↩^{11}

That being said, I find that what Engle^{6} describes is closer to an iterated $n$univariate GARCH volatility forecasting models/EWMA correlation matrix forecasting model than to the iterated EWMA forecasting model of Johansson et al.^{3}… ↩

See Gianluca De Nard, Robert F. Engle, Olivier Ledoit, Michael Wolf, Large dynamic covariance matrices: Enhancements based on intraday data, Journal of Banking & Finance, Volume 138, 2022, 106426. ↩

See Menchero, Jose and Peng Li. Correlation Shrinkage: Implications for Risk Forecasting, Journal of Investment Management (2020). ↩

Digging a little deeper, the old MAC1 and MAC2 Bloomberg multiasset risk models were using an halflife of 26 weeks for estimating volatilities and an halflife of 52 weeks for estimating correlations. ↩

See Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Paul Labys, 2000, Exchange Rate Returns Standardized by Realized Volatility are (Nearly) Gaussian, Multinational Finance Journal 4, 159179.. ↩

See Bahram Pesaran, M. Hashem Pesaran, Conditional volatility and correlations of weekly returns and the VaR analysis of 2008 stock market crash, Economic Modelling, Volume 27, Issue 6, 2010, Pages 13981416. ↩

See RiskMetrics. Technical Document, J.P.Morgan/Reuters, New York, 1996. Fourth Edition. ↩

The maximization of the average loglikelihood of the combined [covariance matrix] prediction over [a trailing number of periods]^{3}. ↩

This is because my own tests did not highlight any strong improvement in terms of forecasting ability v.s. the IEWMA model when used with an optimal pair of decay factors. ↩

These ETFs are used in the Adaptative Asset Allocation strategy from ReSolve Asset Management, described in the paper Adaptive Asset Allocation: A Primer^{22}. ↩

(Adjusted) daily prices have have been retrieved using Tiingo. ↩ ↩^{2}

Using the outer product of asset returns  assuming a mean return of 0  as covariance proxy, and using an expanding historical window of asset returns. ↩ ↩^{2}

The optimal decay factors ($\lambda$, $\lambda_{vol}$, $\lambda_{cor}$) are computed at the end of every month using all the available asset returns history up to that point in time, as implemented in Portfolio Optimizer; thus, there is no lookahead bias. ↩ ↩^{2} ↩^{3} ↩^{4} ↩^{5} ↩^{6}

The difference between the two models is probably just noise, which, at worst, implies that the added complexity of the IEWMA model with assetspecific univariate EWMA models is useless in practice. ↩ ↩^{2}

See Butler, Adam and Philbrick, Mike and Gordillo, Rodrigo and Varadi, David, Adaptive Asset Allocation: A Primer. ↩