Journal of Financial Econometrics Vol. 2, No. 1, pp. 130-168
© 2004 Oxford University Press; all rights reserved.
On the Out-of-Sample Importance of Skewness and Asymmetric Dependence for Asset Allocation
London School of Economics
Address correspondence to Andrew J. Patton, Financial Markets Group, London School of Economics, Houghton Street, London WC2A 2AE, UK, or e-mail: a.patton{at}lse.ac.uk.
| ABSTRACT |
|---|
|
|
|---|
Recent studies in the empirical finance literature have reported evidence of two types of asymmetries in the joint distribution of stock returns. The first is skewness in the distribution of individual stock returns. The second is an asymmetry in the dependence between stocks: stock returns appear to be more highly correlated during market downturns than during market upturns. In this article we examine the economic and statistical significance of these asymmetries for asset allocation decisions in an out-of-sample setting. We consider the problem of a constant relative risk aversion (CRRA) investor allocating wealth between the risk-free asset, a small-cap portfolio, and a large-cap portfolio. We use models that can capture time-varying moments up to the fourth order, and we use copula theory to construct models of the time-varying dependence structure that allow for different dependence during bear markets than bull markets. The importance of these two asymmetries for asset allocation is assessed by comparing the performance of a portfolio based on a normal distribution model with a portfolio based on a more flexible distribution model. For investors with no short-sales constraints, we find that knowledge of higher moments and asymmetric dependence leads to gains that are economically significant and statistically significant in some cases. For short sales-constrained investors the gains are limited.
KEYWORDS: asymmetry, copulas, density forecasting, forecasting, normality, stock returns
Recent studies in the empirical finance literature have reported evidence of two types of asymmetries in the joint distribution of stock returns. The first is skewness or asymmetry in the distribution of individual stock returns, which has been reported by numerous authors over the last three decades.1 Evidence that stock returns exhibit some form of asymmetric dependence has been reported by several authors in recent years [see Erb, Harvey, and Viskanta (1994), Longin and Solnik (2001), Ang and Bekaert (2002), Ang and Chen (2002), Campbell, Koedijk, and Kofman (2002), and Bae, Karolyi, and Stulz (2003)]. The presence of either of these asymmetries violates the assumption of elliptically distributed asset returns, which underlies traditional mean-variance analysis [see Ingersoll (1987)]. In this article we examine the economic and statistical significance of these two asymmetries for asset allocation decisions in an out-of-sample setting. This article can thus be viewed as an attempt to address the suggestions of Harvey and Siddique (1999) and Longin and Solnik (2001), who propose investigating the impact of conditional skewness (Harvey and Siddique) and asymmetric dependence (Longin and Solnik) on portfolio choices.
Theoretical justification for the importance of distributional asymmetries may be found in Arrow (1971), who suggests that a desirable property of a utility function is that it exhibits nonincreasing absolute risk aversion.2 Under non-increasing absolute risk aversion investors can be shown to have a preference for positively skewed portfolios. The skewness of a portfolio of two assets is a function of the skewness of the individual assets, and two "coskewness" terms. Asymmetry in the dependence structure can be shown [see Patton (2002)] to lead to nonzero coskewness and thus impact the skewness of the portfolio return. This suggests that risk-averse investors will have preferences over alternative dependence structures. Ang, Chen, and Xing (2002) report empirical evidence in support of this.
We examine the problem of an investor with constant relative risk aversion (CRRA) allocating wealth between the risk-free asset, the Center for Research in Security Prices (CRSP) small cap and large cap indices, comprised of the 1st and 10th decile of U.S. stocks sorted by market capitalization. We use monthly data from January 1954 to December 1989 to develop the models, and data from January 1990 to December 1999 for out-of-sample forecast evaluation. This problem is representative of that of choosing between a high riskhigh return asset and a lower risklower return asset, as the annualized mean and standard deviation on these indices over the sample were 9.95% and 21.29% for the small caps, and 7.97% and 14.29% for the large caps. Our motivation for studying a problem involving two stocks rather than a stock and a bond, as in numerous previous studies, is that evidence of asymmetric dependence has so far been reported only for equity returns. The presence or absence of asymmetric dependence between equity and bond returns is yet to be established.
We use models of the asset returns that can capture the empirically observed time-varying means and variances of stock returns, and also the presence of (possibly time-varying) skewness and kurtosis, as in Hansen (1994) and Jondeau and Rockinger (2003). Further, we employ models of the dependence structure (or copula) that allow for, but do not impose different dependence during bear markets than bull markets, and allow for changes in this dependence structure through time. A thorough introduction to copula theory is presented in Schweizer and Sklar (1983), Joe (1997), and Nelsen (1999).
The importance of skewness and asymmetric dependence for asset allocation is measured by comparing the performance of a portfolio based on a bivariate normal distribution model with a portfolio based on a model developed using copula theory. We compute the amount that an investor could be charged to make him/her indifferent between two competing portfolios, as in West, Edison, and Cho (1993), Ang and Bekaert (2002), and others. The significance of the differences in portfolio performance are tested using bootstrap methods. We find evidence in most cases that nonnormalities in the marginal distributions and copula do have important economic implications for asset allocation, however, the statistical significance of the improvement is only moderate. Gains are generally only present for investors that are not short-sales constrained, such as hedge funds.
This article is essentially trying to test three hypotheses simultaneously: (1) Are these asymmetries present in this dataset? (2) Are these asymmetries predictable out-of-sample? (3) Can we make better portfolio decisions by using forecasts of these asymmetries than we can by ignoring them? If the answer to any of these questions is "no," then we would conclude that the out-of-sample importance of these asymmetries for asset allocation is zero. The distinction between in-sample and out-of-sample significance is an important one. Finding that a more flexible distribution model fits the data better in-sample does not imply that it will lead to better out-of-sample portfolio decisions than those based on a simpler model. In fact, a common finding in the point forecasting literature is that more complicated models often provide poorer forecasts than simple misspecified models [see Weigend and Gershenfeld (1994), Swanson and White (1995, 1997), and Stock and Watson (1999)].
In this article we consider both unconstrained and short sales-constrained estimates of the optimal portfolio weight. The first reason for doing so is economically motivated: many market participants face the constraint that they are unable to short sell stocks or to borrow and invest the proceeds in stocks, while others, such as hedge funds, actively take both long and short positions. The second reason is statistically motivated: the optimal portfolio weight given a density forecast is itself only an estimate of the true optimal portfolio weight. By ensuring that our estimate always lies in the interval [0, 1], we employ a type of "insanity filter" that prevents the investor from taking an extreme position in the market. Such constraints have been found to improve the out-of-sample performance of optimal portfolios based on parameter estimates [see Frost and Savarino (1988) and Jagannathan and Ma (2002)]. One could also consider an intermediate filter that allows for some limited amount of short selling, but we do not explore such a possibility here.
Much of the existing work on asset allocation focused on special cases where the combination of utility function and distribution model were such that an analytical solution for the optimal portfolio decision exists [see Kandel and Stambaugh (1996) or Campbell and Viceira (1999), among others]. Brandt (1999) and Aït-Sahalia and Brandt (2001) overcome the problem of the appropriate distributional assumption to combine with a given utility function by using the method of moments and the first-order conditions of the investor's optimization problem to obtain an optimal portfolio decision. Detemple, Garcia and Rindisbacher (2003) present a sophisticated new method for finding optimal portfolio weights from empirically relevant models. In this article we combine density models that are shown to adequately describe the statistical properties of the asset returns with the CRRA utility function.
One of the costs of using flexible parametric models for the joint distribution of stock returns is that we are forced by computational constraints to be relatively unsophisticated in other aspects of the project. First, we ignore the effects of parameter estimation uncertainty on the investor's decision problem, though this was found to be important by Kandel and Stambaugh (1996). Also, we only consider the investor's problem for the one-period-ahead investment horizon, thus ignoring the hedging component of the optimal portfolio weight [see Merton (1971)]. Empirical evidence on the importance of the hedging component is mixed: Brandt (1999), Campbell and Viceira (1999), and Detemple, Garcia, and Rindisbacher (2003) find it to be important, whereas Aït-Sahalia and Brandt (2001) and Ang and Bekaert (2002) find only weak evidence.
The remainder of the article is structured as follows. In Section 1 we provide a brief introduction to copula theory and its use in the density forecasting of stock returns. In Section 2 we present the investor's decision problem in detail. Section 3 presents the empirical results on the asset allocation problem for a portfolio of a small-cap index and a large-cap index: the models employed, comparisons of portfolio weights, and tests for improvements in portfolio performance. We conclude in Section 4. In Appendix A we present some details of the optimization procedure and in Appendix B we provide the functional forms of the copula models considered.
| 1 FLEXIBLE MULTIVARIATE DISTRIBUTION MODELS USING COPULAS |
|---|
|
|
|---|
In this article we use copula theory to develop flexible parametric models of the joint distribution of returns. Suppose we have two (scalar) random variables of interest, Xt and Yt, and some exogenous variables Wt. The variables' joint conditional distribution is (Xt, Yt)|
t1
Ht = Ct (Ft, Gt), where Ht is some conditional bivariate distribution function, with conditional univariate distributions of Xt and Yt being Ft and Gt, the conditional copula being Ct, and
t1 is the information set defined as
t
(Zt), for Zt
[Xt, Yt, W't, Xt1, Yt1, W't1, ..., Xtj, Ytj, W'tj]'. We will denote the distribution (cdf) of a random variable using an uppercase letter and the corresponding density (pdf) using a lowercase letter. A copula is any multivariate distribution function that has Uniform (0,1) marginal distributions. It links together two (or more) marginal distributions to form a joint distribution. The marginal distributions that it couples can be of any type: a normal and an exponential, or a Student's t and a Uniform, for example. The theory of copulas dates back to Sklar (1959) and since then numerous applications have appeared in the statistics literature and more recently also in the analysis of economic data.3 The main theorem in copula theory is that of Sklar (1959), presented below for the conditional case. For an introduction to copula theory see Joe (1997) and Nelsen (1999).
Theorem 1 (Sklar's theorem for continuous conditional distributions). Let F be the conditional distribution of X|Z, G be the conditional distribution of Y|Z, and H be the joint conditional distribution of (X, Y)|Z. Assume that F and G are continuous in x and y, and let
be the support of Z. Then there exists a unique conditional copula C such that
![]() | (1) |
Sklar's theorem allows us to decompose a bivariate distribution, Ht, into three components: the two marginal distributions, Ft and Gt, and the copula, Ct. The density function equivalent of Equation (1) is obtained quite easily, provided that Ft and Gt are differentiable, and Ht and Ct are twice differentiable:
![]() | (2) |
Ft(x|z), and v
Gt(y|z). Taking logs of both sides we obtain
![]() | (3) |
| 2 THE INVESTOR'S OPTIMIZATION PROBLEM |
|---|
|
|
|---|
The utility functions we assume for our hypothetical investors are from the class of CRRA utility functions:
![]() | (4) |
i is the proportion of wealth in asset i. The degree of relative risk aversion (RRA) is denoted by
. For this utility function, the initial wealth does not affect the choice of optimal weight and so we set P0 = 1. We consider five different levels of relative risk aversion:
= 1, 3, 7, 10, and 20. A similar range of risk aversion levels was also considered in Campbell and Viceira (1999) and Aït-Sahalia and Brandt (2001). While there exist other utility functions that place higher weight on tail events or asymmetries in the distribution of payoffs, we focus on the CRRA utility because of its prominence in the finance literature. If gains are found using the CRRA utility function then they may be thought of as a conservative estimate of the possible gains using other, more sensitive, utility functions.
The setup of the investor's problem is as follows. Let the excess returns on the two risky assets under consideration be denoted Xt and Yt, with some joint distribution, Ht, with associated marginal distributions, Ft and Gt, and copula, Ct. We will develop density forecasts of this joint distribution
,
, and the conditional copula,
and use them to compute the optimal weights,
, for the portfolio. The optimal weights are found by maximizing the expected utility of the end-of-period wealth under the estimated probability density:
![]() | (5) |
is some compact subset of
2 for the unconstrained investor and
= {(
x,
y)
[0, 1]2:
x +
y
1} for the short sales-constrained investor. The investor is assumed to estimate the model of the conditional distribution of excess returns using maximum likelihood and then optimize the portfolio's weight using the predicted conditional distribution of returns. Work from the forecasting and estimation literature suggests that the parameter estimation stage and the forecast evaluation stage should both use the same objective function [see Granger (1969), Weiss (1996), and Skouras (2001)]. We use maximum-likelihood estimation for computational tractability.
The double-integral defining the expected utility of wealth does not have a closed-form solution for our case. We use 10,000 Monte Carlo replications to estimate the value of this integral, which must be done for each point in the out-of-sample period. The objective function was found to be well behaved (smooth and having a unique global optimum) for our choices of utility functions and density models and so we employed the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm to locate the optimum,
, at each point in time. Further details on this procedure may be found in Appendix A.
One concern that may arise in this design is the existence of
for certain density models. Given the CRRA utility, any density model that assigns positive probability to the case of bankruptcy would preclude the existence of
. All of the above specifications will assign some (extremely small) positive probability to bankruptcy. We deal with this by modifying the left tail of the distribution: we apply a logistic transformation to the lower tail of the portfolio return distribution so that all probability mass assigned to the region (
,
) is relocated to the region (0,
), where
is some extremely small positive number.
| 3 A PORTFOLIO OF SMALL CAP AND LARGE CAP STOCKS |
|---|
|
|
|---|
In this section we consider an investor with constant relative risk aversion facing the problem of allocating wealth between two assets: a portfolio of low market capitalization stocks ("small caps") and a portfolio of high market capitalization stocks ("large caps"). These two assets were chosen as being representative of the general problem of balancing a portfolio comprised of a high riskhigh return asset and a lower risklower return asset.
3.1 Description of the Data
We use monthly data from the CRSP on the top 10% and bottom 10% of stocks sorted by market capitalization to form indices the "large cap" and "small cap" indices, from January 1954 to December 1999, yielding 552 observations. These data were also analyzed in a different context by Perez-Quiros and Timmermann (2001). We reserve the last 120 observations, from January 1990 to December 1999, for the out-of-sample evaluation of the models. Descriptive statistics on the two portfolios are presented in Table 1.
|
The small cap index generally exhibited slightly positive skewness, while the large cap index exhibited negative skewness. Both indices exhibited excess kurtosis. The Jarque-Bera statistic indicates that neither series is unconditionally normal, and the unconditional correlation coefficient indicates a high degree of linear dependence, as expected. Table 1 also reveals that the small cap index had a higher mean and higher volatility than the large cap index over the total sample and the in-sample period, but not over the out-of-sample period. During the 1990 s, as is well known, large cap stocks performed better than their historical average. The change in expost average returns and standard deviations for the small and large cap indices between the in-sample and out-of-sample period suggests that allowing for structural breaks in the returns-generating process may improve portfolio decisions. One promising method of doing so is the reversed ordered Cusum (ROC) procedure of Pesaran and Timmermann (2002). Due to the computational constraints, we are forced to ignore the possibility of structural breaks.
We use three further variables as explanatory variables in our analysis. The first is the one-month Treasury bill rate, denoted Rft, which is taken as the risk-free rate. This variable has been used by Fama (1981) and others as a proxy for shocks to expected growth in the real economy. The second variable is the difference between the yield on corporate bonds with Moody's rating Baa versus those with an Aaa rating, denoted SPRt, which is called the "default spread." This variable tracks the cyclical variation in the risk premium on stocks, see Perez-Quiros and Timmermann (2001). Finally, we look at the dividend yield, denoted DIVt, which is measured as the total dividends paid over the previous 12 months divided by the stock price at the end of the month. This variable acts as a proxy for time-varying expected returns. For a comprehensive review of the variables that have been used in previous studies as predictive variables for stock returns see Aït-Sahalia and Brandt (2001).
To examine the presence of asymmetric dependence between these two assets we use measures presented in Longin and Solnik (2001) and Ang and Chen (2002) called "exceedence correlations,"
. We will not use this measure in the asset allocation problem, but we have found it to be a useful, intuitive way of taking a preliminary look at our data.
![]() |
is nonlinear in q. In Figure 1 we plot the empirical exceedence correlations based on the (raw) excess returns on the two indices, along with what would be obtained if they had the bivariate normal distribution. In Figure 2 we plot the empirical exceedence correlations based on the transformed standardized residuals of the models for the two indices, along with what would be obtained if these assets had the normal copula and the "rotated Gumbel" copula, which is described below. Figure 1 shows the degree of asymmetry in the unconditional distribution of the returns on these two assets; Figure 2 shows the degree of asymmetry in the unconditional copula of these two assets, having removed all marginal distribution asymmetry. Clearly both the unconditional joint distribution and the unconditional copula exhibit substantial asymmetry. This suggests that the assumption of normality, which implies a symmetric dependence structure, is inappropriate for these assets. Whether capturing this asymmetric dependence leads to substantially better portfolio decisions is the focus of Section 3.3.
|
|
3.2 Analysis of the Different Models
We consider a number of different investment strategies. In this section we describe the models used to obtain the density forecasts on which some of the strategies are based.
The first three strategies we consider are simply buy-and-hold strategies (all small caps, all large caps, or an even mix of both). The fourth strategy is one based solely on the unconditional distribution of returns. For this portfolio we assume that the investor optimizes his/her portfolio weights for the period t + 1 using the empirical unconditional distribution of returns observed up until time t:
![]() | (6) |
3.2.1 Marginal distribution models
The benchmark model for our study is the bivariate normal distribution, which is compared with a model constructed using copula theory. Both models have the same forms for the conditional means,
and
, and variances,
and
. In their article on the value of volatility timing for asset allocation decisions, Fleming, Kirby, and Ostdiek (2001) assumed a constant conditional mean rather than using some model for expected returns. In their framework this was shown to lead to a conservative estimate of the value of volatility timing. In our framework, however, a misspecified conditional mean will lead to a misspecified skewness model and a misspecified dependence model, and unlike Fleming, Kirby, and Ostdiek, we cannot be sure what impact this will have on the results; whether it will exaggerate or dampen the differences between the models under analysis. For this reason, we cannot escape the building of a model for expected returns.
The conditional means were set to be linear functions of up to 12 lags of the two asset returns, the risk-free rate, the default spread, and the dividend yield. For the conditional variance we employed a TARCH(1,1) specification and allowed the three lagged exogenous regressors to enter into the conditional variance specification in levels and squares. We used likelihood ratio tests to determine the best-fitting model over the in-sample period. The selected models for the mean and variance equations are given below. The full sequence of parameter estimates for each point in the out-of-sample period are available from the author upon request.
![]() | (7) |
![]() | (8) |
Although the models are recursively reestimated throughout the out-of-sample period, they are "nonadaptive," in that the model specifications are determined using the in-sample data and not updated in the out-of-sample period.
To determine the importance of skewness and asymmetric dependence for asset allocation we specify distribution models that can capture these features. We found Hansen's (1994) skewed Student's t distribution to provide a good fit for the marginal distributions of both assets. Jondeau and Rockinger (2003) present some further results on this distribution. In addition to time-varying conditional means and variances, the skewed t distribution can capture time-varying conditional skewness and kurtosis. Both skewness and kurtosis parameters, denoted
t and
t, were allowed to depend on lags of the exogenous variables and the forecast conditional means and variances. As suggested by Hansen (1994), we use transformations to ensure that the skewness and degrees-of-freedom parameters remained within (1, 1) and (2,
], respectively, at all times by setting
t =
(Z't1ß) and
t = 2.1 + (Z't1ß)2, where Z't1ß is the linear function of the regressors and parameters for that variable and
(x) = (1 ex)/(1 + ex) is the modified logistic transformation, designed to keep
t in (1, 1) at all times.
For both assets we found significant in-sample time variation in these moments, though some variables were dropped as their coefficients were not significant. The total additional number of parameters in the skewed t distribution over those in the normal distribution for the small caps (large caps) was 5 (4). Using likelihood ratio tests, we could reject (with p-values of less than 0.01), for both assets, the assumptions of skewness being constant at zero and kurtosis being constant at three, both jointly and separately for the in-sample period.4 The improved in-sample goodness-of-fit of the skewed t distribution is traded off against possible increased parameter estimation error in an out-of-sample setting.
In addition to testing the significance of each of the possible variables to include in the model, we conducted goodness-of-fit tests (not reported) for the final marginal density models. Such tests are critical when constructing multivariate densities using copulas, as a misspecified marginal density implies that any copula model will be misspecified. We used Kolmogorov-Smirnov (KS) tests for the proposed density and Lagrange multiplier (LM) tests for serial dependence in the probability integral transforms of the variables, as suggested in Diebold, Gunther, and Tay (1998). We also employed the multinomial hit test described in Patton (2001b). We found no evidence against the skewed t models and some evidence against the normal models. To show the outputs of these models, in Figure 3 we present the conditional mean, conditional variance, skewness parameter, and kurtosis parameters for the out-of-sample period, estimated at each point in time only using data available as in the previous period.
|
3.2.2 Copula models
For the bivariate normal model, all that remains to be specified is a model for the conditional correlation. The conditional correlation was set as a function of the lagged risk-free rate, default spread, dividend yield, and the forecasts of the conditional means of the two variables. All of these variables were found to be important in-sample. The bivariate normal model is
![]() | (9) |
![]() | (10) |
(x) is the modified logistic transformation. For the flexible distribution model, all that remains is to specify the form of the copula used to link the two skewed t marginal distributions. A total of nine different copulas were estimated on the transformed residuals from the skewed t models in the search for the best-fitting copula. The copulas considered were the normal, Student's t, Clayton, rotated Clayton,5 Joe-Clayton, Plackett, Frank, Gumbel, and rotated Gumbel copulas; contour plots of a few of these copulas are provided in Figure 4 and the functional forms of these copulas are contained in Appendix B. This list includes almost all of the copulas considered in the various applications of copulas in statistics and economics,6 and is significantly more than we found in any single previous applied study.
|
The plots in Figure 4 show the isoprobability contours of bivariate densities with Normal(0,1) margins and linear correlation coefficient of 0.5. We fixed the marginals and the correlation coefficient in this figure so that the differences in the densities could be more clearly identified and attributed to the different copulas. The copula in the upper left panel is the normal copula, making the joint density a bivariate standard normal and giving us the familiar elliptical contours. Immediately below the normal density is the joint density formed using Clayton's copula. We can see that this density's contours are more tightly clustered around the diagonal in the third ("negative-negative") quadrant than in the first quadrant, indicating stronger dependence between negative observations than between positive observations. This is qualitatively the type of dependence suggested by the exceedence correlation plots in Figures 1 and 2. Of these six, the normal, Student's t, and Plackett all generate symmetric dependence, whereas the Gumbel, Clayton, and Joe-Clayton all generate asymmetric dependence. For further details on the properties of these copulas see Joe (1997), Nelsen (1999), and Patton (2001b).
As in the bivariate normal distribution, we estimated these copulas with conditional dependence modeled as a function of the lagged risk-free rate, default spread, and dividend yield, and the forecasts of the conditional means of the two variables [see Equation (12) below]:
![]() | (11) |
![]() | (12) |
(x) is a function designed to keep
t in the feasible region for the copula C at all times, and C is one of the nine copulas discussed above. The maximum log-likelihood and information criteria values for each of the copulas considered are presented in Table 2, and we can see that the rotated Gumbel copula attained the greatest log-likelihood value and the lowest value of both information criteria. We will thus use the rotated Gumbel copula in the flexible distribution specification, which we will call the "Gumbel" model for simplicity.
|
We specify one final alternative model, called the "NormCop" model, which uses the skewed t marginal distributions along with a normal copula. This specification is included to determine where the benefits, if any, from flexible density modeling lie: in the marginal distribution specifications or in the copula specification. The values of the log-likelihoods at the optimum for the three joint distributions (normal, NormCop, and Gumbel) are 2391.04, 2355.38, and 2342.28, so in terms of in-sample goodness-of-fit we can see that the Gumbel model provides the best fit, and that about 73% of the gains come from the flexible marginal distribution models, though in an out-of-sample setting this ranking and decomposition of gains need not hold.
We again use likelihood ratio tests to determine if any of the five regressors for the conditional copula parameter can be dropped. For the bivariate normal distribution and the NormCop models, all five were significant at the 10% level, while for the rotated Gumbel copula the risk-free rate and the spread were not significant and so were removed from the model, reducing the number of parameters for this copula from six to four.
We conducted some specification tests (not reported) on the normal and rotated Gumbel copulas over the in-sample period to determine their goodness-of-fit, employing the multinomial hit test described in Patton (2001b). We found that the normal copula estimated using residuals from the normal marginal distribution models could be rejected, which is unsurprising since the marginals of that model were also rejected. Neither the normal copula nor the rotated Gumbel copula estimated on residuals from the skewed t marginal distribution models could be rejected at the 5% level.
In Figure 5 we present the conditional correlation parameter from the bivariate normal model and the implied conditional correlation from the skewed t-rotated Gumbel copula model. We use correlation as the measure of dependence here for comparability across models. While all three conditional correlation estimates generally moved in the same direction, their levels are quite different at times.
|
3.3 Performance of the Different Strategies
We now analyze the performance of the different asset allocation decisions made using the various models. We consider five levels of relative risk aversion (RRA = 1, 3, 7, 10, and 20) and 11 strategies. The 11 strategies are (1) always hold the small cap index; (2) always hold the large cap index; (3) always hold a 50:50 mix of the two indices; (4) optimize the portfolio weight using the unconditional empirical distribution of returns; (5) find the optimal portfolio weight for each period using the bivariate normal model; (6) find the optimal portfolio weight for each period using the NormCop model; (7) find the optimal portfolio weight for each period using the Gumbel model. Strategies 811 are the same as strategies 47, subject to a short-sales constraint.
3.3.1 Summary statistics
In Table 3 we present six summary statistics on the optimal portfolio return series based on the different models. In addition to the usual summary statistics we also present two alternative measures of risk, the 5% value-at-risk (VaR) and the 5% expected shortfall (ES). The 5% VaR is defined as the negative of the fifth empirical percentile of the realized returns, that is,
, where
is the empirical distribution of returns on portfolio X using the n out-of-sample observations. While VaR has some advantages over traditional measures of risk, it has received criticism for not being a "coherent" measure of risk [see Artzner et al. (1999)]. An alternative to VaR that has gained some attention recently is the "expected shortfall" of a portfolio. The 5% expected shortfall is defined as the negative of the average return on a portfolio given that the return has exceeded its 5% VaR, that is,
, where
is the sample average.
|
A striking feature of the summary statistics is the much greater mean and standard deviation of the portfolio returns based on the distribution models (Normal, NormCop, and Gumbel) than the portfolios with constant weights for all but the most risk-averse investor. We ignore parameter estimation uncertainty, and so the query may be raised as to whether the investors would so aggressively invest if they knew that they were using parameter estimates rather than the true parameters. Kandel and Stambaugh (1996) and Brandt (1999) both find that even when parameter estimation uncertainty is accounted for, a CRRA investor aggressively seeks the best portfolio. The results for the short sales-constrained investors reveal a much smaller difference in mean and risk between the distribution portfolios and the constant weight portfolios.
Also note the skewness coefficients: the Normal and NormCop portfolios also generally exhibited negative skewness, while the unconstrained Gumbel portfolio actually displayed positive skewness, suggesting that modeling both skewness and asymmetric dependence enables the investor to better avoid negatively skewed portfolio returns.
3.3.2 Performance statistics
The performance measure we consider is the amount in basis points per year that the investor would pay to switch from the "50:50 mix" portfolio to another portfolio. One interpretation of this amount is as the "management fee" that could be deducted from the monthly return on portfolio i over the out-of-sample period and leave the investor indifferent between the 50:50 portfolio and portfolio i. For example, an investor with risk aversion 1 would be willing to pay up to 25.176 basis points per year to switch from the 50:50 portfolio to the constrained Gumbel portfolio, while he would require compensation of 2.0114 basis points per year to switch from the 50:50 portfolio to the "unconditional" portfolio. See Table 4 for the complete set of results.
|
It should be pointed out that the investors with risk aversion of one and three using the normal model density forecast would have gone bankrupt in the month of January 1992. On this date these two investors took the positions
x = 8.9,
y = 21.3 and
x = 5.1,
y = 11.5, respectively, and the month finished with returns of 14.0% on the small caps (the largest return on this asset over the out-of-sample period) and 2.6% on the large caps, leading to negative gross returns for these investors.7 For this month the realized utility for these investors is
, making the required management fee to switch to this portfolio
. The performance statistics indicate that substantial gains may be obtained by employing weights obtained from a model of the conditional distribution of stock returns, particularly when coupled with a short-sales constraint. The unconstrained portfolios generally do not perform as well as simply holding an equally weighted portfolio of the two indices, as the large caps performed particularly well over the period 1990 to 1999. This result can be interpreted as further evidence that placing short sales constraints on the optimal portfolio weights obtained from forecasts improves out-of-sample portfolio performance [see Frost and Savarino (1988) and Jagannathan and Ma (2002)]. If the short-sales constraint is interpreted as a type of "insanity filter," preventing the investor from taking an extreme position in the market, then this finding reinforces results previously reported in the forecasting literature [see, e.g., Stock and Watson (1999)], that constrained forecasts often outperform unconstrained forecasts from nonlinear models.
The management fees that one could charge an investor currently holding the 50:50 portfolio to switch to the constrained Gumbel portfolio range between 5 and 27 basis points per year. Management fees of less than 10 or 20 basis points per year are of questionable economic significance. The largest gains (23 and 27) for the Gumbel portfolio occur for the most risk-averse investor, though it should be noted that the gains are not monotonic in the risk-aversion parameter.
Looking now to the gains from modeling higher moments and asymmetric dependence, we compare the Normal portfolio with the Gumbel portfolio. In 9 of 10 comparisons the Gumbel portfolio outperformed the Normal portfolio, and for the remaining comparison the difference was 0.1 basis points. Ignoring the two comparisons where the Normal portfolio went bankrupt (and so the effective management fee difference would be +
) the average outperformance was 16.2 basis points; 41.5 basis points for the unconstrained investor and only 0.9 basis points for the short sales-constrained investor. We will see in the following section the reason for the extremely small difference for the short-sales constrained investor.
The Gumbel model also outperformed the NormCop model in 9 of 10 comparisons. The average outperformance of Gumbel portfolio over NormCop portfolio was 21.3 (1.5) basis points for the unconstrained (short sales-constrained) investor. The corresponding figures for the Normal versus the NormCop portfolio were 18.3 and 0.5, indicating that the Normal portfolio performed worse for the unconstrained investor, but marginally better for the short sales-constrained investor. That the Gumbel portfolio outperformed both the Normal and NormCop portfolios suggests that the copula specification is important for asset allocation.
3.3.3 Analysis of the optimal portfolio weights
In this section we look at the time series of portfolio weights resulting from the portfolio decisions made using different models and different levels of risk aversion. To consider the impact of risk aversion, we present Table 5 and Figure 6. Table 5 contains quantiles of the distribution of portfolio weights obtained using the Gumbel model for risk aversion ranging between 1 and 20, and Figure 6 shows the time series of portfolio weights for investors with a relative risk aversion of 7 and 20. Both show that increasing the level of relative risk aversion shrinks the portfolio weights toward zero, as expected. In the limit of infinite risk aversion the investor would put no wealth in the risky assets and all wealth in the risk-free asset.
|
|
In Figure 7 we show the impact of imposing a short-sales constraint. The plot makes it clear that even for moderately high-risk aversion, seven in this case, the short-sales constraint is binding for at least one asset almost every period. For the Gumbel model the proportions of times that the short-sales constraint is binding for risk aversion levels of 1, 3, 7, 10, and 20 are 1, 1, 0.98, 0.95, and 0.94, respectively. Similar figures are obtained for the Normal and NormCop portfolios. This shows that much of the information content of these models is lost if the investor is short-sales constrained.
|
One example of this reduced information content relates to comparing the short sales-constrained portfolios. The proportions of times that the short sales-constrained Gumbel portfolio took the same portfolio weights as the short sales-constrained Normal portfolio for risk aversion levels of 1, 3, 7, 10, and 20 are 0.95, 0.80, 0.59, 0.52, and 0.52, respectively. The corresponding figures comparing the Gumbel with the NormCop portfolio are similar. Thus, of the 120 observations in the out-of-sample period, the number of observations that enable us to distinguish one short sales-constrained portfolio from another range from 58 (when there is "only" a 52% overlap, for the two most risk-averse investors) to just 5 (for the least risk-averse investor). This suggests, and is confirmed in the next section, that if there are gains to capturing and forecasting skewness and asymmetric dependence they are more likely to be present for unconstrained investors than for short sales-constrained investors.
To compare the portfolio weights obtained using the different models we present Table 6 and Figures 8 and 9. Table 6 presents quantiles of the distribution of portfolio weights obtained using the different models for RRA = 7, and Figures 8 and 9 compare the time series of portfolio weights over the out-of-sample period. The results show that the weights from the unconstrained normal model are more aggressive than those from the NormCop or Gumbel models. For example, the latter two models yield median positions of being short 1 unit of the small cap index and long 1 to 1.3 units of the large cap index, while for the normal model the median position is being short almost 2 units of the small cap index and long 3 units of the large cap index. Figure 8 confirms that the Normal portfolio weights are almost always more extreme than the Gumbel portfolio weights. This is possibly due to the fact that the Gumbel model takes into account the fat tails, skewness, and asymmetric dependence of these assets. Negative skewness, fat tails, and dependence of the rotated Gumbel type will other make a risk-averse investor less aggressive in his/her portfolio decisions, as they all lead, other things being equal, to a higher probability of large negative moves for the portfolio. Figure 9 reveals that the NormCop and Gumbel portfolios have similar portfolio weights. Much of the difference in portfolio weights between the Gumbel and the Normal portfolios, then, was driven by the different marginal distribution assumptions.
|
|
|
Note that we do not test for the significance of the differences in portfolio weights directly in this article. Differences in portfolio weights are only economically interesting if they lead to differences in portfolio performance and so it seems more appropriate to test for differences using the metric of portfolio performance. We proceed to such tests in the next section.
To try to determine in more detail the causes of the differences in portfolio weights between the Normal and Gumbel portfolios we considered the following simple regression of the difference between the Gumbel and Normal portfolio weights on a constant and the nine parameters of the more flexible model (two expected returns, two volatilities, two skewness parameters, two kurtosis parameters, and one copula parameter):
![]() | (13) |
,
are the optimal Gumbel and Normal portfolio weights for RRA = 7. The optimal portfolio weights are a complicated, nonlinear function of the parameters of the joint density, and so the above regression is almost certainly misspecified.8 Further, for simplicity we ignore the fact that the variables on the right-hand side of Equation (13) are estimated, and so this regression will suffer errors-in-variables bias. Nevertheless, it may help highlight some of the causes of the differences in portfolio weights. To aid interpretation, Table 7 also presents the results of regressions of the individual portfolio weights on the regressors in Equation (13), though we will focus our discussion below solely on the regressions involving the difference in portfolio weights, presented in the last two columns.
|
The signs on the coefficients on expected returns indicate that the Normal portfolio weights reacted more strongly to changes in forecasted returns than the Gumbel portfolio weights. For example an increase in small cap expected returns lead to a larger increase in Normal portfolio small cap weight than Gumbel portfolio small cap weight, making the coefficient in the regression of the difference in portfolio weights negative. The significant positive (negative) coefficient on large cap volatility for the large (small) cap regression also reflects that fact that the Normal portfolio weights reacted more strongly than the Gumbel portfolio weights.
The negative (positive) sign on the degree-of-freedom parameter coefficient in the small (large) cap regression suggests that as the degree-of-freedom parameter decreases, reflecting an increase in the fatness of the tails, the weight in the small (large) caps in the Gumbel portfolio goes toward zero. The Normal portfolio weights were essentially uncorrelated with this parameter, as expected. That is, an increase in tail fatness led to a less aggressive Gumbel portfolio.
The dependence parameter was included in this regression to reflect the fact that at higher dependence levels the rotated Gumbel portfolio diverges more from the Normal copula (at independence these two copulas are identical), becoming more asymmetric. As the assets get more highly correlated the Gumbel portfolio placed more weight in the small caps and less in the large caps, which, over this sample period, brought both portfolio weights closer to zero. Thus greater dependence led to more conservative portfolio weights, reflecting the fact that stronger dependence in the rotated Gumbel copula leads to more negatively skewed portfolio returns. The Normal portfolio responded in precisely the opposite way to an increase in dependence: an increase in dependence led to a decrease in the weight in small caps and an increase in the weight in large caps, representing a more aggressive strategy.
3.3.4 Tests for superior portfolio performance
In this section we attempt to determine whether the differences in portfolio performances documented in previous sections are statistically significant. We present the results of two tests for superior performance: a bootstrap test of pairwise comparisons, and the reality check of White (2000), as modified by Hansen (2001). In all cases we employ the stationary bootstrap of Politis and Romano (1994). 9
We conduct pairwise comparisons by looking at the bootstrap confidence interval on the difference in the performance measures of two portfolios.10 Let the performance measure of portfolio i be µi. If the lower bound of the bootstrap 90% confidence interval of µi µj is greater than zero, then we take model i to be significantly better than model j. If the upper bound of the interval is less than zero, then we take model j to be significantly better than model i. If the confidence interval includes zero, then the test is inconclusive and we cannot statistically distinguish models i and j according to that performance measure. The results of these tests are presented in Table 8. In this table we include only the 50:50 portfolio of the three naïve portfolios to save space. The results from the pairwise comparisons involving this portfolio are representative of the results from comparisons involving the other two naïve portfolios.
|
Table 8 shows that the unconstrained Gumbel portfolio significantly outperformed both the Normal and the NormCop portfolios for all levels of risk aversion. Comparisons of the Gumbel with the 50:50 portfolio and the Uncond portfolio yielded no conclusive results. The Normal portfolio was beaten in every comparison by the 50:50 portfolio and the Uncond portfolio, and the NormCop was beaten by these portfolios in all but two comparisons. This is strong evidence against the Normal and NormCop portfolios for unconstrained investors.
For the short sales-constrained portfolios we find fewer significant comparisons. For all but one comparison we find that the Gumbel portfolio either significantly outperforms or has performance that is indistinguishable from the competing portfolios. In one comparison (for RRA = 1), the Normal portfolio significantly outperformed the Gumbel portfolio.
From the fact that the Gumbel portfolio generally significantly outperformed both the Normal and the NormCop portfolios, whereas the Normal and NormCop portfolios were generally indistinguishable, we can conclude that it was the capturing of asymmetric dependence rather than skewness that yielded the greatest gains for asset allocation. Note, however, that accurate modeling of the dependence structure relies on accurate modeling of the marginal distributions, and so even though capturing skewness alone does not appear helpful for this dataset, it is required for the gains from copula modeling to be realized.
Although the above results are useful for comparing the results of just two particular models, a more appropriate test would compare all models jointly. With this in mind we now present the results of the reality check test of White (2000). This is a test that a given benchmark portfolio performs as well as the best competing alternative model, where we have possibly many competing alternatives. We present the three estimates of the reality check p-values discussed in Hansen (2001) and focus our attention on the "consistent" p-value estimates. We reject the null hypothesis that the benchmark model performs as well as the best competing alternative model whenever the p-value is less than 0.10. In these tests we separate the two sets of models into unconstrained and constrained, and include the three naïve portfolios in both sets. Table 9 presents the results when the 50:50, Normal, and NormCop portfolios are taken as the benchmarks.
|
When comparing the 50:50 portfolio with the unconstrained model-based portfolios we are not able to reject that it performs as well as the best alternative for any level of risk aversion. Comparing the 50:50 portfolio with the short sales-constrained portfolios leads to a single rejection for the most risk-averse investor. From the second panel of Table 9 we see that we are able to reject the unconstrained Normal portfolio using four out of five levels of risk aversion. Table 9 similarly shows that we are able to reject the unconstrained NormCop portfolio for three out of five levels of risk aversion. However, the constrained Normal is only rejected once, and we are unable to reject the constrained NormCop portfolio for any level of risk aversion.
Overall these results support the results of the pairwise comparisons of unconstrained portfolios, that the Normal and NormCop portfolios yielded inferior returns over this period. However, when the investor is short-sales constrained these results suggest that Normal and NormCop portfolios are just as good as the best alternative, a conclusion supported by the economically small differences in the management fee reported in Section 3.3.2, and by the substantial overlap in selected portfolio weights reported in Section 3.3.3. Thus the benefits from modeling skewness and asymmetric dependence are only present for investors that are not short-sales constrained. For such investors, the gains range up to 27 basis points per year, and are generally statistically significant.
| 4 CONCLUSIONS AND FUTURE WORK |
|---|
|
|
|---|
In this article we considered the impact that skewness and asymmetric dependence have on the out-of-sample portfolio decisions of a CRRA investor, with a range of levels of risk aversion. Skewness in the distribution of individual stock returns has been reported by numerous authors in the last three decades. "Asymmetric dependence," of a form where equity returns exhibit greater dependence during market downturns than during market upturns, has been reported by Erb, Harvey, and Viskanta (1994), Longin and Solnik (2001), and Ang and Chen (2002), inter alia, and can be shown to induce negative skewness in the distribution of portfolio returns. It is known that any investor that exhibits nonincreasing absolute risk aversion, a very weak requirement, has a preference for positively skewed assets, ceteris paribus. Thus both of these asymmetries may, in theory, impact portfolio decisions.
We considered the problem of allocating wealth between the risk-free asset and the CRSP small cap and large cap indices, using monthly data from January 1954 to December 1999. We used the data up to December 1989 to develop the models and reserved the last 120 months for an out-of-sample evaluation of the competing methods. We used conditional distribution models that are able to capture time-varying conditional moments of up to order four, and we employed models of the dependence structure of these asset returns that allowed for greater dependence during market downturns than market upturns.
We found some economic evidence that the model capturing skewness and asymmetric dependence yielded better portfolio decisions than the bivariate normal model. The amount that one could charge a risk-averse investor for use of the most flexible density model rather than the bivariate normal model ranged between approximately 0 and 27 basis points per year. The most economically and statistically significant differences were for the unconstrained portfolios; the short sales-constrained portfolios were generally not substantially different. Our results suggest that both marginal distribution modeling and copula modeling have important implications for out-of-sample portfolio performance.
This article leaves unanswered a number of questions. We considered only two specific indices, and it would be interesting to compare the results obtained for other assets. Further, it would be of interest to extend the problem to that of multiple assets do the benefits of flexibly modeling the joint distribution of returns increase with the dimension of the distribution, or does parameter estimation error dominate? In this article we ignored the impact of parameter estimation uncertainty on the investor's optimization problem, and it would be interesting to determine how the results would change when this is taken into account. Finally, it would be of great interest to compare the results of the methods presented in this article with other parametric approaches, such as Ang and Bekaert (2002), and with nonparametric approaches, such as those of Brandt (1999) or Aït-Sahalia and Brandt (2001). All of these questions are left for future work.
| APPENDIX A: DETAILS OF THE OPTIMIZATION PROCEDURE |
|---|
|
|
|---|
In this short appendix we provide further information on the computation of the optimal portfolio weights. We used the in-sample period, January 1954 to December 1989, to determine the best-fitting density specification, and then followed the steps below for each month t in the out-of-sample period, January 1990 to December 1999. See Judd (1998) for some of the issues surrounding the use of Monte Carlo simulations to approximate objective functions containing integrals.
- Estimate the parameters of the density model using data available up until date t 1 and store the parameter estimate as
t. We used
t1 as a starting value for the estimation procedure.
- Generate n = 10,000 independent draws from the forecast density, H(
t), for month t:
- Generate n independent draws from the forecast copula for month t, denote these as (
t,i, Vt,i), for i = 1, 2, ..., n.
- Let Xt,i
F1(
t,i;
t) and
t,i
G1(Vt,i;
t), for i = 1, 2, ..., n, to obtain n draws from the forecast joint density, where F1 and G1 are the inverse cdfs of the models for Xt and Yt.
- Generate n independent draws from the forecast copula for month t, denote these as (
- Define the estimated optimal portfolio weights for month t, utility function
, and density forecast H(
t) as
where
The cut off 2.2204 x 1016 was chosen as this is "machine epsilon" for Matlab. The function























