Disentangling Crashes from Tail Events

The study of tail events has become a central preoccupation for academics, investors and policy makers, given the recent financial turmoil. However, the question on what differentiates a crash from a tail event remains unsolved. This article elaborates a new definition of stock market crash taking a risk management perspective based on an augmented extreme value theory methodology. An empirical test on the French stock market (1968-2008) indicates that it experienced only two crashes in 2007-2008 among the 12 identified over the whole period.


Introduction
Stock market crashes are some of the most fascinating subjects in finance. However, there is no unique definition of a crash. Financial literature on that topic creates some confusion on rare events usually called extremes, crashes or crises. Even if it not straightforward to define a priori a crash, the investors may be able to identify a posteriori a crash through its panic effect; it can have been induced by systemic risk, liquidity risk, regulatory risk or even trading algorithm risk. For example, the biggest intraday drop in the history of the Dow Jones index on May 6, 2010 was interpreted by market participants as a crash. Indeed, the computer-automated trades cause a total drop of 9.16% from the previous day's close; however, the market rebounded to close down by 348 points. The Extreme Value Theory (EVT) has widely documented 1 ways in which extreme events can be quantified. A general discussion on the application of EVT to finance is proposed by Embrechts, Klüppelberg and Mikosch (1997), McNeil (1999), Coles (2001), and Beirlant, Goegebeur, Segers and Teugels (2004). However, Longin (1993) remains the only one in this literature to address explicitly the question of the existence of crashes. Longin (2001) addresses the same question by applying EVT to two sub-samples; one sub-sample of the so-called crashes and an other sub-sample from other minima. He finally concludes that crashes and non-crashes have no statistical differences because they are drawn from the same unconditional distribution of extremes. This conclusion may have closed the debate, earlier than expected, in this literature. Indeed, if the conclusion is not questionable within this literature 2 , the application of EVT to raw returns may cause a problem to identify real crashes from large negative returns. Actually, the investors are not symmetrically affected by a negative return that comes from a high volatility period or by an equal negative return that comes from a low volatility period. Indeed, investors are more cautious during high volatility period and may panic much more if an extreme event occurs during a low volatility period due to the surprise effect. For this reason, this article considers the role of volatility in the crash definition. The first question to address is why is it useful to identify a crash? Because this may helps investors, regulators and policymakers to differentiate warning signals according to their scale level; for these same reasons, the NBER defines expansion cycles and recession cycles in the US since 1854. For instance, the recognition of crash events may justify intervention policies from economic agents with the right timing. For example, would the identification of a crash in 2007 after the collapse of the U.S. Housing Bubble have forced the policy makers to save Lehman Brothers and avoid the huge international volatility spill over of 2008? Recently, Brockmeijer et al. (2011) recommend that "a risk measure breaching a given threshold would prompt policy makers to provide a policy response". However, all crashes do not lead to a macro-economic downturn; for example, the 1987 stock market crash did not generate an economic contraction. Therefore, if all crashes affect the risk aversion of investors, they do not equally affect the economic business cycles. The second question is how can we define a crash versus a tail event? Defining a tail event is straightforward. Indeed, it corresponds to any return located in the tails of the distribution; an adverse tail event represents an negative extreme return for a long position and a positive extreme return for a short position. In addition, if a crash (anti-crash) corresponds to a negative (positive) extreme return, the reverse is not true. Indeed, the largest negative return during a bullish period will surely not be a crash; for example, the minimal return of the S&P 500 stock index during year 1999 is -2.85%. This article introduces a new definition of stock market crash that is risk-management oriented; per hypothesis, stock market crash is defined as being sudden, significant and brief: Sudden event. This corresponds to a price variation that is independent of the current volatility regime. It refers to a high-return shock during a period of low volatility and not to a smallreturn shock during a period of high volatility. Given the asymmetric nature of volatility, returns and volatility are negatively related in equity markets; this relation is more pronounced for large negative returns. When the volatility is high, financial markets over-react to bad news (See e.g. Black (1976), Campbell and Hentschel (1992), Beckaert and Wu (1992) for the so-called "leverage effect" and "feedback effect" hypotheses); this over-reaction is characterized by large volumes of sell orders during stress periods, which contribute to exacerbate downside volatility (see more recently Gabaix, Gopikrishnan, Plerou and Stanley (2003) on the relationship between large fluctuations in prices, trading volume and the number of trades.). In contrast, when the volatility is low, bad news can drive markets into unexpected collapse; this was the case after the heart attack of the U.S. President Eisenhower on September, 26 1955 with a one-day drop of 6.85% on the S&P 500 stock index. Therefore, contrary to the common sense, crashes may happen when the volatility is the lowest.
Significant decline. This corresponds to a price variation whose magnitude is high (domestic crash) and induces systemic risk throughout the world financial markets, increasing the stock market index correlation levels (international crash). Equity markets react not only to their domestic political and economic factors but also to trading pressures around the world. Such was the case on October 19, 1987 with a decline of 22.89%; this one-day drop is comparable to the percentage drop that occurred over October 28 and 29, 1929 with a respective decline of 12.82% and 11.73%.
One-day horizon. Shiller (1988) notices that "The concentration of attention on 1987 as a unique year in stock market history is to some extent an artifact of the one-day interval chosen." But if the shock is sudden and significant, it is almost impossible to hedge a portfolio within a one-day horizon. In that sense, the one-day interval choice is no longer artificial. Choosing a longer period for identifying a crash remains possible from a macro-economic perspective, but it will be no longer relevant for a risk management perspective because the crash is supposed to be sudden. Choosing a shorter period for identifying a crash is limited to the existence of circuit-breakers; for example, trigger levels for a market-wide trading halt are set at 10%, 20% and 30% of the Dow Jones index; another example in France, when the price movement of a share exceeds 10% from the quoted price at the close of the previous market day, quotation is suspended for 15 minutes. These trading curbs limit physically the possibility of intra-day crashes. In addition, trading pressures may induce several intra-day trading halts whose global effect will appear in the daily closing price. Therefore, we argue that it is almost impossible to hedge a portfolio against a crash within a time period of one day or less. This article develops a new definition of stock market crash. An augmented extreme value theory methodology is employed. In addition, an application to the French stock market index is provided, using the longest daily time series ever used  for this country. This choice is motivated by a long history of crashes illustrated by recent literature 3 . The remainder of this article is organized as follows. Section 2 synthesizes the theoretical background of extreme value theory used in the empirical section. Section 3 presents an analysis of the data, the filtering process and its economic implications. Section 4 analyses the tail distributions to differentiate crashes from extremes. Section 5 gives the conclusions.

Tail distribution
A theorem from Balkema and de Haan (1974) and Pickands (1975) shows that when the threshold u is sufficiently high, the distribution function F u of the excess beyond this threshold can be approximated by the Generalized Pareto Distribution (GPD) This limit distribution has a general form given by where β ≥ 0 and where x ≥ 0 when ξ ≥ 0 and where 0 ≤ x ≤ −β/ξ when ξ < 0 . β is a scaling parameter, ξ is the tail index and x is a random variable corresponding to stock index returns. The tail index is an indication of the tail heaviness, the larger ξ, the heavier the tail.
Let's consider x m as the return level that is exceeded on average once every m observations. The return level measure (Gumbel (1941)) highlights the effect of extrapolation, which is useful for forecasting, even if scarcity produces large variance estimates. Let ζ u be the probability of exceeding the threshold u. Return levels are expressed in annual scale so that the N -year return level is the level expected to be exceeded once every N years. n 250 is the average number of trading days per year with m = N × n 250 . It comes for the N-year return level:

Threshold selection
Threshold selection is subject to a trade off between finding a high threshold where the tail estimate has a low bias with a high variance or finding a low threshold where the tail estimate has a high bias with a low variance. The standard practice, to overcome the balance between bias and variance, requires adopting the lowest threshold as possible. There are two families of approaches for threshold selection. The first one is visual inspection and the second one is automatic selection. Visual selection denotes a plausible threshold choice based on the results of the two or more plot methods. Adaptive selection denotes the application of an automated method for optimal threshold selction. We follow Gumbel (1958), Embrechts, Kluppelberg and Mikosch (1997) and Coles (2001) who suggest visual inspection methods for the observation of the tail region. Two approaches exist. The first approach corresponds to explanatory techniques carried out before the model estimation. The second approach corresponds to the assessment of the parameter estimate stability while fitting various models through a range of different thresholds. The mean residual life plot is the first visual method.
with n u representing the number of exceedances over the threshold u. The mean residual life plot allows for distinction between thin and heavy tailed distribution for which a positive slope is associated.
The threshold plot is the second visual inspection method that consists in fitting the GPD over a range of thresholds. This method allows observation of both the parameter estimate stability and variance.
The second approach denotes the application of an automated method which aims at minimizing the Asymptotic Mean Squared Error (see Beirlant et al. (2004), Section 4.7 ii). For the optimal threshold selection method, let's consider an ordered sample of size n u , X nu ≤ ... ≤ X 1 with X nu being the n th u upper order statistic. The Hill (1975) estimator is defined by It has been popular for the optimal threshold 4 to be estimated such that the bias and variance of the estimated Hill tail index vanish at the same rate where the mean squared error is asymptotically minimized. Usually, AMSE is obtained throughout a sub-sample bootstrap procedure. This paper follows Beirlant, et al. (2004) approach who develop a criterion for which the AMSE of the Hill estimator is minimal for the optimal number of observations in the tail.

Data description
The database 5 consists of 10,017 daily stock prices that span the period of time from September 30, 1968 to December 31, 2008. The French stock index known as the "Indice Général CAC" has been recomputed after the 1987 stock market crash and renamed as "CAC 40". The closing price was set to 1000 on December 31, 1987. The longest data set available from NYSE-Euronext begins from January 5, 1962; however, from this date to September 13, 1968, the frequency of the stock index is weekly. For this reason, we choose to begin the period study from September 30, 1968 until the end of year 2008. We note the presence of stock index return autocorrelation for any given lag; in addition, the Q-statistics indicate the presence of strong heteroskedasticity. This study makes use of filtered daily data 6 in order to apply EVT techniques to independent and identically distributed observations given that the fat-tailedness of returns stems from the fat-tailedness of innovations.

Data filtering process
We examine all the possible specifications within five lags. We test 25 specifications of ARMA(p,q) models with p = 1, ..., 5 and q = 1, ..., 5 in addition to 25 specifications with ARMA(p,q) + GARCH(1,1). We select the more parsimonious model. Four criteria are used for comparison: the log-likelihood value, the Akaike criterion, the autocorrelogram of residuals and squared residuals and the ARCH effect test. We take care of the trade off between parsimony and maximizing criteria. We find that the ARMA(2,4) + GARCH(1,1) model produces the best fit. We then test an alternative model that allows for leverage effects by considering the contribution of the negative residuals in the ARCH effect. The ARMA(2,4) + TGARCH(1,1) model improved the fit in all considered criteria. Define the market log-returns as {R t } t=1,...,T with T = 10,017 daily observations. The ARMA(2,4) + TGARCH (1,1) specification is then given as follows With the innovations t being functions of Z t and σ t Where the standardized returns Z t are independent and identically distributed, such as Where F Z is an unknown distribution of Z. The purpose of the time-varying σ t is to capture as much of the conditional variance in the residual t in order to leave Z t approximately independent and identically distributed As the MA(1) term is not statistically significant, we remove it from the model and set θ 1 = 0 . The results for the maximum likelihood estimation of this model are displayed in Table 1. This model provides very good fit according to the selected criteria; all the model's parameters are highly statistically significant. We therefore extract the standardized returns {Z t } t=1,...,T using a time-varying volatility model. Figure 1 shows the evolution of the CAC 40 stock index (1) prices, (2) volatility, (3) raw returns and (4) standardized returns. The discussion is hereafter restricted to the standardized return maxima +Z and minima −Z.

Crash event detection
The filtering process contributes to answer the question of how to identify a crash event from the past. Indeed, from an economic perspective, it corresponds to a transformation of daily raw returns into daily standardized (devolatized) returns. This transformation helps to identify tail events independent of the associated volatility regime. It is clear that this transformation allows the disentanglement of a crash from another tail event, whose magnitude may be amplified by the high level of volatility. More generally, according to our hypothesis, the stock market crash requires being: Sudden It means a price variation independent of the current volatility regime. It refers to a high-return shock during a period of low volatility and not to a small-return shock during a period of high volatility. As a consequence, crashes are supposed to happen when the volatility is the lowest. Given that standardized returns Z t are independent and identically distributed, such as Significant It means a price variation whose magnitude is high. This magnitude effect can be captured by a jump in the volatility process. Indeed, asymmetric volatility is a striking phenomenon in equity markets. More precisely, the leverage effect characterizes a negative relation between past returns and conditional volatility. Therefore, a decline in realized returns will be followed by an asymmetric increase in the conditional volatility. In addition, volatility of stock price changes is directly related to the rate of flow of information 7 . This jump volatility effect can be given by International crash: It induces contagion effect throughout international financial markets, increasing the stock market index correlation level. A leading U.S stock index such as the S&P 500 can be considered as a benchmark for international correlation measure. The conditional correlations are derived indirectly, in multivariate GARCH models, from the ratio of the covariance and the product of the roots of the conditional variances 8 . However, various multivariate GARCH specifications remain cumbersome. Engle (2002) proposes a Dynamic Conditional Correlation model (DCC) with a two step procedure; the first step requires the GARCH variances to be estimated univariately. Their parameter estimates remain constant for the next step; the second step parameterizes the conditional correlations directly and maximizes the log-likelihood function. Engle (2002) finds that DCC model is often the most accurate among the multivariate GARCH model family. The contagion effect is given by Where ρ is the time varying conditional correlation level between conditional volatility changes of Domestic crash: One-day time period ∀t ∈ [1, ..., T ] , δ t = 1/252.

Quantile regression
The reverse transformation for computing raw returns from standardized returns is also very useful for understanding the economic meaning of the statistical inference drawn from EVT results. Indeed, many articles applying EVT to standardized returns leave the reader with few economic interpretations of the results extracted from the filtered series. Therefore, we need to transform the standardized returns into "equivalent" raw returns. As recent literature, to our knowledge, does not propose any solution, this article proposes a linear transformation based on a semi-parametric technique that extends the ordinary least squares regression model to conditional quantiles. Indeed, while the great majority of regression models are concerned with analyzing the conditional mean of a dependent variable (standard ordinary least squares); quantile regression (Koenker and Bassett (1978)) permits a more complete description of the conditional distribution. It can be used to measure the effect of covariates, not only in the center of a distribution, but also in the upper and lower tails. Therefore, quantile regression is the ideal tool for estimation of conditional quantiles of a response (raw returns), given a vector of covariates (standardized returns). As a consequence, a quantile regression is implemented 9 for the left tail and another one for the right tail. The choice of the percentile level corresponds to the two thresholds selected in section 4.1. This complementary methodology refers to the augmented extreme value theory approach. Table 1 presents the descriptive statistics of the standardized log-returns. The Jarque-Bera statistics yield a strong rejection of the normality hypothesis. The comparison of empirical percentiles from 1% to 99% with those implied by the normal distribution shows a clear departure, which means that such a realization would have no probability of existing in a Gaussian framework. For 5, 10 and 20 9 The linear conditional quantile function can be estimated by solvingβn u (q) = argmin β(q) n i=1 ςq (Ri − Ziβ(q)} where the check function which weights positive and negative values asymmetrically for any quantile 0 < q < 1 is ςq(v) = v (q − I(v < 0)) where I(.) denotes the indicator function. For the left tail (respectively the right tail), the intercept is -0.0047 (respectively 0.0033), the slope coefficient is 0.0105 (respectively 0.0099). The parameters are all statistically significant at the level of 1%. The adjusted R-squared value is 0.5715 (respectively 0.6082). The sum of squares errors between raw returns and "equivalent" raw returns is 9.15% (respectively 10.86%). Details are given upon request.

Descriptive statistics of filtered data
lags, the Q-statistic is distributed as a χ 2 5 , χ 2 10 and χ 2 20 with 95% critical values of 11.07, 18.31 and 31.41. The correlogram for the filtered series shows no more dependence because the Q-statistics for the series are lower than the critical values. Short-term serial dependence remains significantly below the confidence interval at 95%. Engle's Lagrange multiplier (LM) test statistic shows that there is no more evidence of remaining ARCH effects at any lag.

Threshold selection
This article proposes a complete approach for threshold detection, mixing visual inspection and automatic selection. The mean residual life plot 10 reveals a critical threshold around +1.0 for +Z and -1.50 for −Z. The threshold plot 11 exposes a stability region of the tail index parameter in an interval between +1.0 and +2.0 for +Z, and between -0.50 and -1.50 for −Z. Optimal threshold selection yields estimates very close to the mean residual life plot. The optimal threshold 12 is around +0.95 (or the 84.80 th percentile) for +Z and -1.38 (or (1-0.9246) i.e., 7.54 th percentile) for −Z. It corresponds, respectively, to a number of upper order statistics of 1,522 and 755 out of 10,014 observations. The threshold values computed from the optimal algorithm respond to the criteria of stability and sufficient exceedances with minimum variance. Table 2 summarizes the results for the threshold selection. Due to the convergence between the three approaches, we consider the threshold optimal values in this study.

Tail event identification
The critical threshold of +0.95 for +Z and -1.38 for −Z corresponds to the entry point of the right and left tail standardized distributions. Applying the quantile regression technique allows to find the "equivalent" thresholds in terms of raw returns; indeed, the threshold for the right tail becomes +1.28% (upper order statistics of 1,013) and for the left tail becomes -1.92% (upper order statistics of 444). This means approximately that beyond +1.5% and below -2.0%, the French CAC 40 stock index becomes extreme. This threshold selection is required for the GPD estimation. Table  3 displays the results for the GPD when considering the optimal threshold values. The maximum likelihood estimators of the GPD are the values of the two parameters (ξ,σ) that maximize the log-likelihood. The tail index value of +0.1439 for −Z is statistically significant, in contrast to that of +Z. The positive sign confirms the presence of fat-tailedness for the lower tail. Indeed, the larger the tail index, the more fat-tailed the distribution. This tail index value indicates that the CAC 40 standardized returns stem from a distribution with finite variance, skewness, and kurtosis. The upper tail has a tail index close to zero, indicating moderate tail behavior belonging to the Gumbel-type domain of attraction. These results are fully consistent with the Q-Q plot 13 . The scale parameters of +Z and −Z are statistically significant and with the same dispersion. Table  4 displays the return levels 14 for both positive and negative standardized residuals with confidence intervals. We note, unsurprisingly, that the return levels for negative returns are higher in comparison to the positive ones; it confirms the asymmetric nature of the distribution. The 100-year return level, for the left tail, corresponds to a value of -8.21% with an asymmetric confidence interval. The 95% confidence interval is obtained from the profile of log-likelihood as [-9.98%, -6.44%]. This corresponds to an "equivalent" raw return of -9.13% with a 95% confidence interval of [-10.99%, -7.26%]. Therefore, the maximum crash expected to be exceeded once every century is -10.99%. Table 6 reveals that the 1981-05-13 event is the biggest crash of the sample with a standardized return of -11.66%. It is consistent with results reported in Table 5. The next day, the volatility reaches its highest level ever at 95%. The contagion effect remains relatively high despites its domestic origin. The 1991-08-19 event is the second biggest crash of the sample. They can be visually identified in Figure 1 (lower right corner); the second crash does not appear in the raw returns graph (lower left corner); this means that relying on raw returns to identify crashes turns out to be misleading; these two first order crashes correspond to a natural cut-off; in addition, they both have a political connotation. The 1989-10-16 event is the third biggest crash of the sample. These two events do not appear in Table 5 Table 6 gives an example of the over-representation of year 2008 in terms of lowest raw returns and highest volatilities. Precisely, 21 (50) out of the 100 worst raw returns (highest volatilities) of the sample belong to year 2008. The 2001-09-11 event is the eleventh biggest crash; the U.S. market was closed at that date. The 1987-05-15 event is the twelfth biggest crash; this date corresponds to a percursor of the 19 October 1987. The remaining negative standardized returns have very low contagion effect (rank > 100); as a consequence, we can limit the number of crashes as 12 among 755 negative tail events.

Result summary
1) The tail area begins from +1.5% for the right tail and -2.0% for the left tail.
2) The CAC 40 stock returns distribution has an asymmetric nature: left tail distribution has a General Pareto form and right tail distribution is exponential.
3) The maximum crash, expected to be exceeded once every century, is -10.99%. 4) Over the 40-year period, 12 crashes are identified (2 in 2007-2008) in comparison with 755 negative tail events; in addition, the 2007-02-27 marks the beginning of the subprime crisis. 5) The magnitude of the recent banking crisis is very important in terms of raw returns because the French market experienced its highest level of volatility in 2008.

Conclusion
This article elaborates a new definition of stock market crash. The methodology is based on a combination between extreme value theory and quantile regression. An application to the French stock market is provided using the longest daily time series ever used . The general contribution is to set up a definition of stock market crash that is risk management-oriented. The empirical contributions are three-fold: First, among 50 possible candidates, the ARMA(2,4) + TGARCH(1,1) structure appears to offer the best fit for explaining the return-generating process of the French stock index. Second, both visual inspection techniques and recent automated threshold selection procedures are applied to identify the tail region of the standardized returns. This represents one of the most complete approaches for threshold selection. Third, a quantile regression technique is applied to convert standardized returns into "equivalent" raw returns in order to provide economic interpretation of empirical results from extreme value theory. Finally, the main policy-oriented conclusion is that crashes happen when the volatility is the lowest.           in the discount interest rate in order to slow double digit inflation rate. Banks raise their prime loan rate on October 9. The consequence is a small investor's panic that leads Dow Jones Industrial Average to fall by 26.48 points this day. On 1979-10-10, the New York Stock Exchange has a record 81.6 million shares; it affects by the same way the Paris Stock Exchange. The peak of early January 1980 marks the onset of a recession in US that spreads to Europe.
On 2007-02-27, the Federal Home Loan Mortgage Corporation (Freddie Mac) announces that it will no longer buy the most risky subprime mortgages and all mortgage-related securities. This government agency was not able to profit from diversification effect because of the legal constraint imposed on its investment choices. This explains its huge risk exposition during the credit market meltdown. This date marks the beginning of the subprime crisis. The date of 1986-05-26 does not have any correspondence in the French Regulator archive; this corresponds exactly to one month after the explosion of the Chernobyl plant; note that the Paris Bourse started to implement an electronic trading system in May 1986. US stock markets are closed on this Memorial Day. The date of 2008-01-21 was characterized by a sharp decline in all non-US equity markets due to successive negative information. US stock markets are closed on this Martin Luther King Day. By Monday 21 January, the Société Générale bank had already discovered the total exposure of 50 billion euros made by one of his traders Jerome Kerviel. After having informed the Banque de France and the French financial market regulator, the decision to liquidate the position was taken; the CAC 40 stock index plunged by 7.08% the same day. The next day, with no information about this event, the U.S. Fed decided to prevent a similar crash in its domestic market and decided to cut the interest rates sharply. Note finally the weak daily impact of the Black Monday (1987-10-19) on the French market. It represents the 36th lowest raw return (-4.76%) over the 40-year period. However, its impact is significant in terms of cumulative returns. In addition, during year 1987, the jump volatility ratio is highest on 1987-10-16. It can be interpreted as an alert signal. The US stock markets are closed on World Trade centre attacks the 2001-09-11; they stay closed the rest of the week. In conclusion, by excluding the tail events with low contagion effect (rank > 100), we identify 12 crashes among 755 negative tail events.