Testing Cointegration for Czech Stock Market

Based on cointegration analysis of daily data of the most liquid Czech stock from September 1, 1997 to February 28, 2007, a long run equilibrium relationship was revealed to exist between prices of stocks of Komerční banka (KB), České energetické závody (CEZ) and Unipetrol (UNPE). Prices time series of these stocks have a unit root and are cointegrated. There is a unique combination of these stocks which is mean reverting and can be used to achieve statistical arbitrage. However, in order to exploit this possibility, a number of challenges need to be dealt with. Investors should take into account the speed of the mean reversion rate, the size of the variation and the stability of the out of sample behaviour of this combination of these stocks.


Introduction
Unit root and cointegration analyses are an important achievement in econometric theory that has become one of the central interests of economists and financists in the last decade.On the financial market, while the price of an individual stock may fluctuate randomly, some combinations of several stocks show that they may share a long-term equilibrium.In this case such behaviour can be used to realize a statistical arbitrage profit.Although the application of this technique is widespread around the world for various aims, we have not seen it to be used for Czech stock market so far.To fill this gap, the main objective of this paper therefore is to determine the linkages between prices of the most liquid stocks at the Czech stock market.Such linkages would enable a trading strategy which could lead to the mentioned arbitrage profit at the Czech stocks market.This paper is divided into three sections.In section one, a summary of cointegration analysis is given.In section two, a cointegration technique is applied to test the linkage between prices of the most traded stocks on the Czech stock market, namely they are stocks of Telefonica O 2 (previously named Český Telecom) (abbreviated SPTT), Komerční banka (KB), České energetické závody (CEZ) and Unipetrol (UNPE).The results of the unit root ADF test of chosen stock prices time series and the cointegration test of a successful combination are shown in this section.In the last section, brief comments on the results of our analysis and some conclusions are presented.

Cointegration Analysis Summary
The definition in the simple case of 2 time series t x and t y , that both are integrated of order one (   1 I ) and means that the process contains a unit root is the following: t x and t y are said to be cointegrated if there exists a parameter  such that is a stationary process.
This turns out to be a breakthrough invention of looking at time series because many financial time series behave that way.The first thing to notice is that financial series behave like   second more important thing to notice here is that they seem to drift in such a way that they do not drift away from each other.The reason unit roots and cointegration is so important is the following.Let's consider the regression: Firstly, if t x is a random walk and that t y is an independent random walk (so that t x is independent of s y for all s ), then the true value of since it is now a regression of a stationary variable on a stationary variable, the difficulties mentioned before are overcome and classical statistical theory can be applied.Secondly, let's assume that t x is a random walk and the t y is another random walk, such that (1) holds true for non-zero 1  , but with the error term t u following a unit root distribution.In this case we can still get inconsistent estimates and we need to estimate the relation (2).This may also be called a spurrious regression, even though there actually is a relationship between t x and t y .Finally, now let's consider that (1) holds with a stationary error term.This is exactly the case where x and y are cointegrated.In this case, 1  is not only consistent, but converges to the true value at rate T and the OLS estimater is superconsistent.In the case where t x is a simple random walk and t u is serially uncorrelated we find that  

 
, where 1 B a 2 B are independent Brownian motions.
This limiting distribution has a mean of zero, but more importantly the standard t-test is asymptotically distributed.In the situation where there is serial correlation or t x may be correlated with s u for some s we may not get a symmetric asymptotic distribution of   From now on we will only treat the CI(1,1) case, which will be referred to simply as cointegration.Most applications of cointegration methods treat that case, and it will allow for a much simpler presentation to limit the development to the CI(1,1) case.
For many purposes the above definition of cointegration is too limited.It is true that financial and economic series tend to move together, but in order to obtain a linear combination of the series, that is stationary, one may have to include more variables.
The general definition of cointegration for the    is stationary.This also implies that one can normalize one of the coefficients to one -but only in the case where one is willing to impose the a priori restriction that this coefficient is not zero.As far as the identification of the matrix  is concerned, it is also clear that if  is a cointegrating matrix, then ' F  is also a cointegrating matrix for any non-singular matrix F .

Cointegration in the Autoregressive Representation
The general VAR(k) model can be written as: as considered earlier.
If  is equal to zero, this means that there is no cointegration.This is the model that is implicit in the Box -Jenkins method.The variables may be   1 I ; but it can be easily cured by taking differences (in order to achieve the usual asymptotic distribution theory).
If  has full rank then all t y must be stationary since the left hand side and the other right hand side variables are stationary (since we limit ourselves to variables that are either   The most interesting case is when  has less than full rank but is not equal to zero.This is the case of cointegration.In this case  can be written as '

  
where  and  are x n r matrix. and  are only identified up to non-singular transformations since for any non-singular F. This lack of identification can sometimes render results from multivariate cointegration analysis impossible to interpret and finding a proper way of normalizing  (and thereby  ) is often the hardest part of the work. can be interpreted as a speed of adjustment towards equilibrium.

Cointegration in the Moving Average Representation
The multivariate Wold representation states that the stationary series t y  can be written as: which, according to the Beveridge -Nelson decomposition, can be written as where t S is the n-dimensional random walk Now is stationary, which implies that    is equal to 0. This gives another characterization of cointegration that is useful for testing.
One can show that the representation (1) can be reformulated in the case of cointegration as where * t S is the (n-r)-dimensional random walk.This is called a common trend representation in Stock and Watson (1988), and this representation can be also used as the basis for cointegration tests.

The Engle -Granger test
The most well known test, suggested by Engle and Granger (1987), is to run a static regression (after first having verified that t x a t y are both   where t x is one or higher dimensional.The asymptotic distribution of  is not standard, but the test suggested by Engle and Granger was to estimate  by OLS and the test for unit roots in ˆ' Since the unit root tests test the null hypothesis of a unit root, most cointegration tests test the null of no cointegration.Unfortunately, the limiting distribution of, for example, t-test, does not have the limiting distribution tabulated by Dickey andFuller (1979, 1981).The limiting distribution does, however, resemble the Dickey -Fuller distribution even though we need a separate table for each dimension of the regressor.Typically, we will allow for dynamics in the residual and perform the equivalent of the ADF test (using the slightly different critical values in this case).Such a procedure is usually called a cointegration ADF test, or CADF test for short.Engle and Granger compared different tests and recommended the CADF test and supplied critical values based on Monte Carlo simulations in the case of just one regressor.They also extend those tables to the case of more than one regressor, and the most complete tables can be found in McKinnon (1991).

If the t
x series contain a trend or may contain a trend, then one should be careful to include a trend in the cointegration regression, otherwise the asymptotic critical values will be different.In the case of a one dimensional t x , that include a deterministic trend, a regression of t y on t x that does not include that the trend will give you an asymptotically normal coefficient (this is not too surprising since deterministic trend always will dominate a unit root trend).

Estimation of the Parameters in Case of Cointegration
The issue of efficient estimation of parameters in cointegrating relationships is quite a different issue from the issue of testing for cointegration.Engle The fact that  is superconsistent implies that the parameters of the lag polynomials have the same (asymptotically normal) distribution as they would have if  had been known.

The Johansen ML Estimator
The best way of testing for unit roots is by using the system maximum likelihood estimator of Johansen (1988Johansen ( , 1990) is a test for cointegration restrictions in a VAR representation.This estimator also gives asymptotically efficient estimates of the cointegrating vectors (the betas) and of the adjustment parameters (the alphas).Johansen's method is the maximum likelihood estimator of the so-called reduced rank model.We will start with the AR (k) model: which under the assumption of cointegration of order k can be written as The maximum likelihood estimator of  and  is a function of these residuals.
Johansen shows that  can be found from choosing the eigenvectors   In order to find those eigenvalues left-and right-multiply the equation above by 1/ 2 kk S  and we get the equivalent problem: The eigenvalues will be the ones that we are looking for.After being normalized, the eigenvectors (say i u ) such that '  1 i i u u  , so we get: In order to give some interpretation of this equation remember that the least squares  can be obtained by regressing 0t R on kt R .So the least squares estimate of  is : Now we note that :

S S S S S S S S S S S S S S S S S
The intuitively natural approach would be to consider the eigenvalues of We can see that this is a function of the estimated eigenvalues where all the eigenvalues except the largest r eigenvector are set equal to zero.So for example, the test for one cointegrating vector against no cointegrating vector consists of testing whether the largest eigenvalue is significantly different from 0. Johansen further finds Note that the null hypothesis here is that there are   p r  unit roots.This corresponds to the simple residual based test previously, where we have p = 2 (if the X variable is one dimensional), and we test for 1 cointegrating relation, the null is then that there are 2 unit roots.This test statistic is often referred to as the trace statistic.Note that this statistic is expected to be close to 0 if there is at most r (linearly independent) cointegrating vectors.Another test that is often used is the  -max test which looks at -the idea being that if the   1 r  th eigenvalue can be accepted to be zero, then all the smaller eigenvalues can also.This test is a test of 1 r  cointegrating vectors against r cointegrating vectors.
The asymptotic distribution of the likelihood ratio test is a functional of multivariate Brownian motion and is tabulated for values of p.Of course, sometimes we do not really want to test whether there is, say 3 cointegrating vectors against no integrating vectors, rather we want to make a decision as to what is the number of cointegrating vectors.In the situation where we directly want to test 1 r  cointegrating vectors against r cointegrating vectors, we could use the  -max test, but this test will not give us a consistent way of deciding the cointegration rank.A consistent way to do this, using the trace test, is to start by testing for zero cointegrating vector.If we reject zero cointegrating vectors, then we test for at most 1 cointegrating vectors.If this is not rejected, we stop and decide that r = 1 -if we reject this, we move on until we can no longer reject and stop there.
Even though there is a constant in the error correction representation (13), this may not translate into a deterministic trend in t y .Johansen (1991) invented the likelihood ratio test for reduced rank in the case where there is a constant in the Error Correction Model but no trend in t y .Johansen also discusses how to obtain a consistent test for the number of stochastic trends and for trend in t y at the same time: in this case, all we have to do is to move the vector of ones in to kt Z and delete it from 1t Z .

Application of Cointegration Technique on Czech Stocks
In this section, the cointegration of Czech stocks is tested on the most liquid stocks.They are stocks of Telefónica O 2 Česká republika (SPTT), České energetické závody (CEZ), Komerční banka (KB) and Unipetrol (UNPE).The daily data set of SPTT, CEZ, KB and UNPE from September 1 st 1997 to February 28 th 2007 is obtained from Reuters Česká republika.Another reason for choosing them is the existence of long time series needed for the analysis.The whole data set of these stocks includes 2384 daily observations of closing prices adjusted for dividends and splits.The courses of stocks prices in time are shown in Fig. 1 and their basic descriptive statistics are in Tab. 1.Although the ADF test and the cointegration test can be carried out straightaway on prices time series of the stocks of our interest, they were actually completed on their log price series in the statistical package Eviews 5 as recommended by Oomen (2006) for the purpose of stabilizing behaviour of these series.Logarithmization only reduces the scales of our original time series, but it does not change their character.The null hypothesis of the ADF test always is: a time series has a unit root.The results of the ADF test are shown in Tab. 2 to Tab. 5.The results of ADF test show us that we can not reject the null hypothesis that our time series of interest have a unit root at a level of significance of 5% (p-values always are higher).Another thing worth noticing is their differences do not have a unit root according to p-value in these tables.All stock prices time series of our interest have unit root, therefore they are all   1 I processes and they might be cointegrated.Their cointegration of all combination from 2 to 4 of these stocks (there are 11 combinations of 2 to 4 stocks which can be made from 4 stocks of our interest) is tested by Johansen's procedure and only one combination of stocks CEZ, KB and UNPE is found to be cointegrated.The results of the cointegration test of this combination are shown below (Tab.6 to Tab. 10).Both the trace tests and the maximum eigenvalue tests of Johansen's procedure have detected the presence of cointegration in price time series of stocks CEZ, KB and UNPE at level of significance 5% and the null hypothesis that there is no cointegration between them that can be rejected.A combination consisted of -3.722893 of stock of CEZ, 1.780315 of stock of KB and 4.155830 of stock of UNPE can be made and its behaviour in time will be mean-reverting and can be used for arbitrage purposes.

Conclusion
On the financial market, on one hand, the price of an individual stock may fluctuate randomly, as the efficient market hypothesis postulates.On the other hand, a certain combination of those stocks may share a long-term equilibrium when they are cointegrated.In this case, such relationship can be used to develop a statistical arbitrage profit.To determine the linkages between prices of the most liquid stocks on Czech stock market, cointegration analysis is a powerful tool and was used to detect a longterm equilibrium among them.For this purpose, the most liquid Czech stocks with long time series were chosen.They are stocks of Telefónica O 2 (previously named Český Telecom) (abbreviated SPTT), Komerční banka (KB), České energetické závody (CEZ) and Unipetrol (UNPE).

MIX
To find out the cointegration relationship among them, first, all time series of stocks prices on the Czech stock market of our interest have been tested for unit roots by an ADF test to determine whether they are   1 I processes.The presence of a unit root in all chosen time series was confirmed statistically.After that, all combinations of chosen stocks from 2 to 4 stocks were made, they are 11 in all and the Johansen's procedure of the cointegration test was carried on them.As far as the results of the cointegration test are concerned, cointegration in time series of stocks prices of CEZ, KB and UNPE was detected both by the trace test and the maximum eigenvalues test at the level of 5% of significance.This result enables us to state that even though the time series of stock price of CEZ, KB and UNPE are   1 I processes and they may fluctuate randomly, however, their combination is definitely a   0 I process which is mean reverting and can be used for an arbitrage trading strategy.This combination is following: MIX = -3.722893CEZ + 1.780315 KB + 4.155830 UNPE, where CEZ, KB and UNPE are prices of these stocks respectively as stated before.To be understood the long-term equilibrium equation with ease, the term -3.722893 CEZ is moved from the righthand side to the lefthand side of the equiriblium equation and it can be written as following: MIX + 3.722893 CEZ = 1.780315KB + 4.155830 UNPE.
The long-term equiriblium value of combination MIX should be 0. It means that if the value of 1.780315 KB + 4.155830 UNPE is higher than the value of 3.722893 CEZ, then the value of combination MIX is positive.In this case, it will be falling to zero in the future.Therefore, we buy an amount of 3.722893 CEZ stocks and sell an amount of 1.780315 KB + 4.155830 UNPE and an arbitrage profit can be made in the future.If the value of our combination MIX is is negative now, it will be increasing to zero later and we should buy the righthand side of the equation and sell the lefthand side given the original pattern of the individual stock behaviour is unchanged over the course of time.The behaviour of time series MIX during the course of time is shown in Figure 2. A long period from 500 th trading day to 1850 th trading day, when the value of the portfolio was fluctuating around zero, can be seen at the figure.
To effectively exploit this mean reverting behaviour of combination MIX, some challenges should be dealt with.First of all, it would be more useful for investors if time series revert to their mean so quickly as possible because in this case, investors can realize the trading strategy sooner rather than later.Secondly, it will be more comfortable to them if the variation of the combined time series is large enough so that trading opportunities could be executed when the returns cover all transactional and informational costs.The last thing to be mentioned is the stability of this pattern in the future.We have only investigated the in-sample behaviour of stock prices of our interest.Of course, they may not follow this pattern in the future.Then this pattern may not be useful to investors in the future at all.

1 I
processes, e. i. they seem to drift all over the place; but the  The support from the Czech Science Foundation under the Grant 402/03/H057 is gratefully acknowledged.* Ing.Tran Van Quang -research fellow; Center for Research in Economic Dynamics and Econometrics, Faculty of Finance and Accounting, University of Economics, Prague, nám.W. Churchilla 4, 130 67 Prague 3, Czech Republic; <tran@vse.cz>.
 .Since the estimated eigenvalues are continuous random variables they are different from zero and each from other with a probability of 1.And it can be intuitively clear now that we want the eigenvectors corresponding to the non-zero eigenvalues to be the estimators of the cointegrating vectors.

2 kk
that this is actually what the maximum likelihood algorithm does apart from the fact that  has been normalized by left-multiplying by1/ r against the unrestricted model where  has full rank p is

Figure 1 :
Figure 1: The course of stock prices over time

Figure 2 :
Figure 2: Behaviour of combination of CEZ, KB a UNPE over the course of time nad Granger suggested the following simple two-step estimator.First, we estimate the static

Table 3 : Summary of ADF unit roots test for KB
*MacKinnon (1996) one-sided p-values.