A Note on the Power of Panel Cointegration Tests-An Application to Health Care Expenditure and GDP

The main reason for this change of interests is the extended use of panel data starting from the 1990s [2,3,13]. Advantages and concerns relative to the use of panel data are discussed in various papers [15,22]. One of the main issues raised by the use of panel data is the problem of cointegration of non-stationary variables. Various papers have tried to solve such issue but have reached inconclusive evidence on the matter both on country-by-country and panel tests [6,12,14,23].

The main reason for this change of interests is the extended use of panel data starting from the 1990s [2,3,13]. Advantages and concerns relative to the use of panel data are discussed in various papers [15,22]. One of the main issues raised by the use of panel data is the problem of cointegration of non-stationary variables. Various papers have tried to solve such issue but have reached inconclusive evidence on the matter both on country-by-country and panel tests [6,12,14,23].
The main goal of this paper is to show how the choice of most powerful test actually depends on the empirical values of the statistics. In order to that, we proceed as follows. We first verify the existence of a cointegrating vector of non-stationary health care expenditure and GDP and then we compare the power of three panel cointegration tests to assess which is the most powerful.
Country-by-country and panel stationarity tests and country-bycountry and panel cointegration tests are performed on a panel of 20 OECD countries over the period 1971-2004. Residual-based tests and a cointegration rank test in the system of health care expenditure and GDP are used to test cointegration. Asymptotic normal distribution of these tests allows a straightforward comparison. For some values of the sample statistics, residual-based tests and the cointegration rank test are not directly comparable as the power of the residual-based tests oscillates; for other values of the sample statistics, the residual-based tests and the cointegration rank test are directly comparable and the rank test is more powerful than the residual-based tests.

Data Description
Health care expenditure (HE) and GDP are measured in national currencies and expressed in 2000 prices (deflated by GDP deflator). Both variables are taken from the OECD Health Dataset [24]. The starting dataset contains a list of 30 OECD countries with annual data covering the period 1960-2005. Due to missing data, Czech Republic, France, Greece, Hungary, Italy, Korea, Mexico, Poland, Slovak Republic and Turkey have been excluded and the remaining unbalanced panel dataset has a total of N=20 countries. The total number of observations is T=34 years. Both variables are expressed in logarithm and per capita. As graphical analysis for HE and GDP shows that both series contain a linear trend, 1 this characteristic is incorporated both into the model specification and into the tests. Description of the model specification and tests is provided in various papers [15,22,[25][26][27][28][29][30][31]. Therefore, only results for unit root, stationary and cointegration tests are presented.

Results on Stationarity and Cointegration Tests
Results for country-by-country unit root test (labelled ADF test) and panel unit root test [30] are presented in Table 1.
For HE the unit root hypothesis is only rejected for Germany and Portugal on the 1%, 5% and 10% level, for Australia and Switzerland on the 5% and 10% level, and for Belgium on the 10% level. For GDP the unit root hypothesis can be rejected for three countries (Austria, Belgium and USA) on the 5% and 10% level, and for three countries (Denmark, Finland and Switzerland) on the 10% level. The panel results fail to reject the I(1) hypothesis for both HE and GDP. In order to check for possible multicollinearity, a country-by-country stationarity test [32] and a panel stationarity test [27], are also performed and reported in Table 2.
For HE the stationarity hypothesis is rejected for six countries (Australia, Austria, Luxembourg, Portugal, Spain, and Switzerland) on the 1%, 5% and 10% level, for five countries (Canada, Denmark, the Netherlands, Sweden and UK) on the 1% and 5% level, and for the rest on the 1% level. For GDP the stationarity hypothesis cannot be rejected for Ireland, while it is rejected for nine countries (Belgium, Canada, Denmark, Finland, Portugal, Sweden, Switzerland, UK and USA) on the 1%, 5% and 10% level, for four countries (Luxembourg, New Zealand, Norway, and Spain) on the 1% and 5% level, and for the rest on the 1% level. The panel results reject the hypothesis of stationary series for both HE and GDP.
Results relative to country-by-country tests of no cointegration and panel tests of no cointegration (labelled Engle-Granger procedure and IPS test for homogeneous panels and LLC test by [29] for heterogeneous panels, respectively) and country-by-country tests of cointegration and panel tests of cointegration [28,[33][34][35], are presented in Tables 3 and 4, respectively 2 .
From Table 3, the hypothesis of no cointegration is rejected for ten countries on the 1%, 5% and 10% level, for USA on the 1% and 5% level, and for nine countries on the 1% level. The panel results fail to reject the no cointegration hypothesis.
From Table 4, the hypothesis of cointegration cannot be rejected for all countries except Denmark, Japan, New Zealand and Sweden. The cointegrating rank is determined by the sequential likelihood ratio trace test procedure. Using tests at the 5% level, a rank r=1 is found for 16 countries, indicating that HE and GDP are cointegrated. For the remaining four countries the selected rank is r=0, which indicate that HE and GDP are not cointegrated for these countries. For the panel rank test the hypothesis that the largest rank in the panel is r=0 is rejected, but the hypothesis of a largest rank r=1 cannot be rejected.
According to the results in Tables 3 and 4, HE and GDP are cointegrated around linear trends for the sample of OECD countries.

Power of the Panel Tests of Cointegration
We assume that the estimated values of the three panel tests of co-integration reported that at the bottom of Tables 3 and 4 (IPS, LLC and LLL tests) are the true values of the statistics associated to the data generating process (DGP). The fact that all tests are normally distributed allows comparisons.
We use the sampsi STATA command to draw the power function of the three tests. This command estimates the required power of a test comparing the characteristics of the DGP and the sample. Therefore, for each panel test of cointegration sampsi command tests whether the value of the sample statistics is equal to the value of the statistics associated to the DGP, given the level of significance of the test (α=0.05), the size of the population and sample, and the standard deviation.
When the value of the sample statistics is equal to the value of the statistics associated to the DGP, the test has the minimum power of 0.05. For values of the sample statistics different from the value of the statistics associated to the DGP, power of the test increases up to maximum power of 1.
Power of the three tests is represented in Figure 1. The sample statistics takes values between -8.1 and 2.5. Panel tests are directly comparable only for certain values of the statistics. For values between -7.3 and -5.1 the residual-based tests and the rank test are not directly comparable as the power of residual-based tests oscillates. For values between -7.3 and -6.1, the IPS test is more powerful than the LLC test, and vice versa for values between -6.1 and -5.1. Over the interval -5.0 and -3.75, power is not defined, while between -3.75 and -3 only the residual-based IPS test and the rank LLL test are directly comparable: the latter is more powerful than the former.   [27] test. Serial dependence in the disturbances is taken into account in the [27] test using a Newey-West estimator of the long run variance.

Conclusion
This paper offers an alternative way to compare power of panel tests of cointegration based on the comparison between values of the sample statistics and statistics associated to the DGP. The choice of the most powerful test depends on the values of the sample statistics. Both residual-based tests and a cointegration rank test are asymptotically normally distributed, which allows a straightforward comparison.
For some values of the statistics, the residual-based LLC test is more powerful than the IPS test, and vice versa for other values. For those value of the statistics such that residual-based test and the rank test is comparable, LLL test is more powerful than residual-based LLC test. Therefore, the choice of the most powerful test is not only an empirical matter but also an open issue without a clear-cut choice.