On Small Sample Properties of Permutation Tests: A Significance Test for Regression Models∗ Hisashi Tanizaki Graduate School of Economics, Kobe University (
[email protected])
ABSTRACT In this paper, we consider a nonparametric permutation test on the correlation coefficient, which is applied to a significance test on regression coefficients. Because the permutation test is very computer-intensive, there are few studies on small-sample properties, although we have numerous studies on asymptotic properties with regard to various aspects. In this paper, we aim to compare the permutation test with the t test through Monte Carlo experiments, where an independence test between two samples and a significance test for regression models are taken. For both the independence and significance tests, we obtain the results through Monte Carlo experiments that the nonparametric test performs better than the t test when the underlying sample is not Gaussian and that the nonparametric test is as good as the t test even under a Gaussian population.
1 Introduction In the regression models, we assume that the disturbance terms are mutually independently and identically distributed. In addition, in the case where we perform the significance test on the regression coefficients, we assume that the error terms are normally distributed. Under these assumptions, it is known that the ordinary least squares (OLS) estimator of the regression coefficients follows the t distribution with n − k degrees of freedom, where n and k denote the sample size and the number of regression coefficients. As the sample size n increases, the t distribution approaches the standard normal distribution N(0, 1). From the central limit theorem, it is known that the OLS estimator of the regression coefficient is normally distributed for a sufficiently large sample size if the variance of the OLS estimator is finite. However, in the case where the error term is non-Gaussian and the sample size ∗ This research was partially supported by Japan Society for the Promotion of Science, Grants-in-Aid for Scientific Research (C) #18530158, 2006–2009.
1
is small, the OLS estimator does not have the t distribution and therefore we cannot apply the t test. To improve these problems, in this paper we consider a significance test of the regression coefficient that includes the case where the error term is non-Gaussian and the sample size is small. A nonparametric test (or a distribution-free test) is discussed. Generally we can regard the OLS estimator of the regression coefficient as the correlation between two samples. The nonparametric tests based on Spearman’s rank correlation coefficient and Kendall’s rank correlation coefficient are very famous. See, for example, Hollander and Wolfe (1973), Randles and Wolfe (1979), Conover (1980), Sprent (1989), Gibbons and Chakraborti (1992) and Hogg and Craig (1995) for the rank correlation tests. In this paper, the permutation test proposed by Fisher (1966) is utilized, and we compute the correlation coefficient for each of all the possible combinations and all the possible correlation coefficients are compared with the correlation coefficient based on the original data. This permutation test can be directly applied to the regression problem. The outline of this paper is as follows. In Section 2, we introduce a nonparametric test based on the permutation test, where we consider testing whether X is correlated with Y for the sample size n. Moreover, we show that we can directly apply the correlation test to the regression problem without any modification. In Section 3, we compare the powers of the nonparametric tests and the conventional t test when the underlying data are non-Gaussian. In the case where k = 2, 3 is taken for the number of regr