E-Book Content
Chapter 1
Regression Models 1.1 Introduction Regression models form the core of the discipline of econometrics. Although econometricians routinely estimate a wide variety of statistical models, using many different types of data, the vast majority of these are either regression models or close relatives of them. In this chapter, we introduce the concept of a regression model, discuss several varieties of them, and introduce the estimation method that is most commonly used with regression models, namely, least squares. This estimation method is derived by using the method of moments, which is a very general principle of estimation that has many applications in econometrics. The most elementary type of regression model is the simple linear regression model, which can be expressed by the following equation: yt = β1 + β2 Xt + ut .
(1.01)
The subscript t is used to index the observations of a sample. The total number of observations, also called the sample size, will be denoted by n. Thus, for a sample of size n, the subscript t runs from 1 to n. Each observation comprises an observation on a dependent variable, written as yt for observation t, and an observation on a single explanatory variable, or independent variable, written as Xt . The relation (1.01) links the observations on the dependent and the explanatory variables for each observation in terms of two unknown parameters, β1 and β2 , and an unobserved error term, ut . Thus, of the five quantities that appear in (1.01), two, yt and Xt , are observed, and three, β1 , β2 , and ut , are not. Three of them, yt , Xt , and ut , are specific to observation t, while the other two, the parameters, are common to all n observations. Here is a simple example of how a regression model like (1.01) could arise in economics. Suppose that the index t is a time index, as the notation suggests. Each value of t could represent a year, for instance. Then yt could be household consumption as measured in year t, and Xt could be measured disposable income of households in the same year. In that case, (1.01) would represent what in elementary macroeconomics is called a consumption function. c 1999, Russell Davidson and James G. MacKinnon Copyright °
3
4
Regression Models
If for the moment we ignore the presence of the error terms, β2 is the marginal propensity to consume out of disposable income, and β1 is what is sometimes called autonomous consumption. As is true of a great many econometric models, the parameters in this example can be seen to have a direct interpretation in terms of economic theory. The variables, income and consumption, do indeed vary in value from year to year, as the term “variables” suggests. In contrast, the parameters reflect aspects of the economy that do not vary, but take on the same values each year. The purpose of formulating the model (1.01) is to try to explain the observed values of the dependent variable in terms of those of the explanatory variable. According to (1.01), for each t, the value of yt is given by a linear function of Xt , plus what we have called the error term, ut . The linear (strictly speaking, affine1 ) function, which in this case is β1 + β2 Xt , is called the regression function. At this stage we should note that, as long as we say nothing about the unobserved quantity ut , (1.01) does not tell us anything. In fact, we can allow the parameters β1 and β2 to be quite arbitrary, since, for any given β1 and β2 , (1.01) can always be made to be true by defining ut suitably. If we wish to make sense of the regression model (1.01), then, we must make some assumptions about the properties of the error term ut . Precisely what those assumptions are will vary from case to case. In all cases, though, it is assumed that ut is a random variable. Most commonly, it is assumed that, whatever the value of Xt , the expectation