Nonstationarity and Differencing of Macroeconomic Data


Many macroeconomic time series have the property that they are increasing over time, on average. Examples include gross national product, consumption, money supply, the S&P500 index, and so on. Other macroeconomic series have the property that a "best guess" as to tomorrow's value is often today's value, or that tomorrow's value is today's value plus some fixed constant. Examples of variables that fit loosely within this class include prices, interest rates, exchange rates, and so on. It also turns out that according to many statistical tests, gross national product, consumption, money supply, and the S&P500 index also fit within this class. The class of models is called "random walks". In both of the above cases (which turn out to contain a huge number of economic variables, including not only those listed above, but also unemployment, inventory, sales, capital, and wages, to name but a few) it turns out that standard statistical procedures cannot be validly applied to the analysis of the relationships between the economic variables, when the variables are treated in levels form (i.e. when the variables are used as reported, without first differencing, etc.). This situation arises as the means and variances of these types of variables are not constant over time. Hence standard hypothesis testing, confidence interval construction, and analysis of coefficients of determination (R squared values) are all invalidated. For example, it turns out that if two variables can be characterized as random walks, then the R squared value describing the linear goodness of fit between the two variables will approach 1 (a perfect fit!), EVEN if the variables actually have nothing to do with each other. This phenomenon is called SPURIOUS REGRESSION, and it was pointed out in the 1970s by a famous economist (Clive W.J. Granger) that the probably accounts for the fact that prior to 1970, economists almost always found that any variables which they regressed on each other were almost perfectly linearly correlated. (e.g. The famous case of the very high correlation between alcohol consumption and the wages of university professors.) This problem, and related problems arise because standard statistical procedures rely on an assumption of constant means and variances, in many cases. Fortunately, the means and variances of the series being examined can usually be made constant by simply taking the first difference of each variable. Alternatively, the first difference of the logs of the variables can be used. This amounts to using the growth rates of the variables (instead of using the levels of the variables), and thus is naturally interpretable an an economic quantity of interest.