Nonstationarity and Differencing of Macroeconomic Data
Many macroeconomic time series have the property that they are increasing over time, on average. Examples
include gross national product, consumption, money supply, the S&P500 index, and so on.
Other macroeconomic series have the property that a "best guess" as to tomorrow's
value is often today's value, or that tomorrow's value is today's value plus some fixed constant.
Examples of variables that fit loosely within this class include prices, interest rates, exchange rates, and so on.
It also turns out that according to many statistical tests,
gross national product, consumption, money supply, and the S&P500 index also fit within this class.
The class of models is called "random walks".
In both of the above cases (which turn out to contain a huge number of economic variables, including not only those
listed above, but also unemployment, inventory, sales, capital, and wages, to name but a few)
it turns out that standard statistical procedures cannot be validly applied to the analysis
of the relationships between the economic variables, when the variables are treated in
levels form (i.e. when the variables are used as reported, without first differencing, etc.).
This situation arises as the means and variances of these types of variables are not constant over time.
Hence standard hypothesis testing, confidence interval construction, and analysis of coefficients
of determination (R squared values) are all invalidated.
For example, it turns out that if two variables can be characterized as random walks, then the R squared value describing the
linear goodness of fit between the two variables will approach 1 (a perfect fit!), EVEN if the variables actually
have nothing to do with each other. This phenomenon is called SPURIOUS REGRESSION, and
it was pointed out in the 1970s by a famous economist (Clive W.J. Granger) that the
probably accounts for the fact that prior to 1970, economists almost always found that
any variables which they regressed on each other were almost perfectly linearly correlated. (e.g. The famous case of the very high correlation between alcohol consumption and
the wages of university professors.)
This problem, and related problems arise because
standard statistical procedures rely on an assumption of constant means and variances, in many cases.
Fortunately, the means and variances of the series being examined can usually be made
constant by simply taking the first difference of each variable. Alternatively, the
first difference of the logs of the variables can be used. This amounts to using the growth rates
of the variables (instead of using the levels of the variables), and thus is
naturally interpretable an an economic quantity of interest.