Thursday, January 26, 2012

A brief parable of over-differencing

The Grumpy Economist has sat through one too many seminars with triple differenced data, 5 fixed effects and 30 willy-nilly controls. I wrote up a little note (7 pages, but too long for a blog post), relating the experience (from a Bob Lucas paper) that made me skeptical of highly processed empirical work.

The graph here shows velocity and interest rates.  You can see the nice sensible relationship.

(The graph has an important lesson for policy debates. There is a lot of puzzling why people and companies are sitting on so much cash. Well, at zero interest rates, the opportunity cost of holding cash is zero, so it's a wonder they don't hold more. This measure of velocity is tracking interest rates with exactly the historical pattern.) 

But when you run the regression, the econometrics books tell you to use first differences, and then the whole relationship falls apart. The estimated coefficient falls by a factor of 10, and a scatterplot shows no reliable relationship.  See the the note for details, but you can see in the second graph  how differencing throws out the important variation in the data. 

The perils of over differencing, too many fixed effects, too many controls, and that GLS or maximum likelihood will jump on silly implications of necessarily simplified theories are well known in principle. But a few clear parables might make people more wary in practice.  Needed: a similarly clear panel-data example.