Bad Outcomes: Statistical Significance vs. OOmpf

My co-blogger and I have recently been blogging about an old fallacy in economics. It is incredibly common for researchers to confound the size or importance of their result with the precision with which they measure it. For many years now, Stephen Ziliak and Deidre McCloskey have been standard bearers on this issue and their work has culminated in a book, The Cult of Statistical Significance. They have gone so far as to advocate against doing hypothesis testing, why not just report the confidence interval? After all, all of the information is contained in the confidence interval anyway. Rejecting a null hypothesis is just saying that some number isn't in the confidence interval. In Social Science and Medicine, it is especially bad, because regardless of theory or outside evidence, we almost always choose zero to be our null hypothesis. Even if eight studies have come before showing some statistically significant effect, we set the null naively at zero. If the ninth study shows no significant results, it is often interpreted as countervailing evidence to the studies that came before, when in fact, the estimates aren't statistically significantly different that the earlier studies. They are just measured with more error, so that zero ends up in the confidence interval.

Imagine you read a study on the effects of totally revamping the tax code, perhaps a massive simplification to the tax code and no taxes on capital accumulation. The study finds that such a move on average will raise the growth rate 1% a year. A one percent change in the growth rate is huge! Over a hundred years will be 1 and a half times richer. That's the difference between the U.S. and a country like Mexico or Croatia. Yet, you read further and see that the effect is measured with considerable error. The effect is somewhere between -1% and 3%. The effect is "not statistically significant." The headline of the study is often, "Large Changes to the Tax Code Show No Effect On Growth."

Then another study crosses your desk and finds for other dramatic changes to the tax code growth will rise .01% each year. But the effect is very precisely measured, the 95% confidence interval is between .005% and .015%. This study is titled, "Large Changes to the Tax Code Significantly Increase Economic Growth." Yet, "significance" has taken on this strange meaning. If there is any cost associated with these massive changes, it will quickly eat into any policy relevance. The changes in study one is where the action is. The potential is huge. The study should make us much more interested in those changes, rather than much less.

Ziliak and McCloskey go through many examples in published articles pointing out this phenomenon. Often example one is a drug that is "shown not to work," and example two is a costly procedure or a drug who's statistically significant benefits are quickly eroded by the economically significant side effects. "We found that this drug lowered the weight of subjects by five pounds, a number statistically significantly different from zero. We also found that the rate of heart attacks tripled, though this was not statistically significant."

We have a term "economic significance" that tries to capture this idea. It is quite an awkward term in this context. I just used it to describe a costly side effect. The authors introduce the term OOmpf, which is itself a bit clumsy. The mere fact that we don't have a term of art across all Sciences to describe the importance of a point estimate a part from the statistical significance is quite telling.

I must say I have renewed appreciation for Ziliak and McCloskey's mission in recent days. Thomas Schelling sums up the feeling of exasperation well on the books cover, “McCloskey and Ziliak have been pushing this very elementary, very correct, very important argument through several articles over several years and for reasons I cannot fathom it is still resisted. If it takes a book to get it across, I hope this book will do it. It ought to.”

Tuesday, October 29, 2013

Statistical Significance vs. OOmpf

No comments:

Post a Comment

Contributors