April 01, 2016
2 min read
Save

Publication Exclusive: Breaking down the sham of statistical significance

You've successfully added to your alerts. You will receive an email when new content is published.

Click Here to Manage Email Alerts

We were unable to process your request. Please try again later. If you continue to have this issue please contact customerservice@slackinc.com.

What does P < .05 actually mean? I am asking you, the reader, right now. This is not a rhetorical question. Take just a moment, and really try to formulate an answer.

If you said: “P < .05 means that there is less than a 5% chance that a study’s results are due to randomness alone,” then you are wrong. And not just wrong in a semantic, pedantic or quibbling sort of way. You are spectacularly, blindingly and egregiously wrong; wrong in a way that reflects a complete misunderstanding of the statistical concept.

You are also wrong if you said: “There is a 95% chance that the study’s findings are true/the null hypothesis is false/the observed results will be replicated.” In fact, according to how the average experiment is powered, less than half of all studies with P < .05 will have their results replicated if run a second time.

All of this should concern you, that — despite years of accumulated encounters with this mathematical index and its ubiquitous presence in nearly every major ophthalmology journal — it is totally unclear, to basically everyone, what this metric actually means.

Somehow, the story gets even worse. Not only are the concepts of P values and statistical significance confusing and obscure, but — even properly applied and understood — they contain almost no practically useful information. Specifically, they do not distinguish between “real” and “random” results, are conspicuously absent from the journals of the “hardcore” sciences such as physics and chemistry, and have been the ongoing public target of professional statisticians for decades. So what is going on? What is a P value? And if it is so bad, why is it everywhere in medicine?

What is a P value?

The first thing to know is that the starting presumption of all tests for statistical significance is that the null hypothesis is true: namely, that a proposed intervention has no effect. P values answer this question: Assuming that the null hypothesis is true, how often would results as extreme as the study’s appear? What P values provide is the probability of the data given the hypothesis of pure randomness, not the probability of the hypothesis (that the intervention has no effect) given the data. In other words, P values tell you the opposite of what you want to know. P < .05 simply indicates that the observed results are rare, not that they were not randomly generated. Therefore, isolated P values are unable to distinguish the effects of an intervention from random chance. This failure alone is enough to recommend abolishing their use, especially considering that this tool — while mostly unhelpful — is nevertheless frequently mistaken as the single most important indicator of a study’s scientific validity.

Click here to read the full publication exclusive, The Young Ophthalmologist, published in Ocular Surgery News Europe Edition, March 2015.