There has recently been a somewhat of a crisis in Social Sciences research. When a group of researchers attempted to replicate many studies published over the years, including some rather famous ones, they found that many could not be replicated. The results of these studies were likely the result of chance.
When a study is conducted, the researchers will conduct statistical tests that seek to find out whether they found any true effects. Did the treatment work better than placebo? Did group A behave differently than group B?
Studies that produce the expected effects are far more likely to be published than studies that do not. Part of the reason for this is because the researchers with non-significant results tend to not even seek publication. They may assume they made a mistake, or be hesitate to publish if their results are inconsistent with the current literature, or they assume others will be uninterested in the null results (Easterbrook et al, 1991). Indeed, Publication Bias is oftentimes referred to as the “File Drawer Problem”, because researchers failing to find significant effects file away their results. Another part of the reason is because publishers tend to prefer studies that are novel and that produce significant results. Studies that fail to confirm hypotheses are deemed less desirable than those finding significant results. In fact, given studies of equal quality, those with statistically significant results are 3 times more likely to be published than papers with null results (Dickersin, Chan and Chalmers, 1987).
Coined by the statistician Theodore Sterling, Publication Bias refers to the tendency of publication to depend on the significance and direction of effects found. Publication Bias has profound implications for science.
To understand why, it is important to realize that in any study there is a chance that a statistically significant effect will show up when in truth there is none. This is known as a Type I Error. It is the chance of falsely inferring the existence of an effect that in fact does not exist. A false alarm is a Type I error. A study showing that a useless drug works is a Type I error. By convention we accept a 5% risk of making this error, which means that if conducting a similar experiment 100 times, we would expect 5 of them on average to be false alarms.
Now, say that you want to know whether hypnosis can help reduce pain. Say further that the only studies investigating this question that get published are the ones producing significant results. You only see the published studies. You have no idea of how many studies were conducted that produced null effects. Knowing only the successes but none of the failures makes it impossible to know whether the success was a fluke of chance or not. This is likely a major reason for the replication crisis I described earlier. Who knows how many times a study was conducted, failed to produce significant results and filed away never to see the light of day?
As a researcher, the absence of studies investigating a particular question may lead you to believe that the research has not yet been conducted and so off you go to do so, unaware that it may have been tried a dozen times, yielding failure each time.
References:
Dickersin, K.; Chan, S.; Chalmers, T. C.; et al. (1987). “Publication bias and clinical trials”. Controlled Clinical Trials. 8 (4): 343–353.
