Deemo(Yizhou) Chen's Observatory

# Use Bayers Theorem to Show Most Published Research Findings Are False

U

In 2005, John P. A. Ioannidis published a somewhat controversial titled “Why Most Published Research Findings Are False”, but the logic and reasonings are subtle and thoughtful. In the paper, John pointed out one of the problems with the current research methodologies is the heavy dependence on a “convenient, yet ill-founded strategy of claiming conclusive research findings solely on the basis of a single study assessed by formal statistical significance”, that is, the “gold standard” of p-value <= 0.05. And the effect of “bias”, a “combination of various design, data, analysis, and presentation factors that tend to produce research findings when they should not be produced,” again pushes this situation to the limit. A such claim could be proved by the following modeling:

Let’s declare some random variables:

Positive: The researcher get a positive result from the experiment

True: the relationship the experiment trying to show is indeed true.

False: the relationship the experiment trying to show is false.

P(False)=1-P(True)

What we care about is the probability of the hypothesis is true when the researcher gets a positive result–P(True|Positive)

\beta is the probability of a type II error, so we can derive P(Positive|True) = 1 - \beta, also the probability of a Type I error \alpha=P(Positive|False)

So we can derive the above-mentioned probability using Bayers Theorem, that is P(True|Positive)=\frac{P(Positive|True)P(True)}{P(Positive|True)P(True)+P(Positive|False)P(False)} = \frac{(1-\beta)P(True)}{(1-\beta)P(True)+\alpha(1-P(True))} — Eq1

Let’s say V = number of findings that have a relationship, X = number of findings that have no relationships, and let R denote the ratio of V/X. And the probability of having a such relationship P(True) can be found by V/(V+X)=R/(R+1). So Eq1 could be rewritten to be \frac{(1-\beta)R}{(1-\beta)R+\alpha}

Let u be the percentage of research findings published because of bias(otherwise would not have). One can derive P(True|Positive)=\frac{(1-\beta)R+u\beta R}{R+\alpha-\beta R + u-u\alpha +u\beta R}. We can see as u increases, P(True|Positive) decreases.

With 1-\beta=0.20, R=1/5, u=0.2 in Table 4 of John’s paper, we can see P(True|Positive) is merely 0.23. Along with other examples provided, the majority of the examples could not have a P(True|Positive) value of more than 50% which could unravel some of the possible problems with many of the research experiment designs.

As a result, before conducting an experiment, it is important to analyze some of the statistical parameters of the situation.

Reference:

Ioannidis, John P. A. (2005). Why Most Published Research Findings Are False. PLoS Medicine, 2(8), e124–. doi:10.1371/journal.pmed.0020124