In a previous post, I extolled the virtues of Bayesian statistics. Bayesian statistics is often flouted as the superior alternative to p-values, and its popularity is growing rapidly in psychology. In that post I discussed how one of the advantages of Bayesian analysis is it allows us to assess evidence in support of a certain hypothesis, given the data (rather than what the p-value gives you, which is something quite different, as I discuss in this previous post).

However, that post neglected one critical aspect of Bayesian analysis: priors. In Bayesian statistics, you take your **prior **belief about something, then you view some data, and Bayes’ rule formally states how your beliefs should be updated in light of the new data; this updated belief is called the **posterior**. The prior not only states what your prior belief is, it also states how strongly you believe it. For example, if deciding whether a coin is fair or not, most people would have a strong prior that it is fair (i.e., that p(head) = 0.5), because most coins you encounter are fair. However, some people might believe in aliens, but the prior for this would be pretty weak. Strong priors require a lot of strong opposing evidence to overcome, whereas weak priors can be overcome with weaker evidence.

At this point, frequentist psychologists (i.e., those strongly committed to “traditional” statistics with p-values and null-hypothesis testing) will be shaking their head in fervent disgust. They will say “You shouldn’t allow your beliefs enter into scientific analysis; I want **objective **results, not **subjective **results! Do they have a point? I would argue NO!

**Why we’re all Bayesian already**

We are all—yes, even *you*—Bayesian. We let our beliefs (priors) enter our assessment of “data” all the time. Let’s use an example. If I told you I had roast chicken for dinner this weekend, your prior would be likely in the form of believing me; therefore, you would not require much evidence from me (or any at all) to be convinced that I had roast chicken. However, if I told you I had roast chicken for dinner, *and aliens joined me for desert*, you would immediately demand photo evidence, video recordings, and a whole-host of scientific evidence to convince you it was true.

Priors *should* enter our decision-making process, and be paired with the data to reach rational conclusions. As another example, if I flipped a coin 10 times and it came up heads 8 times, you would likely not change your prior that much from believing it is a fair coin. However, if you knew the coin was from a magic shop, your prior would be that the coin is biased. These priors should be used for coming to conclusions.

## An Example from Psychology

Below is a screen-grab from a paper I have read recently. It shows a meta-analysis of a certain psychological phenomena of interest. For those unfamiliar, a meta-analysis is a statistical aggregation of all known published results about a particular effect of interest. It allows for a much more precise estimate of the effect size of the particular phenomena of interest. Meta analyses are important, because individual studies have error (sampling, experimental, and measurement, with residual errors also) associated with their reported results.

The figure below shows a recent meta-analysis. Each diamond reflects the estimated effect size for the phenomena under investigation in each study. The lines represent 95% confidence intervals around this effect size. Confidence intervals that overlap with zero do not provide convincing evidence that the effect size reported is different from zero.

The diamond at the bottom (circled in red) is the meta-analysis effect size; that is, the estimate of effect size when the data is aggregated across all studies (this is a very simplified, and ultimately incorrect, summary of what the meta-analytic effect size represents, but it serves our purposes). You will note that the 95% confidence intervals aren’t even close to zero, suggesting that the effect under investigation is “real”.

Now, to prove you are all Bayesian. Answer the following two questions:

**1) How strongly do you believe in this final positive effect size if you knew the studies were investigating whether working memory capacity is related to intelligence?**

**2) How strongly do you believe in this final positive effect size if you knew the studies were investigating whether people are able to perceive future events? This is known as “psi”**

Now, I believe that most—if not all—would take the first question at face value, and would have faith in the final effect size. You all either have no real prior for the relationship between working memory and intelligence, or you have a prior that there is indeed a relationship. Therefore, you aren’t really surprised (or perhaps interested) in the result of the meta-analysis.

However, I would be shocked if any had faith in the final effect size knowing that it provides “positive evidence” for people being able to see into the future. That is, you all have—and if you don’t, you really should have—a strong prior **against** believing in being able to perceive future events reliably.

**The point is, in both cases you are viewing the same data. In the former, you would accept the data without much problem. In the latter case, you still think the results don’t provide convincing evidence. **That is, you are Bayesian. Accept it. Embrace it. Move forward.

## Bem’s Psi Work

This meta-analysis is not made up; it’s from a paper by Cornell psychologist Daryl Bem. You can view the full (as yet, unpublished) manuscript here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2423692

This paper has confirmed my prior belief (pun intended) that Bayesian analysis is the way forward in tackling how to interpret data. Priors entering the decision-making equation is not a weakness, but a strength. Despite Bem’s best efforts with this paper and his prior work, I am still not convinced that people can “feel the future”. To me, psi doesn’t exist. A frequentist should be convinced by the above meta-analysis—after all, the p-value for psi is very low!

A strong Bayesian approach to statistical decision-making, with well-defined (and defendable) priors, makes you robust against wacky claims, even if the data is “strong”.

**“Extraordinary claims require extraordinary evidence”**

—Carl Sagan (quite likely a Bayesian)

Thanks for this comment about that meta-analysis. I agree pretty much with every word you wrote here, but I think in the interest of full disclosure it must also be acknowledged that in this study Bem et al used Bayesian hypothesis testing in addition to frequentist statistics. It all boils down to what your prior is of course and how much the evidence should sway us into believing that precognition is real.

For me one of the fundamental points that gets ignored is how small the actual effect sizes are. In most experiments there are typically 16 or 18 trials in the critical conditions. The above-chance performance amounts on average to a fraction of a trial above chance. Of course individual performances presumably vary between people above and below chance – but either way I hardly find this convincing enough to update your beliefs no matter how significant this is.

The argument is often made that any small effect size should suffice to prove precognition if only it is reliable. But I think the smaller the effect sizes the more probable it is that the true effect is obscured by artifacts. 53% correct out of 18 trials per person simply isn’t sufficiently compelling evidence.

Hello, Sam – thanks very much for your comment! Yes, indeed Bem did report Bayesian analysis in this report. I left this part out as the main point of the article was to show the benefits of a Bayesian approach; this was spurred by recent discussion with a colleague in my department who has a very strong dislike of Bayesian analysis due to its subjectivity. When my colleague was talking, the Bem paper sprang instantly to mind as an excellent example of when priors really do make a difference.

But you are correct that Bem reported some Bayesian analysis, and even concluded that it strongly supported psi. I wasn’t overly convinced, however, of how dismissive the paper was of Wagenmaker’s and colleagues’ shift in prior given this new evidence and their previous Bayesian analysis of Bem’s work, and I personally feel my prior hasn’t budged one bit!

You raise an even more important (methodological) point, which I fully agree with; I am also unconvinced by the findings with such low trial numbers. If the method is poor, no statistics are reliable! As Fisher said,

“To consult the statistician after an experiment is finished is often merely to ask him [sic] to conduct a post mortem examination. He can perhaps say what the experiment died of.”

Thanks again for your informative comment, and very best wishes!!

Jim.