The importance of statistical-savviness

Students often wonder why we in Psychology ask them to learn statistics. There are many good reasons, but today I just want to focus on one; surprisingly, this reason has nothing to do with Psychology. Having even just a smattering of statistical savviness can protect you from a lot of…well…bullshit. And trust me, there is a lot out there.

We live in a world where we are bombarded with statistics: 8/10 people prefer brand X; inflation has risen by x% above wage increases; exit polls suggest politician x has a y-point lead…the list is endless. What are you to make of these facts and figures? Should you believe advertisers when they say product x is the best? Should you be worrying whether eating bacon more than once a week really will increase your chance of an early death? Possibly. Possibly not. But with a good understanding of statistics you can approach these topics with more rigour than people who don’t have this foundation. And that’s a very positive thing.

Take for example a nice light-hearted report recently that researchers have discovered a “Law of Urination”; that is, it has been reported that mammals — regardless of their bladder size — take an average of 21 seconds to urinate. (I should declare here that I do not necessarily believe that this article falls into the bs category I mentioned above; indeed, despite its light-hearted coverage in the media, the topic has serious research implications for bladder disease. It’s just a nice modern example of how we should all have our stats-hats on.)

You might believe that this is very interesting indeed, and it seems many media outlets agree with you. “Mammals take 20 seconds to pee”, declared the National Geographic; “It takes 21 seconds to go”, parodied the Daily Mail.

But, what I wanted to know was: What’s the error associated with that mean time? I would be delighted if all my students were asking the same question. Well, it turns out that the estimate is associated with a standard deviation of 13 seconds! That’s huge!

Over-simplifying, the standard deviation is an estimate of spread in your sample; a small standard deviation (simplistically) means that your mean estimate is a close representation of a lot of scores in your sample. A larger standard deviation can mean that your estimate is perhaps not representative.

But, the media have not made much fuss about the 13 second standard deviation, preferring to tout the mean estimate as the “law of urination”.

To convince you that this mean estimate is perhaps not so mind-blowing, below I ran a simulation in the statistical language R, where I drew 10,000 random samples from a “known” normal distribution with a mean of 21 and standard deviation of 13; then, I plotted a histogram of these samples to see what the spread in the data looks like. (I appreciate that this method is likely not the best to model the data under discussion, but everyone is familiar with the normal distribution, so it seemed appropriate to make my point.)

For those interested, the code I used for this was

randomSamples <- rnorm(10000, 21, 13)
hist(randomSamples, xlab=c("Sampled Urination Time (seconds)"), ylab=c("Frequency"))

RplotNow, I don’t know about you, but all of a sudden the 21 seconds (identified in the plot as the vertical red line) doesn’t seem so exciting. You might also note a problem with this plot that there are negative numbers; you can’t have a negative pee-time! So, I re-ran the random sampling, this time replacing any value that fell below zero with another random sample from the normal(21,13) distribution. (This is where my choice of random distribution is biting me in the butt, but bear with me.)

Again, the (un-elegant) code:

randomSamples <- rnorm(10000, 21, 13)
for(i in 1:length(randomSamples)){
if(randomSamples[i] < 0){while(randomSamples[i]<0){
randomSamples[i] <- rnorm(1, 21, 13)}}}

Rplot01

Now the negatives have gone, and we are still left with quite a wide spread of urination times. I am left wondering whether if the media published this plot they would have the same interest from readers than just publishing the mean.

It’s hard to be critical of this research article (in fact, it appears to be more of an abstract for a forthcoming conference) because we know, as researchers, that ANY measurement has error attached to it. The researchers were upfront about this. What you can be critical of, though, is the coverage that such claims get.

If nothing else, an undergraduate course of statistics in psychology will hopefully make you cognisant of such issues.

Advertisements
Tagged ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: