Telephone telepathy data: Statisticians needed

Jorghnassen · Jul 24, 2007

Paul C. Anagnostopoulos said:
Thanks Beth, I'm significantly enlightened. Just to bother you with another question:

How do you look at the data to decide between these two distributions?

~~ Paul

That one is very basic. The normal distribution is your standard continuous, symmetric, bell shaped density. Works best with data that has these properties. But, because of the central limit theorem, the normal distribution pops up a lot in statistics, especially with large samples, because statistics such as the mean of a sample will tend to follow a normal distribution as the sample size goes to infinity (under a number of assumptions, most of them "mild"). For small samples, a t-statistic is usually preferred because it behaves like a normal but with heavier tails (higher probability of more extreme outcomes), and provides a better, more robust approximation to the true distribution of the quantity of interest.

What we have here though, is a series of Bernoulli trials. For each call, the subject guesses who the caller will be, and a success is defined as guessing the caller correctly. The outcome of each trial is discrete: it's either a success (with probability .25 under the null hypothesis in this case) or a failure. When you count the number successes over a fixed number of trials, you get a Binomial distribution with parameters n (the number of trials) and p (the probability of success for one trial). This is a discrete distribution that is not necessarily symmetric (depending on p). It is the exact distribution of the number of successes provided the trials are independent (that is not entirely the case here, but it shouldn't be much of a problem) and that p is fixed (again, this may not be the case if the die and the psychics are biased, but such bias is probably negligible in this case). One can find the exact probability of any number of successes for n trials under such circumstances, so there is no need to use a normal approximation (unless one happens to be without a computer/calculator and with no time or enough paper/pen/pencils to do the exact calculation).

Beth · Jul 24, 2007

Paul C. Anagnostopoulos said:
Thanks Beth, I'm significantly enlightened. Just to bother you with another question:

How do you look at the data to decide between these two distributions?

~~ Paul

I think Jorghnassen has answered this well. Although in some cases where the distribution is unknown, I might look at the data to try and find a distribution that's a good fit, in this case that's not what's happening. We aren't trying to fit a distribution to the data, but have a situation that will, under the null hypothesis, produce data that will follow a particular binomial distribution.

In hypothesis testing, we assume the null is true and then compute the probability of getting the results we got under that assumption. If the p-value is low enough, we reject the null and conclude the alternative is true.

Telephone telepathy data: Statisticians needed

Jorghnassen

Illuminator

Beth

Philosopher