Very nice, drk.
Would you care to say a few words about the logic behind the assumption of the null? I would (and will, if you don't), but dang, you got a way with words.
(...and such an explanation would be very nice to be able to point to when people say things like "well, why can't we just assume X until proven otherwise?")
Well I'll say some things, but they won't be what you're expecting.
The problem is that people always want to solve an impossible problem. So statisticians have come up with a problem they can solve which sounds enough like the impossible problem that we can't solve that people are happy. And then statisticians get annoyed when people mistake the one for the other.
Allow me to expand.
The problem that people want to answer is, "What is the probability that this is true?" Well anyone who is familiar with Bayes' Theorem can tell you that this doesn't have a well-defined answer - it depends on how likely you thought it was before you did your experiment. But people don't like that answer much.
So we say, "Well let's figure out the probability of getting a result this weird under the null hypothesis. If that is low enough, then we'll reject the null hypothesis." Now
this question is something we can answer, it is well-defined, and people are happy to use the answer.
The problem is that people insist on mistaking the second question for the first. And that is where statisticians get annoyed. Because they aren't the same thing at all. Or worse yet, people want a concrete answer of the form, "I see that this is better than that, with 95% probability, how much better is it?" Which again is the impossible question. No matter how much you explain it to them, people want the simple answer, and want to state it in the simple way.
So to keep the statisticians happy, all we need to do is just get it straight and keep it straight? Well, that depends on the statistician. You see, there is a debate among statisticians about whether or not the standard procedure makes much sense at all. Bayesians like to bring up cases like the following one.
Suppose we know that a couple planned to have children until they had both a son and a daughter. They have 7 sons in a row, then a daughter. At a 95% confidence level, should we reject the hypothesis that they are equally likely to have sons or daughters? (*) Well the null hypothesis is equal probabilities, under which a result this strange or stranger requires a string of 7 boys or 7 girls in the first 7 kids, which will happen 1 time in 2^6, or 1.5625% of the time. So at the 95% confidence level (even at a 98% confidence level) we'd reject the null hypothesis.
Now let's change the problem. Suppose they just were going to have 8 kids. What then? Well the odds of a result that odd are the odds of 7 boys and 1 girl (happens 8 ways), or 7 girls and 1 boy (happens 8 ways) or 8 boys (1 way) or 8 girls (1 way). So there are 18/2^8 ways in which we could get a result this odd, which has probability 7.03125%. So at the 95% confidence level we should not reject the hypothesis.
But according to Bayes' theorem, no matter what prior probabilities you assign, your posterior probabilities
will not depend on the knowledge that they were going for 8 kids or both a boy and a girl. In any valid system of inference that piece of data is a red herring that should not make sense. Therefore standard statistical methods lead to nonsensical results.
In the real world the Bayesians lose for two reasons. First, everyone is used to the standard solution. And second, Bayesian alternatives to the standard methods are
far more complex to understand and explain.
Cheers,
Ben
* In fact this hypothesis is generally wrong. Population statistics demonstrate that there is a small but significant bias towards having sons rather than daughters.