Thanz
Fuzzy Thinker
- Joined
- Jul 24, 2002
- Messages
- 3,895
I have explained it - and in a way, so have you. Here is what you posted earlier:BillHoyt said:
And just what happens to the Poisson distribution with this alleged "overcounting"? Why didn't the "J" count move back to the mean? Thanz can't answer it. Tr'olldini can't answer it. You haven't been able to so far. Care to try?
So, the significance increases if the percentage difference between observed and expected remains the same. We have seen this in the difference between your count and my count. Your sample and your number of J counts approximately doubled mine, with J remaining at about 21% of the total. It is as if you counted two guesses for every single real guess.That means, that if the percentage difference between observed and expected remains the same, the significance of that observed result increases.
You went on to say:
Here is the problem - you are not actually increasing the sample size. The underlying data in both counts is exactly the same. By overcounting the same sample, you have increased the significance of the result even though the proportions remained the same. Remember, you initially defended your overcounting on the basis that it didn't matter as everything would go up. We see that this is not true - it certainly does matter.If JE's repetitions of "I'm getting a J; like Joe or John" were truly random, we would expect repetitions of "I'm getting an X; like Xanadu or Xena," etc., on a random basis as well. We would expect those fluctuations to overwhelm small random perturbations in the "J"s that we see with smaller sample sizes. We would expect the percentage difference between observed and expected "J" frequencies to head to the mean; that is, to go down. That is the meaning of the fall off in the Poisson's pdf.
Your count is like taking 100 coin flips, multiplying the result by 10, and claiming that is the same as actually observing 1000 coin flips. As your explanations show, we would expect the results of 1000 coin flips to move closer to the mean than if we multiplied it out. Your counting method is like multiplying it out - which gives an artifically high count for everything, including sample size, which makes the result appear more significant.
