• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Need Help with Randomizing for Experiment

OK - bpesta, 69Dodge and Jekyll - am I thinking about this wrong? To me, there will be 10 prayed-for pots. So if he can guess all 10 prayed for ones, he gets 10 out of 10. But if he only guesses 5 of the prayed for ones, he gets 5 out of 10. I'm leaving the other 10, the normal ones or the control group, out of his correct/not correct guesses. That's the way I had planned to record the results. (So I would see a trial with only two pots as he either gets it right or wrong - he either chooses the correct prayed for pot, or chooses the wrong pot.)

I'm striving to set this up so that he gets no feedback as he is guessing, not even unconscious signals from Mr. Amapola.
Your method is fine, you're recording enough data that anyone could recreate the full outcome if they needed to.

We're just arguing about what the numbers mean, and what we should count as a success.

We don't want to give unearned credit to god for making the flowers grow, or needlessly oppress Christians.
 
Actually, it's more restrictive than that.

Ok, so under the binomial distribution there are 2^20 = 1048576 possible guesses he can make about which pots are being prayed for. Only one of these guesses will be entirely right.

If we restrict ourselves to just the cases where there are 10 prayed for plants there are 20C10 =184756 possible guesses.
Still exactly one of these guesses will be right.

So if the guesser knows that there are exactly 10 pots being prayed for he is almost 6 times more likely to be completely right than the binomial distribution leads us to believe.

J-- just because it's hard to read a post's tone, realize I am enjoying the debate here and not trying to be combative.

That said, of course, you're completely wrong.
:D

The probability of being completely right isn't what we're testing and is so close to zero that it wouldn't matter. Plus, with no feedback, even if one managed to get the first 19 correct, it'd still be a 50/50 guess as to what plant 20 was.

What matters-- my contention-- is of those restricted scenarios where it's 10 and 10, exactly how many of them include the prayed for plant at trial 20 (I'm guessing exactly half). Since p = .50 for each trial whether you know it's 10/10 or not, I'm asserting the independence assumption is not violated.
 
Oh and I am praying for this guy to do it to counteract claus praying against him.
 
I'm not going to work out the math, but I do agree that, as stated, it's a 10-trial rather than a 20-trial. He may correctly choose all 10 holy water plants, but for each one he misses, he will also necessarily miss one well water plant. I'm still comfortable with calling 80% (8 of ten correct) or better a win for prayer, and I suspect it will be a win in his mind if he correctly guesses 6 of ten. In that case, you might suggest (or, indeed, have prepared in advance) a second experiment along the lines Hokulele suggests, with only 1 plant in the group being "favored." After all, it's the water he's praying (for/at); the number of plants you subsequently pour it on is not material.
 
I could be wrong and would concede if someone could explain how just knowing that it's 10 and 10 gives you any advantage whatsover or clues you in on what to guess for pot 20?

Looking at pot 20 alone, I have no clue as to what I ought to guess about it.

But looking at all twenty pots together, I know that I must make ten "prayed-for" guesses and ten "normal" guesses.

That eliminates a whole bunch of wrong sets-of-twenty-guesses that I might have made, had I not known that the twenty pots were split ten-and-ten.

With no extra conditions, there are 220 = 1,048,576 ways to make twenty binary guesses. But if ten must be "yes" and ten must be "no", there are only [latex]$\binom{20}{10}$[/latex] = 184,756 ways.

still submitting that it's a 20 trial binomial test-- not a 10 trial one on just the prayed for beans.

I'm saying that it's not either of those. It's not binomial at all. My post #11 contains what I think is the correct formula.
 
OK - bpesta, 69Dodge and Jekyll - am I thinking about this wrong? To me, there will be 10 prayed-for pots. So if he can guess all 10 prayed for ones, he gets 10 out of 10. But if he only guesses 5 of the prayed for ones, he gets 5 out of 10. I'm leaving the other 10, the normal ones or the control group, out of his correct/not correct guesses. That's the way I had planned to record the results. (So I would see a trial with only two pots as he either gets it right or wrong - he either chooses the correct prayed for pot, or chooses the wrong pot.)

I'm striving to set this up so that he gets no feedback as he is guessing, not even unconscious signals from Mr. Amapola.

But am I wrong in only considering the 10 that he will have prayed for?...
Yes, you are tossing out data. You need to record how many are correctly and incorrectly identified in each category.
............Prayed For Not Prayed For

Correct ______ _______

Incorrect ______ ________
 
Last edited:
Controversy!

We should pray for guidance or a statistician. I'm going to ask one today, if he's in.

of those restriced 184576 ways, half of those have prayed-for beans at trial 20 and the other half do not. Indeed, for any trial, the same is true. It's a mini-me version of a the more dispersed (non-restricted) binomial, but it's a still binomial. Can you plot the restricted version and see if it's bell shaped?

If it's not still a binomial (will defer to a statistician here) I assert that the test is equivalent to a binomial test.

I agree with JC (hell freezing) that technically it would be inappropriate to analyze only the 10 and that all 20 should be factored in (note for example that 8 of 10 right is a different passing rate than is 15 or 20).
 
J-- just because it's hard to read a post's tone, realize I am enjoying the debate here and not trying to be combative.

Likewise.

That said, of course, you're completely wrong.
:D

And . . . likewise.

:p

Since p = .50 for each trial whether you know it's 10/10 or not, I'm asserting the independence assumption is not violated.

But that's not what independence means! It doesn't mean that all the probabilities are the same. It is an extra condition, which might or might not be satisfied, even where all the probabilities are definitely the same.
 
best I can tell, the independence assumption means that the p of getting it right on any trial does not depend on the p of getting it right on any other trial. And, it doesn't, unless you get feedback after every trial.

Sure 10/10 reduces the number of possible distributions, but they're still bell shaped (plotting frequency of occurence by 0-20 correct guesses), with p=.50 for any trial.
 
How would this be any different from a test for dowsing with 20 covered buckets, half filled with water? If the water witch was correct without feedback 15 times, wouldn't that be significant with p = .02?
 
How would this be any different from a test for dowsing with 20 covered buckets, half filled with water? If the water witch was correct without feedback 15 times, wouldn't that be significant with p = .02?

How can the dowser be correct 15 times?

If ten buckets actually have water, and ten are guessed to have water, the number of correct guesses, out of twenty, has to be even.

That alone should tell you that something is funny, that the situation is not the usual binomial one.
 
best I can tell, the independence assumption means that the p of getting it right on any trial does not depend on the p of getting it right on any other trial. And, it doesn't, unless you get feedback after every trial.

Sure 10/10 reduces the number of possible distributions, but they're still bell shaped (plotting frequency of occurence by 0-20 correct guesses), with p=.50 for any trial.
That's not what the independence assumption means. Try Wikipedia, for instance:
Wikipedia said:
In probability theory, to say that two events are independent, intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs.
It says that if you know the result of a trial, then your probability for the next trial is changed. It does not say that if you know the probability of the results of a trial, then your probability for the next trial is changed.

In general if I have N trials and my guess for each is independently good or bad with probability p=0.5, that's a binomial distribution on the total number of good guesses with mean Np=N/2 and variance Np(1-p)=N/4. As N grows large this will be approximated by a normal distribution N(N/2,N/4).

But if I have N trials and I know that exactly N/2 are of one type and N/2 are of another, then my guess for each is good or bad with probability p=0.5, but my guesses aren't independent. What does this do? The mean total number of good guesses is still Np. My intuition says the variance will be different but I'm not sure. So let's check. I will randomly guess N/2 of the N to be one type and the other N/2 to be the other type. Of the first N/2 I will get k correct with probability (N/2)Ck * (N/2)C(N/2-k) / NC(N/2). Of the second N/2 I will get another k correct because I will get N/2-k of them incorrect due to my first guesses.

Where have I seen that before? Well, it's certainly hypergeometric Here m=N/2 and n=N/2. We see that the mean for that is nm/N = N^2/4N = N/4, which is good since we actually get twice the mean correct (k and then k again), for N/4 * 2 = N/2. How about the variance? It's n*(m/N)*(1-m/N)*(N-n)/(N-1). That's N/2*(1/2)*(1/2)*(N/2)/(N-1) or 1/16 * N^2/(N-1). But remember this is the variance of k and we want the variance of 2k, so it's actually 1/4 * N^2/(N-1). As N grows large that's N(N/2,N^2/4(N-1)) which is nearly exactly N(N/2,N/4).

Conclusion (tl;dr)
In the limit these two methods are very, very close to each other. But for smallish N, where approximating with a normal distribution isn't good enough anyway, they are very different distributions. N=2 is smallish; I claim N=20 is smallish.
 
How can the dowser be correct 15 times?

If ten buckets actually have water, and ten are guessed to have water, the number of correct guesses, out of twenty, has to be even.

That alone should tell you that something is funny, that the situation is not the usual binomial one.

He got 7 of 10 "yes" guesses right.

and 8 of 10 "no" guesses right.

15/20 and binomial.
 
Last edited:
Greedy, the independence assumption has nothing to do with whether one's GUESSES are independent; the focus is only on the outcome's probability, as I understand it.

The outcome's probability (whether for example pot 20 is prayed for) is independent of any other trial's p when the selector only knows that it's 10/10 but gets no feedback.

I think it's binomial with a smaller variance, as you point out (less sure about this point than the two above).
 
He got 7 of 10 "yes" guesses right.

and 8 of 10 "no" guesses right.

15/20 and binomial.

Um. Let's see. Suppose we line up the buckets afterwards, keeping his guesses tagged to the buckets, with buckets with water first and no water second:

Code:
WWWWWWWWWWNNNNNNNNNN

You can't tag those with 7/10 yes guesses right and 8/10 no guesses right.

Code:
WWWWWWWWWWNNNNNNNNNN
YYYYYYY   YYY

There are the 7 yes guesses right. Notice there are not enough buckets left to get 8 no guesses right. If there are 10 buckets with water out of 20 buckets and "yes" is guessed 10 times, the number of correct guesses must be even.
 
I think it's binomial with a smaller variance, as you point out (less sure about this point than the two above).

No, it's not binomial with a smaller variance. As N grows large, the two are very closely approximated by normal distributions, and the second has slightly higher variance (divide by N-1 makes it larger). The second is not binomial. Lots of things are approximated well by normal distributions, that doesn't mean they're the same thing. If it's known that 10 of the 20 are of one type instead of each chosen randomly with p=0.5, then the distribution on number of correct guesses is hypergeometric, not binomial. In fact the Wikipedia page gives conditions on which hypergeometric distributions are well approximated by binomial distributions: "If N and m are large compared to n and p is not close to 0 or 1". But uh oh, now I see that it says also
Wikipedia said:
If n is large, N and m are large compared to n and p is not close to 0 or 1, then [the hypergeometric distribution is well approximated by Φ(np,np(1-p))] where Φ is the standard normal distribution function]
But here m is not large compared to n. Maybe this isn't even well approximated by a normal distribution as N grows large.
 
The "covered bucket" example is a different experiment. The only way he could get 15/20 right is if he treats them all independently, and doesn't keep track well enough to have 10 "yes" guesses and 10 "no" guesses. This shouldn't be the case with the plants, because the plants are being compared to each other, and the prayer knows there are 10 of each.
Whatever criteria he uses -- biggest, greenest, bushiest, beaniest, or some combination of factors -- he should be guessing 10 prayer buckets. If he guesses 11 prayer buckets, then yes, the number of "wrong" guesses can be factored in as well, and the total taken from 20 possibilities. If (as should be the case) his "Yes" guesses equal his "No" guesses, every incorrect "Yes" will force an incorrect "No", and you can take either group, or both totaled, without changing the percentages.
 
Greedy, Dodge and Bok

I concede the point re it being impossible to get 15 correct in the scenario outlined above. I guess, unlike flipping coins, where one just might get 20 tails, here the most one can get is 10, as constrained by the experiment.

I still feel the independence assumption is not violated for reasons I posted above (whether the distribution be binomial or hypo allergenic). The guesses might be dependent but the probability on any trial is not (with no feedback provided).

Given that being wrong on the 15/20 thing increases the probability I am wrong here too, I'm still not ready to concede the point. Any help?

Changing your guessing strategy based on knowing that it's 10/10 is one thing, but I see no way where that can change the probability that any pot (even pot 20) was prayed for or not.

Also, why wouldn't this now be a binomial distribution with n=10 (if the above already explained that, my apologies for not noticing).

B
 
I still feel the independence assumption is not violated for reasons I posted above (whether the distribution be binomial or hypo allergenic). The guesses might be dependent but the probability on any trial is not (with no feedback provided).
I think you simply have a mistaken idea of what independence is. Independence is a statement about the joint distribution over trials, not about what each trial looks like independently. The definition of independence is that A and B are independent given background information X exactly when P(AB|X) = P(A|X)P(B|X).

So is one pot prayed independent of another pot prayed? We know that P(A|X)=0.5 and P(B|X)=0.5, so is P(AB|X)=0.25?

P(AB|X) = P(A|BX)*P(B|X) = 0.5*P(A|BX). As expected, the condition is that P(A|BX) must still be 0.5 to make them be independent. But it's not, since if B, then there are only 9 prayed pots left out of 19 and P(A|BX) = 9/19. Not independent, even though P(A|X)=0.5.

Also, why wouldn't this now be a binomial distribution with n=10 (if the above already explained that, my apologies for not noticing).
Ah. The explanation is that the guesser is not choosing 1 pot from 10 prayed and 10 not-prayed pots, 10 times. Then it would be binomial with n=10, p=0.5. Instead the guesser is choosing 10 pots known to be distinct from 20 pots. In the first case he could choose the same pot more than once. In this he cannot. That is where the difference from binomial comes in.
 
Last edited:

Back
Top Bottom