• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Need Help with Randomizing for Experiment

How can the dowser be correct 15 times?

If ten buckets actually have water, and ten are guessed to have water, the number of correct guesses, out of twenty, has to be even.

That alone should tell you that something is funny, that the situation is not the usual binomial one.
Maybe I'm wrong, but if there are ten buckets with water and ten without and the water witch gets 7 right and 3 wrong when there's water and 8 right and 2 wrong when not, he has 15 right. An odd number. Right?
 
Last edited:
Maybe I'm wrong, but if there are ten buckets with water and ten without and the water witch gets 7 right and 3 wrong when there's water and 8 right and 2 wrong when not, he has 15 right. An odd number. Right?

If he gets 7 right and 3 wrong when there's water, that means he guessed "water" 7 times and "no water" 3 times. And if he gets 8 right and 2 wrong when there's not water, that means he guessed "water" 2 times and "no water" 8 times. That's a total of "water" 9 times and "no water" 11 times. The only way to get an odd number is to guess "water" a number of times other than 10.
 
I think you simply have a mistaken idea of what independence is. Independence is a statement about the joint distribution over trials, not about what each trial looks like independently. The definition of independence is that A and B are independent given background information X exactly when P(AB|X) = P(A|X)P(B|X).

So is one pot prayed independent of another pot prayed? We know that P(A|X)=0.5 and P(B|X)=0.5, so is P(AB|X)=0.25?

P(AB|X) = P(A|BX)*P(B|X) = 0.5*P(A|BX). As expected, the condition is that P(A|BX) must still be 0.5 to make them be independent. But it's not, since if B, then there are only 9 prayed pots left out of 19 and P(A|BX) = 9/19. Not independent, even though P(A|X)=0.5.


Ah. The explanation is that the guesser is not choosing 1 pot from 10 prayed and 10 not-prayed pots, 10 times. Then it would be binomial with n=10, p=0.5. Instead the guesser is choosing 10 pots known to be distinct from 20 pots. In the first case he could choose the same pot more than once. In this he cannot. That is where the difference from binomial comes in.



See, I don't think the guesser has what you're calling "background info x". Without it, the trials are independent. The guesser knows that it's 10/10. That's it. But, your X appears to be feedback on each trial, which will not be given.

Just to summarize where I'm at, still admitting that I might be wrong:

1) I don't see how a guessing strategy can influence the probability of an outcome. The plants were either prayed for or not. You know that of the 20, 10 fall in each category. You don't know-- unless prayer works-- which 10 are which.

2) There are 20 place markers here. 10 can be filled with prayed for plants; 10 with non prayed for plants. The selector even knows this.

3) It follows that once 10 of either type have been placed, whatever trials remain must be of the other type. I understand and agree (I characterized it as a degrees of freedom deal).

This is the argument that the trials are not independent? Fine, but the guesser is not privy to when in the sequence 10 of one type have occurred (because he's not getting trial by trial feedback). So, he can't use the info to his advantage, which means he can't change the probability on later trials.

similarly:

4) Someone above calculated that there are x number of possible "samples" that meet the requirement in #3 above (the 10/10 deal).

For all of these samples, once 10 of either type have been placed, the remaining trials must be of the other type.

5) I suspect there are exactly as many samples where the trial 20 plant is prayed for as there are where the trial 20 plant is not prayed for (i.e., that the probability at trial 20 is .50, SUMMED across all possible samples that meet the 10/10 requirement).

Given that, I concluded that the trials are independent. Without getting feedback on each trial, there is no X in your posted formula for the guesser to use.

Since any sample of 10/10 is equally likely to be the sample you're asked to judge, and since those samples are normally distributed, it's p = .5 that you get any single trial correct, and that does not depend on how you guessed on prior trials. Nor can you use a sampling without replacement strategy as you don't know the true placement on any trial til after you've guess on all trials.

Thoughts? Thanks!
 
best I can tell, the independence assumption means that the p of getting it right on any trial does not depend on the p of getting it right on any other trial.

Ok, but how can we decide whether one thing depends on another? Only by changing the first---if not in reality then at least in imagination---and seeing whether the second changes too.

Suppose I go to the post office to mail a half-ounce letter. I'm told that the postage is 41 cents. From this information alone, can I tell whether the postage depends on the weight of the letter?

No. I have to ask, Would the postage be any different if the letter weighed two ounces?

It doesn't weight two ounces. It weighs half an ounce. That isn't going to change in reality. But, in order to decide whether the postage does or does not depend on the weight, I need to consider an imaginary scenario in which the weight differs from its actual value.

Similarly, to determine whether two events are independent, it is not enough to look at what their probabilities are. I need to look at what the probability of one would be, if the probability of the other were different. In reality, the guesser is not given feedback, but would feedback be useful to him if it were given?
 
If he gets 7 right and 3 wrong when there's water, that means he guessed "water" 7 times and "no water" 3 times. And if he gets 8 right and 2 wrong when there's not water, that means he guessed "water" 2 times and "no water" 8 times. That's a total of "water" 9 times and "no water" 11 times. The only way to get an odd number is to guess "water" a number of times other than 10.

How about he guessed 6/4 and 9/1. Oops that's still an odd number.
 
1) I don't see how a guessing strategy can influence the probability of an outcome. The plants were either prayed for or not. You know that of the 20, 10 fall in each category. You don't know-- unless prayer works-- which 10 are which.

The guesser is not being judged on the correctness any single guess. He's being judged on the total number of correct guesses. He has no strategy that will improve his chances of guessing correctly about any single plant considered in isolation, but the overall strategy of making ten "prayed" guesses and ten "normal" guesses will tend to increase the total number of correct guesses, compared to the strategy of making twenty independent guesses, e.g. by flipping a coin for each guess.

ETA: I should rephrase this. The expected number of correct guesess is still 10. So, in that sense, it does not "tend to increase to total number of correct guesses". But it increases the probability that the number of correct guesses will be high. (Balancing this, it also increases the probability that the number of correct guesses will be low. The middle cases lose out, then. I think that's right. I should make another graph.)

ETA again: I'm having trouble making a helpful graph. But never mind whether I can give an intuitive explanation. The numbers are what they are. I didn't make them up. I calculated them. I stand by post #11.

2) There are 20 place markers here. 10 can be filled with prayed for plants; 10 with non prayed for plants. The selector even knows this.

3) It follows that once 10 of either type have been placed, whatever trials remain must be of the other type. I understand and agree (I characterized it as a degrees of freedom deal).

This is the argument that the trials are not independent? Fine, but the guesser is not privy to when in the sequence 10 of one type have occurred (because he's not getting trial by trial feedback). So, he can't use the info to his advantage, which means he can't change the probability on later trials.

He can avoid making five "prayed" guesses and fifteen "normal" guesses, for example. That's an advantage. He's eliminated from consideration a sequence of guesses that's sure to be wrong.

similarly:

4) Someone above calculated that there are x number of possible "samples" that meet the requirement in #3 above (the 10/10 deal).

For all of these samples, once 10 of either type have been placed, the remaining trials must be of the other type.

5) I suspect there are exactly as many samples where the trial 20 plant is prayed for as there are where the trial 20 plant is not prayed for (i.e., that the probability at trial 20 is .50, SUMMED across all possible samples that meet the 10/10 requirement).

Yes, that's right.

Given that, I concluded that the trials are independent. Without getting feedback on each trial, there is no X in your posted formula for the guesser to use.

No. Still not independent.

Since any sample of 10/10 is equally likely to be the sample you're asked to judge,

Yes.

and since those samples are normally distributed,

No.

it's p = .5 that you get any single trial correct,

Yes.

and that does not depend on how you guessed on prior trials.

It depends on whether your previous guesses were in fact correct, even though, at the time, you don't know whether they were or not.

Nor can you use a sampling without replacement strategy as you don't know the true placement on any trial til after you've guess on all trials.

In effect, you do use a sampling without replacement strategy, by making exactly ten "prayed" guesses and ten "normal" guesses rather than any other combination.
 
Last edited:
See, I don't think the guesser has what you're calling "background info x". Without it, the trials are independent. The guesser knows that it's 10/10. That's it. But, your X appears to be feedback on each trial, which will not be given.
Nope. X is just the background info everyone agrees the guesser has, namely that 10 of the 20 pots use prayed water. It's only there to remind us that the guesser does know something, that he knows and will try if he can to take into account the given information - exactly 10 of 20 are prayed for.

1) I don't see how a guessing strategy can influence the probability of an outcome. The plants were either prayed for or not. You know that of the 20, 10 fall in each category. You don't know-- unless prayer works-- which 10 are which.

2) There are 20 place markers here. 10 can be filled with prayed for plants; 10 with non prayed for plants. The selector even knows this.

3) It follows that once 10 of either type have been placed, whatever trials remain must be of the other type. I understand and agree (I characterized it as a degrees of freedom deal).
Correct on all counts. But also note that "once 10 of either type have been placed, whatever trials remain must be of the other type" is not everything that follows.

This is the argument that the trials are not independent? Fine, but the guesser is not privy to when in the sequence 10 of one type have occurred (because he's not getting trial by trial feedback). So, he can't use the info to his advantage, which means he can't change the probability on later trials.
On each trial, he has a 0.5 chance of being right. Everyone agrees about this. What we don't agree about, apparently, is whether he has a 25% chance of being right twice in a row for the first two trials. I claim he doesn't.

4) Someone above calculated that there are x number of possible "samples" that meet the requirement in #3 above (the 10/10 deal).

For all of these samples, once 10 of either type have been placed, the remaining trials must be of the other type.

5) I suspect there are exactly as many samples where the trial 20 plant is prayed for as there are where the trial 20 plant is not prayed for (i.e., that the probability at trial 20 is .50, SUMMED across all possible samples that meet the 10/10 requirement).
You'd be right on both counts.

Given that, I concluded that the trials are independent. Without getting feedback on each trial, there is no X in your posted formula for the guesser to use.
This is where you're going wrong. (Side note: X is what we agreed his information was. I didn't introduce information you explicitly said he doesn't get, don't worry.) That's not enough to conclude the trials are independent. Just use the simple example of dependent variables we saw earlier this thread - two coins, you know one is heads and the other tails. There are exactly as many samples where the trial 2 coin is heads as there are where the trial 2 coin is tails (i.e. the probability at trial 2 is .50, SUMMED across all possible samples that meet the one-heads/one-tails requirement).

And yet the two trials are not independent. This shows that at least you cannot conclude "Given that" that the trials are independent.

Since any sample of 10/10 is equally likely to be the sample you're asked to judge, and since those samples are normally distributed, it's p = .5 that you get any single trial correct, and that does not depend on how you guessed on prior trials. Nor can you use a sampling without replacement strategy as you don't know the true placement on any trial til after you've guess on all trials.
Yes. It's p=0.5 that you get any single trial correct. Everyone agrees. It doesn't depend on prior trial guessing. Clearly. There is no way possible to have a greater that p=0.5 chance to get any single trial correct. Just like in the two coin example. The point is that guesses are not independent. In the two coin example, you're either going to get 0 or 2 guesses correct. Each single trial has p=0.5 of being correct but you cannot possible get one right and one wrong (given the background information X - not "whether you got trial 1 right", but rather "don't guess heads for both coins, silly").

You have to separate your intuitions about the means, which appear to be correct, from the facts about the variances (and higher moments).

Can we make this easier so that everyone can show explicit calculations? Apparently the 2-coin example was too simple (though it's perfectly fine). Shall we continue the discussion as if there were 4 plants, 2 of which should get prayed-for water? Here's what happens.

Each plant gets prayed water with probability 0.5 said:
Probability of getting k guesses right:
k=0: 1/16
k=1: 4/16
k=2: 6/16
k=3: 4/16
k=4: 1/16

Exactly 2 plants at random are chosen to get prayed water said:
Probability of getting k guesses right:
k=0: 1/6
k=1: 0/6
k=2: 4/6
k=3: 0/6
k=4: 1/6

Are these the two distributions in question or do you have a different one that you're thinking of?
 
ETA: [...] So, in that sense, it does not "tend to increase to total number of correct guesses".

Bah. I edited in a typo, instead of editing one out. "to total number" should be "the total number", of course.
 
See, I don't think the guesser has what you're calling "background info x". Without it, the trials are independent. The guesser knows that it's 10/10. That's it. But, your X appears to be feedback on each trial, which will not be given.

Just to summarize where I'm at, still admitting that I might be wrong:

1) I don't see how a guessing strategy can influence the probability of an outcome. The plants were either prayed for or not. You know that of the 20, 10 fall in each category. You don't know-- unless prayer works-- which 10 are which.

2) There are 20 place markers here. 10 can be filled with prayed for plants; 10 with non prayed for plants. The selector even knows this.

3) It follows that once 10 of either type have been placed, whatever trials remain must be of the other type. I understand and agree (I characterized it as a degrees of freedom deal).

This is the argument that the trials are not independent? Fine, but the guesser is not privy to when in the sequence 10 of one type have occurred (because he's not getting trial by trial feedback). So, he can't use the info to his advantage, which means he can't change the probability on later trials.
This is fun...

Ok, let's try a simpler analogous problem.

Assume we have 20 plants and we are going to make either the first 10 holy or the last 10 holy.

If you know this, and you want to make a guess about what's going on, you can either guess:
1) First 10 holy.
or
2)Second 10 holy.

So it's no harder to guess correctly than it is to guess a single coin toss. Given the knowledge of what one plant's status is, we would know exactly what every other plant's status is like. There is no independence here.

Despite this, the bit in bold below is still correct, but your conclusion is wrong.

5) I suspect there are exactly as many samples where the trial 20 plant is prayed for as there are where the trial 20 plant is not prayed for (i.e., that the probability at trial 20 is .50, SUMMED across all possible samples that meet the 10/10 requirement).

Given that, I concluded that the trials are independent. Without getting feedback on each trial, there is no X in your posted formula for the guesser to use.

The problem is, you're using the wrong definition of independence.
Normally people say that if,

[latex]$P(X=x)\neq P(X=x|Y=y)$[/latex] for any x and y then X and Y aren't independent, because then some knowledge of Y may change the probability that X=x .

You on the other hand are averaging across all possible fixings of Y, when you test for independence:

[latex]$P(X=x) = \frac{\sum_{y} P(X=x|Y=y)}{|Y|}$[/latex] Which is always going to be true, whether or not X is dependant on Y.
 
If he gets 7 right and 3 wrong when there's water, that means he guessed "water" 7 times and "no water" 3 times. And if he gets 8 right and 2 wrong when there's not water, that means he guessed "water" 2 times and "no water" 8 times. That's a total of "water" 9 times and "no water" 11 times. The only way to get an odd number is to guess "water" a number of times other than 10.
I don't think so. It seems to me that it is possible to get any number from 0 to 20 correct. Every guess is either right or wrong. Right?
 
I don't think so. It seems to me that it is possible to get any number from 0 to 20 correct. Every guess is either right or wrong. Right?
Here are the assumptions: There are 20 pots. Exactly 10 will be god-pots. You know both of those things but of course don't know which pots are which. Thus you pick 10 at random and guess that they are the god-pots, then guess that the other 10 are not god-pots.

Claim: The number of correct guesses is even.

Proof: Suppose the number of pots which you guessed are god-pots that are actually god-pots is k. Then you guessed (incorrectly) that exactly 10-k normal pots were god pots. Therefore you guessed that the rest of the normal pots were normal pots: 10-(10-k)=k. So you made k correct god-pot guesses and k correct normal pot guesses for a total of 2k correct guesses, which is even.
 
I suspect you guys are right, I just can't get my head around it and need more time to think (and sulk!).

Very educational and jref-mission like.

I shall be back though with another bad argument, I suspect :)
 
Ok, before I go gently into the night, lemme see what this looks like format-wise then come back to it.

4 trials / plants. 2 are Gods 2 are not:


1,2,3,4
t,t,t,t
t,t,t,f
t,t,f,t
t,t,f,f **
t,f,t,t
t,f,t,f **
t,f,f,t **
t,f,f,f
f,t,t,t
f,t,t,f **
f,t,f,t **
f,t,f,f
f,f,t,t **
f,f,t,f
f,f,f,t
f,f,f,f

So, I agree, without the constraint that half be god plants, there's 16 possible placements. With the constraint there's only 6.

Roger that. More later..
 
ok, still thinking. Here's the 6 possible trials of 4 plants where 2 are prayed for and 2 are not:

1,2,3,4
t,t,f,f
t,f,t,f
t,f,f,t
f,t,t,f
f,t,f,t
f,f,t,t

Any series is equally likely to be the one picked for the guesser.

For any of the 4 trials, p = .5 as looking down the columns there are always three t's and three f's.
 
ok, still thinking. Here's the 6 possible trials of 4 plants where 2 are prayed for and 2 are not:

1,2,3,4
t,t,f,f
t,f,t,f
t,f,f,t
f,t,t,f
f,t,f,t
f,f,t,t

Any series is equally likely to be the one picked for the guesser.

For any of the 4 trials, p = .5 as looking down the columns there are always three t's and three f's.
Okay, let's take the last step. Those 6 possibilities are also the six ways you could guess. Here's a table of possibilities, numbered:
Code:
   1,2,3,4
1: t,t,f,f
2: t,f,t,f
3: t,f,f,t
4: f,t,t,f
5: f,t,f,t
6: f,f,t,t
Now let's see how many you get right if you, say, guess pattern 4 when the reality is pattern 6. In that case you'd correctly guess "f" for the first trial and correctly guess "t" for the third trial: 2 correct guesses.
Code:
   1 2 3 4 5 6
   -----------
1 |4 2 2 2 2 0
2 |2 4 2 2 0 2
3 |2 2 4 0 2 2
4 |2 2 0 4 2 2
5 |2 0 2 2 4 2
6 |0 2 2 2 2 4
The expected number of correct guesses is still 2, but as soon as you start some kind of aggregate statistics like "how likely is it you guess at least 2 correctly?" wonkiness appears. The answer to that question is clearly 5/6, but if it was binomial the answer would be 6/16+4/16+1/16 = 11/16 < 5/6. So if you do the math like it was binomial then you'd conclude, if asking that question, that the guesser got luckier (i.e. prayer was more likely to have worked) than he really did.

ETA: Or if you say that half the total correct guesses is binomial in N/2, then you get 2/4+1/4 = 3/4 < 5/6 for the probability of guessing 2*1 or more.
 
Last edited:
I do have a procedural question and I am not sure if it was covered in all the math here. Amapola, were you planning to have him walk down the row and say "Prayed" or "Non-prayed" for each pot? Would you allow him to go back and change his mind? For example, if he uses up his 10 "Prayed" options before reaching the end of the row and one of the last plants looks healthier, can he go back and reassign a pot to "Non-Prayed"? Would this affect the results of the experiment at all? I could see the selection process giving him an excuse regardless of how it was carried out.

Has this already been covered?
 
Here are the assumptions: There are 20 pots. Exactly 10 will be god-pots. You know both of those things....
That's the problem here. The second assuption is not necessarily true. As I said in post#39, they would be independent, "especially if he did not know how many were in each group."
 
That's the problem here. The second assuption is not necessarily true. As I said in post#39, they would be independent, "especially if he did not know how many were in each group."

Okay, let's look at post 39 then. Here are the relevant parts: 69dodge's setup, and your response.

69dodge said:
There are twenty plants. Ten of them were given prayed-for water. Ten were given regular water. The guy knows this; he just doesn't know which are which. He will try to decide which are which.
Jeff Corey said:
I am assuming that he will make all his guesses in one sitting and not be given feedback until all guesses are recorded. This would make each guess independent of the previous ones, especially if he were not told how many were in each group.
Yes, he will make all his guesses in one sitting. Yes, he will not be given feedback until all guesses are recorded. No, this would not make each guess independent of the previous ones. Yes, if he were not told (i.e. if the experiment is set up differently than 69dodge stipulates) how many were in each group, then each guess will be independent of the previous ones.

Sure, fine. Change the experiment, change the results. I've got no problem with that. Let's summarize:

20 pots, each chosen to be a god-pot or not independently with probability 0.5, and the guesser does not know this: binomial.

20 pots, each chosen to be a god-pot or not independently with probability 0.5, and the guesser knows this: binomial.

20 pots, a random 10 of which are chosen to be a god-pot and the others not, and the guesser does not know this: binomial.

20 pots, a random 10 of which are chosen to be a god-pot and the others not, and the guesser knows this: not binomial.

ETA: if a random 10 are chosen and the guesser doesn't know, the results are binomial from the standpoint of the guesser. But if the guesser declares he will guess in a certain way (such as always guessing 10 to be god-pots) then the experimenters' probabilities are no longer binomial. Anyone who has both pieces of information can no longer model it with a binomial distribution.
 
Last edited:
I do have a procedural question and I am not sure if it was covered in all the math here. Amapola, were you planning to have him walk down the row and say "Prayed" or "Non-prayed" for each pot? Would you allow him to go back and change his mind? For example, if he uses up his 10 "Prayed" options before reaching the end of the row and one of the last plants looks healthier, can he go back and reassign a pot to "Non-Prayed"? Would this affect the results of the experiment at all? I could see the selection process giving him an excuse regardless of how it was carried out.

Has this already been covered?

I've been quoted! *Amapola faints*

I don't see how his changing his mind would affect results but on the other hand some of this math is really over my head... I don't think it would matter, because he would finally have to make his choices (whatever they are) and the results will still (in my view) be pure guess work. (Maybe I shouldn't be so doubtful about the power of prayer! ;)) But the way I am seeing it is, the coin doesn't "know" you changed your mind mid-flip; it will still either be heads or tails no matter how unsure the guesser might be.
 
Okay, let's look at post 39 then. Here are the relevant parts: 69dodge's setup, and your response.



Yes, he will make all his guesses in one sitting. Yes, he will not be given feedback until all guesses are recorded. No, this would not make each guess independent of the previous ones. Yes, if he were not told (i.e. if the experiment is set up differently than 69dodge stipulates) how many were in each group, then each guess will be independent of the previous ones.

Sure, fine. Change the experiment, change the results. I've got no problem with that. Let's summarize:

20 pots, each chosen to be a god-pot or not independently with probability 0.5, and the guesser does not know this: binomial.

20 pots, each chosen to be a god-pot or not independently with probability 0.5, and the guesser knows this: binomial.

20 pots, a random 10 of which are chosen to be a god-pot and the others not, and the guesser does not know this: binomial.

20 pots, a random 10 of which are chosen to be a god-pot and the others not, and the guesser knows this: not binomial.

ETA: if a random 10 are chosen and the guesser doesn't know, the results are binomial from the standpoint of the guesser. But if the guesser declares he will guess in a certain way (such as always guessing 10 to be god-pots) then the experimenters' probabilities are no longer binomial. Anyone who has both pieces of information can no longer model it with a binomial distribution.

I was suggesting number 3 as part of the procedure, maybe I should have been most explicit.
I don't get part about the guesser declaring he will guess in a certain way being any different from deciding he will guess in a certain way and not declaring it or having a bias to guess in a certain way and not realizing it.
 

Back
Top Bottom