Originally Posted by Kuko 4000 said:Do you mean that when the volunteers have chosen the profile that they got most hits in, I, or whoever works as the middle man, would then see how the points compare with the overall point distribution, and if the target persons have significantly higher rate of "hits" than the average the test is a success? Damn the language barrier, it makes certain kind of thinking very difficult, especially when the field is pretty much unknown to me, apologies for that.
Yes, and then compare the results to the null hypothesis - which in this case is the hypothesis that the astrologer's profiles fit their target no better than a randomly selected one.
Here's a possible procedure:
1) find the average (over everyone) number of hits participants give to profiles that are NOT their own
2) find the average (over those that had one) number of hits that the participants gave to their own profile
3) based on 1) and 2), determine with what confidence you can reject the null hypothesis. Basically the question you're asking is this: give a set of randomly distributed numbers with mean and variance as in 1), what is the probability that the three(?) additional such numbers in 2) differ from 1) by as much as x (where x is the amount 2) differs from 1) in the real data). If that probability is less than .05 or so, you can call the result significant.
Computing that probability is simple given some assumptions about the distribution in 1). You could also write a little computer code to simulate the experiment (assuming the null hypothesis) to check that.
Alternatively you might use some statistic other than the average in 2). I don't see what would be better, but perhaps there is something.
You should also decide in advance whether you would accept anomalously low scores in 2) as significant (i.e. if the astrologer's profiles fit their subjects much worse than a random profile does, do you consider that evidence for anything). I'd say not, which affects the calculation of significance (you use a one-tailed distribution instead of two-tailed).
ETA: If he knows there's 5 of each, he could guess all yes or all no and be sure to get 5 of 10 right.
My problem with this protocol is twofold: 1)it provides a source of possible information leakage and 2)hit-counting can be an exercise in retrofitting.Discussion on protocol suggestion #1
But if he guessed all yes or all no he would be sure to get 5 right.Discussion on protocol suggestion #2
(My bold.)
I don't think that's accurate, let's recap:
I have a pool of 10 participants. 5 of them have a history in substance abuse, 5 of them have not = Y/N (50 / 50) situation for the astrologer in each guess. He needs to connect the Y/N answer to the birth details. He knows that there is 5 of each, as far as my brain tells me, he could get them all wrong.
I do like this method better. If nothing else, it makes the math easier to do. (We effectively have 10 participants that could be all Y all N or--most likely--some mixture of Ys and Ns.)Anyways, I like this approach.
Suggestion:
I have a pool of 20 volunteers over 30 years of age.
10 of them have a history of substance abuse and 10 of them have not.
I will randomly choose (by flipping a coin) 10 participants out of these 20 volunteers.
The astrologer will try to connect the birth details of the 10 participants with the substance abuse.
But if he guessed all yes or all no he would be sure to get 5 right.
Suggestion:
I have a pool of 20 volunteers over 30 years of age.
10 of them have a history of substance abuse and 10 of them have not.
I will randomly choose (by flipping a coin) 10 participants out of these 20 volunteers.
The astrologer will try to connect the birth details of the 10 participants with the substance abuse.
Observations:
The pool of participants is 10.
In option 1) the average can only be counted from 7 participants, because the test will be considered as a fail if any of the target participants chooses the wrong date.
Ok, question:
How about if the average number of hits in option 1) is 4.
In this case, what would the average number of hits have to be in option 2) to reach a statistically significant result, and, what would it have to be if I wanted the odds to be around 1: 500.
At the moment I'm concentrating my efforts on this protocol:
It is much simpler to work with and the astrologer is happy as well
I will update the thread as soon as new info emerges.
And of course it doesn't rely on the business of subjects trying to retrofit hits to some profile (and the Forer Effect experiments show that we can expect something like 80% success rate even when we know astrology wasn't used).OK, that is much simpler. Of course you're testing a different claim (that the astrologer can identify substance abusers rather than write generally accurate profiles), but it will still be interesting.
That agrees pretty much with what I calculated. Good.As you can see, 8/10 is just barely not significant at the 5% level (there's a 5.4% chance of getting 8/10 or better given my null hypothesis). 9/10 is 1.1%, and 10/10 is .1% (highly significant).
So he needs 9/10 or 10/10 for a significant result with this protocol.
Yep. As I said earlier, if he knows it's 5 of 10 it would be trivial for him to get 5 correct 100% for sure simply by guess all Ys or all Ns. Kuko came up with the idea of starting with a pool of 20 (10 Ys and 10 Ns) and randomly selecting 10 from that pool. Plus it makes the math a lot easier!Make sure he knows that the number of substance abusers in the set isn't necessarily 5/10 (since you're picking each randomly from the set of 20). Otherwise you'd need to use a different null hypothesis, because his guesses will certainly not be independent (he'll make sure to choose 5/10).
As for the statistics.... let's see. We can probably assume as a null hypothesis that the astrologer is 50% likely to identify any given subject as a substance abuser (after all, he knows 50% of his subjects will be on average). If so he has a 50% chance of being correct for each of them. The probability of getting n out of 10 correct can be calculated here (with my assumptions the first line is .5 and the second 10).
I don't think the bolded part is true, given how Kuko described the protocol.
Although 50% of the 20 subjects will be substance abusers, half of the entire will be selected to be tested by a random method (coin flip). So half of the tested subjects could be substance abusers, all of them could be (although unlikely), or any percentage in between. Not knowing how many are substance abusers gives the astrologer slightly less of an edge in this test.
I do think that each test in isolation gives the astrologer a 50% chance of being correct (abuser/not abuser), but wouldn't that lead to roughly straight odds over 10 tests (i.e. similar to flipping a coin heads 8 out of 10 trials)?
That agrees pretty much with what I calculated. Good.
Your fiancee brings up the same problem I am trying to work out.My fiancee, who is alas a complete lover of all things woo, has pointed out a flaw in the protocol - people are often really lousy judges of their own personality. They'll agree with anything positive you say about them!
She suggests, and I agree, that a better protocol may be for the subjects to nominate a number of people who know them well, and *those* people judge how well the essays match.
I understand. (I was more worried about being off by an order of magnitude.)I had missed your post before. Our numbers disagree slightly - it looks like you were calculating the odds that he gets, say, exactly 9/10. That's actually not the right number for this - you should calculate the odds he gets 9/10 OR 10/10. That gives the confidence with which you can reject the null hypothesis if he does in fact get 9/10. Do you see why? If not, I'll explain (it might help to think about a continuous distribution, or a case with 1000 trials instead of 10).
Thanks IXP, but I don't think that's necessary here unless you think there's a danger for something other than deliberate info leakage or fraud that is somehow connected to me, and only me. I'm open to the possibility, but just can't think of anything that would compromise the test. We just need to make sure everything goes according to keikaku.
That is an interesting point though, but who is to say that the person I choose or trust to be chosen is not "in it" as well. Maybe you are my partner in crime IXP! Also, how can "the astrologer" be sure that no one is messing up the results afterwards? (EDIT: I guess if the volunteers would send their answers straight to "the astrologer", that's how.)
If anyone here would like to volunteer for the job I'm all for it though, feel free to PM, I'd appreciate it. I just need to cross check your woo-record first!