• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Astrology test protocol in progress..

Good one Hokulele!

Now, if I may ask, let's get back to the original test protocol. If you want to discuss with Astro Teacher in more detail, please open a separate thread.
 
Discussion on protocol suggestion #1

Originally Posted by Kuko 4000 said:
Do you mean that when the volunteers have chosen the profile that they got most hits in, I, or whoever works as the middle man, would then see how the points compare with the overall point distribution, and if the target persons have significantly higher rate of "hits" than the average the test is a success? Damn the language barrier, it makes certain kind of thinking very difficult, especially when the field is pretty much unknown to me, apologies for that.


Yes, and then compare the results to the null hypothesis - which in this case is the hypothesis that the astrologer's profiles fit their target no better than a randomly selected one.

Here's a possible procedure:

1) find the average (over everyone) number of hits participants give to profiles that are NOT their own

2) find the average (over those that had one) number of hits that the participants gave to their own profile

3) based on 1) and 2), determine with what confidence you can reject the null hypothesis. Basically the question you're asking is this: give a set of randomly distributed numbers with mean and variance as in 1), what is the probability that the three(?) additional such numbers in 2) differ from 1) by as much as x (where x is the amount 2) differs from 1) in the real data). If that probability is less than .05 or so, you can call the result significant.

Computing that probability is simple given some assumptions about the distribution in 1). You could also write a little computer code to simulate the experiment (assuming the null hypothesis) to check that.

Alternatively you might use some statistic other than the average in 2). I don't see what would be better, but perhaps there is something.

You should also decide in advance whether you would accept anomalously low scores in 2) as significant (i.e. if the astrologer's profiles fit their subjects much worse than a random profile does, do you consider that evidence for anything). I'd say not, which affects the calculation of significance (you use a one-tailed distribution instead of two-tailed).


Thanks Sol, I think I'm starting to see this a bit clearer now. However, I would like to make this example a bit more concrete to understand it even better.

Observations:

The pool of participants is 10.

In option 1) the average can only be counted from 7 participants, because the test will be considered as a fail if any of the target participants chooses the wrong date.

In option 2) the average can only be counted from 3 participants, because there are only 3 real dates.

Ok, question:

How about if the average number of hits in option 1) is 4.

In this case, what would the average number of hits have to be in option 2) to reach a statistically significant result, and, what would it have to be if I wanted the odds to be around 1: 500.

Could you also open up the operation a bit, I'd like to see if I could grasp the maths myself. It doesn't seem too complicated at this point. I realize that the numbers are connected.

Note:

I have not yet decided about the number of claims there will be per single profile. 10, 15 or 20, I guess.
 
Last edited:
Discussion on protocol suggestion #2

ETA: If he knows there's 5 of each, he could guess all yes or all no and be sure to get 5 of 10 right.


(My bold.)

I don't think that's accurate, let's recap:

I have a pool of 10 participants. 5 of them have a history in substance abuse, 5 of them have not = Y/N (50 / 50) situation for the astrologer in each guess. He needs to connect the Y/N answer to the birth details. He knows that there is 5 of each, as far as my brain tells me, he could get them all wrong.

Anyways, I like this approach.

Suggestion:

I have a pool of 20 volunteers over 30 years of age.

10 of them have a history of substance abuse and 10 of them have not.

I will randomly choose (by flipping a coin) 10 participants out of these 20 volunteers.

The astrologer will try to connect the birth details of the 10 participants with the substance abuse.
 
Discussion on protocol suggestion #2




(My bold.)

I don't think that's accurate, let's recap:

I have a pool of 10 participants. 5 of them have a history in substance abuse, 5 of them have not = Y/N (50 / 50) situation for the astrologer in each guess. He needs to connect the Y/N answer to the birth details. He knows that there is 5 of each, as far as my brain tells me, he could get them all wrong.
But if he guessed all yes or all no he would be sure to get 5 right.

ETA: If I shuffled a regular deck of cards then told you that I would guess (or rather use my psychic power) whether each card in turn was red or black, and I guaranteed that I would be right at least 26 times, would you put money on it? (I could just guess "red" for every card. If I kept count, I might get lucky on the last few cards and be able to get better than 50% since I might know that they have to be black, but I would be 100% certain to be right for at least half of them.)

Anyways, I like this approach.

Suggestion:

I have a pool of 20 volunteers over 30 years of age.

10 of them have a history of substance abuse and 10 of them have not.

I will randomly choose (by flipping a coin) 10 participants out of these 20 volunteers.

The astrologer will try to connect the birth details of the 10 participants with the substance abuse.
I do like this method better. If nothing else, it makes the math easier to do. (We effectively have 10 participants that could be all Y all N or--most likely--some mixture of Ys and Ns.)

So this means he has a 1:2 chance of being right by chance alone for each one.

If I'm doing this right (and that's a big IF!), that means for him to get 10 in 10 correct, the P=0.00098 or just under 0.01%--about 1:10,000. For him to get 9 of 10 right, comes out to P=0.0098 or just under 0.1% or 1:1000. For him to get 8 of 10 right comes out to P=0.044 or 4.4% or about 1:23.

If that's right, he would need to get all 10 right to convince me that he can do what he claims. I would abide by 9 being a significant enough result to test again.
 
Last edited:
But if he guessed all yes or all no he would be sure to get 5 right.


You are of course correct, for some reason I didn't read and / or think your post carefully enough the first time, and just responded automatically, I don't know what happened (EDIT: actually I do now, but can't be bothered to explain in detail, it's hard enough to write the necessary stuff!). If he would know that there is 5 and 5 of both, the rules would of course be that he would have to connect 5 birth details to "no history" and 5 birth details to "history", instead of placing all bets on either.
 
Last edited:
At the moment I'm concentrating my efforts on this protocol:


Suggestion:

I have a pool of 20 volunteers over 30 years of age.

10 of them have a history of substance abuse and 10 of them have not.

I will randomly choose (by flipping a coin) 10 participants out of these 20 volunteers.

The astrologer will try to connect the birth details of the 10 participants with the substance abuse.


It is much simpler to work with and the astrologer is happy as well :)

I will update the thread as soon as new info emerges.
 
Observations:

The pool of participants is 10.

In option 1) the average can only be counted from 7 participants, because the test will be considered as a fail if any of the target participants chooses the wrong date.

The wrong "date"? I'm confused. Did you mean to say "data" or "profile"? If so, are you saying that each participant will pick one profile only and count hits for that one? I had thought each participant would rate the number of hits in all 3 profiles.

And why would it be an automatic fail under any such single circumstance? The whole point of my suggestion is to get rid of arbitrary criteria like that, which have no place in a scientific protocol. You should simply evaluate the probability which which you can reject the null hypothesis given the data you obtain, not impose arbitrary cutoffs.

Ok, question:

How about if the average number of hits in option 1) is 4.

In this case, what would the average number of hits have to be in option 2) to reach a statistically significant result, and, what would it have to be if I wanted the odds to be around 1: 500.

I don't know, because evidently I've misunderstood something about the protocol since I don't understand your example. Also you haven't provided enough information - even if you assume the simplest possibility for the distribution, you still need at least one more number besides the mean to answer that (the standard deviation if it's a Gaussian). So let me try to answer a bit more generally.

The math problem should boil down to this: you have a set of numbers - the hit numbers recorded by participants for profiles that are not their own - that we can assume are random. Those numbers have a mean and a standard deviation, and it's probably OK to assume they are drawn from a Gaussian distribution with that same mean and variance (although if the number of possible hits is low, or the mean near zero or the max, it might be better to assume a binomial distribution or something similar). Picking which distribution to assume might tricky... but if you have a reasonable amount of data (like 27 responses) you can make sure they fit reasonably well.

Then you have 3 numbers (hits of subjects on their own profile) that we want to look at and decide what's the probability that they would take the values they take if they were drawn from that same distribution. Actually more precisely, what's the probability that they would take those values or any larger value (because a larger value is even more significant). That's a mathematically well defined question given the distribution, and you can find out how to calculate it in any book on stats, or ask for help again here. It's not hard. If the result is sufficiently small (5% and 1% are typical numbers in psychology) you can reject the null hypothesis (namely, that those three numbers were drawn randomly from the same distribution as the others), and the astrologer has succeeded.
 
Last edited:
At the moment I'm concentrating my efforts on this protocol:





It is much simpler to work with and the astrologer is happy as well :)

I will update the thread as soon as new info emerges.

OK, that is much simpler. Of course you're testing a different claim (that the astrologer can identify substance abusers rather than write generally accurate profiles), but it will still be interesting.

As for the statistics.... let's see. We can probably assume as a null hypothesis that the astrologer is 50% likely to identify any given subject as a substance abuser (after all, he knows 50% of his subjects will be on average). If so he has a 50% chance of being correct for each of them. The probability of getting n out of 10 correct can be calculated here (with my assumptions the first line is .5 and the second 10).

As you can see, 8/10 is just barely not significant at the 5% level (there's a 5.4% chance of getting 8/10 or better given my null hypothesis). 9/10 is 1.1%, and 10/10 is .1% (highly significant).

So he needs 9/10 or 10/10 for a significant result with this protocol.

Make sure he knows that the number of substance abusers in the set isn't necessarily 5/10 (since you're picking each randomly from the set of 20). Otherwise you'd need to use a different null hypothesis, because his guesses will certainly not be independent (he'll make sure to choose 5/10).
 
Last edited:
OK, that is much simpler. Of course you're testing a different claim (that the astrologer can identify substance abusers rather than write generally accurate profiles), but it will still be interesting.
And of course it doesn't rely on the business of subjects trying to retrofit hits to some profile (and the Forer Effect experiments show that we can expect something like 80% success rate even when we know astrology wasn't used).

Also it removes 100% the chance of information leakage coming through profiles. (That's really a subjective and error prone business, as we discovered on that other thread.)

As you can see, 8/10 is just barely not significant at the 5% level (there's a 5.4% chance of getting 8/10 or better given my null hypothesis). 9/10 is 1.1%, and 10/10 is .1% (highly significant).

So he needs 9/10 or 10/10 for a significant result with this protocol.
That agrees pretty much with what I calculated. Good.


Make sure he knows that the number of substance abusers in the set isn't necessarily 5/10 (since you're picking each randomly from the set of 20). Otherwise you'd need to use a different null hypothesis, because his guesses will certainly not be independent (he'll make sure to choose 5/10).
Yep. As I said earlier, if he knows it's 5 of 10 it would be trivial for him to get 5 correct 100% for sure simply by guess all Ys or all Ns. Kuko came up with the idea of starting with a pool of 20 (10 Ys and 10 Ns) and randomly selecting 10 from that pool. Plus it makes the math a lot easier!
 
As for the statistics.... let's see. We can probably assume as a null hypothesis that the astrologer is 50% likely to identify any given subject as a substance abuser (after all, he knows 50% of his subjects will be on average). If so he has a 50% chance of being correct for each of them. The probability of getting n out of 10 correct can be calculated here (with my assumptions the first line is .5 and the second 10).


I don't think the bolded part is true, given how Kuko described the protocol. Although 50% of the 20 subjects will be substance abusers, half of the entire will be selected to be tested by a random method (coin flip). So half of the tested subjects could be substance abusers, all of them could be (although unlikely), or any percentage in between. Not knowing how many are substance abusers gives the astrologer slightly less of an edge in this test.

I do think that each test in isolation gives the astrologer a 50% chance of being correct (abuser/not abuser), but wouldn't that lead to roughly straight odds over 10 tests (i.e. similar to flipping a coin heads 8 out of 10 trials)?
 
I don't think the bolded part is true, given how Kuko described the protocol.

It's true, I promise.

Although 50% of the 20 subjects will be substance abusers, half of the entire will be selected to be tested by a random method (coin flip). So half of the tested subjects could be substance abusers, all of them could be (although unlikely), or any percentage in between. Not knowing how many are substance abusers gives the astrologer slightly less of an edge in this test.

All true. And as I said, on average 50% of them will be abusers, because the original pool was 50% abusers and they are selected randomly. So the astrologer will know that the expected (i.e. average) number of abusers is 5/10, but not the actual number.

Actually there is one thing here which makes the odds slightly more complicated than I was thinking before. Because the pool is finite (20), the distribution on number of abusers is not exactly binomial, it's a bit "squeezed" towards the mean (which is 5/10). So the astrologer has slightly more information than he would if the pool were infinite. In principle, if the astrologer came up with a very unbalanced result (like 9/10 are abusers) he could use that knowledge to tweak his guesses towards a more balanced result.

I'm not sure I'm explaining that well - basically what I'm saying is that I could do slightly better than just guessing 50/50 abuser versus not if I were the astrologer. But only slightly, and I don't think that will affect the odds very much, and moreover it's not clear this means the null hypothesis should change, since I kind of doubt the astrologer is that sophisticated.

I do think that each test in isolation gives the astrologer a 50% chance of being correct (abuser/not abuser), but wouldn't that lead to roughly straight odds over 10 tests (i.e. similar to flipping a coin heads 8 out of 10 trials)?

I'm not sure what you mean by "straight odds". If you mean 9/10 is as likely as 5/10, then no, absolutely not. There are many more ways to get 5/10 than there are to get 9/10, and each of those ways is equally likely in a random process. If you meant it's the same odds as flipping a coin, then yes, you're right - that's what I said, and that's what the calculator I linked to calculates.
 
That agrees pretty much with what I calculated. Good.

I had missed your post before. Our numbers disagree slightly - it looks like you were calculating the odds that he gets, say, exactly 9/10. That's actually not the right number for this - you should calculate the odds he gets 9/10 OR 10/10. That gives the confidence with which you can reject the null hypothesis if he does in fact get 9/10. Do you see why? If not, I'll explain (it might help to think about a continuous distribution, or a case with 1000 trials instead of 10).

There's also the caveat (about the distribution on the actual number of abusers not being a pure binomial) in my post just above. Do you agree with me that it's not important enough to worry about?
 
Last edited:
My fiancee, who is alas a complete lover of all things woo, has pointed out a flaw in the protocol - people are often really lousy judges of their own personality. They'll agree with anything positive you say about them!

She suggests, and I agree, that a better protocol may be for the subjects to nominate a number of people who know them well, and *those* people judge how well the essays match.
Your fiancee brings up the same problem I am trying to work out.

A good suggestion, but a secondary problem arises: many individuals associate with like-minded individuals. These associates may also feel obligated to uplift their friend, choosing the most polite result and thus perpetuating the original problem.
 
I had missed your post before. Our numbers disagree slightly - it looks like you were calculating the odds that he gets, say, exactly 9/10. That's actually not the right number for this - you should calculate the odds he gets 9/10 OR 10/10. That gives the confidence with which you can reject the null hypothesis if he does in fact get 9/10. Do you see why? If not, I'll explain (it might help to think about a continuous distribution, or a case with 1000 trials instead of 10).
I understand. (I was more worried about being off by an order of magnitude.)
 
Great news! The astrologer is back in business, I thought he was on a retreat in US, but he never got there in the first place. Well, anyways, we're back, and now he has more time on his hands. He agreed to do this:


I will collect 10 birth details.

He will write 10 profiles according to the details.

The 10 volunteers will read all of the profiles and pick the one that they feel is closest to their persona.

His aim is to get 7 correct out of 10.

The volunteers cannot all be sceptics, because of the negative energies, but there can be a few sceptics in the mix.

Minimum age of the volunteers is 25.

Birth certificate is necessary.



I think we are going to lower down the "pass" mark considerably, 7 seems incredible, but that's what he said to me. What would you guys consider to be a meaningful result here? We are currently discussing the best way to choose volunteers. I will keep you all posted.
 
Last edited:
Thanks IXP, but I don't think that's necessary here unless you think there's a danger for something other than deliberate info leakage or fraud that is somehow connected to me, and only me. I'm open to the possibility, but just can't think of anything that would compromise the test. We just need to make sure everything goes according to keikaku.

That is an interesting point though, but who is to say that the person I choose or trust to be chosen is not "in it" as well. Maybe you are my partner in crime IXP! Also, how can "the astrologer" be sure that no one is messing up the results afterwards? (EDIT: I guess if the volunteers would send their answers straight to "the astrologer", that's how.)

If anyone here would like to volunteer for the job I'm all for it though, feel free to PM, I'd appreciate it. I just need to cross check your woo-record first!



Well being the first owner of a randi fish tattoo, i think that i would probably be a good choice. If there is anything i can do to help , toss me a pm.
 

Back
Top Bottom