I don't want to give T.C. ammunition but;Thank you.
I am happy now; I know that it is not my reasoning that is wrong.
Your error is that you are continuing to look at each individual test, rather than at the cumulative nature of combining them in the meta-analysis. You also continue your misunderstanding of the reasoning behind alpha, but that is ok. The two z tests you compare are precisely the difference between laboratory parapsychology work (where the tc z reasoning determines alpha) and the challenge (where the merc z reasoning determines alpha and the cutoff). My hypothetical coin-flip example was merely a demonstration which combined the multiple challenge tests into one test, for ease of understanding. Your analysis here confirms that it is a sound argument.
My argument is with the test design, certainly; it is a design that is appropriate for the challenge, but inappropriate for a meta-analysis.
I do thank you for finally putting your reasoning on the table. My mind is at ease now.
Don't worry about that--first off, he was comparing to .5, so this is a different animal altogether. Secondly, if it turns out he's right, that is what is important, not winning some argument.I don't want to give T.C. ammunition but;
At first glance, that looks really neat--I would certainly defer to someone who knows more about math than I do (drkitten?), but I do think that addresses the systematic bias that I was talking about.Thoughts anyone?
Wouldn't the example you gave about the strings of heads be analysable as a geometric distribution?
We've lost a massive amount of information in the throwing away of data so our certainty of the head to tails ratio will be lower, but I think we can create an unbiased estimator of E(heads) as (Occurences of a string of heads of length L+1)/(Occurences of a string of length L).
Agreed. It would be one fun Monte Carlo simulation, though, no?Yes, we could do this analysis. But we would need essentially to build our statistics, our estimator, and our tables of significance from scratch, and to perform meta-analysis on this kind of data under field conditions would be a nightmare.
Correct me if I am thinking fuzzy on this...it seems to me that the type of test and claimed level of accuracy are much less important in this analysis than on the sort that TC was asking for. Or maybe I am just looking at the bias I saw, and am ignoring some other source of bias.In particular, building our estimator hinges crucially on the idea that there are lots of people, all running near-identical coin-flip experiments. Under field conditions, this is exactly what we don't get -- instead, we get one person who can "influence" coin flips, another person who can "predict" the fall of a pair of dice, a third who can clairvoy (is that a word?) the cards drawn from a conventional deck, all to different claimed threshholds of accuracy. And that's not counting the nutcases who believe that they can summon UFOs.
Agreed wholeheartedly. Although it might (might, I say, I am speaking out of ignorance here) be a fun project for someone pursuing a math degree!Again, the people who are in the greatest need of this kind of analysis are not the JREF, but the field researchers at the parapsychology department of Redbrick Uni; the JREF has neither the facilities, the interest, the mission, nor the capacity for doing this kind of meta-analysis.
It's still not really applicable to the JREF stats, just because of the pick 'n' mix nature of their tests. Something which is necessary for them to test every applicant.Correct me if I am thinking fuzzy on this...it seems to me that the type of test and claimed level of accuracy are much less important in this analysis than on the sort that TC was asking for. Or maybe I am just looking at the bias I saw, and am ignoring some other source of bias.
*Sigh*.Agreed wholeheartedly. Although it might (might, I say, I am speaking out of ignorance here) be a fun project for someone pursuing a math degree!
Agreed. It would be one fun Monte Carlo simulation, though, no?
Correct me if I am thinking fuzzy on this...it seems to me that the type of test and claimed level of accuracy are much less important in this analysis than on the sort that TC was asking for.
The claimed level of performace is less relevant, but I think that would need to be assessed on a case-by-case basis.
Why would each person take the challenge? Consider the cost, chance of winning and the reward. The cost is a few hours of your time. The chance of winning is 1 in 1000 (if we have alpha=0.001). The reward is to be able to say you beat James Randi in his challenge. The woos worldwide would eat this up and the winner of the preliminary challenge would make a lot of money from them as a result. S/He'd be a cult hero to them. You'd make, at a minimum, tens of thousands of dollars, if not millions, all for a few hours of your time at a 1 in 1000 shot.
I don't know what the alpha level is on the JREF preliminary test (the ones that are statistical in nature that is) but if it's 0.001 then it's too large IMO. To prevent fraud you have to have it such that a large number of people can't take the test in a short time and have one win by chance and then go "A-ha!"
[snip]
If all the tests were the same and the alpha level were 0.001 then that chance someone wouldn't win by chance on a single test would be 0.999, which means it'd take only 693 people taking the test to have a greater than 50% chance that someone would pass by luck alone.
My understanding is that the .001 level is for preliminary testing only, with results expected to meet the .000001 level for the actual MDC.
And to some extent he's right, because "against stupidity, the Gods themselves contend in vain," and there are demonstrably still people out there who believe all sorts of dumb things. But I also think that he's wrong, because the percentage of those people is slowly getting smaller and smaller. Even homeopaths will go in for surgery if they get appendicitis.
If I were a cheerleader for the paranormal I'd be very keen to see these statistics compiled,
so that I could claim that paranormal activities are observed, just not at a level required to win the challenge.
Especially those who are ignorant of statistics.Or just any person who is curious about seeing the actual data from interesting tests.
Misinterpretation of statistical artifacts as effects.If there's nothing there, what does one have to be afraid of?
Especially those who are ignorant of statistics.
Not ignorant of the statistics. Ignorant of statistics. Unable to understand a bias inherent in the accumulation of trial data, for one example.But we're all ignorant of the statistics if we're not able to see any actual statistics.
It is a self-selected sample. Suppose you found a particular percentage female; what possible interpretation could you give it? Is that the percentage in the population? Are there pressures that might lead a greater proportion, or a smaller proportion, of women to apply?What % of the applicants have been female? Interesting question. Seems unnecessarily difficult to get a numeric answer.
Unable to understand a bias inherent in the accumulation of trial data, for one example.
Suppose you found a particular percentage female; what possible interpretation could you give it? Is that the percentage in the population? Are there pressures that might lead a greater proportion, or a smaller proportion, of women to apply?
They are meaningless data.
What sort of things do you think you could learn from the gender percentages of these data?