JREF Challenge Statistics

Mercutio · Feb 21, 2006

T'ai Chi said:
At the end of the day, you get numbers out of it that would be interesting to see and analyze.

No. As explained above, you get numbers out of it that are necessarily biased, due to the nature of the challenge tests. The only interesting analysis would be a demonstration to a stats or methods class as to why this would be a flawed use of data.

Mercutio · Feb 21, 2006

T'ai Chi said:
In this scenario, in order to do the test one assumes that the person making the claim is truthful about their claimed abilities. Unfortunately, that is the very thing one is trying to ascertain by the test in the first place. Perhaps they really only perform at the 90% level, or at the 70% level, or some other level.

And in practice, the JREF representatives have allowed for some wiggle-room (x-ray girl was allowed mistakes, even though her claim could easily have meant that she would not have made mistakes). The cutoff performance is mutually agreed upon. JREF strongly suggests that claimants test their own abilities first. Again, the challenge is not the time to be ascertaining what their abilities are, it is the time to demonstrate their abilities as claimed.

It doesn't make much sense to say 'OK, since you're saying you perform at the K% level, we'll test you at that level, and if you don't perform at it, you're wrong.' It makes sense to say 'We know with regular coins we'd expect you to perform at the 50% level, and if you don't perform siginificantly away from this, you're wrong.'

It makes perfect sense if you are interested in testing that question, which is a completely different question. The question we are testing is whether they can perform as they claim to. Period. Because of this, their data are useless for the type of post-hoc combinational analysis you were initially suggesting.

I see that three people beat me to this...well, tough. I am posting it anyway.

T'ai Chi · Feb 21, 2006

Mercutio wrote:

The question we are testing is whether they can perform as they claim to. Period.

You aren't testing anything. Neither am I. Various organizations are.

That aside, their performance is measured by performance, not by what they believe about how they'll perform. The latter helps to set various cutoffs and help design the test though.

Because of this, their data are useless for the type of post-hoc combinational analysis you were initially suggesting.

From the reasoning I've seen, it is not convincing that meta-analysis does not apply to tests done by skeptical organizations.

And again, this issue of combining aside, what about seeing a list of the tests with their statistical results? Why is something so basic, so obviously interesting, so difficult to see.

drkitten · Feb 21, 2006

T'ai Chi said:
And again, this issue of combining aside, what about seeing a list of the tests with their statistical results?

What about it? The list of the tests is easy enough to obtain. The statistical results may not be available for the reasons Mercutio has already discussed.

CFLarsen · Feb 21, 2006

T'ai Chi,

How do you expect a psychic to perform?

Or, in fact, any kind of paranormal claimant to perform?

Mercutio · Feb 21, 2006

T'ai Chi said:
You aren't testing anything. Neither am I. Various organizations are.

"We" = anyone interested in this topic. You have suggested using their data in a secondary analysis; from that, I gathered that you are interested in finding the answers. My apologies if that was a mistake.

That aside, their performance is measured by performance, not by what they believe about how they'll perform. The latter helps to set various cutoffs and help design the test though.

You did not understand my reasoning, nor my example, then. Their performance in these tests is subject to a systematic bias, not because of the test design, but because the tests are not intended to provide data which would be appropriate for meta-analysis.

From the reasoning I've seen, it is not convincing that meta-analysis does not apply to tests done by skeptical organizations.

Then one of us is wrong. Please explain your reasoning; I have already explained why such data is inappropriate for meta-analysis. What is it that I have missed? Or, what is it that you do not understand?

And again, this issue of combining aside, what about seeing a list of the tests with their statistical results? Why is something so basic, so obviously interesting, so difficult to see.

I have already given you my answer to this.

CFLarsen · Feb 22, 2006

T'ai Chi said:
You aren't testing anything. Neither am I. Various organizations are.

Have you ever taken any responsibility for anything you have ever done in your entire life?

T'ai Chi · Feb 22, 2006

Mercutio said:
Their performance in these tests is subject to a systematic bias,

We disagree.

And, again, the issue of meta-analysis aside, wouldn't the summarized data from all of these individual tests be nice to see?

drkitten · Feb 22, 2006

T'ai Chi said:
We disagree.

Which doesn't mean that your opinions are of equal validity.

What's wrong with Mercutio's coin-flipping experiment as an example of a valid test that even so is to biased to use for meta-analysis?

Mercutio · Feb 22, 2006

T'ai Chi said:
We disagree.

We have established this. I have explained why I believe you are wrong, and invited you to explain why you believe I am. "We disagree" leaves one of us wrong; if it is me, I want to know.

And, again, the issue of meta-analysis aside, wouldn't the summarized data from all of these individual tests be nice to see?

No. The summarized data would be misleading. The summarized data from parapsychologists' experiments, not subject to the same bias described earlier, would be very nice to see.

CFLarsen · Feb 22, 2006

T'ai Chi,

How do you expect a psychic to perform?

Or, in fact, any kind of paranormal claimant to perform?

T'ai Chi · Feb 24, 2006

Mercutio said:
No. The summarized data would be misleading.

Not at all.

For example, if a dowser got 5 right out of 20 tries, showing a table that says 5 out of 20, with the probability of being correct for each try (which is the same from try to try) is useful information.

How about even more simple, like the number of preliminary challenges per year?

Or the gender distribution of applicants?

Or the % of different types of claims?

And many more.

The summarized data from parapsychologists' experiments, not subject to the same bias described earlier, would be very nice to see.

Sure.

But this is a thread on the statistics from the JREF challenge and similar stats from similar tests done by similar skeptical organizations.

CFLarsen · Feb 24, 2006

T'ai Chi,

How do you expect a psychic to perform?

Or, in fact, any kind of paranormal claimant to perform?

Mercutio · Feb 24, 2006

T'ai Chi said:
Not at all.

Yes, at all. Quite, in fact. And subtly so--enough so that you have not, it seems, grasped it yet.

For example, if a dowser got 5 right out of 20 tries, showing a table that says 5 out of 20, with the probability of being correct for each try (which is the same from try to try) is useful information.

Please go back and read my coin-flip example. It explains how an accumulation of fair tests can lead to a bias if taken in the collective. Your example here looks (and probably is) perfectly fair for a single test, but does nothing to alleviate the cumulative bias. Either you do not yet understand, or you are being dishonest, or I am missing something that you are unwilling or unable to show me.

How about even more simple, like the number of preliminary challenges per year?

Or the gender distribution of applicants?

Or the % of different types of claims?

And many more.

My former answer still applies; I think this would be far more interesting and useful as a summation of the parapsychologists' data. The challenge data are self-selected, and there really is no statistically sound reason to combine them. They are as available as most experimental data already, and of interest to fewer people, I would think.

Sure.

But this is a thread on the statistics from the JREF challenge and similar stats from similar tests done by similar skeptical organizations.

So, all tests subject to that bias.

Ok, then it is a bad idea.

T'ai Chi · Feb 25, 2006

Mercutio said:
Either you do not yet understand, or you are being dishonest, or I am missing something that you are unwilling or unable to show me.

Or you are incorrect in your reasoning of why you believe data from a skeptical organization is exempt from being scrutinized.

...there really is no statistically sound reason to combine them.

That remains to be seen. In fact, the data remains to be seen.

Again, let's ignore the issue of combining for now (something it seems you are having a very hard time doing). We wouldn't know if combining is even applicable until delving into the specifics of the tests. Ignoring the combining, it would still be nice to see the data to answer basic questions like

-what % of those taking the preliminary test are male?
-what % were testing dowsing? card guessing?
-how many preliminary tests per year?
-what is the closest someone has got to passing?
-where geographically do the people being tested come from?
-what % of those being tested, get retested?
-how much does it cost, on average, to get tested?

and many others, this for each skeptical organization who does such tests.

Ok, then it is a bad idea.

You are entitled to your opinion, sure.

http://www.statisticool.com/jrefchallengestats.htm

CFLarsen · Feb 25, 2006

T'ai Chi,

And you are entitled to answer the questions or not:

How do you expect a psychic to perform?

Or, in fact, any kind of paranormal claimant to perform?

Mercutio · Feb 25, 2006

T'ai Chi said:
Or you are incorrect in your reasoning of why you believe data from a skeptical organization is exempt from being scrutinized.

For the third time, then, I invite you to explain the error of my reasoning. I don't see it. I have explained it in sufficient detail that you should be able to point to where I have made my alleged mistake.

That remains to be seen. In fact, the data remains to be seen.

No, it does not remain to be seen; it has, if my reasoning is correct, been explained. I have asked you to explain why you think I am wrong. Feel free to run a simulation of my coin-flip experiment and empirically demonstrate to yourself the soundness of my argument.

The data remain to be seen because there is no good reason to knowingly present data which would be misleading in the aggregate.

Again, let's ignore the issue of combining for now (something it seems you are having a very hard time doing).

I can understand why you would want to ignore it, but it won't go away. Address the issue, and then we won't have to ignore it. We can put it to rest.

We wouldn't know if combining is even applicable until delving into the specifics of the tests. Ignoring the combining, it would still be nice to see the data to answer basic questions like

-what % of those taking the preliminary test are male?
-what % were testing dowsing? card guessing?
-how many preliminary tests per year?
-what is the closest someone has got to passing?
-where geographically do the people being tested come from?
-what % of those being tested, get retested?
-how much does it cost, on average, to get tested?

and many others, this for each skeptical organization who does such tests.

Why? What use do you see for these data?

You are entitled to your opinion, sure.

And I have explained my reasoning, and invited you to do the same. All opinions are not equal. Some are held for good reason, some are held out of ignorance. If mine is the latter, I want to know it.

CFLarsen · Feb 25, 2006

Mercutio,

Better make a list....

Mercutio · Feb 25, 2006

CFLarsen said:
Mercutio,

Better make a list....

I don't do lists.

I could put my request in limerick form...

Is the problem with you, or with me?
I've asked you, times one, two, and three--
Please show my mistake--
That's all it would take--
So put up or shut up, T'ai Chi.

CFLarsen · Feb 25, 2006

I do.

T'ai Chi,

Can you please explain the error of Mercutio's reasoning, instead of merely declaring that he is in error?
Will you run a simulation of Mercutio's coin-flip experiment and empirically demonstrate to yourself the soundness of his argument?
Can you tell us what element all of those tests mentioned by Gr8wight in post # 146 had in common?
What use do you see for these data you listed in #175?
Why is it "reasonable" to set alpha to what you did?
Why is your own alpha value "reasonable", if JREF and others set it differently?
Status: Refused to answer.

JREF Challenge Statistics

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing

Penultimate Amazing