• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

JREF Challenge Statistics

The only thing presented on that note was a poor argument based on optional stopping occurring, something which does not actually happen in JREFs well-designed tests, and based on observed data being tested against what the claimant claims, something which also does not accur since z-scores are in the form

z = (observed-expected by chance)/stuff

Your hang-up is thinking of doing inference based on the %. Viewing the % as a descriptive statistic, there are no problems whatsoever.
The example I gave was a simplification of the problem of combining tests with decision levels based on different claims. It was intended purely to put the very real problem into a more concrete form so that you could understand it. Now the only thing missing is your ability or willingness to extrapolate from that example to the challenge situation.
Why is it meaningless if one wants to know the % of females that have applied for the test? You do not want to learn numerical results about the test?
What would a given percentage mean? What inference could you draw? I have shown you why I think it meaningless.
The characteristics of the applicants. That seems interesting.

As do the categories of claims tested.

For example.

As do the scores from the tests, for reasons already explained to you.
But you would be unable to make any inferences at all about the greater population from this sample. Why not examine the far more useful data gathered by parapsychologists?
 
It was intended purely to put the very real problem into a more concrete form so that you could understand it.

Like I said, it was not dealing with real data nor a real situation (optional stopping and testing against the what the claimant expects do not occur in the JREF tests), so understandably it is not a persuasive way to argue.

What would a given percentage mean? What inference could you draw? I have shown you why I think it meaningless.

This is descriptive statistics, as already mentioned, not inferential. It tells you what % of something occured in the sample.

Why not examine the far more useful data gathered by parapsychologists?

No one is stopping anybody from doing that. It is a good idea too, but a different topic. If you'd like to talk about that different topic, sinec you keep coming back to that, why not start up a new thread on that topic and leave this thread to talking about the statistics from tests done by skeptical organizations.
 
Like I said, it was not dealing with real data nor a real situation (optional stopping and testing against the what the claimant expects do not occur in the JREF tests), so understandably it is not a persuasive way to argue.
You were unable to see the bias. I tried this example to help you. Perhaps it succeeded better with other readers.
This is descriptive statistics, as already mentioned, not inferential. It tells you what % of something occured in the sample.
The last paragraph of your web page you linked a couple of posts back says that you intend to use the information, in part, to "help better understand what people believe in, who believes in them, where they are from..." This implies inference from the sample to the population.
No one is stopping anybody from doing that. It is a good idea too, but a different topic. If you'd like to talk about that different topic, sinec you keep coming back to that, why not start up a new thread on that topic and leave this thread to talking about the statistics from tests done by skeptical organizations.
Again from your page, you advocate "learning about how skeptical organizations test, and ways to improve the testing." I am simply suggesting ways to improve your own investigation. If you wish to answer the questions you ask on the face of it, the better data set is the parapsychologists'. If you have some other motive for looking at a data set that is not at all ideal for answering your question, by all means continue.
 
You were unable to see the bias.

I saw what you tried to do, it is just irrelevant for reasons stated, namely that there is no optional stopping in the JREF tests, and the observed data is not compared to what the claimant expects but to chance.

Let's break this down; just show me one example of an actual, not hypothetical, test by JREF where optional stopping was agreed upon and moreover, actually occured. Just one. Would shut me up, and would prove your argument has some worth.

The last paragraph of your web page you linked a couple of posts back says that you intend to use the information, in part, to "help better understand what people believe in, who believes in them, where they are from..." This implies inference from the sample to the population.

No, it does not necessarily imply that. It tells us the characteristics of that sample of people.

Again from your page, you advocate "learning about how skeptical organizations test, .."

Yes, that is correct. Without data we can't say much.

If you wish to answer the questions you ask on the face of it, the better data set is the parapsychologists'.

A parapsychologists' data set is not data from preliminary tests done by skeptical organizations. If you wish to look at parapsychologists' data sets, which I agree is fascinating, but off the topic of looking at data from tests done by skeptical organizations, you are welcome to do so.

If you have some other motive for looking at a data set that is not at all ideal for answering your question, by all means continue.

Looking at data from a sample is ideal for telling us about that sample.

If one has some motive for suggesting that data from skeptical organizations cannot possibly be analyzed, they can, by all means, continue.
 
I saw what you tried to do, it is just irrelevant for reasons stated, namely that there is no optional stopping in the JREF tests, and the observed data is not compared to what the claimant expects but to chance.

Let's break this down; just show me one example of an actual, not hypothetical, test by JREF where optional stopping was agreed upon and moreover, actually occured. Just one. Would shut me up, and would prove your argument has some worth.
Once again, and slowly: That was not the point. The point was that the challenge tests were smaller runs because the cutoffs were chosen based on the claim, not based on an assumption of chance performance. That is, if you understand it, sufficient to create the bias. My example was much more blatant, and distilled the problem into one easy-to-see example.

The fact that you still do not see it, though, tells me that the argument failed in its purpose. Oh, well.
No, it does not necessarily imply that. It tells us the characteristics of that sample of people.
Perhaps you had better correct your web page then. It speaks of "what people believe in", not "what that small, self-selected sample of people believe in"
Yes, that is correct. Without data we can't say much.
Does the "yes" go so far as to understand the point I was making? In the part of my post you chose not to quote, I suggest that your own questions are better answered using other methods.
A parapsychologists' data set is not data from preliminary tests done by skeptical organizations. If you wish to look at parapsychologists' data sets, which I agree is fascinating, but off the topic of looking at data from tests done by skeptical organizations, you are welcome to do so.
Well, then...given that the data are unable to answer the questions you have about "what people believe in...", what exactly is your motivation for focusing on the poorer data set?
Looking at data from a sample is ideal for telling us about that sample.
Yes. We have already looked at that sample, in the context of answering the questions that sample was intended to answer. Any more is bad statistics, and bad methodology.
If one has some motive for suggesting that data from skeptical organizations cannot possibly be analyzed, they can, by all means, continue.
Cannot? No. Should not? Many. No ulterior motive, though, simply enough experience with statistics and methodology that their misapplication is irritating. Does one need more motivation to advocate not using flawed methods to try to draw conclusions? I thought we had a common goal of understanding the world, understanding these phenomena. If I pointed out that there was one microscope in the lab that had a cracked lens, and noticed that you advocated using that scope, do you need to suggest "some motive" for my actions?

Your suggestion is flawed. Drop it and walk away.
 
Mercutio said:
What sort of things do you think you could learn from the gender percentages of these data?

The characteristics of the applicants. That seems interesting to know what type of people took the test, gender, where they are from, age, so on.


It is becoming clear that T'ai Chi's interest is merely one of general curiosity, which is nothing to be ashamed of. I, too, have been very curious to find out more information about past tests, and was elated when Kramer began posting information about past challenge applicants on the forum.

Why he feels it necessary to hide his personal curiosity behind clumsy protestations of desire for statistical analysis is beyond me. Is he just trying to feel all academic-like? Does he somehow feel embarrassed to be doing nothing more than simply sticking his nose in and sniffing about?

Don't worry, T'ai Chi. None of us will think less of you if you admit the truth. In fact, many of us will think very much more of you if you abandon this charade and just deal honestly with us.
 
I think it would be interesting to know what percentage of paranormal subjects in skeptic tests have a mole on their left hand. Such a pity that they do not recognize the value such data could have to humanity.
 
T'ai Chi,

Answers to these questions, made possible by the data being easily available, would be of general interest to the skeptical community, and could help better understand what people believe in, who believes in them, where they are from, and learning about how skeptical organizations test, and ways to improve the testing.
Source

Yet, you merely pick those you can find, namely those who take the JREF challenge.

This is exactly what you have accused me of doing in this article: That I include all astrologers (although I don't). And here you do the very same: You take a small sample of people and extrapolates that to the general population.

Better change your webpage, T'ai.
 
What % of the applicants have been female? Interesting question. Seems unnecessarily difficult to get a numeric answer.
Doesn't seem an interesting question to me at all. What would it tell you?
 
Doesn't seem an interesting question to me at all. What would it tell you?

You might not be interested in such things like, for example, the number of beds in a hospital, the number of car accidents in a city, % of crime by type of crime, number of a certain product sold, and other descriptive statistics either, but some are.

Skeptical organizations doing tests should expect people to be interested in seeing the numbers.
 
The point was that the challenge tests were smaller runs because the cutoffs were chosen based on the claim, not based on an assumption of chance performance.

That necessarily doesn't mean there is bias.

If there is, you have an issue with the test design, not people asking to see the statistics.
 
Last edited:
That necessarily doesn't mean there is bias.
Necessarily doesn't? or doesn't necessarily? The former, I would disagree with strongly. The latter, I would ask why one would plan for a test knowing that there is a known possibility of bias, instead of looking for better data sets with which to answer the question? If you have the choice between test tubes you know are clean, and those that might be dirty, it should be an easy choice.
If there is, you have an issue with the test design, not people asking to see the statistics.
There is not a bias in the tests, when they are used for what they are designed for; the designs are perfectly sound for their purpose. The bias emerges when the results are combined to look for deviations from chance. Jekyll's analysis might control for that bias, but your first idea would not.
 
The bias emerges when the results are combined to look for deviations from chance.

That remains to be known. An argument by appealing to hypotheticals is not convincing. In fact, it is somewhat doubtful since meta analyses have been very useful in many other areas of study.

In any case, the idea of combining aside, it would be nice to see an organized presentation of data from individual tests from skeptical organizations that conduct tests. Is that objected to as well?
 
That remains to be known. An argument by appealing to hypotheticals is not convincing. In fact, it is somewhat doubtful since meta analyses have been very useful in many other areas of study.
Meta analysis is a wonderful tool, but only as good as the original studies. If they are not appropriate for the meta-analysis, no amount of massaging the data will make it worthwhile.
In any case, the idea of combining aside, it would be nice to see an organized presentation of data from individual tests from skeptical organizations that conduct tests. Is that objected to as well?
Objected to? Questioned. You seem to want, although you deny, to infer from the small, self-selected sample to the greater population. If that is at all implied, then that is worth objecting to. If it is not what you (or anyone) are after, then what is being examined is the characteristics of a self selected sample, for the sake of looking at the characteristics of a self selected sample. It can't tell us anything about the population at large, it cannot generalize to something about human nature...it seems a very trivial question. Not objected to...but about thisclose to useless.
 
If it is not what you (or anyone) are after, then what is being examined is the characteristics of a self selected sample, for the sake of looking at the characteristics of a self selected sample.

If one is interested in the % of the types of tests that skeptical organizations have done, for example, then seeing these numbers is not useless, but in fact answers the question of what % of types of tests there have been.

I believe (but may be wrong) that it has been said that for the JREF, for example, most of the tests are dowsing. It would be nice to have an actual number, instead of "most". Is most 51%? 90%? What? If you have these %s for the general types of claims, you can sort the list from largest to smallest. This type of stuff is called 'understanding a topic better'. :) Would think that people calling themselves skeptics and others would be interesting in seeing the test data from skeptical organizations for a variety of reasons.

If one protests because the claimants are "self selected", one needs to ask themself if there is any way to test people from the population randomly for a paranormal claim for a million dollar challenge. Sometimes, in reality, one has to go with what data one has, even if it is not under ideal conditions. If no inference is being done from such data, there is not really any issue.

Let's start with something simple; how many preliminary tests are conducted by year, for each year the preliminary test has been done? I doubt the "self selection" makes the number from the answer to this question "useless".
 
If one protests because the claimants are "self selected", one needs to ask themself if there is any way to test people from the population randomly for a paranormal claim for a million dollar challenge.
Thank you. Yes, this points out the problem nicely. The challenge has a very specific aim, and there are exceedingly few questions that can be asked of data collected in accordance with that aim. It is not a "protest" that the claimants are self-selected; it is a fact. A fact which limits the applicability of these data to any other use. The question is not whether there is another way to test the population for the million; the challenge does a very good job of that. The question is, why would anyone want to take those data, which have accomplished their task, and try to force them to answer questions they are not equipped to answer?

The short answer to the question you pose here is that it needlessly combines two problems. If you want to ask the questions you want to ask, you need to test people randomly selected from the population. If you want to test a paranormal claim for a million dollar challenge, you need to do what the challenge has been doing. The data set from the latter is not appropriate to address the former. If you are truly interested in the questions you pose, you should avoid the challenge data.
 
Take a look at how hard T'ai Chi evades the questions put to him.

First, he thinks that it would be "interesting" to find out what percentage of the applicants have been female - and then indicates that this is a problem that JREF should solve.

Attempts of making him explain why it would be "interesting" only results in T'ai Chi's assurance that the statistics are merely "descriptive", despite the fact that he clearly indicates on his webpage that the statistics will be interpreted to help people better understand.

True to form, T'ai Chi has evaded this question:

Well, then...given that the data are unable to answer the questions you have about "what people believe in...", what exactly is your motivation for focusing on the poorer data set?

Taking potshots at skeptical organizations. By pointing to what he feels is poor data, he wants to portray skeptical organizations (and JREF in particular) as sloppy and poorly equipped to test paranormal applicants.

It is interesting to note that, despite T'ai Chi's insistence that only laboratory data is valid, he completely refuses to look at such data from the parapsychologists.
 
You might not be interested in such things like, for example, the number of beds in a hospital, the number of car accidents in a city, % of crime by type of crime, number of a certain product sold, and other descriptive statistics either, but some are.

as it seems so important to you to answer this question, why don't you just go to the challenge forum and count the number of male and female applicants YOURSELF?
 
as it seems so important to you to answer this question, why don't you just go to the challenge forum and count the number of male and female applicants YOURSELF?

Because T'ai Chi has a long history of demanding that others do his work for him. Here's how it goes:

T'ai Chi: "It could be interesting to look at X".

Others: "Well, go look at X and tell us what you find."

T'ai Chi: "If you want to see it, you should find it yourself."

Others: "But you are the one who expressed interest in X."

T'ai Chi: "See how lazy those skeptics are..."

Take his little AURA "study" of transcripts of cold-readers (psychics and mentalists): He insisted that others collected the transcripts for him. He invited people to come up with suggestions and critiques, but refused to listen to those who were critical, even excluding those from seeing the study. People had to either state that they believed in spirits or that they believed that spirits were an impossibility - he would not allow the skeptical POV, namely that spirits may be possible, but no evidence has yet been found. He himself took that stance, of course, and saw no problem whatsoever.

For some unknown reason, T'ai Chi has removed the "study" from his website. Copies are, however, available if you send me an email.
 
A fact which limits the applicability of these data to any other use.

So every once in a while we see data from these tests, right?

Why not put it all in an easy to get, easy to read, format for everyone who is interested in such results from each test?

Seems reasonable.

If you are truly interested in the questions you pose, you should avoid the challenge data.

Is "truly interested" the same as a "true Scotsman"? :D
 

Back
Top Bottom