Who runs the tests?

Reb · Sep 19, 2003

Oh yeah! Somehow my eyes just passed over that paragraph ...

scoman · Sep 29, 2003

Who runs the tests? The JREF, no question!

The flummery on the page proceeding the actual application, states that the JREF does not involve itself with the testing procedure. However, this point is not in the terms of the actual application. The actual application does state that Randi may be present (but he won't interact with any materials!).

Also "We consult competent statisticians when an evaluation of the results, or experiment design, is required" - Does not mean that the statistcians are completely unconnected to the JREF, or unsympathetic to the JREF.

There have also never been any failed claimants to the $1000000 challenge! Back to the flummery of the proceeding page, "Upon success in the preliminary testing process, the "applicant" becomes a "claimant." . As the application states "To date, no applicant has passed the preliminary test, and this has eliminated the need for formal testing in those cases" - therefore, there has never even been a claimant to the challenge!

TheBoyPaj · Sep 29, 2003

You've raised an interesting point there, Scoman. (Welcome, by the way!)

"We consult competent statisticians when an evaluation of the results, or experiment design, is required"

Randi has always been very forthright in pointing out that the tests are designed so that no evaluation of results is ever required. Sure, consult a statistician while the test is being designed, so that an appropriate number of repetitions can be picked to rule out chance, but it should never be necessary to consult one afterwards. The results are supposed to be self-evident!

Can anyone explain this?

scoman · Oct 9, 2003

Thanks paj,

"Can anyone explain this?" - apparently not!!

Scott

Rolfe · Oct 9, 2003

TheBoyPaj said:
"We consult competent statisticians when an evaluation of the results, or experiment design, is required"

Randi has always been very forthright in pointing out that the tests are designed so that no evaluation of results is ever required. Sure, consult a statistician while the test is being designed, so that an appropriate number of repetitions can be picked to rule out chance, but it should never be necessary to consult one afterwards. The results are supposed to be self-evident!

Can anyone explain this?

Yes. It's perfectly simple. For a good example, look at the "question about statistical significance" thread on the "Science, Mathematics..." forum.

In that one, I asked about a potential applicant who has a claim that he can identify a homoeopathic preparation from a sham. You give him a preparation and he answers. Well, he can be right or wrong. It's a simple yes/no situation, he has a 50% chance of getting it right just by a random guess. Randi isn't giving anyone a million bucks for getting a 50/50 guess right.

So, how many times does he have to get it right before you're as confident as you can be that he's not just guessing? We chewed it over and came up with 18 right out of 20, because that gives a p value of <0.001. Nine out of ten doesn't quite reach that level of significance, though ten out of ten does (but three out of three doesn't). You do need statistical advice to ensure that you've set the bar high enough so that only the genuinely superpowered are going to be able to clear it.

The thing is, you have to agree this beforehand - 10 out of 10 or 18 out of 20 can be mutually agreed as the level of performance which constitutes a pass. Less than this is a fail. Once that has been agreed (and if the applicant tries to insist that 16 out of 20 should be a pass, you have to explain why you don't buy that), then the outcome is self-evident. He either achieves it or he doesn't.

Clear enough?

Rolfe.

scoman · Oct 9, 2003

"Clear enough?" - Yes!

As long as the use of statisticians, and their identity is agreed before hand, that does answer the question.

TheBoyPaj · Oct 13, 2003

That covers the "experimental design" bit, but not the "evaluation of the results" which should never be needed.

Rolfe · Oct 13, 2003

TheBoyPaj said:
That covers the "experimental design" bit, but not the "evaluation of the results" which should never be needed.

It is at the most basic level. Somebody needs to have made a record of whether the applicant said "yes" or "no" at each trial, and that needs to be matched up to the record of what was given at each trial. Are the records accurate? Untamperable-with? Do they come up with a pass or fail according to the agreed criteria? I think that's all it is.

Rolfe.

Peter Morris · Oct 14, 2003

The thing is, you have to agree this beforehand - 10 out of 10 or 18 out of 20 can be mutually agreed as the level of performance which constitutes a pass. Less than this is a fail. Once that has been agreed (and if the applicant tries to insist that 16 out of 20 should be a pass, you have to explain why you don't buy that), then the outcome is self-evident. He either achieves it or he doesn't.

Clear enough?

I understand what you are saying, and I know that's how Randi runs his tests, but it seems to me that that is completely the wrong way to do it. It seems to me absurd that with a passmark of 80% and a score of anything up to 79% Randi would declare a failure.

There should be three possible results: definite pass, definite failure, or more tests required.

Example: http://www.skeptics.com.au/journal/divining.htm

In this dowsing test, dowsers had a 1 in 10 chance of guessing the correct pipe, so you would expect a 10% hit rate. In fact they scored 22% on the water test.

This is a lot lower than the passmark of 80%, but a lot higher than chance, the odds against it are about 107-1. Statistically speaking we can say that there is a better than 99% chance that there is some real dowsing effect here.

When you get results of this type, you have to decide whether its real, or just a fluke. So you should run further tests to see if the results are replicated. If you run the test several times, and the dowsers score above 20% on every test, that would be conclusive evidence of dowsing. The fact that they boast of 100% success rate wouldn't matter.

In your hypothetical example, with a score of 16 out of 20, again that would be significantly higher than chance, I would want to see if he can repeat it with the same result.

That's how the test should be run.

Randi, of course, won't do that.

CERDIP · Oct 14, 2003

You really do only need a binary result.

Party A says "90% is the pass mark".

Party B says, "No - 60% is better than chance alone. 90% is too high."

Party A says,"How about 75%, then ?"

Party B says,"Ok, 75% is easy, so OK."

Party A says, "Are you sure ? 75% is do-able for you ?"

Party B says, "yes!"

Party A says, "So if you achieve only 74.9% or less than that, then you agree that you've failed, and will say so publicly ?"

Party B says....

whatever, and so it goes until both sides are satisfied. No need for graduated scales or judges, etc.

[EDIT]typo

uneasy · Oct 14, 2003

Well put, CERDIP. This reminds me of a time when someone told me to meet them somewhere "before 3pm". I arrived at 2:30pm, and they were angry because they wanted me there earlier. I followed their request, but it wasn't good enough because the same request meant something else to them. They should have given an exact deadline to remove all doubt. And I was also at fault for asking them what they meant by "before 3pm".

The agreement BEFORE the test (which has been mentioned at least 5 times in this thread) can make it a binary test.

Segnosaur · Oct 16, 2003

Peter Morris said:
It seems to me absurd that with a passmark of 80% and a score of anything up to 79% Randi would declare a failure.

There should be three possible results: definite pass, definite failure, or more tests required.

There is already a way to say "more tests are required"... because the terms of the JREF allow people to reapply in the future. If a person failes, but thinks they are close, they are welcome to make any adjustments as they feel is necessary, and retake the test.

Who runs the tests?

Reb

Scholar

scoman

New Blood

TheBoyPaj

Graduate Poster

scoman

New Blood

Rolfe

Adult human female

scoman

New Blood

TheBoyPaj

Graduate Poster

Rolfe

Adult human female

Peter Morris

Muse

CERDIP

Student

uneasy

Muse

Segnosaur

Penultimate Amazing