Merged Odds Standard for Preliminary Test

fls · Nov 29, 2009

Rodney said:
Thus, to this day, it remains unclear what odds standard must be met and whether time-consuming protocols are eligible for the Challenge.

I think the problem is that you look at the Challenge as a test for paranormal effects, so you make suggestions that are suitable for such a test. Instead, think of the Challenge as a publicity stunt. The purpose is not to discover paranormal abilities, but rather to put a public face on what it means to be skeptical and to examine claims.

If you look at it in that regard, it becomes obvious that tests which are tedious, uninteresting to watch, and show effects which, even if established to be remarkable at greater than 1000 to 1 odds, are so small that no one goes "Wow" when they see it, will not be suitable for the Challenge. It needs to be set up so that it is obvious to the casual observer when something remarkable or unremarkable has happened. And also for this reason, the odds cannot be set in advance so that Randi can maintain flexibility in the design of the experiment.

Anita's recent test is a good example of something which isn't suitable for the Challenge. The test was set up in such a way that she had a good chance of getting at least one correct answer and of providing answers that would be perceived as hits to the casual observer. You can see that some people were unable to resist the temptation of considering her choice of the right person and her purported sensation of certainty as somehow indicative of an effect (even if neither were part of the formal test). On the other hand, the PEAR testing influencing random outcomes, showed outcomes which were far more unlikely than Anita's, yet who would see that one excess hit for every 10,000 trials and argue that they saw an effect?

Think of it this way - it has to play well on Youtube.

Linda

MattC · Nov 29, 2009

Bear in mind, Mr. Rodney, when finagling with the rules that it is not JREF resources which are unduly taxed by the application procedure outside of the negotiation stage - the applicant pays the costs of the procedure, while some other organization handles the actual test. I don't know what fees those organizations charge, if any, but from personal history with non-skeptical organizations of about the same size these fees would be in the four digits and would certainly present a roadblock to any expedient test being conducted.

I think ultimately it is this difficulty that prohibits your emendations from being codified - the difficulty placed on other organizations (not to mention the applicant) would be too great for many of them to bear. Creating a list of organizations capable of bearing the costs of any extended procedure would both dramatically limit the scope of the Challenge and greatly restrict the available applicant pool, hardly ideal for a comparatively small organization focused upon education. Skeptics and their organizations seem to be judged upon knowledge accumulated rather than financial werewithal.

Your point about more stringent codification of the odds is one I am in favor of at least on the theoretical level, but it seems practically difficult to enforce adequately. It would certainly be more scientific, but generally the more stringent the controls the more expensive the test - I don't think it reasonable for an applicant to bear the costs of a fully scientific procedure (which would be quite expensive, judging by some of the grants my university obtains to do that degree of research we're looking in the mid-upper six figures). Also, bear in mind that perfect scientific exclusion is not altogether practical in many of these experiments for ethical concerns - I'd love to forcibly isolate a Ganzfeld participant for the duration of a p=0.001 test, for example, but I imagine that few people would volunteer for it once the requirements were explained to them.

fls said:
I think the problem is that you look at the Challenge as a test for paranormal effects, so you make suggestions that are suitable for such a test. Instead, think of the Challenge as a publicity stunt. The purpose is not to discover paranormal abilities, but rather to put a public face on what it means to be skeptical and to examine claims.

No. If it isn't a test for the paranormal, what purpose does it serve? A magician doesn't need to put a million dollars on the line to garner what is essentially cheap publicity available for the cost of a webcam - judging by the proliferation of amateur magicians and skeptics on YouTube, many have come up with this same idea (and have found parents willing to buy them the webcam as well). Given that many of the actual tests conducted for the Challenge were not widely publicized (whether on YouTube or otherwise) I am not sure this proclamation holds water. Ms. Hunter's case possessed some absurdist elements that made it a worthy spectacle, while Ms. Sonne's test was broadcast as part of a larger event - the decision to broadcast may depend upon noteworthy features, but the decision to test is not.

Science ultimately tries to prove things by exclusion, meaning that implicit or potential effects are removed by experimentation over the long term. If, for example, I set up a test to determine whether or not I can psychically detect someone's hair color when they were in an adjoining room, it's possible that I could simply guess right each time. If we do three trials of this and I get three hits, blind guessing is both a very normal solution and a very possible one. If we increase the number of trials, guesswork becomes increasingly unlikely and will eventually pass the point of statistical likelihood. This doesn't mean it isn't possible (over a thousand trials it's still possible that I could guess a significant percentage right), merely that it's so improbable that it is a functionally insignificant probability - but, because it is a probability, there's always that slim chance of success. Blind luck cannot be controlled for.

The ultimate goal of the Challenge is to control out the mundane explanations and trickery that could be causing a supposedly paranormal event. The decision to broadcast a Challenge testing to the world at large is not made upon test design criteria, rather they are made upon features of the claim (as in Ms. Hunter's case) or done as a subsidiary of a larger broadcasting initiative (Ms. Sonne's test at TAM).

~ Matt

GzuzKryzt · Nov 29, 2009

Reading MattC's post, would it be completely out of line to ask you, Mr. Rodney, if you could accept the limitations of the JREF Challenge and just set up a Ganzfeld test elsewhere?

Yes, we know it's not easy.

Perhaps you could draw courage from the fact that you would be actually doing something, creating reality, if you will, other than meandering about the limitations you very well know and understand.

Rodney · Nov 29, 2009

fls said:
I think the problem is that you look at the Challenge as a test for paranormal effects, so you make suggestions that are suitable for such a test. Instead, think of the Challenge as a publicity stunt.

I commend you for your astuteness. I think MDC Rule Number 1 should be: "The first and most important rule is that the Challenge is only a publicity stunt."

fromdownunder · Nov 30, 2009

Rodney said:
I commend you for your astuteness. I think MDC Rule Number 1 should be: "The first and most important rule is that the Challenge is only a publicity stunt."

Well it is a marketing device. I doubt that anyone thinks otherwise. But that being said, how are "odds" relevant for all claims.

Let's say my claim is "I can levitate" This is a paranormal claim, and the odds are irrelevant. I either can or I can't (In fact a can/can't preliminary test actually took place last year).

There are no odds to calculate. So, in a rules for the claim, specific odds become irrelevant. Because all claims are different, there can be no specific "beat the odds" thingy included. Some of the proposed challenges simply do not work that way.

And as has been said, perhaps your suggestions were considered, found wanting, and not included for that reason.

Norm

MattC · Nov 30, 2009

fromdownunder said:
Let's say my claim is "I can levitate" This is a paranormal claim, and the odds are irrelevant. I either can or I can't (In fact a can/can't preliminary test actually took place last year).

There are no odds to calculate. So, in a rules for the claim, specific odds become irrelevant. Because all claims are different, there can be no specific "beat the odds" thingy included. Some of the proposed challenges simply do not work that way.

It can become more complex (e.g. "I can't do it all the time"), but in general there are some cases where odds generally aren't required nor can they be specific, particularly as Mr. Rodney seems to be pushing for. If this levitation process requires three hours of intensive meditation before you can manage it, doing more than one trial seems quite an imposition upon whomever we ask to observe the event. While undoubtedly someone could be found to shamelessly observe twenty repetitions of the same levitation, there comes a limit as to what most people are willing to volunteer for. Given that neutrality of volunteers is important it makes sense to reduce the demands of the protocol in favor of getting an actual test off the ground so long as experimental efficacy is not compromised.

GzuzKryzt said:
Reading MattC's post, would it be completely out of line to ask you, Mr. Rodney, if you could accept the limitations of the JREF Challenge and just set up a Ganzfeld test elsewhere?

The only real stumbling block I can see to establishing a Ganzfeld test would be to find the space - you'd need at least two rooms and permission to soundproof them (if they weren't already). More importantly you'd need them for a fair bit of time, which leads me to think that getting two motel rooms some distance apart (at least four units I'd think) and setting up the equipment there wouldn't be so difficult. Paying for them would naturally be the responsibility of the applicant.

~ Matt

Rodney · Nov 30, 2009

MattC said:
It can become more complex (e.g. "I can't do it all the time"), but in general there are some cases where odds generally aren't required nor can they be specific, particularly as Mr. Rodney seems to be pushing for.

I agree, which is why I suggested the following language: "An applicant must pass a preliminary test, in which the general criterion for success will be that the applicant must perform at significantly above the chance level. In tests where the odds of success can be readily calculated (emphasis added) -- such as numbers guessing -- the applicant must perform at least at the P=.001 level; that is, the odds must be only one in one thousand that the applicant could have achieved that performance level by random chance. (However, if the applicant achieves a lesser, but above chance, performance level in a limited number of tests -- for example, if the applicant performs at the P=.05 level in 20 trials -- the preliminary test may be extended on a different day or days to include more trials.)"

MattC · Nov 30, 2009

http://www.automeasure.com/chance.html

This web site suggests that performing at a 1/1000 level in a test with five trials requires 0-2 successes - mighty good odds. I'd be willing to apply for a number guessing trial given that hit list, a pack of cards is cheap and the potential rewards quite worth it.

~ Matt

steenkh · Nov 30, 2009

Rodney said:
In tests where the odds of success can be readily calculated (emphasis added) -- such as numbers guessing -- the applicant must perform at least at the P=.001 level; that is, the odds must be only one in one thousand that the applicant could have achieved that performance level by random chance.

The recorded history of the MDC seems to indicate that the JREF practically always offers better chances for success than this.

(However, if the applicant achieves a lesser, but above chance, performance level in a limited number of tests -- for example, if the applicant performs at the P=.05 level in 20 trials -- the preliminary test may be extended on a different day or days to include more trials.)"

A prime focus for the JREF has been to keep tests short and simple. It would be counterproductive to extend short tests until the P=.001 had been reached. It must be up to the claimant to demand a sufficient number of tests if he thinks his level of certainty is close to random chance.

GzuzKryzt · Nov 30, 2009

Rodney, why do you think the JREF should explain to you why the MDC rules weren't changed in accord with your suggestions?

Rodney · Nov 30, 2009

MattC said:
http://www.automeasure.com/chance.html

Your link didn't work for me.

Rodney · Nov 30, 2009

GzuzKryzt said:
Rodney, why do you think the JREF should explain to you why the MDC rules weren't changed in accord with your suggestions?

For one thing, you asserted in another context: "If you want a detailed and official answer, please contact challenge@randi.org." See http://www.internationalskeptics.com/forums/showthread.php?t=160637

Second, I doubt if the JREF receives more than a few e-mails each year suggesting general modifications to the rules. So why not a brief response to the few inquiries that they do receive?

MattC · Nov 30, 2009

http://www.automeasure.com/chance.html

With the best of fortune, this link should work. Without that, having enjoyed several conversations about this issue, pray let me expound on why I think it's theoretically good (having some sort of standard on hand is good) but practically unsound (it couldn't be applied all the time, which is contrary to the definition of "standard").

I think a large part of the difficulty here between both parties can be focused entirely upon the "random chance" phrase and its persistent inclusion. If we were to enforce a 1/1000 standard, breaking this would be quite simple within a few trials precisely because we wouldn't be randomly guessing - guessing according to a pattern or figures is certainly practical, but perhaps an example might aid my case.

I claim that I can tell someone's hair color at a better-than-chance rate if they are in an adjoining room. Over 10 trials of this, the web site I keep attempting to link suggests that any number of successes greater than two (table 3 has this information) would not be attributable to "random chance," but this operates under the presumption that I am actually randomly guessing. Were I to be located in Europe, where according to Frost black hair (or some shade thereof) is comparatively common, beating these odds by simply guessing "black" on every trial becomes a real possibility - and certainly not a paranormal one. Were the JREF unaware of this predominance, dire circumstances could result from this adherence to a set odds protocol. Further, were I actually inclined to apply, I would view the ability to set my own criteria for success as a sign of the JREF's intention to fairly investigate my claim.

The ultimate problem with mathematical determinations of "random chance" is that mathematical purity rarely translates very well to muddied reality. In the sciences it is commonly considered to be a guide that something is causing a significant result, but what that "something" is cannot be determined by simply "beating the odds" - luck can never be controlled for. If you truly desire to test something for the paranormal, set the odds for success much higher to forcibly exclude most cases of percentage-based guessing (as I employed above) and a majority of luck-oriented factors.

Wisely, in my opinion, the JREF does so.

Rodney said:
Second, I doubt if the JREF receives more than a few e-mails each year suggesting general modifications to the rules. So why not a brief response to the few inquiries that they do receive?

This is a forum ultimately designed to facilitate debate, and the participants here are not hesitant to engage in such behavior. A "brief response" would, judging by a brief analysis of the forums, snowball into a much bigger response that would serve little purpose.

~ Matt

Rodney · Dec 1, 2009

MattC said:
http://www.automeasure.com/chance.html

Thanks. The link works now.

MattC said:
With the best of fortune, this link should work. Without that, having enjoyed several conversations about this issue, pray let me expound on why I think it's theoretically good (having some sort of standard on hand is good) but practically unsound (it couldn't be applied all the time, which is contrary to the definition of "standard").

I think a large part of the difficulty here between both parties can be focused entirely upon the "random chance" phrase and its persistent inclusion. If we were to enforce a 1/1000 standard, breaking this would be quite simple within a few trials precisely because we wouldn't be randomly guessing - guessing according to a pattern or figures is certainly practical, but perhaps an example might aid my case.

I claim that I can tell someone's hair color at a better-than-chance rate if they are in an adjoining room. Over 10 trials of this, the web site I keep attempting to link suggests that any number of successes greater than two (table 3 has this information) would not be attributable to "random chance," but this operates under the presumption that I am actually randomly guessing. Were I to be located in Europe, where according to Frost black hair (or some shade thereof) is comparatively common, beating these odds by simply guessing "black" on every trial becomes a real possibility - and certainly not a paranormal one. Were the JREF unaware of this predominance, dire circumstances could result from this adherence to a set odds protocol. Further, were I actually inclined to apply, I would view the ability to set my own criteria for success as a sign of the JREF's intention to fairly investigate my claim.

The ultimate problem with mathematical determinations of "random chance" is that mathematical purity rarely translates very well to muddied reality. In the sciences it is commonly considered to be a guide that something is causing a significant result, but what that "something" is cannot be determined by simply "beating the odds" - luck can never be controlled for. If you truly desire to test something for the paranormal, set the odds for success much higher to forcibly exclude most cases of percentage-based guessing (as I employed above) and a majority of luck-oriented factors.

Individually-designed protocols should be able to ensure that each applicant is meeting a .001 odds standard in tests where the odds of success can be readily calculated. The problem with things the way they are now is that there is no uniformity. In Pavel's cases, after undergoing endless negotiations with the JREF, he was summarily informed that he must score 100% on his preliminary test to pass -- when Pavel himself had consistently stated that his paranormal ability is less than perfect.

MattC said:
Wisely, in my opinion, the JREF does so.

This is a forum ultimately designed to facilitate debate, and the participants here are not hesitant to engage in such behavior. A "brief response" would, judging by a brief analysis of the forums, snowball into a much bigger response that would serve little purpose.

~ Matt

If you're right, GzuzKryzt is wrong when he recommends: "If you want a detailed and official answer, please contact challenge@randi.org."

GzuzKryzt · Dec 1, 2009

Rodney said:
...
Second, I doubt if the JREF receives more than a few e-mails each year suggesting general modifications to the rules. So why not a brief response to the few inquiries that they do receive?

You said this was Jeff Wagg's response:

"Hello Rodney,

Thanks for the suggestions. So you know, the challenge rules are being reconsidered, and we'll take your suggestions into account.

If we do make changes, they'll be posted publicly.

Jeff"

A brief response to an inquiry. What am I missing?

MattC · Dec 1, 2009

Rodney said:
Individually-designed protocols should be able to ensure that each applicant is meeting a .001 odds standard in tests where the odds of success can be readily calculated. The problem with things the way they are now is that there is no uniformity. In Pavel's cases, after undergoing endless negotiations with the JREF, he was summarily informed that he must score 100% on his preliminary test to pass -- when Pavel himself had consistently stated that his paranormal ability is less than perfect.

I do not know much about Mr. Pavel's case aside from the involvement of Mr. Startz, someone whom I respect.

0.001 standards are calculated against random - truly random - chance. If you employ nonrandom chance these odds are quite beatable as I have persistently attempted to show. Employing a set standard like this serves no benefit.

~ Matt

fls · Dec 1, 2009

MattC said:
No. If it isn't a test for the paranormal, what purpose does it serve? A magician doesn't need to put a million dollars on the line to garner what is essentially cheap publicity available for the cost of a webcam - judging by the proliferation of amateur magicians and skeptics on YouTube, many have come up with this same idea (and have found parents willing to buy them the webcam as well). Given that many of the actual tests conducted for the Challenge were not widely publicized (whether on YouTube or otherwise) I am not sure this proclamation holds water. Ms. Hunter's case possessed some absurdist elements that made it a worthy spectacle, while Ms. Sonne's test was broadcast as part of a larger event - the decision to broadcast may depend upon noteworthy features, but the decision to test is not.

What explanation do you offer for Randi's capricious and abrupt rejection of Pavel's claim?

Science ultimately tries to prove things by exclusion, meaning that implicit or potential effects are removed by experimentation over the long term. If, for example, I set up a test to determine whether or not I can psychically detect someone's hair color when they were in an adjoining room, it's possible that I could simply guess right each time. If we do three trials of this and I get three hits, blind guessing is both a very normal solution and a very possible one. If we increase the number of trials, guesswork becomes increasingly unlikely and will eventually pass the point of statistical likelihood. This doesn't mean it isn't possible (over a thousand trials it's still possible that I could guess a significant percentage right), merely that it's so improbable that it is a functionally insignificant probability - but, because it is a probability, there's always that slim chance of success. Blind luck cannot be controlled for.

It is seems clear that what you have just described does not resemble the Challenge.

The ultimate goal of the Challenge is to control out the mundane explanations and trickery that could be causing a supposedly paranormal event. The decision to broadcast a Challenge testing to the world at large is not made upon test design criteria, rather they are made upon features of the claim (as in Ms. Hunter's case) or done as a subsidiary of a larger broadcasting initiative (Ms. Sonne's test at TAM).

~ Matt

The decision to broadcast is irrelevant to what I said earlier.

It is very important to Randi's educational mission that he has a body of work, known collectively as The Challenge, to refer to. Each piece of this body of work should be somewhat representative, so that any piece can serve as an example of what The Challenge represents - unqualified failure. Whether any individual test is broadcast or whether it subsequently shows up on youtube, it should be viewable as an unqualified failure to any casual observer, in order to provide support for the rest of Randi's message.

Linda

fls · Dec 1, 2009

MattC said:
0.001 standards are calculated against random - truly random - chance. If you employ nonrandom chance these odds are quite beatable as I have persistently attempted to show.

You have shown that calculating odds that are not based upon the situation at hand would be foolish. However, I have not seen anyone, particularly Rodney, suggest this. In fact, it is a quite bizarre suggestion and I am puzzled as to why you even brought it up.

Employing a set standard like this serves no benefit.

~ Matt

One benefit that I could see would be to make Dean Radin and others look foolish when they tried to claim that it would take thousands of ganzfeld trials to pass the Challenge.

Linda

Rodney · Dec 1, 2009

GzuzKryzt said:
You said this was Jeff Wagg's response:

"Hello Rodney,

Thanks for the suggestions. So you know, the challenge rules are being reconsidered, and we'll take your suggestions into account.

If we do make changes, they'll be posted publicly.

Jeff"

A brief response to an inquiry. What am I missing?

The "detailed and official answer" that you suggested NWO Sentryman would receive if he were to inquire about the applicability of the JREF MDC to bomb detectors. That is what I hoped to receive in May 2008 when I first made my inquiry. Instead, I received no response until I followed up in February 2009, when I received the above response from Jeff. What I then expected to happen is that either: (a) the MDC rules would be modified in accord with my suggestions, or (b) I would receive an explanation as to why my suggestions were rejected.

fls · Dec 1, 2009

Rodney said:
I commend you for your astuteness. I think MDC Rule Number 1 should be: "The first and most important rule is that the Challenge is only a publicity stunt."

Just out of curiosity, since you have been so persistent on this point...is your real purpose to get the JREF to change the rules in line with your suggestions, or is it to force them to be more explicit in their real intentions by their refusal to change them? Or is it simply to show them up as inherently unfair and unscientific?

Linda

Merged Odds Standard for Preliminary Test

Penultimate Amazing

ducky's chatroom assassin

Philosopher

Illuminator

Philosopher

ducky's chatroom assassin

Illuminator

ducky's chatroom assassin

Philosopher

Philosopher

Illuminator

Illuminator

ducky's chatroom assassin

Illuminator

Philosopher

ducky's chatroom assassin

Penultimate Amazing

Penultimate Amazing

Illuminator

Penultimate Amazing