Merged Odds Standard for Preliminary Test

Just out of curiosity, since you have been so persistent on this point...is your real purpose to get the JREF to change the rules in line with your suggestions, or is it to force them to be more explicit in their real intentions by their refusal to change them? Or is it simply to show them up as inherently unfair and unscientific?

Linda
The former, but the failure of the JREF to even explicitly respond to my suggestions -- let alone implement them -- more than 18 months down the road from when I first made them suggests that the latter is true.
 
The former, but the failure of the JREF to even explicitly respond to my suggestions -- let alone implement them -- more than 18 months down the road from when I first made them suggests that the latter is true.


Case not proven.


M.
 
...
What I then expected to happen is that either: (a) the MDC rules would be modified in accord with my suggestions, or (b) I would receive an explanation as to why my suggestions were rejected.

I somehow admire you for your persistence. I kid you not.

If you seriously expected "the MDC rules would be modified in accord with [your] suggestions", consider you may have a warped view of the MDC, as per the definition of its rules. Plus, you might have huge cojones.

I can't see why the JREF should give you "an explanation as to why [your] suggestions were rejected.".
 
The former, but the failure of the JREF to even explicitly respond to my suggestions -- let alone implement them -- more than 18 months down the road from when I first made them suggests that the latter is true.

Well, I don't think either of them makes your case.

It's pretty clear that the JREF, and Randi in particular, don't give any priority to communication. This is a consistent theme on any subject - from responding to requests for references, to transparency in moderation, to timely follow-up on Challenge applications - whether or not you hear back from them is pretty much happenstance. And this has nothing to do with whether or not you are on their side. So there is nothing remarkable about the absence of a reply.

And there are several good reasons for failing to be specific when it comes to odds or time limits - reasons that you have ignored or dismissed in these discussions. And some of your suggestions (as has already been pointed out to you) have the effect of introducing a bias (the finding of a result when none should be found) or of making the test very unfair to the subject.

Linda
 
fls said:
What explanation do you offer for Randi's capricious and abrupt rejection of Pavel's claim?

I seem to recall mentioning that I had very little (if anything) to do with the case, paying it only the most cursory attention because I enjoy matching my responses against Startz's considerably more educated and sensible ones as a benchmark of personal success. I am further not certain how this question is relevant to what you quoted?

fls said:
It is seems clear that what you have just described does not resemble the Challenge.

Certainly, which is why I declared it "science" and invoked terms like "long term" in an attempt to illustrate the difference. Mr. Randi is quite rightly not attempting to do pure science, he seems to attempt to be as scientific as possible with the resources available to both him and the applicant. Bear in mind when evaluating the Challenge that a magician is not an expert in science but rather in trickery - it is sufficient for a challenge offered by a magician to control out the forms of trickery that magicians would perceive, something that holds many similarities to science but is not perfectly equivalent.

fls said:
You have shown that calculating odds that are not based upon the situation at hand would be foolish. However, I have not seen anyone, particularly Rodney, suggest this. In fact, it is a quite bizarre suggestion and I am puzzled as to why you even brought it up.

Oh, certainly he has - his attempt to emend the rules to incorporate an odds standard based upon mathematical purity rather than physical reality is doing just that - calculating odds not based upon the situation at hand. It is possible (as I also attempted to show) to design a claim where the odds are easy to calculate, thereby qualifying Mr. Rodney's arbitrary standard, and further to have those odds quite beatable by nonrandom guessing.

fls said:
It's pretty clear that the JREF, and Randi in particular, don't give any priority to communication. This is a consistent theme on any subject - from responding to requests for references, to transparency in moderation, to timely follow-up on Challenge applications - whether or not you hear back from them is pretty much happenstance. And this has nothing to do with whether or not you are on their side. So there is nothing remarkable about the absence of a reply.

Small organizations seldom have the luxury of specialization - they cannot afford to staff call centers with trained "account representatives" who are paid to take and respond to calls. Frankly I'd regard most of the things you mentioned as being of very low priority for a response, with the "timely follow-up on Challenge applications" as a medium priority.

~ Matt
 
And there are several good reasons for failing to be specific when it comes to odds or time limits
Let's focus on the former for the time being. If there had been a P=.001 standard for the preliminary test, don't you think Pavel would have been tested by now? If not, why not?
 
I seem to recall mentioning that I had very little (if anything) to do with the case, paying it only the most cursory attention because I enjoy matching my responses against Startz's considerably more educated and sensible ones as a benchmark of personal success. I am further not certain how this question is relevant to what you quoted?

You asked, if not a test of the paranormal, what purpose does it serve? Well, Pavel's claim was a paranormal claim, he sent a proposal to the MDC which fulfilled all the explicit requirements, and he was rejected. If Randi is interested in testing paranormal claims and he is presented with an opportunity for doing so, why would he reject that opportunity?

Certainly, which is why I declared it "science" and invoked terms like "long term" in an attempt to illustrate the difference. Mr. Randi is quite rightly not attempting to do pure science, he seems to attempt to be as scientific as possible with the resources available to both him and the applicant. Bear in mind when evaluating the Challenge that a magician is not an expert in science but rather in trickery - it is sufficient for a challenge offered by a magician to control out the forms of trickery that magicians would perceive, something that holds many similarities to science but is not perfectly equivalent.

I just think that it should be distinguished from science, which is about the process of discovery and following up interesting leads. The Challenge is more about demonstrating a claim to be fraudulent or mistaken, rather than any attempt at discovery.

Oh, certainly he has - his attempt to emend the rules to incorporate an odds standard based upon mathematical purity

The idea of mathematical purity seems to have come from you, rather than anyone else. The only person who has suggested odds based on mathematical purity is you. Rodney referred to "tests where the odds of success can be readily calculated", not to some unrelated distribution for the sake of mathematical purity.

rather than physical reality is doing just that - calculating odds not based upon the situation at hand. It is possible (as I also attempted to show) to design a claim where the odds are easy to calculate, thereby qualifying Mr. Rodney's arbitrary standard, and further to have those odds quite beatable by nonrandom guessing.

Except that nobody would have calculated the odds in your test to be 1:1000, nor would they have set success at the particular standard you chose, if they knew anything at all about probability. If the guidelines that Rodney proposed were followed, then the odds would be no more beatable by random guessing than any other 1:1000 guess.

Small organizations seldom have the luxury of specialization - they cannot afford to staff call centers with trained "account representatives" who are paid to take and respond to calls. Frankly I'd regard most of the things you mentioned as being of very low priority for a response, with the "timely follow-up on Challenge applications" as a medium priority.

~ Matt

I understand that. Which is why I specifically stated that the lack of a response cannot be taken to mean anything, since the JREF seems to be busy not responding to anyone.

Linda
 
Let's focus on the former for the time being. If there had been a P=.001 standard for the preliminary test, don't you think Pavel would have been tested by now? If not, why not?

No, because Pavel's test is not the kind that serves Randi's covert purposes. It would still have been rejected as too long.

Linda
 
fls said:
You asked, if not a test of the paranormal, what purpose does it serve? Well, Pavel's claim was a paranormal claim, he sent a proposal to the MDC which fulfilled all the explicit requirements, and he was rejected. If Randi is interested in testing paranormal claims and he is presented with an opportunity for doing so, why would he reject that opportunity?

I will not theorize nor answer questions on something I know nothing about.

fls said:
The idea of mathematical purity seems to have come from you, rather than anyone else. The only person who has suggested odds based on mathematical purity is you. Rodney referred to "tests where the odds of success can be readily calculated", not to some unrelated distribution for the sake of mathematical purity.

Oh, but we can readily calculate the odds of success in the case I proposed, and by Rodney's proposed emendation we would have been forced to use 1:1000 odds - it's not a choice under his restrictions, the JREF would have to do it.

fls said:
Except that nobody would have calculated the odds in your test to be 1:1000, nor would they have set success at the particular standard you chose, if they knew anything at all about probability. If the guidelines that Rodney proposed were followed, then the odds would be no more beatable by random guessing than any other 1:1000 guess.

First, he is by no means proposing a guideline - he is suggesting a rule change that must be followed in all cases. According to him, if you can readily calculate the odds of success - which we can - you must use 1:1000 odds. Whatever you might want to set them at is no good under his proposed emendations, you must use 1:1000 - and that is why I have disagreed with it ever since he proposed it.

Second, in the plan I suggested, I would not be randomly guessing - far from it. Guessing certainly, but guessing very much according to the numbers. Accordingly, I must restate my previous supposition that using odds based upon mathematical purity (which is exactly what you are doing if you refer to totally random guessing and my odds of beating the test thereon) has no merit in cases of nonrandom guessing and, in fact, is entirely harmful given Rodney's proposed restrictions.

~ Matt
 
Last edited:
Oh, but we can readily calculate the odds of success in the case I proposed, and by Rodney's proposed emendation we would have been forced to use 1:1000 odds - it's not a choice under his restrictions, the JREF would have to do it.

Yes, but the JREF would do so correctly, or some of us would step in and do it for them.

First, he is by no means proposing a guideline - he is suggesting a rule change that must be followed in all cases. According to him, if you can readily calculate the odds of success - which we can - you must use 1:1000 odds. Whatever you might want to set them at is no good under his proposed emendations, you must use 1:1000 - and that is why I have disagreed with it ever since he proposed it.

That is also why I disagreed with it. It is your reason for disagreeing with it, "to have those odds quite beatable by nonrandom guessing", which is incorrect. It is not beatable by non-random guessing.

Second, in the plan I suggested, I would not be randomly guessing - far from it. Guessing certainly, but guessing very much according to the numbers.

The probabilities you quoted have nothing to do with whether or not the guessing follows a random distribution. In fact, guessing is rarely random. Rather, they reflect the distribution of the sample about which the guesses are made. So if you are guessing hair-colour and the underlying distribution has only one black-haired person for every 50 people (which is what you would need for 3 correct guesses out of 10 trials to reflect less than 1:1000 odds), it doesn't matter if you guess "black hair" every single time. You can only be right one time in 50 on each trial.

Accordingly, I must restate my previous supposition that using odds based upon mathematical purity (which is exactly what you are doing if you refer to totally random guessing and my odds of beating the test thereon) has no merit in cases of nonrandom guessing and, in fact, is entirely harmful given Rodney's proposed restrictions.

~ Matt

You are incorrect. The odds do not refer to guessing, but to the distribution of the answers. If you have a test with equal numbers of A's, B's, C's and D's as answers, guessing C every time will still only get you 25% on the test.

Linda
 
fls,

I think what Matt is suggesting and what Rodney is ignoring is that while we can certainly calculate The Odds, those odds are based on the premise that the protocol ensures things are random. In some cases, that's simply not the case.

Take the IIG test for VisionFromFeeling where she was supposed to detect who was missing a kidney. She could see the people well enough to determine things like gender and especially age. Based on my research, it seems like each year kidney donations are roughly the same in number across the age range of about 18 to 55. Thus, the group of people born in 1954 will have donated a buttload more kidneys than the group born in 1991. Couple this with reading nonverbal cues like fidgeting, and I think the case can be made that the theoretical odds don't match with the reality of what's practical to assemble for a protocol.

By not having any set odds in the rules, the JREF can require one odds requirement for a Connie Sonne type protocol (no conceivable way for her to take an educated guess) and a more stringent requirement for a VFF type protocol so that the JREF can have a comfort zone, so to speak.

Matt, if I'm not stating your position correctly, I apologize. However, it's also my position, so no time was wasted.
 
fls,

I think what Matt is suggesting and what Rodney is ignoring is that while we can certainly calculate The Odds, those odds are based on the premise that the protocol ensures things are random. In some cases, that's simply not the case.

Take the IIG test for VisionFromFeeling where she was supposed to detect who was missing a kidney. She could see the people well enough to determine things like gender and especially age. Based on my research, it seems like each year kidney donations are roughly the same in number across the age range of about 18 to 55. Thus, the group of people born in 1954 will have donated a buttload more kidneys than the group born in 1991. Couple this with reading nonverbal cues like fidgeting, and I think the case can be made that the theoretical odds don't match with the reality of what's practical to assemble for a protocol.

Why not simply recognize when you can or cannot readily calculate odds?

By not having any set odds in the rules, the JREF can require one odds requirement for a Connie Sonne type protocol (no conceivable way for her to take an educated guess) and a more stringent requirement for a VFF type protocol so that the JREF can have a comfort zone, so to speak.

I'm not disagreeing that there are good reasons to maintain flexibility in protocol design, but in this case it would have made more sense to me to subject VFF to a well-designed test, rather than trying to put a bandaid on a poorly-designed test by making the threshold more stringent.

Linda
 
I'm not disagreeing that there are good reasons to maintain flexibility in protocol design, but in this case it would have made more sense to me to subject VFF to a well-designed test, rather than trying to put a bandaid on a poorly-designed test by making the threshold more stringent.
I couldn't agree more, and yet many on the VFF discussion thread want to blame Anita for the test deficiencies, even though she was the one who paid $1,000 to fly across the country and stay overnight to take what she thought was a well-designed test of her claim. See, for example, http://www.internationalskeptics.com/forums/showpost.php?p=5358380&postcount=1766
 
While I don't feel such an addition would harm the purpose any, I expect if JREF were to add such a clause and for any later protocol suggest the odds are not readily calcuable (tests where the applicant may get significant hints from mundane observations), people (perhaps some already involved in this discussion) would then request the addition of a definition of "readily calcuable". Ad infinitum.
 
While I don't feel such an addition would harm the purpose any, I expect if JREF were to add such a clause and for any later protocol suggest the odds are not readily calcuable (tests where the applicant may get significant hints from mundane observations), people (perhaps some already involved in this discussion) would then request the addition of a definition of "readily calcuable". Ad infinitum.

Well, it's a bit less vague than "proper observing conditions" or "feasible", I suppose. :)

Linda
 
Why not simply recognize when you can or cannot readily calculate odds?
We don't "simply recognize" that because the world is not black and white. We can calculate the odds if we make the assumption that everything is random. We can also acknowledge that things are not random but estimate that the factors working in favor a claimant, whom we assume to be without a sooper power, still give us enough of an edge. At the end of the day it's just a bet: money vs failure.

I'm not disagreeing that there are good reasons to maintain flexibility in protocol design, but in this case it would have made more sense to me to subject VFF to a well-designed test, rather than trying to put a bandaid on a poorly-designed test by making the threshold more stringent.

I am going to start a thread in GS&P about the VFF protocol, and I hope you participate. Until then, you have to remember that a "well designed test" is actually a negotiation between two parties and subject to practical limitations. It's a value judgment to proceed with a challenge that you know is not as good as you'd like but still sufficient to demonstrate the point.

Take the age factor in kidney donations. Finding people missing a kidney and who are willing volunteer their time is a pain in the ass. Ideally, for each target I would try to assemble a group of controls of the same sex with the same general physical characteristics including age. That's a lot of work.

Sometimes the claimant refuses to budge on certain issues (VFF insisted on 4.5 minutes per person), so again, it's a value judgment whether to concede the point or hold fast. At the end of the day it's still just a challenge, not a scientific test. If you're confident that your money is safe and the claimant is confident she can demonstrate her abilities, then many would argue that there's no reason not to go through with it.
 
We don't "simply recognize" that because the world is not black and white. We can calculate the odds if we make the assumption that everything is random.

I don't think that that is the assumption which is made. Or maybe I don't know what you mean by "random".

Anyway, you seemed able to come up with two clear examples earlier.

Linda
 
UncaYimmy said:
I think what Matt is suggesting and what Rodney is ignoring is that while we can certainly calculate The Odds, those odds are based on the premise that the protocol ensures things are random. In some cases, that's simply not the case.

Yes. Mathematical randomness and the controlled randomness inherent in a protocol design seldom perfectly intersect. Your example of the VFF test is a very good one to demonstrate the point - if the applicant knows anything more about the target pool or can discover this information it is no longer a perfectly random test. Certain some elements of randomness are preserved, but the goal of any protocol agreement should be to ensure that this randomness preserved is related to the applicant's ability.

UncaYimmy said:
By not having any set odds in the rules, the JREF can require one odds requirement for a Connie Sonne type protocol (no conceivable way for her to take an educated guess) and a more stringent requirement for a VFF type protocol so that the JREF can have a comfort zone, so to speak.

A better way to state it might be that all odds requirements are based upon normal knowledge available to the applicant throughout the procedure. If we're testing for paranormal knowledge it makes sense to identify and exclude the normal throughout the process of protocol design. I wonder if this sort of information-oriented explanation might be a better way to clarify protocol designs in the future, specifically the denial of normal sources of relevant information through controls. If this is impractical for some reason (as it was in the VFF case), the odds requirement will be higher than it would be in other cases - your ability to find examples is quite exemplary.

~ Matt
 
I don't think that that is the assumption which is made. Or maybe I don't know what you mean by "random".

Anyway, you seemed able to come up with two clear examples earlier.

Linda

I'll try to clarify.

If the self-evident results have a set number of choices and/or answers, then you can pretty much calculate The Odds. In in a perfect world these worst case odds indicate how likely a person flipping a coin or rolling dice is to pass the test. Let's call this scenario Blind Luck.

Unfortunately, the world's not perfect. There are going to be occasions where due to time, space, money, personalities, or whatever, the protocol is not going whittle it down to Success by Ability vs Success by Blind Luck. Our odds calculation remains the same, but our confidence level is reduced.

It sounds to me like you're saying that if we don't have complete confidence that we have made it Ability vs. Blind Luck that we have no right to be discussing odds. Is that your stance?

I say that this is a challenge and not scientific research. We're perfectly entitled to say, "We acknowledge that in this protocol an ordinary person without a special ability will outperform an ordinary person flipping a coin." We can do that because the more important statement we're making is, "We're so confident that only a person who has an ability could pass this test that we're gonna put up our money."

The blind luck odds are just one factor to consider when setting up the challenge.
 
I think the most important point is one I've made before - there's no reason to have a consistent probability of winning by chance, because as far as the applicant is concerned it's completely irrelevant. Remember, the odds we're talking about here are the odds of someone winning if they can't actually do what they claim. Obviously applicant believe they can do whatever it is, so they have no reason care what the odds actually are. The odds of winning by chance only matter to the JREF, because it represents the chance of them losing their money without actually demonstrating anything interesting.

The only way these odds affect an applicant in any way is in the length of time, via the number of separate rounds, required for a test. However, since pretty much every ability claimed takes a different length of time per trial, and usually have different restrictions on how long people claim to be able to take part in a test for, there's no way of setting any consistent standard based on that. For example, Pavel_do was happy to go on for as long as the JREF might have wanted, while in VissionFromFeeling claims to be unable to do more than two rounds before being unable to continue.

Perhaps the most important question to ask here is - is consistency really necessary? Does it matter if one applicant has to beat 1:500, another 1:1000 and another 1:10000? There can hardly be accusations of favouritism, since the JREF obviously doesn't think anyone will ever win, and anyone who actually did have an ability would win regardless of what their odds of winning by chance were. So why all the fuss here? Why should the JREF bother coming up with a hard answer on what odds they'll accept? As long as they are happy that the odds in each case are high enough that someone without an ability isn't going to win, what difference does it make?
 

Back
Top Bottom