Merged Odds Standard for Preliminary Test

Like I said, I don't disagree that there are valid reasons to maintain flexibility in setting the odds. I just don't think "let's pretend it's a solution even though it doesn't actually solve the problem" should be one of them.

Suppose you design a microwave that has issues with even heating. Do you spend a bunch of time and money redesigning the guts or do you add a simple carousel to turn the food?
 
Suppose you design a microwave that has issues with even heating. Do you spend a bunch of time and money redesigning the guts or do you add a simple carousel to turn the food?

That example doesn't make any sense, since adding a simple carousel will actually address the problem. What I wouldn't do is spend a bunch of time and money painting the walls red and claim that it helps.

Linda
 
That example doesn't make any sense, since adding a simple carousel will actually address the problem. What I wouldn't do is spend a bunch of time and money painting the walls red and claim that it helps.

I don't recall anyone suggesting anything as useless as painting walls red to fix a defective microwave emitter. I have heard suggestions where instead of fixing a defective emitter that steps be taken to reduce the chances of it heating unevenly. A carousel doesn't actually fix the problem, it just lessens the effect under most circumstances. What happens when you try to heat an item that is too big to rotate?

That's why I thought it an appropriate analogy.

So, based on our discussions, what do you consider a "paint the walls red" type of suggestion and why?
 
I don't recall anyone suggesting anything as useless as painting walls red to fix a defective microwave emitter. I have heard suggestions where instead of fixing a defective emitter that steps be taken to reduce the chances of it heating unevenly. A carousel doesn't actually fix the problem, it just lessens the effect under most circumstances. What happens when you try to heat an item that is too big to rotate?

That's why I thought it an appropriate analogy.

Okay.

So, based on our discussions, what do you consider a "paint the walls red" type of suggestion and why?

It goes back to this post:

http://www.internationalskeptics.com/forums/showthread.php?postid=5372144#post5372144

Holes in the protocol mean that there is an effect contributing to the results in addition to chance and paranormal abilities. We do not know the size of that effect. To "increase the odds" means that you have either increased the number of trials or you have increased the proportion of hits necessary to count as a success. Increasing the number of trials increases the likelihood that the effect of the holes will show up in the results (you have increased your power to detect an effect). Alternatively, if you don't know your hole effect size, you don't know to what extent increasing the effect size you are testing for (increasing the hit rate) changes or reduces the likelihood that the test will be passed.

Linda
 
It goes back to this post:

http://www.internationalskeptics.com/forums/showthread.php?postid=5372144#post5372144

Holes in the protocol mean that there is an effect contributing to the results in addition to chance and paranormal abilities. We do not know the size of that effect. To "increase the odds" means that you have either increased the number of trials or you have increased the proportion of hits necessary to count as a success. Increasing the number of trials increases the likelihood that the effect of the holes will show up in the results (you have increased your power to detect an effect). Alternatively, if you don't know your hole effect size, you don't know to what extent increasing the effect size you are testing for (increasing the hit rate) changes or reduces the likelihood that the test will be passed.

You still did not give an example of painting the walls red.

You're in medical research, right? I'm sure your studies have holes in them. In a drug trial, for example, is there an eyewitness to ensure that each subject actually swallows the pills at the specified time intervals? If so, does someone verify that they don't immediately go to the bathroom and puke it up? Does somebody ensure that no other medicines are taken without proper documentation? Is there a financial penalty or any other method for ensuring that reports of subjective side-effects are truthful?

To me those are all holes. I trust the medical establishment takes these issues into effect when deciding what effect sizes are deemed significant. I am not seeing the difference with these challenge protocols acknowledging their imperfection and wanting higher odds to give themselves more confidence.

One major difference to medical trials is that a researcher can reject any subject who doesn't agree to the rules. In a challenge, the organization and the subject have to negotiate something acceptable to both parties, and this typically entails compromises. Which is to say the organization trades possible gaps for more significant odds.
 
You still did not give an example of painting the walls red.

Well, I explained how the solution might not actually make the IIG money safer. Wouldn't that serve as an example?

You're in medical research, right? I'm sure your studies have holes in them. In a drug trial, for example, is there an eyewitness to ensure that each subject actually swallows the pills at the specified time intervals? If so, does someone verify that they don't immediately go to the bathroom and puke it up? Does somebody ensure that no other medicines are taken without proper documentation? Is there a financial penalty or any other method for ensuring that reports of subjective side-effects are truthful?

To me those are all holes. I trust the medical establishment takes these issues into effect when deciding what effect sizes are deemed significant. I am not seeing the difference with these challenge protocols acknowledging their imperfection and wanting higher odds to give themselves more confidence.

The difference is that medical trials have control groups. And this is a big deal when it comes to addressing this issue. Because the things that you describe will of course influence the result. But they will not influence the result differently between the groups, so they are effectively cancelled out. Because you don't have a control group, you don't have a way to cancel out the effect of any holes.

Linda
 
Well, I explained how the solution might not actually make the IIG money safer. Wouldn't that serve as an example?
Honestly, I'm not clear at all what specific choice the IIG made that you think was painting a wall red to fix a microwave. Can you give a "they did this when they could have done that" example?


The difference is that medical trials have control groups. And this is a big deal when it comes to addressing this issue. Because the things that you describe will of course influence the result. But they will not influence the result differently between the groups, so they are effectively cancelled out. Because you don't have a control group, you don't have a way to cancel out the effect of any holes.

I disagree. Grapefruit juice is known to affect some antibiotics, right? If not, let's just pretend it does. We tell both groups not to drink grapefruit juice, but very clearly we have no way of knowing if they do or not. How does having a control group help that situation?

Or let's say you're testing a heart medication and tell people not to take aspirin because we know it can prevent heart attacks (humor me if I'm wrong). The medication seems to cause headaches, so the test group has a higher percentage of people with an incentive to break protocol and take aspirin.

We also know that it's virtually impossible to control for everything between the two groups. There's no guarantee that both groups will have the same percentage of people with arthritis or stressful jobs, so maybe one group is more likely because of that to break protocol and take aspirin.

To this layman that's one reason why you work with confidence levels and p-values. You use the statistics to give yourself a margin of error. And that's what groups like the IIG and JREF do, only they do it because for various reasons they can't construct "perfect" protocols.
 
I disagree. Grapefruit juice is known to affect some antibiotics, right? If not, let's just pretend it does. We tell both groups not to drink grapefruit juice, but very clearly we have no way of knowing if they do or not. How does having a control group help that situation?
Why would we want two groups not drinking grapefruit juice? The control group would be the one not drinking grapefruit juice, while the other group was drinking the juice. If the groups are big enough, there will be a difference between the two groups, even if some people cheat.
 
Last edited:
I disagree. Grapefruit juice is known to affect some antibiotics, right? If not, let's just pretend it does. We tell both groups not to drink grapefruit juice, but very clearly we have no way of knowing if they do or not. How does having a control group help that situation?

Because with random group assignment, those who drink grapefruit juice will be distributed into both groups. The 'effect' of drinking grapefruit juice will be present in both groups.

Or let's say you're testing a heart medication and tell people not to take aspirin because we know it can prevent heart attacks (humor me if I'm wrong). The medication seems to cause headaches, so the test group has a higher percentage of people with an incentive to break protocol and take aspirin.

We also know that it's virtually impossible to control for everything between the two groups. There's no guarantee that both groups will have the same percentage of people with arthritis or stressful jobs, so maybe one group is more likely because of that to break protocol and take aspirin.

But there's no particular reason to think that people with arthritis or stressful jobs or who take aspirin, will be distributed differently between the two groups. That is, if any of those things have an effect on the outcome, they will affect the outcome in both groups. If there are differences in the extent to which randomly distributed characteristics can affect the outcome in those who are taking the active treatment, then this falls under the treatment effect, which is the effect of interest.

The really valuable part of this is that we know the distribution of any of these characteristics came about due to random assignment. Which means that our statistics which are based on random sampling actually apply to them. So things like p-values and confidence intervals do actually accurately describe our confidence, instead of the situation in the IIG test where a distribution based on random sampling was applied to what was known to be a non-random distribution.

To this layman that's one reason why you work with confidence levels and p-values. You use the statistics to give yourself a margin of error. And that's what groups like the IIG and JREF do, only they do it because for various reasons they can't construct "perfect" protocols.

My point is simply that you don't know the extent to which the confidence intervals and p-values from a known distribution can be transferred to a distribution which is known to be different. Simply ignoring that problem and pretending that alterations to the margin of error based on the known distribution will take care of it is merely wishful thinking.

Linda
 
Because with random group assignment, those who drink grapefruit juice will be distributed into both groups. The 'effect' of drinking grapefruit juice will be present in both groups.

Bolding mine. You say they "will be" distributed among both groups. You don't know that. You're banking on the distribution of grapefruit juice drinkers and protocol violators to be close enough not to skew the results.

But there's no particular reason to think that people with arthritis or stressful jobs or who take aspirin, will be distributed differently between the two groups.
Yes, there is a reason to believe they will be distributed unevenly. It's basic statistics. I'm sure you could tell me with degrees of confidence the likelihood of 50 of those people in a group of 200 being distributed 25-25, 15-35, 5-45 and 0-50.

The only difference I see between these situations and challenge situations is that you can quantify the risk statistically. In the challenges people make judgments regarding weaknesses in design. They figure, "If the claimant only sees their backs while they sit still in a chair, there's a 'very small' chance there might be some information the claimant can use to better the odds beyond pure chance. Therefore, instead of our usual threshold of 1,000 to 1 we are going to use 1,728 to 1."

My point is simply that you don't know the extent to which the confidence intervals and p-values from a known distribution can be transferred to a distribution which is known to be different. Simply ignoring that problem and pretending that alterations to the margin of error based on the known distribution will take care of it is merely wishful thinking.

I'm saying you do it all the time in the medical field because your protocols are often far less strictly enforced than challenge protocols. There are potentially a lot more things people can do to violate protocol, and while you have reasonable estimates of how they are distributed among the groups, you do not know the actual rates. You compensate for this with confidence levels and statistical practices.

I don't have a problem with the practice in the medical field or in the challenges. What I can't understand is why you see one as superior to the other. If anything, the challenge protocols are superior because the tests are so much shorter and tightly controlled.
 
Bolding mine. You say they "will be" distributed among both groups. You don't know that. You're banking on the distribution of grapefruit juice drinkers and protocol violators to be close enough not to skew the results.

Right, because we have very detailed knowledge about the distribution of random samples.

Yes, there is a reason to believe they will be distributed unevenly. It's basic statistics. I'm sure you could tell me with degrees of confidence the likelihood of 50 of those people in a group of 200 being distributed 25-25, 15-35, 5-45 and 0-50.

Exactly. We can describe exactly how confident we can be about a distribution based on random sampling. It's powerful information.

The only difference I see between these situations and challenge situations is that you can quantify the risk statistically.

Yes. You can quantify the risk. So you can quantify whether or not you have adequately accounted for the risk.

In the challenges people make judgments regarding weaknesses in design. They figure, "If the claimant only sees their backs while they sit still in a chair, there's a 'very small' chance there might be some information the claimant can use to better the odds beyond pure chance. Therefore, instead of our usual threshold of 1,000 to 1 we are going to use 1,728 to 1."

How are you going to change the odds? If you do it by increasing the number of trials, you make it easier for the claimant to pass using their "some information". If you do it by increasing your threshold, what makes you confident that a 1.728-fold increase is adequate given that claimants typically describe abilities (under conditions where they don't think that they are using normal sensing) that would represent a 5 to 10-fold increase?

I'm saying you do it all the time in the medical field because your protocols are often far less strictly enforced than challenge protocols. There are potentially a lot more things people can do to violate protocol, and while you have reasonable estimates of how they are distributed among the groups, you do not know the actual rates. You compensate for this with confidence levels and statistical practices.

We actually do a bunch of other stuff like measure how they are distributed, measure the resultant effect size, randomize on variables which may have an effect, give everyone the same intervention, etc. None of that really matters, though, since (as you mentioned a gazillion times) challenges are performed with different goals than scientific tests.

I don't have a problem with the practice in the medical field or in the challenges. What I can't understand is why you see one as superior to the other. If anything, the challenge protocols are superior because the tests are so much shorter and tightly controlled.

I didn't say that one is superior to the other. I pointed out that having a control group is an incredibly powerful tool when you are unable to eliminate bias - something which just happens to be of interest to both scientific study and challenges.

Linda
 
Last edited:
Yes. You can quantify the risk. So you can quantify whether or not you have adequately accounted for the risk.
Well, sorta. You can quantify how you *might* have adequately accounted for the risk. Look at poker. We can calculate absolute pot odds without a problem and determine that a heavy bet with a full house is a good idea. If the other guy has a straight flush, I did not "adequately" account for this situation. I made the "right" decision but I did the "wrong" thing because I lost the hand.

How are you going to change the odds? If you do it by increasing the number of trials, you make it easier for the claimant to pass using their "some information".
That's an assertion with no evidence and, quite frankly, it's rather counterintuitive unless you mean the likelihood of passing *one* trial. I can't believe you mean that since you *always* increase the likelihood of passing one trial when adding additional trials.

Suppose the "gap" we don't find practical to close is cheating by some form of signaling. With two trials, I stand virtually no chance of determining a pattern in the environment (finding a signal within the noise). With 2,000 trials you can bet I'm going to find that signal.

Suppose the gap is that there's some little thing a decoy might do to reveal that he's not the target. The step I would need to take to entirely prevent this is too expensive. I estimate that there's only a 1 in 100 chance that this might happen. So, the claimant has a 99 in 100 chance of having three trials with a 1 in 12 chance and a 1 in 100 chance of having three trials with a 1 in 10 chance (I'm on the 2 kidney thing).

If I add one more trial, then my worst case scenario is four trials of a 1 in 10 chance. Those odds are more difficult than my original best-case scenario of of three trials of 1 in 12 odds.

Clearly I have demonstrated that your assertion is not true in all cases. So, under what scenarios will adding trials increase the likelihood of passing? Please be specific.
 
Well, sorta. You can quantify how you *might* have adequately accounted for the risk. Look at poker. We can calculate absolute pot odds without a problem and determine that a heavy bet with a full house is a good idea. If the other guy has a straight flush, I did not "adequately" account for this situation. I made the "right" decision but I did the "wrong" thing because I lost the hand.

That's not really a good example, since a simple understanding of how you calculate odds on the probability of any particular poker hand does not actually include the conditions under which you encounter those hands.

That's an assertion with no evidence

It is basic statistics. I have simply described what "power" means in terms of your ability to demonstrate an effect.

and, quite frankly, it's rather counterintuitive unless you mean the likelihood of passing *one* trial.

It is somewhat counter-intuitive, which I suspect is why it usually gets little to no consideration in protocol discussions. It is not the likelihood of passing one trial. It is the likelihood of passing your threshold for success with a given effect size.

I discussed this in Pavel's thread with examples (http://www.internationalskeptics.com/forums/showthread.php?postid=5032589#post5032589).

Let's take an effect size of 0.80, which represents a 'large' effect size. For trials with p=0.50, this means translates to the following numbers of hits for increasing trial numbers:

1/1, 9/10, 22/25, and 43/50.

The p-value for each of those results if due to chance are:

1.00, 0.01, 0.0001, and 0.0000001.

The number of hits necessary to exceed a standard of 0.001 would be:

N/A, 10/10, 21/25, and 37/50.

Which translates to success rates of:

N/A, 100%, 84%, and 74%.

Which reflects effect sizes of:

N/A, 1.571, 0.748, and 0.500.

While the person is able to accomplish the same thing in each in each set of trials, whether or not they will be able to exceed the threshold depends upon the total number of trials. Conversely, the larger the total number of trials, the lower their success rate needs to be in order to pass, and smaller and smaller effect sizes (i.e. the effect of 'holes') will allow them to pass.

I can't believe you mean that since you *always* increase the likelihood of passing one trial when adding additional trials.

Suppose the "gap" we don't find practical to close is cheating by some form of signaling. With two trials, I stand virtually no chance of determining a pattern in the environment (finding a signal within the noise). With 2,000 trials you can bet I'm going to find that signal.

Suppose the gap is that there's some little thing a decoy might do to reveal that he's not the target. The step I would need to take to entirely prevent this is too expensive. I estimate that there's only a 1 in 100 chance that this might happen.

But you realize that you are pulling this number out of your ass, right? What if it's one in ten or one in two?

So, the claimant has a 99 in 100 chance of having three trials with a 1 in 12 chance and a 1 in 100 chance of having three trials with a 1 in 10 chance (I'm on the 2 kidney thing).

If I add one more trial, then my worst case scenario is four trials of a 1 in 10 chance. Those odds are more difficult than my original best-case scenario of of three trials of 1 in 12 odds.

This works if the number you have pulled out of your ass is reasonable. How would you go about figuring out whether it is or not?

The amount of residual bias present in good-quality RCT's is estimated to be 0.10. In good-quality studies without control groups, it is estimated to be 0.20. Ray Hyman looked at the amount of bias which may be present in the ganzfeld studies (as in Anita's test, these involve making guesses whilst attempting to remove any possible sources of normal information) and found that there may be at least 0.30. What these numbers indicate is the proportion of studies which should be found to be negative, which will seem to be positive. Now, as you can see, a bias of 0.10 utterly dwarfs the effect of playing around with the odds. If you are worried about the one false-positive result due to chance in 1000 tests, this will be dwarfed by the 100 false positive results due to bias - an effect that won't even be touched by the removal of that one false positive due to chance.

Now, under the conditions of Anita's test, some of those sources of bias will not be present - the effect of multiple testing, flexibility in specifying outcomes (at least for the purpose of passing the test), and publication bias, will not be present. You've never taken other biases, like the bias introduced by the asymmetry in the location of the missing kidney and asymmetry in her guesses, or randomization (this isn't mentioned in the protocol) into consideration. But mostly we worry about the effect of her picking up subconscious or conscious clues from the subjects and examiners. And we've tried to mitigate this through partial blinding. So how successful are we at reducing bias. Is it a thousand-fold less? Is it ten-fold less?

Other situations, where people claim to have eliminated bias (parapsychology studies, claimants performing informal tests), when compared to subsequent testing, can show effect sizes of 0.20 or 0.50 or more due to the sort of bias we are worried about with Anita. And as I illustrated in my example above, increasing numbers of trials allow for smaller and smaller effect sizes to lead to a result which passes the criteria for a successful test.

Linda

References:
Why Most Published Research Findings Are False.
Statistical Power Analysis for the Behavioral Sciences, Jacob Cohen.
Commentary on John P.A. Ioannidis' 'Why Most Published Research Findings Are False', Ray Hyman, Skeptical Inquirer, Vol. 30, March-April 2006.
 
Last edited:
That's not really a good example, since a simple understanding of how you calculate odds on the probability of any particular poker hand does not actually include the conditions under which you encounter those hands.
We're getting sidetracked, but you obviously don't play poker. Do a Google search for pot odds for poker, and you'll find all sorts of handy charts.

In Texas hold 'em it's not terribly difficult to determine how many "outs" you have to draw to certain hands. From there it's not difficult to look at the size of the pot in terms of how many "bets" there are. If it's 5 to 1 to make your hand and 4 to 1 bets in the pot, you're far less inclined to make the bet than if it's 10 to 1 in the pot.

You also know how likely your hand is to take the pot. Sometimes you absolutely know you might draw to the best hand (or already have it) because you can see 3, then 4, then 5 of the possible 7 cards each opponent has, and you know which two cards they do not have.

My point in bringing that up is that making a bet on a 5 to 1 draw with 12 to 1 pot odds is the "correct" decision but you still might lose the hand. Thus "accounting" for probabilities doesn't mean controlling the outcome.

It is somewhat counter-intuitive, which I suspect is why it usually gets little to no consideration in protocol discussions. It is not the likelihood of passing one trial. It is the likelihood of passing your threshold for success with a given effect size.

Ah, I see. So you're concerned about effect sizes and the organization not taking that into account. This can be handled just fine in any individual case. I haven't seen any cases where this wasn't properly addressed by the organization. Have you?

But you realize that you are pulling this number out of your ass, right? What if it's one in ten or one in two?
The inability to determine something with precision is not the same as the inability to determine it with degrees of confidence. This is what I am trying to get you to understand about your studies with control groups. For example, you cannot say with any certainty the likelihood of people breaking protocol. You can say that in a random sample you expect with Y degree of confidence N people with arthritis. You can say with Y degree of confidence what the distribution of arthritis sufferers will be between the two groups. But you don't *know* for sure, nor do you know how likely they will be to break protocol and pop some aspirin.

The world of challenges is a lot more messy, especially when human targets are involved. We make observations of the world and discuss the possibilities. The numbers we assign are not pulled out of our asses. They are arrived at through discussion by intelligent people who are very cautious.

I don't need to crunch a bunch of numbers to say I'm more likely to get struck by a car at night while wearing dark clothes and crossing a busy street as opposed to crossing a quiet street at noon.

Where challenges have an advantage is that they can strictly enforce protocol. They have zero doubt that the protocol will be executed 100% correctly because they have people watching at the time, and they review the video later.

The amount of residual bias present in good-quality RCT's is estimated to be 0.10. In good-quality studies without control groups, it is estimated to be 0.20.
Estimated? You mean you pulled it out of your ass.

I'm done with this. I have repeatedly asked you for specific examples regarding gaps/holes, but I have received none. Next time there is a protocol discussion, I hope you'll come wrestle with the pigs. You could have joined the thread on the IIG protocol for VFF, but I think you'd rather argue in the abstract. I'm the opposite.
 
We're getting sidetracked, but you obviously don't play poker. Do a Google search for pot odds for poker, and you'll find all sorts of handy charts.

I don't think you understood what I meant, but I agree that it's a sidetrack.

Thus "accounting" for probabilities doesn't mean controlling the outcome.

I wasn't suggesting that the outcomes are controlled - that something has 1000 to one odds against happening due to chance doesn't mean that you can guarantee that it won't happen.

Ah, I see. So you're concerned about effect sizes and the organization not taking that into account. This can be handled just fine in any individual case. I haven't seen any cases where this wasn't properly addressed by the organization. Have you?

I have repeatedly pointed out that you have no idea what the effect size is for Anita's ability to pick up subconscious clues, which doesn't seem to be a "just fine" situation. :)

Pavel's protocol is another example where the JREF paid little or no attention to effect size in order to dismiss Pavel's application. Although, some might call that "just fine".

The inability to determine something with precision is not the same as the inability to determine it with degrees of confidence.

It's just that in this case you haven't done either. It's not just a matter of an imprecise measure, it's an inability to narrow it down to something smaller than 3 or 4 orders of magnitude.

This is what I am trying to get you to understand about your studies with control groups. For example, you cannot say with any certainty the likelihood of people breaking protocol. You can say that in a random sample you expect with Y degree of confidence N people with arthritis. You can say with Y degree of confidence what the distribution of arthritis sufferers will be between the two groups. But you don't *know* for sure, nor do you know how likely they will be to break protocol and pop some aspirin.

Well, we can know because we can collect that information from the people in the study. But for the purpose of illustration, let's pretend that we don't. Say that we are interested in whether or not a drug prevents heart attacks, so then the proportion of arthritis sufferers who are taking aspirin becomes important because this will also influence the heart attack rate. If we look at the intervention group (the group that took the drug), we see that they have a lower rate of heart attack than the general population, but we have no idea how much of that was due to the use of aspirin. All of the effect could be due to aspirin, half of the effect, or none of the effect - we have no clue as to the effect size of the intervention. But, if we have a placebo control group, we can accurately describe our confidence that whatever the effect of aspirin-taking arthritis sufferers has on one group, it has the same effect on the other group. So without having to know anything at all about the size of the aspirin effect, we can reasonably confidently state what effect size was due to the intervention.

The world of challenges is a lot more messy, especially when human targets are involved. We make observations of the world and discuss the possibilities. The numbers we assign are not pulled out of our asses. They are arrived at through discussion by intelligent people who are very cautious.

So what data did you use to estimate the effect size?

I don't need to crunch a bunch of numbers to say I'm more likely to get struck by a car at night while wearing dark clothes and crossing a busy street as opposed to crossing a quiet street at noon.

Where challenges have an advantage is that they can strictly enforce protocol. They have zero doubt that the protocol will be executed 100% correctly because they have people watching at the time, and they review the video later.

Yes. As I mentioned, they are able to eliminate some of those factors which contribute to bias.

Estimated? You mean you pulled it out of your ass.

I gave you the references for the numbers. They were based on measured examples.

I'm done with this. I have repeatedly asked you for specific examples regarding gaps/holes, but I have received none.

I'm sorry. I think I misunderstood your request. I thought you were asking for specific examples of how effect size, trial number and threshold change the likelihood of a claimant passing the test when they don't have paranormal abilities.

You are asking about gaps/holes in the protocol for Anita's test? Okay, let's start with one I mentioned in my last post. There is no randomization described in the protocol. There is no description of how the subjects were collected or arranged.

Next time there is a protocol discussion, I hope you'll come wrestle with the pigs.

I have been involved in a number of protocol discussions, both for the Challenge and for more informal tests. I didn't refer to the people I was talking to as 'pigs' though. ;)

You could have joined the thread on the IIG protocol for VFF, but I think you'd rather argue in the abstract. I'm the opposite.

It depends. I am interested in the concrete when it is meaningful, such as when we are working towards a real test. But in this case, the discussions with this claimant have been particularly acrimonious, and they quickly went down a path that I didn't think was useful, so I chose to stay out of it. And judging by the response I've received from you, my input would have been undesirable anyway.

Linda
 
Humph :( I feel like I've been two timed, this discussion bares a striking resemblance to one I was having in another forum :D

I'm curious what the "bias" and "effect size" is. I'll have to read up on some of these posts.
 

Back
Top Bottom