• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Statistical significance

Thank you all for your informative replies. I think I know where I am going now. What I am doing is writing about an article I found about "guided imagery" being used as a pain relief method. The article states:

I understand why that is bull(four asterisks), but I was having trouble putting into words I thought others would understand. You guys have definitely helped me put my thoughts in order. Any further commentary would still be greatly appreciated.


One thing to keep in mind is the sample size. If the sample size is small, the effect may well be real even though it isn't statistically significant.
 
It would help if you could provide more details or a link to the article. Were the subjects randomly assigned to the treatment vs. no treatment groups? Was the pain rating on the typical 0 to 10 scale? How many subjects?
But, in any case, no competent scientist would ever state that statistically insignificant results, even at the less rigorous level of .05, were clinically significant.
Quite often, the opposite is true. With a large number of subjects, statistical significance can be obtained with trivial effects.

Hi Jeff,

I didn't link to the article because that quote I supplied is all they say about the actual study that was allegedly undertaken. Pardon me, other than to say that the study group was a measly 17 children.

Here is the link: http://www.livescience.com/healthday/535689.html
 
If I get into my car and drive to the store the chances of me having a wreck are 50% because either I have a wreck or I don't so it's 50/50.

This bit is wrong. According to this argument there is a 50;50 chance of throwing a 6 with a normal dice because either it will happen or it wont.

The chances of me having a wreck on the way home is also 50% either I have a wreck or I don't.

You multiply independent probabilities rather than adding them so if the chance of not having a crash as you drove on the road the shop and your house was 1/2 the chance of not having a crash on the way there or back is 1/4.
So you'd have a 75% chance of crashing on the way there or back if the 50% figure was correct.

Entertainingly, you misstated the conclusion which probably should have been "there is a 100% chance that the car will crash on the trip." Your conclusion that afterwards you will "either have a wreck or not" is actually correct.

Edit: Abridged version: I agree with Beth and Jeff.
 
Last edited:
I don't want to start another thread for this question so i'll ask it here...

Can someone give me the mathematical reasons of why this argument is flawed....?


If I get into my car and drive to the store the chances of me having a wreck are 50% because either I have a wreck or I don't so it's 50/50. The chances of me having a wreck on the way home is also 50% either I have a wreck or I don't.

I know there are two main problems with this logic. Firstly probability isn't calculated that way and secondly it isn't added up that way (on the way to and from the store). But I don't remember the exact mathematics behind how it's actually done and why this is fallacious.


Can anyone refreash my memory?
What some people reason is that, after you are back from the store (assuming you ever get back ;)), pre-trip probabilities are meaningless. Either you had a wreck or you did not. However, you could logically conclude that the pre-trip probability of a wreck on the way to and from the store is 50% only if an analysis that factors in all known variables (such as prior number of wrecks on the way to and from the store, traffic conditions, weather conditions, etc.) indicates that there is an equal chance of having a wreck or not having a wreck. If it does, you should consider walking to the store or staying home . . .
 
One thing to keep in mind is the sample size. If the sample size is small, the effect may well be real even though it isn't statistically significant.
Sure. But the experiment in question doesn't provide much evidence in favor of its reality. Nor, of course, against its reality. A small experiment just isn't too informative either way, and we're left more or less where we started.
 
I don't want to start another thread for this question so i'll ask it here...

Can someone give me the mathematical reasons of why this argument is flawed....?

If I get into my car and drive to the store the chances of me having a wreck are 50% because either I have a wreck or I don't so it's 50/50. The chances of me having a wreck on the way home is also 50% either I have a wreck or I don't.

I know there are two main problems with this logic. Firstly probability isn't calculated that way and secondly it isn't added up that way (on the way to and from the store). But I don't remember the exact mathematics behind how it's actually done and why this is fallacious.

Can anyone refreash my memory?

You can use "either you will get into a wreck or you won't", but you actually have to figure out the frequency with which you get into wrecks and the frequency with which you won't.

If you want to combine the chance from the trips to and from the store, you are looking at "the chance of having a wreck on the way to the store and not on the way back" or "the chance of having a wreck on the way back from the store and not on the way there" or "both on the way there and on the way back". The "or" is a good indication that the probabilities are additive. So you need to figure out the probability of each scenario and add them up.

Let's look at the chance of having a wreck on the way to the store and not on the way back. The use of "and" is a good indication that the probabilities are multiplicative. So you have the chance of having a wreck (50% using your example) times the chance of not having a wreck (50%), which comes out to 25%. The chance for each of the other two scenarios is also 25%. Adding it all together, you have a 75% chance that you will have at least one wreck on the way to and from the store (or a 50% chance of only one wreck).

There is an easier way to solve this particular problem (1-(the chance of not having a wreck on the way to the store and not having a wreck on the way back), but I did it this way to illustrate the difference between when you add probabilities and when you multiply probabilities.

Linda
 
Every experiment is really a comparison between two hypotheses -- an "experimental" hypothesis and a "null" hypothesis.

really? every experiment? how does that work if am looking for a number, say if i am measuring the speed of light (before it was set equal to 1, of course)?

or if i am a Bayesian estimating how far my car will go on a tank of gas, and end up with a postierior distribution?

that is not to say that there aren't many cases where the dual hyposthesis structure is extremely useful! but the original post also asked:

...what expressions of error mean (the term escapes me right now, but I'm talking about what is expressed as a +/- range of accuracy in experimental results).
 
You can use "either you will get into a wreck or you won't", but you actually have to figure out the frequency with which you get into wrecks and the frequency with which you won't.

Linda
so how do i do that on My first trip to the store?
 
Assume it was just 3 taste trials-- you got 2 right and one wrong.

The probability of getting 2 right just by guessing (as would be the case if the null were indeed true) is 3/8 or about .24

How do you decide the figure 3/8 please?
 
Quote:
Assume it was just 3 taste trials-- you got 2 right and one wrong.

The probability of getting 2 right just by guessing (as would be the case if the null were indeed true) is 3/8 or about .24
How do you decide the figure 3/8 please?

There are 2 possible answers for each taste test. Three taste tests gives you 8 possible outcomes (2x2x2). Three of those outcomes involve two correct guesses and one wrong guess.

Linda
 
Thank you all for your informative replies. I think I know where I am going now. What I am doing is writing about an article I found about "guided imagery" being used as a pain relief method. The article states:

I understand why that is bull(four asterisks), but I was having trouble putting into words I thought others would understand. You guys have definitely helped me put my thoughts in order. Any further commentary would still be greatly appreciated.


The silliness in the quote is that the researcher is implying that everyone will get a reduction of one or two points from the method, even though the lack of statistical significance means that the probability of an effect of that size occurring when chance alone is operating is unacceptably high. Its almost as though the researcher thinks that a statistically significant effect is just a bigger effect than the one they got but that this doesn't matter as long as the effect is big enough to be useful.

Also, 'one or two points' is meaningless as a measure of effect size. I'm sure somebody will correct me, but I think that the difference between the means should be divided by the average standard deviation to get a meaningful measure of effect size.

Neither 'clinical' or statistical significance matter if the study wasn't properly conducted, and there isn't any information about what was done to the control group.
 
A small experiment just isn't too informative either way, and we're left more or less where we started.
agreed, but is this not, arguably, a circular definition of "small".

a not-small experiment might only consist of one photographic plate, with a few stars in the "wrong" place, even in those rares cases where one might believe that "Every experiment is really a comparison between two hypotheses -- an "experimental" hypothesis and a "null" hypothesis"
 
The probability of getting 2 right just by guessing (as would be the case if the null were indeed true) is 3/8 or about .24
Sorry, kept expecting someone else to correct here, but I guess I will. 3/8 is actually 0.375, not about 0.24. This makes a bit of difference if you're eyeballing the numbers to make a decision.

Good explanation, though, bpesta.
 
Sorry, kept expecting someone else to correct here, but I guess I will. 3/8 is actually 0.375, not about 0.24. This makes a bit of difference if you're eyeballing the numbers to make a decision.

Good explanation, though, bpesta.

Thank you for answering my second question.

Also thank you fls for explaining the 3/8 to me.

What a lovely forum :)

Cheers
 
Thank you for answering my second question.

Also thank you fls for explaining the 3/8 to me.

What a lovely forum :)

Cheers

Whoops, my bad on the 3/8 = .24 thing. It was definately a mistake, but I did leave a bit out of the explanation for teaching purposes. I think the probability you would use to evaluate the null here would actually be .50.

In other words, guessing 2 of 3 right would result in us concluding only 50% accuracy for evaluating the null.

Here's all possibilities for three taste trials (T= true, you got it right; F=False, you got it wrong):

ttt
ttf
tft
tff
ftt
ftf
fft
fff

Three of them have exactly 2 right and 1 wrong (which gives the 3/8 probability), but we need to actually calculate the probability of performing at "2 out of 3 correct OR better" to properly test the null.

So, for binomial tests, we also have to factor in not only the P of the subject's actual performance, but the sum of all Ps for performance even better than that.

Since there are 4 ways where the subject can perform at 2 out of 3 right or better, the observed probability would be .50 (4/8).

Since the .50 is greater than the .05 alpha level, we would not reject the null.

If you think of the bell curve, your performance needs to be at the tail end (to the right of whatever alpha level you set). To get where your performance is on the curve, you have to calculate not only the P of your actual performance, but the P values for all outcomes that are even rarer than this.

I left this out of the original because it doesn't help conceptually.
 
Last edited:
note also that with only 3 trials and alpha =.05, you would never be able to reject the null as perfect performance here would have a probability of .125, which is > alpha.

Moral of the story: Add more trials.
 
I'm on a manic role here, but to further complicate things, the above assumes a one-tailed test (we're testing only better than chance accuracy, not the possibility that the guy could be performing signficantly worse than chance).

If we were doing a two tailed test-- which makes little sense here as it's testing whether the guy is either better or worse than chance at detecting coke versus pepsi-- we would have a really strange result with only 3 trials.

2 out of 3 right or better and the opposite (to cover both ends of the tail) would have a 100% probability.

In other words, for the two tailed test and only 3 trials, one is guaranteed to get at least 2 right or better, or at least 2 wrong or worse!
 
If we were doing a two tailed test-- which makes little sense here as it's testing whether the guy is either better or worse than chance at detecting coke versus pepsi-- we would have a really strange result with only 3 trials.

Not really that strange, is it? Perhaps the guy can detect the difference, but can't label it properly. A two-tailed test captures and controls for that possibility. And any time you perform at exactly the midpoint -- or as close to the exact midpoint as the quantization of the data will permit -- you get a 100% result on a two-tailed test. If I flip a hundred coins, I'm guaranteeed to get either at least fifty heads or at least fifty tails.
 

Back
Top Bottom