Medical research relies heavily on statistical results to substantiate its findings. Yet, stat results are almost always misunderstood and misinterpreted. It's really scandalous.
This research, like any science, seeks to make statements about general states of affairs, associations, effects, etc., given empirical data. Viz., this research seeks to be inductive, to say something about general theories given particular observations, to be inductive.
Yet, the common statistical results (p-values, hypothesis test results & confidence intervals) are not inductive, but deductive; they assume this or that theory or hypothesis, & then make statements about data. They move from the general to the specific.
The most ubiquitous statistical result, for instance, is the p-value, p. p is calculated as
p=Pr(Y>=x given Ho) where
Ho = the tested hypothesis,
x = observed experimental data,
Y = data from a second, hypothetical experiment that is never conducted.
"Y>=x" means "Y constitutes as strong evidence against Ho as does x."
In words: "Having run an experiment and obtained x, p is the probability, in a repetition of the experiment, of obtaining evidence against Ho as strong as is x." So, p is a statement about data x and Y, assuming Ho. It's not a statement about Ho. It's not inductive.
BTW, hypothesis test results & confidence intervals are also statements about data given hypotheses. They are not inductive either. But they are almost universally interpreted as such.
Yet, one can hardly blame researchers for these misinterpretations; the misinterpretations even populate introductory statistical books, where p-values & the like are proposed as "inferential" measures and where "inference" is defined as "extending sample results to general populations."
For decades, Michael Oakes and others have been studying how people interpret common stat results; they conclude that experts as well as students, applied scientists as well as statisticians, all misinterpret these results. Just put "+michael +oakes +p-value +medical" into google or yahoo searches & read the results! None of what I've said so far is news.
It is high time for us to move beyond recognizing and living with these misunderstandings. We need to start asking questions such as
1. Why do people almost universally misunderstand p-values, hypothesis test results & confidence intervals?
2. Why are these tools still used, despite these shortcomings?
We need to explore the root causes of these problems. I have some ideas about these causes, and would be interested in considering any you may propose as well. I expect that would generate some excellent discussion.
Andrew
This research, like any science, seeks to make statements about general states of affairs, associations, effects, etc., given empirical data. Viz., this research seeks to be inductive, to say something about general theories given particular observations, to be inductive.
Yet, the common statistical results (p-values, hypothesis test results & confidence intervals) are not inductive, but deductive; they assume this or that theory or hypothesis, & then make statements about data. They move from the general to the specific.
The most ubiquitous statistical result, for instance, is the p-value, p. p is calculated as
p=Pr(Y>=x given Ho) where
Ho = the tested hypothesis,
x = observed experimental data,
Y = data from a second, hypothetical experiment that is never conducted.
"Y>=x" means "Y constitutes as strong evidence against Ho as does x."
In words: "Having run an experiment and obtained x, p is the probability, in a repetition of the experiment, of obtaining evidence against Ho as strong as is x." So, p is a statement about data x and Y, assuming Ho. It's not a statement about Ho. It's not inductive.
BTW, hypothesis test results & confidence intervals are also statements about data given hypotheses. They are not inductive either. But they are almost universally interpreted as such.
Yet, one can hardly blame researchers for these misinterpretations; the misinterpretations even populate introductory statistical books, where p-values & the like are proposed as "inferential" measures and where "inference" is defined as "extending sample results to general populations."
For decades, Michael Oakes and others have been studying how people interpret common stat results; they conclude that experts as well as students, applied scientists as well as statisticians, all misinterpret these results. Just put "+michael +oakes +p-value +medical" into google or yahoo searches & read the results! None of what I've said so far is news.
It is high time for us to move beyond recognizing and living with these misunderstandings. We need to start asking questions such as
1. Why do people almost universally misunderstand p-values, hypothesis test results & confidence intervals?
2. Why are these tools still used, despite these shortcomings?
We need to explore the root causes of these problems. I have some ideas about these causes, and would be interested in considering any you may propose as well. I expect that would generate some excellent discussion.
Andrew