Maia
Graduate Poster
- Joined
- Jul 20, 2009
- Messages
- 1,259
(laces up running shoes to get a headstart)
Now, before everyone starts chasing me out of town waving virtual torches, y’all should know that what follows is not an argument that NDE’s represent proof for God and angels and the Second Coming of Jesus bringing us all back to heaven. But I can’t stand what the popular press does to scientific research, and to be perfectly honest, I was not impressed by this research anyway. Here’s why. Please read all the way to the end before commenting. I did a lot of original work on this, ALL BY MYSELF.
Take a deep breath, and let’s dive in!
Here's the actual real-live study yapped about in National Geographic:
Right here.
On careful analysis, the results are so far from anything that was reported in National Geographic (or anywhere else) that it’s simply embarrassing. The differences are so significant. But you really do have to do some digging.
The claim was that:
I decided not to try to chase down info about potassium levels, etc., because those weren't the main claims that had been passed down through the popular press (although the part about the patients with previous NDE's having more was very interesting.) Here we go:
If you have a TI-83 plus calculator, you may want to get it out and follow along. First, I made little charts of NDE and non-NDE-rs, with n, xbar (sample mean), and standard deviations. For PCo2, for example, the NDE group had an n of 11, a sample mean of 6.6, and a standard deviation of 2.3; the non-NDE group had an n of 41, a sample mean of 5.3, and a standard deviation of 1.4. The null hypothesis was that the population means were equal. The claim was that population mean one was greater than population mean two. Here's where the TI-83 comes in.
STAT-TESTS-2SAMPTTEST. More about why this test was chosen later.
Now enter in your data. Your p value is .049469,but this is NOT WHAT THE RESEARCHER SAY. They claim that the p value is .041. It is not. Get out your stats calculator and enter the numbers yourself. The alpha value they chose is the very common one of .05. This means that they've barely, BARELY inched over into showing an effect for that pCo2. This is just barely statistically significant, and it has no practical significance by any definition. If you round up at 2 decimal places rather than 3 (which you're not supposed to do, of course, but let's just say for the sake of argument...) you cannot reject the null hypothesis.
The numbers get more robust for the petCo2 levels, because the p value is .0017. But there's a tremendous problem here. The multivariate analysis found that:
It's all very easy to say that pCO2 was an independent predictor of NDEs, but its effect was so extremely minor that the independent predictor effect couldn't have been anything but negligible. Nothing plus nothing equals nothing.
Now, a stats expert I am not. But think about this: they said that they first used univariate analysis, considering each variable separately. By the time they got to the information in Table 2, what else could you use for those particular results besides an independent samples t-test for 2 means? What they actually said in the study was:
They used SPSS 13.0, which means this wasn't done by hand. I'm not so sure this is always a good idea, because you can really miss some things. And for this particular set of relationships between variables, nothing else really makes any sense besides a t-test that I can see, because you’re testing two groups separated by one defined characteristic (NDE vs. non-NDE) to see if you can come up with good evidence that they come from a population whose “averages” are different on this, that, and the other defined property (various blood gas levels). (You wouldn’t use ANOVA, for instance, because you only have 2 groups. You wouldn’t use a chi‐square test because it isn’t nominal data, you wouldn’t use a Kolmogorov–Smirnov test because it isn’t ordinal data.)Maybe… I don’t know, I’m just trying to come up with something here… they used some adjustment factor or other to come up with the differing numbers (.041 rather than .04596), in order to compensate in some way for the fact that n most certainly did not equal 30 for the NDE group (more about that later)… a stats expert might know more about if this is even possible, but then I think they would have to have said something more specific about it than they did. A straight t test for 2 means gives a result of 049469, not .041.
The only obvious way to get this number that I can see would be to artificially conflate the sample sizes. For instance, if you say that you had 100 subjects in each group (n1 and n2 each = 100,) then you’ll end up with p < .0001. The effect would be very robust if you could just get a large enough sample size. But that is just about the worst methodology that anybody could ever even imagine coming up with. The sample only consisted of 52 people, 11 (yes, eleven) of which had NDE’s. This was from an original pool of 400 who had cardiac arrest, and only 76 survived! Ack. Seven of the 11 NDE’ers were atheists. I’m not sure if this was unusual or not, though, because I have no idea what the religious makeup of Slovenia is.)
I don’t know. I’m really, really confused about this point, because I basically just do not see how they came up with a figure of .041 based on their data. Even so, a P-value of .041 is not the least bit impressive compared to an alpha of .05 either.
Overall, higher petCo2 was not an independent predictor of NDE's, and higher PCo2 was but just barely, barely managed to cross over into even statistical significance. The only conclusion to really come to is that this study simply does not mean what it has been ballyhooed to mean.
When all is said and done, of course, there’s also a fatal flaw that totally and completely invalidates the kind of conclusions that have been drawn from this study by the popular press, and it's that in order to draw any real conclusions from these stats methods, the sample sizes absolutely must be at least n = thirty for each group. The NDE group was n = 11. The researchers said clearly that this was a prospective observational study. That was the responsible thing to say because of its limitations, but particularly because of the size problem.
The final lesson to take away from all of this, I think, is to just not believe the version you read in the popular press. Complex neurobiological phenomena do not have simplistic explanations. I’ll be sending all of this information to National Geographic, and I’ll let them know what I think of their fact-checking quality. All of this took me half an hour, and really, most of the people on this board could have done it too.
Now, before everyone starts chasing me out of town waving virtual torches, y’all should know that what follows is not an argument that NDE’s represent proof for God and angels and the Second Coming of Jesus bringing us all back to heaven. But I can’t stand what the popular press does to scientific research, and to be perfectly honest, I was not impressed by this research anyway. Here’s why. Please read all the way to the end before commenting. I did a lot of original work on this, ALL BY MYSELF.
Here's the actual real-live study yapped about in National Geographic:
Right here.
On careful analysis, the results are so far from anything that was reported in National Geographic (or anywhere else) that it’s simply embarrassing. The differences are so significant. But you really do have to do some digging.
The claim was that:
Patients with higher petCO2 had significantly more NDEs. Patients with higher pCO2 had significantly more NDEs. Patients with previous NDEs had significantly more NDE's. The NDE score was positively correlated with pCO2 and with the serum level of potassium. Patients with lower pO2 had more NDEs, although the difference was not statistically significant.
I decided not to try to chase down info about potassium levels, etc., because those weren't the main claims that had been passed down through the popular press (although the part about the patients with previous NDE's having more was very interesting.) Here we go:
Table 2
Correlation of independent variables with the presence of NDEs
Variable NDEs group (mean ± SD) Non-NDEs group (mean ± SD) P
________________________________________
Age (years) 57.9 ± 13.8 51.8 ± 14.6 0.217
Time until ROSC (minutes) 8.3 ± 6.7 8.8 ± 5.3 0.772
petCO2 (kPa) 5.7 ± 1.1 4.4 ± 1.2 < 0.01
pO2 (kPa) 16.4 ± 11.1 25.3 ± 15.1 0.108
pCO2 (kPa) 6.6 ± 2.3 5.3 ± 1.4 0.041
Serum sodium (mmol/l) 139.2 ± 6.1 140.4 ± 4.0 0.439
Serum potassium (mmol/l) 4.6 ± 1.2 4.1 ± 0.8 0.118
________________________________________
NDE, near-death experience; petCO2, initial partial end-tidal pressure of carbon dioxide; pCO2, partial pressure of carbon dioxide; pO2, partial pressure of oxygen; ROSC, return of spontaneous circulation; SD, standard deviation.
Klemenc-Ketis et al. Critical Care 2010 14:R56 doi:10.1186/cc8952
If you have a TI-83 plus calculator, you may want to get it out and follow along. First, I made little charts of NDE and non-NDE-rs, with n, xbar (sample mean), and standard deviations. For PCo2, for example, the NDE group had an n of 11, a sample mean of 6.6, and a standard deviation of 2.3; the non-NDE group had an n of 41, a sample mean of 5.3, and a standard deviation of 1.4. The null hypothesis was that the population means were equal. The claim was that population mean one was greater than population mean two. Here's where the TI-83 comes in.
STAT-TESTS-2SAMPTTEST. More about why this test was chosen later.
Now enter in your data. Your p value is .049469,but this is NOT WHAT THE RESEARCHER SAY. They claim that the p value is .041. It is not. Get out your stats calculator and enter the numbers yourself. The alpha value they chose is the very common one of .05. This means that they've barely, BARELY inched over into showing an effect for that pCo2. This is just barely statistically significant, and it has no practical significance by any definition. If you round up at 2 decimal places rather than 3 (which you're not supposed to do, of course, but let's just say for the sake of argument...) you cannot reject the null hypothesis.
The numbers get more robust for the petCo2 levels, because the p value is .0017. But there's a tremendous problem here. The multivariate analysis found that:
Higher pCO2 was an independent predictor of NDEs. The logistic regression model explained 46% of the variation (Table 3). A higher NDE score was independently associated with higher pCO2, higher serum levels of potassium, and previous NDEs. The linear regression model explained 34% of the variation (Table 4).
It's all very easy to say that pCO2 was an independent predictor of NDEs, but its effect was so extremely minor that the independent predictor effect couldn't have been anything but negligible. Nothing plus nothing equals nothing.
Now, a stats expert I am not. But think about this: they said that they first used univariate analysis, considering each variable separately. By the time they got to the information in Table 2, what else could you use for those particular results besides an independent samples t-test for 2 means? What they actually said in the study was:
To identify statistically significant differences between different variables, we used an independent samples t-test, chi-squared test, and a Wilcoxon rank sum test.
They used SPSS 13.0, which means this wasn't done by hand. I'm not so sure this is always a good idea, because you can really miss some things. And for this particular set of relationships between variables, nothing else really makes any sense besides a t-test that I can see, because you’re testing two groups separated by one defined characteristic (NDE vs. non-NDE) to see if you can come up with good evidence that they come from a population whose “averages” are different on this, that, and the other defined property (various blood gas levels). (You wouldn’t use ANOVA, for instance, because you only have 2 groups. You wouldn’t use a chi‐square test because it isn’t nominal data, you wouldn’t use a Kolmogorov–Smirnov test because it isn’t ordinal data.)Maybe… I don’t know, I’m just trying to come up with something here… they used some adjustment factor or other to come up with the differing numbers (.041 rather than .04596), in order to compensate in some way for the fact that n most certainly did not equal 30 for the NDE group (more about that later)… a stats expert might know more about if this is even possible, but then I think they would have to have said something more specific about it than they did. A straight t test for 2 means gives a result of 049469, not .041.
The only obvious way to get this number that I can see would be to artificially conflate the sample sizes. For instance, if you say that you had 100 subjects in each group (n1 and n2 each = 100,) then you’ll end up with p < .0001. The effect would be very robust if you could just get a large enough sample size. But that is just about the worst methodology that anybody could ever even imagine coming up with. The sample only consisted of 52 people, 11 (yes, eleven) of which had NDE’s. This was from an original pool of 400 who had cardiac arrest, and only 76 survived! Ack. Seven of the 11 NDE’ers were atheists. I’m not sure if this was unusual or not, though, because I have no idea what the religious makeup of Slovenia is.)
I don’t know. I’m really, really confused about this point, because I basically just do not see how they came up with a figure of .041 based on their data. Even so, a P-value of .041 is not the least bit impressive compared to an alpha of .05 either.
Overall, higher petCo2 was not an independent predictor of NDE's, and higher PCo2 was but just barely, barely managed to cross over into even statistical significance. The only conclusion to really come to is that this study simply does not mean what it has been ballyhooed to mean.
When all is said and done, of course, there’s also a fatal flaw that totally and completely invalidates the kind of conclusions that have been drawn from this study by the popular press, and it's that in order to draw any real conclusions from these stats methods, the sample sizes absolutely must be at least n = thirty for each group. The NDE group was n = 11. The researchers said clearly that this was a prospective observational study. That was the responsible thing to say because of its limitations, but particularly because of the size problem.
The final lesson to take away from all of this, I think, is to just not believe the version you read in the popular press. Complex neurobiological phenomena do not have simplistic explanations. I’ll be sending all of this information to National Geographic, and I’ll let them know what I think of their fact-checking quality. All of this took me half an hour, and really, most of the people on this board could have done it too.
Last edited: