bobdroege7
Illuminator
- Joined
- May 6, 2004
- Messages
- 4,408
First, your post is lengthy, appropriate, and almost completely true.The answer is the 68-95-99 rule. It's not a trick question, and no, you didn't know any such thing ahead of time.
Because it's the most commonly used of the 68, 95, and 99 figures in the experimental sciences.
I first asked you why Casabianca used a 95% confidence interval. That was because you were reading χ2 critical values from a table that included many columns for different confidence intervals. It's important to know why Casabianca thought was the appropriate confidence interval because if you use a different one, the numbers you're using to drive the heterogeneity argument change. If a yes-no determination ends up being highly sensitive to a parameter, you had better be sure to set that parameter at a defensible place.
But you didn't get it. You answered that Casabianca used 95% because Damon used 95%. That's a true statement but an incomplete answer. It just kicks the question down the road to ask, "Why then did Damon et al. use a 95% CI?"
You still didn't get it. I had to finally lay the 99% (3 standard deviations) figure on you before you had enough to Google the answer.
Now that you know where the 95% CI comes from, I've asked the next question in my line of questioning. In an ordinary, valid data set, what observable, fundamental shifts occur in the underlying data at those {68, 95, 99, ...} CI boundaries that let you say something about data that falls outside them? And yes, when you finally come up with an answer to that question, I'll ask the next question and so forth until you finally understand my rebuttal to Casabianca.
Yes, the answer is that Damon used 68% and 95% because those correspond to μ±1σ and μ±2σ respectively in the normal distribution. Some physics needs 4σ or 5σ confidence to be accepted, while I occasionally want to aim for 6σ. (I.e., Six Sigma, but don't mistake engineering for experimental science. I don't want to have to relitigate that.)
No, this doesn't just hold for the normal distribution.
The normal (Gaussian) distribution is the Mother of All Distributions.
Properties like mean, standard deviation, score, and degrees of freedom are defined for all these distributions, but they're formulated differently in each case because the mathematical definition of each distribution is different. But the concepts transfer. The concept of standard deviation is generic enough to apply to all distributions that have one, and theories developed in those terms for one distribution generally apply to the others, because that's the way we've formulated them to work.
When I asked you why Casabianca used the 95% CI, if not for the 68-95-99 rule of thumb, you insinuated that only he knows—because it couldn't possibly be for that rule, in turn because (according to you) that rule only works for normally-distributed data. But then Damon also used 68% CI and 95% CI, also for reasons that have nothing to do with the 68-95-99 rule? And also for reasons that only he knows?
I'm going to give you a chance to retract that silly, knee-jerk answer. If Casabianca is just making up numbers to use in his analysis and not explaining where they came from, wouldn't that make him a poor scientist? Neither Damon nor Casabianca has to explain why they're using 68% and 95% CIs because everyone already knows. And yes, it's because of the 68-95-99 rule.
The next question then is to examine how those feel-good intervals relate to the measurable properties of the data they describe. We want to reason about coin tosses or fish eggs or carbon isotope atoms, not just abstract statistics. So what marks a change in the underlying data corresponding to those whole-number sigma boundaries? What happens in the data that lets us say—according to where they lie with respect to those intervals—that we should reject that data? The intervals effectively categorize the data. What similar signs of discrete categorization should I be able to see in the actual data that corresponds to those elegant intervals?
It's my rebuttal. You're desperately trying to take it away from me and drive it away from the weak part of the argument. The weak parts are the premises upon which you place such great faith in the rigidity of the statistical analysis and therefore its ability to claim that Damon et al. cheated. In turn, I've determined that your great faith is based on assumptions you've made about elementary statistics rather than an education in it.
You're repeatedly trying to bait me into answering questions that presume your premises are correct. My rebuttal is instead aimed at your premises, hoping to get you to understand why they're not as correct as you may think. You can either deal with that fact or you can admit you're not prepared to have your argument addressed in the proper manner. You aren't entitled to demand an answer that is both correct and simplified to fit your existing knowledge.
It's my rebuttal.
I'm sure you were good at your job, and I'm sure your understanding of statistics was sufficient at the time to allow you to be successful. But you clearly don't have the proper background in statistics to understand why Casabianca's claims are not considered credible by archaeologists.
As we've discussed, that's not really the issue. You've tried to declare the 68-95-99 rule irrelevant to Damon and Casabianca because you wrongly think it applies only to the normal distribution. In your haste to knee-jerk your way around every time you have to admit you just learned something, you've forgotten that your argument relies on those intervals being defined for the χ2-distribution (else where did that table come from and why does it matter?) and having the same meaning as they do for other distributions. And you ended up in a corner where you can't tell us where Casabianca, Damon, et al. got the intervals they used if not from the rule.
So the tl;dr answer to your question is, "It doesn't matter." The subtext is, "By even asking it you're displaying ignorance that we need to keep correcting."
In my job, I did not have the luxury of working with normal distributions, mostly with radioactive decay, which is usually near normal but skewed. There is a gray area between nice normal distributions and those that where you can not use statistics like standard deviation and the others.
That's my point with the Damon data for the shroud, to paraphrase a fictional baseball coach in Japan, coaching actor Tom Selleck, your distribution has a hole in it.
Yes, I knew about the 68-95-99.7 correlation with standard deviations, do not incorrectly round it to 99, but yes I did not know it was considered a rule. And I knew, and know 6 sigma is one in a million.
You can not apply it to bad data.
And yes, Damon cheated, unless you consider averaging data points before reporting your data.
That was one of the things revealed by Casabianca et al, in requesting access to the data from Damon et al's paper.
And it is true that I don't know what happens at one sigma, two sigma, etc. The inflection points on a normal distribution are close to the one sigma, that might not be the answer you are looking for, I don't know.
I will discuss anything you want, but the end goal for me is to discuss the distribution of the Data as reported by Damon, and Casabianca.
Because it is radioactive decay we are talking about and it better be normal or gaussian to be acceptable.
That's the question for me, does the Damon data have an acceptable distribution?
Last edited: