I didn't stall, I know you asked a trick question and you thought the answer was the 68, 95, 99 rule...
The answer
is the 68-95-99 rule. It's not a trick question, and no, you didn't know any such thing ahead of time. It's one of the most fundamental principles in statistics as used in the experimental sciences—so fundamental, as we discover below, that it doesn't even need to be explicitly acknowledged as such.
It's also critical to your defense of Casabianca because it's the figure that gives you what you expect to be a rigid "specification" that you allege the archaeologists are violating. And I let you flounder enough (and dropped enough hints) to make it clear you were struggling and couldn't credibly excuse that by saying, "Oh, yeah, I knew that all along." You
didn't know it all along, though others did. And now you're trying to tap-dance to a wholly different misconception to make it sound like irrelevance, not ignorance, was the reason you couldn't answer the question before.
You're simply not as good at face-saving as you think, so maybe stop trying to win and start trying to learn.
...the question was why Casabianca used the 95% confidence interval.
Because it's the most commonly used of the 68, 95, and 99 figures in the experimental sciences.
I first asked you why Casabianca used a 95% confidence interval. That was because you were reading χ
2 critical values from a table that included many columns for different confidence intervals. It's important to know why Casabianca thought was the appropriate confidence interval because if you use a different one, the numbers you're using to drive the heterogeneity argument change. If a yes-no determination ends up being highly sensitive to a parameter, you had better be sure to set that parameter at a defensible place.
Now anyone who is even remotely familiar with experimental statistics would have been able to immediately answer why Casabianca is reading from then 95% column. My dad was an academic and an experimenter in the social sciences, so I learned all this when I was about twelve. For most others who want to pursue science, this happens in late high school or early college. But it should be a knee jerk to answer, "Because that's two standard deviations away from the mean." No matter when one learned it, it was one of the first things one learned about descriptive statistics and one that tends to stick.
But you didn't get it. You answered that Casabianca used 95% because Damon used 95%. That's a true statement but an incomplete answer. It just kicks the question down the road to ask, "Why then did Damon et al. use a 95% CI?" And really it just signals that you didn't know where the 95% comes from. But as I already pointed out, Damon didn't use only the 95% CI. He also used the 68% CI—one standard deviation. The dates reported in Table 3 are reported for both CIs. Now that raises the question why Casabianca didn't provide χ
2 analysis for the 68% CI, but we'll get to that question later after we've finished laying some more groundwork.
You still didn't get it. I had to finally lay the 99% (3 standard deviations) figure on you before you had enough to Google the answer. But the point is that other people here had already had their "Oh, right..." moment. The point is not to belittle you personally. But we have to confront the notion that you don't know enough about statistics to understand why Casabianca's claims aren't really taken very seriously in the archaeology world and why you're having such hard time understanding why.
Now that you know where the 95% CI comes from, I've asked the next question in my line of questioning. In an ordinary, valid data set, what observable, fundamental shifts occur in the underlying data at those {68, 95, 99, ...} CI boundaries that let you say something about data that falls outside them? And yes, when you finally come up with an answer to that question, I'll ask the next question and so forth until you finally understand my rebuttal to Casabianca.
Is the answer you are looking for the correlation of 1,2, and 3 standard deviations with the 68, 95, 99 rule? Theoretically we would expect a continuous function, but not with a finite number of measurements.
Yes, the answer is that Damon used 68% and 95% because those correspond to μ±1σ and μ±2σ respectively in the normal distribution. Some physics needs 4σ or 5σ confidence to be accepted, while I occasionally want to aim for 6σ. (I.e.,
Six Sigma, but don't mistake engineering for experimental science. I don't want to have to relitigate that.)
No, this doesn't just hold for the normal distribution.
The normal (Gaussian) distribution is the Mother of All Distributions. By that I mean it was the first to be discovered, and therefore we have been examining its mathematical properties for something like 300 years. It has a formal mathematical definition, and it has an ever-growing set of properties that also have mathematical definitions based on the behavior of the normal distribution. Later came discrete distributions like Poisson, or sample-only distributions like Student's
t-distribution, and of course the χ
2-distribution. But because the normal distribution is the bedrock of statistical understanding, it's common and appropriate to investigate behavior and develop new theories using the normal distribution first. If a principle can be understood clearly in terms of the normal distribution, its behavior in other distributions lands more gently. This is why we describe the 68-95-99 rule in terms of the normal distribution, not because it apples only to continuous, normally-distributed data.
Properties like mean, standard deviation, score, and degrees of freedom are defined for all these distributions, but they're formulated differently in each case because the mathematical definition of each distribution is different. But the concepts transfer. The concept of standard deviation is generic enough to apply to all distributions that have one, and theories developed in those terms for one distribution generally apply to the others, because that's the way we've formulated them to work.
When I asked you why Casabianca used the 95% CI, if not for the 68-95-99 rule of thumb, you insinuated that only he knows—because it couldn't possibly be for that rule, in turn because (according to you) that rule only works for normally-distributed data. But then Damon also used 68% CI and 95% CI, also for reasons that have nothing to do with the 68-95-99 rule? And also for reasons that only he knows?
I'm going to give you a chance to retract that silly, knee-jerk answer. If Casabianca is just making up numbers to use in his analysis and not explaining where they came from, wouldn't that make him a poor scientist? Neither Damon nor Casabianca has to explain why they're using 68% and 95% CIs because everyone already knows. And yes, it's because of the 68-95-99 rule.
Using a 95% CI for, say, a coin-toss experiment that has to use the Poisson distribution lets me directly compare the strength of those findings with those of someone else who studies fish fertility according to the Gaussian distribution, and with those of someone who is dating cloth using a χ
2-distribution. It also lets me compare findings within a field, even if different experiments in the field require different statistical models for goodness of fit. We all agree to publish according to those values because they are defined for all the distributions we need for experiment statistics, and we choose
those particular values because they correspond to the feel-good intervals defined by a simple, elegant property of the normal distribution.
The next question then is to examine how those feel-good intervals relate to the measurable properties of the data they describe. We want to reason about coin tosses or fish eggs or carbon isotope atoms, not just abstract statistics. So what marks a change in the underlying data corresponding to those whole-number sigma boundaries? What happens
in the data that lets us say—according to where they lie with respect to those intervals—that we should reject that data? The intervals effectively categorize the data. What similar signs of discrete categorization should I be able to see
in the actual data that corresponds to those elegant intervals?
You won't answer my questions and you are dodging as fast as you can.
It's my rebuttal. You're desperately trying to take it away from me and drive it away from the weak part of the argument. The weak parts are the premises upon which you place such great faith in the rigidity of the statistical analysis and therefore its ability to claim that Damon et al. cheated. In turn, I've determined that your great faith is based on assumptions you've made about elementary statistics rather than an education in it.
You're repeatedly trying to bait me into answering questions that presume your premises are correct. My rebuttal is instead aimed at your premises, hoping to get you to understand why they're not as correct as you may think. You can either deal with that fact or you can admit you're not prepared to have your argument addressed in the proper manner. You aren't entitled to demand an answer that is both correct and simplified to fit your existing knowledge.
You are assuming I was trained, but I was just told what equations were to be used and expected to perform the necessary statistical analyses, no "on the job training was provided."
You don't treat me with the respect you expect me to treat you with. But I'll suck it up. You might start with answering some of my questions.
It's my rebuttal.
I'm sure you were good at your job, and I'm sure your understanding of statistics was sufficient at the time to allow you to be successful. But you clearly don't have the proper background in statistics to understand why Casabianca's claims are not considered credible by archaeologists.
I could just say the rebuttal to Casabianca is over your head, and that wouldn't even be considered disrespectful. It might come off as dismissive or elitist, but it's not an inappropriate answer. There are many topics that are over many people's heads, mine included. The world in general is not obliged to entertain naive claims, and so is often justified in simply saying, "That's a naive claim," without being tarred as disrespectful.
But because I care, I'm taking the time to teach you the statistics you missed out on. I'm teaching Socratically, so that you end up teaching yourself. Having taught difficult subjects at a major university, I'm sticking with what I know to be effective. Now you could say "Thank you," and walk away knowing more than you did. But given that this is a debate and not a classroom, I'll settle for you just dialing back the attitude.
Is the Damon data a normal distribution or not?
As we've discussed, that's not really the issue. You've tried to declare the 68-95-99 rule irrelevant to Damon and Casabianca because you wrongly think it applies only to the normal distribution. In your haste to knee-jerk your way around every time you have to admit you just learned something, you've forgotten that your argument
relies on those intervals being defined for the χ
2-distribution (else where did that table come from and why does it matter?) and having the same meaning as they do for other distributions. And you ended up in a corner where you can't tell us where Casabianca, Damon, et al. got the intervals they used if not from the rule.
But see, now that you mention it, the χ
2-distribution only works for standard normal variables. That is, it describes how the sum of the squares of some number of such variables should be distributed. A normal variable is based on the normal distribution—i.e., x ~ N{μ, σ
2}. The χ
2-distribution tells us what we can expect from
k such values of x, where each x comes from a different parameterization of N, but where all N purport to model the same underlying phenomenon. But if you want to do a goodness-of-fit test against the χ
2-distribution, what do you do when the little empirically-measured distributions that you want to compare for fit against the theoretically expected standard normal variables aren't themselves normally distributed (or known to be normally distributed)?
This problem comes up all the time in statistics. We suspect that a certain phenomenon that we just measured might be normally distributed, but we have no way of knowing what the population mean and variance might be. We can assess normality by any of several methods. Then based on those findings, we might "studentize" the data. (This doesn't have anything to do with teaching; it's named for a statistician with the unfortunate name of Student.) But that doesn't really work for radiocarbon dating data because there's that horribly convoluted Stuiver and Pearson (not that Pearson) model you have to invoke. That's where Ward and Wilson earned their money—by figuring out how to wrestle radiocarbon dates into a form that can be directly compared to standard normal variables and thus validate a goodness-of-fit test according to the χ
2-distribution. They do this in part by reformulating the expected distribution as N{μ(θ), σ
2} to let the expected mean vary according to the Stuiver and Pearson graph.
So the tl;dr answer to your question is, "It doesn't matter." The subtext is, "By even asking it you're displaying ignorance that we need to keep correcting."
Now this tees up five or six followup questions for you that lead to where I'm eventually going. Lest you mistake my questions for dodges or distractions, I'm assuring you that there is a plan. And I've already foreshadowed the plot twist that includes the notion that a χ
2 goodness-of-fit test is not necessarily the only (or the best) way to measure what archaeologists want to know about radiocarbon dates. But please resist the urge to knee-jerkedly respond to that for now. Again, I'm just assuring you that there's a plan. The question on the table now is the one about how you recognize the categorization from the CI in the data themselves.
What do the numbers 6.4, 0.1, 1.3, and 2.4 as reported by Damon mean?
What tests would you run to determine if a data set is normally distributed?
No, you don't get to keep repeating questions that presume the correctness of your premises. No, you don't get to take every detour as a distraction. This is my rebuttal. Stop trying to derail it, and stop trying to recharacterize my right to direct my own rebuttal as some kind of evasion.
I am talking about the statistical test performed both by Damon, Casabianca, and Van Haelst. Not just Casabianca, not just my authors.
No, you're trying to get out ahead of the rebuttal and head it off at the pass so that you don't have to suffer through it. The problem with your defense of Casabianca is that you don't yet know enough about statistics and how they're used in archaeology (specifically radiocarbon dating) to understand the weakness of his premise, hence you have knee-jerkedly rejected the explanation every time it's been offered. You don't get to pretend that's not the problem, and you don't get to avoid solving it by trying to direct my rebuttal away from it.