If Mosteller and Wallace are stylometry’s best known success, it is equally important to discuss the best known failures. The cusum or Qsum technique, an abbreviation for “cumulative sum,” [10, 11, 39, 111] is a visual method for observing similarity between sequences of measures. As applied to sequences in general, one first takes the sequence, e.g., { 8, 6, 7, 5, 3, 0, 9, 2 . . . } and calculates the mean (in this case, 5). One then calculates the differences from the mean { 3, 1, 2, 0, −2, −5, 4, −3 . . . } and plots their “cumulative sum” { 3, 4, 6, 6, 4, −1, 3, 0 . . . }. This plot measures the homogeneity or stability of a feature — in the case of cusum, the feature is traditionally something like “percentage of words with two or three letters.” (but see Section 4).
This technique was rapidly adopted and used in several English court cases [including The Queen vs. Thomas McCrossen (Court of Appeal, London 1991), The Queen vs. Frank Beck (Leicester Crown Court 1992), and The Queen vs. Joseph Nelson-Wilson (London 1992)] as a forensic technique. Unfortunately, the accuracy of the technique almost immediately came under question; reports such as [24, 52, 53, 55, 60] suggested that the theory was not well-grounded and that the results were not accurate enough to be relied upon (especially given that the case mentioned were criminal cases). However, the ultimate downfall happened when “he was challenged on live British television to attribute texts that he had never seen. The result was disastrous;despite his impressive statistics and his fancy computer graphics, Morton could not distinguish between the writings of a convicted felon and the Chief Justice of England” [50].
Despite this failure variations on cusum such as WQsum (“weighted cusum”) continue to be used [15, 135, 136] and have some evidence for their validity. As will be seen, frequency analysis continues to be a major component of many successful algorithms. But the negative publicity of such a failure has cast a substantial shadow over the field as a whole.