I was tinkering around this past weekend and sat down and helped write some Perl code to investigate something that interests me and combines probability, statistics, and skepticism.
Since some people see random bundles of words from the Bible (selected through an equidistant letter skip, ie. a systematic sample) as meaningful, I thought I'd investigate if it is easy (or not) to get meaningful words from random letters in general.
I created n 'words' of length k, where k varied from 2 to 25 (the max and min word length in my dictionary), where each letter in the 'word' was selected randomly from the letters a through z. Then I searched and saw if these 'words' were in the dictionary.
I ran trials of n = 100, n = 1000, n = 10000, n = 100000, n = 1000000, and n = 10000000 random words of random lengths k. As it turned out on average, for n words created randomly, about .12*n of these words were real dictionary words.
That seems pretty significant to me. If meaningful words can be created from obvious randomness without employing a systematic sample, then how much easier is it to create meaning from words that are not randomly created with employing a systematic sample?
Since some people see random bundles of words from the Bible (selected through an equidistant letter skip, ie. a systematic sample) as meaningful, I thought I'd investigate if it is easy (or not) to get meaningful words from random letters in general.
I created n 'words' of length k, where k varied from 2 to 25 (the max and min word length in my dictionary), where each letter in the 'word' was selected randomly from the letters a through z. Then I searched and saw if these 'words' were in the dictionary.
I ran trials of n = 100, n = 1000, n = 10000, n = 100000, n = 1000000, and n = 10000000 random words of random lengths k. As it turned out on average, for n words created randomly, about .12*n of these words were real dictionary words.
That seems pretty significant to me. If meaningful words can be created from obvious randomness without employing a systematic sample, then how much easier is it to create meaning from words that are not randomly created with employing a systematic sample?