now is the time for all good men...

evildave said:
For starters, is it supposedly scanning the text of several of Shakespear's works with each random set of numbers it generates? Why is it this tends to always generate lines like "Leonato. I lea" or "GLOUCESTER. N", always at the starts of lines? You'd think there would be a lot of nice, long matches like "nce of the " from the middles of lines. Watch those numbers. Even scannning only the starts of lines, you would have me believe the number of tries grows two orders of magnitude from 10^20 to 10^22 power in a few minutes? In a Java applet?

No, it doesn't claim to do this. FAQ:
Why is it only the first few letters that count?
The rules could be different so that matches anywhere on a page count. In fact there are many possible variations of the rules, and there are bound to be disagreements on which rule is chosen. In any case, if a whole page is going to match, then the first few letters will have to match too.

Is this a 'real' simulation or just a cheat?
The simulation is based on a random number generator to generate random keystrokes. The simulator does not simulate every level of detail because today's computers are just far too slow, but the probabilities are designed to accurately match those of real life and with the correct element of chance. Just like real life, the results of this simulator cannot be predicted even though the probabilities can.

Will the monkeys ever succeed?
Due to the accelerated time and an unlimited supply of bananas, the monkey population in every simulator doubles every few days! So bookmark this page and come back now and again to check how other people's monkeys are doing and to put your own monkeys to work.

So, i guess it's a simulation, every second simulating only the best pages that (now) 1.6e22 monkeys typically typed a certain day based on the mathematical probabilities.
 
Evil Dave pretty much made my head hurt, and we went from an infinte number of monkeys banging on keyboards to a 32 bit seed generator..

I am aware of the limitations of random number generators, but doesn't this exercise have more to do with the fact that there are only 26 letters in the alphabet ( add 27 for spaces ), and randomly picking one of those characters..


Why does the generator need to produce more than 27 different values , i.e. only five bits..


It would seem you could increase the randomness by having multiples of five bits, and randomly select one of the five bit groups, as you randomly sequenced the entire set...
 
evildave said:
And my simple point has been it's not bloody likely from a fast, simple runtime library polynomial generator to produce much text (or particular text) from a seed.

32 bits of internal state is not enough.

This is not a problem on most Unix machines. Their (most current) PRNGs do not use linear combination or polynomial generation. /dev/random on Linux uses network traffic timings (pretty random), deviations in microscopic mouse movements (pretty random), and timer interrupts + processor ticks (not random, but not predictable, either) to form an entropy pool from which it gets its random number. I am pretty sure /dev/urandom is statistically indistinguishable from a real RNG. And it is very fast. Note, however, these are not library calls. Library calls should never be used for any amount of RNG that needs to actually have a statistically significant amount of randomness. Other operating systems might have other things, but Linux pretty much takes its behaviour from a Unix consensus, so I would be willing to bet the other Unices are at least as strong.

Here is the source for the kernel if you are interested.

Experiment: Look at your network traffic lights and mouse position, then "cat /dev/random" on a Linux box. You will notice that after a while, it will stop producing random data. Move the mouse or watch the lights blink... and suddenly some more will pop out. Its actually smart enough to know when it does not have sufficient entropy.

While your point is well made, it is not much of an issue in modern computing. I think there is even a site that auto-generates random numbers for you using a Gieger counter and radioactive isotope. You can't get much more random than that.
 
Diogenes said:
Evil Dave pretty much made my head hurt, and we went from an infinte number of monkeys banging on keyboards to a 32 bit seed generator..

I am aware of the limitations of random number generators, but doesn't this exercise have more to do with the fact that there are only 26 letters in the alphabet ( add 27 for spaces ), and randomly picking one of those characters..


Why does the generator need to produce more than 27 different values , i.e. only five bits..


It would seem you could increase the randomness by having multiples of five bits, and randomly select one of the five bit groups, as you randomly sequenced the entire set...

27 different values is log2(27) bits. You can pack the data in 216 bit 'bignum' packets by multiplying/adding (and decoding by dividing and taking the modulus).

"HELLO " = (8*26^0) + (5*26^1) + (5*26^2) + (12*26^3) + (12*26^4) + (15*26^5) + (0*26^6)....


The problem is that you can't guarantee the random number generator will create enough of the necessary range of random numbers. The example 1.010010001... is emblematic of this problem. If the random number generator always generates "different" output, but there is a pattern to it (and there absolutely is a pattern in every software-based random number generator), then what it's going to do is repeat that pattern and never produce anything more than snippets... unless that pattern is literally devised to produce your quote. A more direct model would be "ZXCWADZXDWADZXEWAD", except that the repeated character sequence may run thousands, even millions of characters of 'seemingly' random noise, all the while it's only tweaking subsets of the numbers and repeating others, or any of a fabulous variety of defects in output that rander them handy for a quick game of solitaire, but utterly useless for the purpose of true randomness.

If you really want truer random number generation, you will need a seperate hardware solution to it. Perhaps Something sillier. (The individual mice may have to be well stimulated by external things as well...) Or something more practical like video capturing a forest on a windy day, or recording a waterfall and keeping only the noisy bits of color/sample information. Taking care not to 'capture' artifacts in your sampling methods, that is.
 
Gestahl said:


While your point is well made, it is not much of an issue in modern computing. I think there is even a site that auto-generates random numbers for you using a Gieger counter and radioactive isotope. You can't get much more random than that.

Until someone solves the 'hidden variable' problem, anyway.
 
The New Problem

There is another problem with using a 27 character vocabulary, or (even worse) ASCII as your source.

Even a well balanced and "perfect" random number generator is vanishingly less likely to produce legible text using the aforementioned straight tables.

There are only five (or six with Y) vowels, and 21(20) consonants, yet vowels are needed with equal or even greater frequency than most of the consonants. Rarely used consonants, like 'Z' or 'X' are as likely to appear in a random word as 'E'. That's a big handicap. Space (word break) is always needed, yet there is only a 1/27 chance one will appear. You would need to balance the selection of characters, as characters are actually used.

Of course, extending it to previous notions about dictionary lookups, you would need to run the dictionary through a lookup that approximates the english word-frequency tables to get the right distribution of words to make sentence-like-text, if you use words instead of character glyphs. Conjunctions are more likely to appear in sentences than any given proper noun, for instance. Even with the word frequency included, the grammar of the words will seldom be correct.
 

Back
Top Bottom