• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Lotto: Statistics question

The two aren't the same, certainly. I'd say I have a good understanding of math but not statistics. But I'd also say I have a good understanding of probability.

no they're not the same insofar as statistics is a subset of mathematics. I'd be interested in what mathematics course would not contain statistics - certainly in the UK, statistics forms an important part of key stage 3, GCSE, A-level and degree maths.

cflarsen said:
Are you saying that you only need math classes, and not statistics classes?

I'm saying that for you to say that you're good at maths but not good at statistics seems a little odd. Maths classes incorporate statistics in the UK - perhaps Denmark does things differently.
 
Last edited:
no they're not the same insofar as statistics is a subset of mathematics. I'd be interested in what mathematics course would not contain statistics - certainly in the UK, statistics forms an important part of key stage 3, GCSE, A-level and degree maths.


I'm saying that for you to say that you're good at maths but not good at statistics seems a little odd. Maths classes incorporate statistics in the UK - perhaps Denmark does things differently.

Depends on what level of math. Here, math classes for e.g. high school also incorporate statistics, but not a lot.
 
I've been away and this thread has gotten out of hand, but Claus the question you asked me referred to a lotto where the drawing not the RNG is biased. But here's the simplest way I know of thinking about it.

Forget the lotto balls altogether and just number every single combination.

1 2 3 4 5 6 is 1
1 2 3 4 6 7 is 2
...
30 31 32 33 34 35 36 is 800,000 and something.

Imagine the drawing is done by pulling poker chips out of a bag. The normal lotto drawing works just the same as drawing from a bag of 800,000+ poker chips. This is hard for people to understand. It seems that 1 2 3 4 5 6 7 is somehow close to 1 2 3 4 5 6 8.

If the balls are biased for some reason then we can simulate this by putting extra poker chips for the preferred combinations, or by arranging to have certain chips drawn more often.

So let's look at the very simple case of 5 poker chips numbers 1-5. In the original case of a bogus RNG we imagine that the RNG generates 2 of 2,3,4,5 for every 1. But the drawing is fair.

So there are 5 cases:

1 - generated 1 in nine times
2 - generated 2 in nine times
3 - 2 in 9
4 - 2 in 9
5 - 2 in 9

What are your chances of winning with 1? Simply 1 in 5. Once you have generated the one, the drawing will still match one time of 5.

What about the inverse case where the RNG is fair, but the drawing is not?

The Fair RNG looks like this:

1 - generated 2 in ten times
2 - generated 2 in ten times
3 - 2 in 10
4 - 2 in 10
5 - 2 in 10

The biased drawing looks like this:

1 - drawn 1 in 9 times
2 - generated 2 in 9 times
3 - 2 in 9
4 - 2 in 9
5 - 2 in 9

Assuming you use the now fair RNG, you will have five cases. Each case occurs equally often, one time in five.

Case 1:
Chance of occurring: 1 in 5
Chance of winning: 1 in 9

Case 2:
Chance of occurring: 1 in 5
Chance of winning: 2 in 9

Case 3:
Chance of occurring: 1 in 5
Chance of winning: 2 in 9

Case 4:
Chance of occurring: 1 in 5
Chance of winning: 2 in 9

Case 5:
Chance of occurring: 1 in 5
Chance of winning: 2 in 9

Assuming you don't know that the drawing is biased and you use the (fair) RNG, your overall chances are:

(1/5 * 1/9) + (1/5 * 2/9) + (1/5 * 2/9) + (1/5 * 2/9) + (1/5 * 2/9) = 1/5 * 9/9 = 1/5

I think you can see that it doesn't matter that the chances of the biased drawing were 1/9, 2/9, 2/9, 2/9, 2/9, so long as they all add to one.

If you are using lotto balls rather than poker chips the calculation of what chances apply to which seven number combinations can become quite long. A combination like 1 2 3 5 6 9 will be much less common than 10 11 12 13 14 25 26. But it won't matter in the end because all the odds will add to one and the fraction from the unbiased RNG (1/5 in this example) will factor out.

The case where both the RNG and the drawing are biased can reduce or increase your chances of winning. This is easy to see: just imagine that the RNG always generates 1 but 1 is never drawn.
 
Here is a code snippet that could be used to draw lottery numbers either for a retail ticket or for the official drawing. (The specific syntax probably won't run on any real system but any programmer should be able to correct the syntax to make it work) This is similar to the code lottery officials pay thousands of dollars for to replace the ping-pong ball machines that they paid tens of thousands of dollars for. [note to lottery officials: you may use my code for FREE if I get to see the final version and how it's set up]

Code:
input "Enter date of drawing" seed_date
input "Enter time of drawing" seed_time

# seed the random number generator with the date and time
temp=random(seed_date+seed_time)

input "Enter number of balls" max_ball
input "Enter number to draw" max_draw

print " "
print "Running... count to ten then press a key to draw numbers"

# Cycle the random number generator until a key is pressed
while not key_pressed()
	temp=random()
end while

dim draw[max_draw]

for i = 1 to max_draw
	try_again:
	pick = int(max_ball * random())+1	# Uniformly distributed [1..max_ball]
	for j = 1 to i
		if (pick == draw[j]) goto try_again
	next j
	draw[i] = pick
next i

print "The official numbers for the drawing on", seed_date, "at", seed_time, "are ... "
for i=1 to max_draw
	print draw[i]
next i

print "--END RUN--"
end
 
Here is a code snippet that could be used to draw lottery numbers either for a retail ticket or for the official drawing. (The specific syntax probably won't run on any real system but any programmer should be able to correct the syntax to make it work) This is similar to the code lottery officials pay thousands of dollars for to replace the ping-pong ball machines that they paid tens of thousands of dollars for. [note to lottery officials: you may use my code for FREE if I get to see the final version and how it's set up]

But you haven't defined the function random(), which is the key point here and the one that was apparently biased in our example. Also, in your example random() seems to generate a floating point number in the interval [0,1), but usual PRNG give integers.
 
In addition to what Yllanes said, computers have a really hard time generating truly random numbers - they are generally considered pseudo-random. Which is how the guy in Vegas figured out the RNG for Keno.
 
But you haven't defined the function random(), which is the key point here and the one that was apparently biased in our example. Also, in your example random() seems to generate a floating point number in the interval [0,1), but usual PRNG give integers.

If you have a random() that returns integers instead of a float just use mod instead of the multiply. The default random() in many systems may be slightly biased but that isn't its biggest problem and why I want the lottery officials to use my code. :)
 
If you have a random() that returns integers instead of a float just use mod instead of the multiply.

That's not always a good idea, sometimes the least significant digits are the less random ones.

The default random() in many systems may be slightly biased but that isn't its biggest problem and why I want the lottery officials to use my code. :)
Of course it is its biggest problem. The whole point of the program is to define the random() function. The rest is trivial.

I am a theoretical physicist and do a lot of Monte Carlo simulations. Believe me, the PRNG you use is very important. A different PRNG with the same code can make the difference between success and failure.
 
If you have a random() that returns integers instead of a float just use mod instead of the multiply. The default random() in many systems may be slightly biased but that isn't its biggest problem and why I want the lottery officials to use my code. :)
The proposed generation scheme is of course entirely unusable. In addition to any possible bias of the random() function (that wouldn't be very difficult to overcome), it has other grave problems: first, no matter how much entropy the PRNG is seeded with, the entropy used for generating the output set is limited by the PRNG's internal state size. In other words, depending on that size, some outputs may be impossible to generate at all, even if the PRNG is initially fed gigabits of entropy.

Secondly, and most importantly, the amount of entropy that the PRNG is seeded with is alarmingly insufficient. Its only source is the date and time of generation and the number of loop cycles before a key is pressed. The date for a draw is usually given, so has zero entropy. The time, if counted in minutes, would always have less than 11 bits of entropy - and most likely much less than that: if the draw is always generated at the same time, for example, it would have zero entropy. I see that the time is also printed out, perhaps with the hope of being made public, for the attacker's further convenience. As for the number of loop cycles before a key is pressed, the operator is instructed to count to ten before pressing a key, apparently in order to keep the entropy as low as possible. If the operator's timing of counting to ten fluctuates in something like a half second interval, and the system can compute the random() function a million times per second, the entropy would be less than 19 bits.

As the Danish lotto would require at least 23 bits of entropy even if the PRNG was flawless, the proposed scheme would fail dramatically. Of course, I have no doubt that it was intentionally designed to contain these flaws, and no competent lottery official, by definition, would ever use something like that.
 
That's not always a good idea, sometimes the least significant digits are the less random ones.

You're right. It's pretty bad if the modulus is even and especially bad if the modulus is a power of 2.

Of course it is its biggest problem. The whole point of the program is to define the random() function. The rest is trivial.

I could design a PRNG to meet any need. My need here is to make lots of money beating the lotteries. So I'll let the lottery officials use whatever PRNG they happen to have available (probably a 24 bit LCPRNG written by microsoft).

A different PRNG with the same code can make the difference between success and failure.

That is true. But there are lots of lotteries. A few of them are bound to use a PRNG that will allow me to succeed.
 

Back
Top Bottom