• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Statistics puzzle that's driving me nuts

PiedPiper

Thinker
Joined
Nov 14, 2010
Messages
148
Guys, I'd love some help with this math problem that's been bouncing around in my head for a while.

I've posted something similar to this before, and I never really got an answer that I could understand and made sense. It's something that's been bothering me for years, and it's so counter-intuitive that it drives me crazy. If someone with knowledge of statistics says "give up, just accept it, that's the way it is" - that's fine, I can do that. But if there's something deeper here, I'd love to hear about it.

Ok, so here we go.

This statistics problem takes the form of a game. You have a bag with ten marbles. The marbles can either be white or black, but there has to be ten total, and at least one white and at least one black. The way the game goes is this:

You put a combination of ten marbles into the bag, and draw one out. If it's white, the game is over. If it's black, you draw again, and keep drawing one at a time until you get a white one. That then ends the game.

For example, say there's 9 white marbles and 1 black marble. The chance of pulling a white marble on the first draw (ending the game outright) is 9/10, or 90%. The chance of pulling the black marble and then a white marble (to finish the game) is (1/10)*(9/9) = 10%.

Pretty simple stuff. Let's look at an example where there are 8 white marbles and 2 black marbles.

Chance of pulling white marble straight off: 8/10.
Chance of pulling one black, then a white: (2/10)*(8/9)=0.177
Change of pulling two blacks, then a white: (2/10)*(1/9)*(8/8)=0.022

This makes sense to me, logically. It's harder to draw two blacks then white, than it is to draw one black then white. Once you have one black marble out of the bag, you have to avoid all of those 8 white marbles floating around and find the one black marble. That's hard to do, avoiding all those white marbles.

I won't go through all the math because it's redundant, but this theme plays out as you go through 3 black / 7 white, 4 black / 6 white, 5 black / 5 white, 6 black / 4 white, 7 black / 3 white, and 8 black / 2 white. It becomes harder and harder to get all the black marbles out of the bag (avoiding the white ones, which would end the game) before finally hitting white. Hence the decreasing odds for emptying the bag of black marbles before hitting a white one.

This makes perfectly good sense to me, and it's backed up by the math if you'd care to draw it out, like I have above. It becomes *very* difficult to avoid hitting those white marbles until all the black ones are out.

However, something very strange happens when there are 9 black marbles and 1 white marble. Before we draw out the math, think about it: each time, we have to avoid hitting the white marble which becomes more and more difficult as the game continues. But let's look at the math:

Chance of hitting white straight away: 1/10 = 0.1.
Chance of hitting 1 black, then the white marble: (9/10)*(1/9)= 0.1.
Chance of hitting 2 blacks, then the white marble: (9/10)*(8/9)*(1/8)=0.1
......
......
Chance of hitting 5 blacks, then the white marble:
(9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(1/5). As you can probably see, everything cancels, and we're left with odds of 1/10, or 0.1.

Taking it to the far extreme: chance of hitting all 9 blacks, missing the white marble *every single draw* - which in other starting marble ratios, was very difficult to do - before finally there's only one marble left in the bag, the white one, which is pulled out to end the game.

(9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(4/5)*(3/4)*(2/3)*(1/2)*(1/1) = 1/10, or 0.1.

So you have an equal chance of pulling the one white marble out of ten in the bag straight off (10%) as you do pulling 9 black marbles out one at a time, missing the white marble each time - which for previous ratios, was shown to be very difficult! - and the chance is still 10%.

I just can't wrap my head around this. I'm not stupid, I consider myself a learned man, but this just has me stumped.

Anyone out there want to put me out of my misery? :)
 
That's just wrong. You are conflating many outcomes with a single outcome. You are in effect claiming that there is only a 10% chance that the last marble will, in fact, be a marble at all. That alone should be sufficient for you to identify your error.
 
However, something very strange happens when there are 9 black marbles and 1 white marble.
...........................
(strange stuff happening)
...........................
So you have an equal chance of pulling the one white marble out of ten in the bag straight off (10%) as you do pulling 9 black marbles out one at a time, missing the white marble each time - which for previous ratios, was shown to be very difficult! - and the chance is still 10%.

I just can't wrap my head around this. I'm not stupid, I consider myself a learned man, but this just has me stumped.

Anyone out there want to put me out of my misery? :)
It is an interesting result but one that is not confined to just 10 marbles. A bit of algebra might help:

Suppose there are N+1 marbles of which N are black and 1 is white and we want to know the probability of drawing K black marbles followed by the white marble (where K < N obviously) then the calculation is

N/(N+1) x (N-1)/N x (N-2)/(N-1) x ... x (N-K+1)/(N-k+2) x 1/(N-K+1) = 1/(N+1)

When you cancel out all of the intermediate terms, you find that the result is independent of K. Of course, if you had more than 1 white marble then the intermediate terms wouldn't cancel out so neatly and the answer would depend on K.
 
Last edited:
It is an interesting result but one that is not confined to just 10 marbles. A bit of algebra might help:

Suppose there are N+1 marbles of which N are black and 1 is white and we want to know the probability of drawing K black marbles followed by the white marble (where K < N obviously) then the calculation is

N/(N+1) x (N-1)/N x (N-2)/(N-1) x ... x (N-K+1)/(N-k+2) x 1/(N-K+1) = 1/(N+1)

When you cancel out all of the intermediate terms, you find that the result is independent of K. Of course, if you had more than 1 white marble then the intermediate terms wouldn't cancel out so neatly and the answer would depend on K.
While a well intentioned response, my feeling is that the OP is far more fundamental. I could be right, I could be wrong.
 
There's a standard trick for thinking about questions like this: even though your rule says stop when you first get a white marble, you can imagine that you keep going, but remember that the first white one was the point at which you were supposed to stop. In other words, you just lay the 10 marbles out in a line in a random order, and then the question is where is the *first* white marble.

Now with only 1 white this is just the same as the question where is the white marble, and it's clear that it is equally likely to be in any of the 10 positions.

With more than 1, the first one is more likely to be in an earlier position: to have the first white marble in position x you need a white marble there (equally likely in any position) plus that all the other white marbles come between x+1 and n. There are fewer and fewer ways of meeting the second condition as x increases.

With only one white there aren't any other white marbles in the second part of the argument above.
 
In other words: you lay out your ten marbles in a line. One is white, the others are black. What are the chances that the white marble is in position1? 0.1. What are the chances that the white marble is in position 2? 0.1. And so on. Picking up your marbles from left to right doesn't change this.
 
You can think of it as fish in a pool. You either catch a fish when you draw a marble, or you reduce the size of the pool, making it more likely you'll catch a fish (and stop) on the next cycle.

When there is a single fish the same thing happens, but the pool starts bigger (relative to the fish). If you've drawn 5 out without a hit, the pool is smaller and the odds of getting a white on the next pull are now 1 in 5, not 1 in 10.

In the case where you've already withdrawn 5 blacks and there is a single white remaining with four blacks, you have the same situation as when you have two whites and eight blacks. One white swimming in a 5-marble pool. So, when the whites are increased, it's as if you are starting later in the one-white-marble problem. And we know the chances go up in that problem as the pool gets smaller.
 
In other words: you lay out your ten marbles in a line. One is white, the others are black. What are the chances that the white marble is in position1? 0.1. What are the chances that the white marble is in position 2? 0.1. And so on. Picking up your marbles from left to right doesn't change this.

That's pretty good.

Unsure if it will sink in with the OP.
 
However, something very strange happens when there are 9 black marbles and 1 white marble.

Chance of hitting 5 blacks, then the white marble:
(9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(1/5). As you can probably see, everything cancels, and we're left with odds of 1/10, or 0.1.

Try doing the same calculation when you have 8 black and ten black and see what happens...

8 Black: (8/9)*(7/8)*(6/7)*(5/6)*(4/5)*(1/4) = 1/9
9 Black: (9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(1/5) = 1/10
10 Black: (10/11)*(9/10)*(8/9)*(7/8)*(6/7)*(1/6) = 1/11

Why does it only confuse you when you start with 9?

Taking it to the far extreme: chance of hitting all 9 blacks, missing the white marble *every single draw* - which in other starting marble ratios, was very difficult to do - before finally there's only one marble left in the bag, the white one, which is pulled out to end the game.

(9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(4/5)*(3/4)*(2/3)*(1/2)*(1/1) = 1/10, or 0.1.


Once again, let's try this with 8 and 10 black marbles instead...

(8/9)*(7/8)*(6/7)*(5/6)*(4/5)*(3/4)*(2/3)*(1/2)*(1/1) = 1/9
(9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(4/5)*(3/4)*(2/3)*(1/2)*(1/1) = 1/10
(10/11)*(9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(4/5)*(3/4)*(2/3)*(1/2)*(1/1) = 1/11


I don't see why you find the outcome when starting with 9 black marbles particularly surprising.

I just can't wrap my head around this. I'm not stupid, I consider myself a learned man, but this just has me stumped.

What exactly has you stumped? Why are you confused about this outcome occurring with 9 black marbles when you get an equivalent outcome no matter how many marbles you start with?
 
Thanks psion :) That piece of algebra actually did help me quite a lot. I never thought of laying it out like that. Like you say, it's an interesting result, but easily explainable and I "get it" now.

Even though I have a Ph.D. in Organic Chem, that discipline is unfortunately very soft in math. I'm thinking of going back and getting my B.S. in Math from the local university, just for my own sense of accomplishment.

Thanks to everyone who contributed towards my finally grasping this problem :)
 
Why does it only confuse you when you start with 9?

It wasn't 9 black marbles specifically that was confusing, it was that the probability of drawing any number of black marbles before running into the white marble was equal. That is, that it was no less likely that the player draw the white marble on the first try than the 10th. That is not the case for 2 or more white marbles out of 10.

Had the OP started with 8 black and 1 white there would have been a similar result and I suspect that that would have been just as confusing, but 10 total marbles was chosen, presumably because it's a nice, round number.
 
I just can't wrap my head around this. I'm not stupid, I consider myself a learned man, but this just has me stumped.

Anyone out there want to put me out of my misery? :)


There are only 10 possible combinations of 9 black marbles and 1 white marble:

WBBBBBBBBB
BWBBBBBBBB
BBWBBBBBBB
BBBWBBBBBB
BBBBWBBBBB
BBBBBWBBBB
BBBBBBWBBB
BBBBBBBWBB
BBBBBBBBWB
BBBBBBBBBW

Each of the combinations is equally likely. So, there's a 10% chance of the white marble being in any given position.

Where you're killing yourself is in thinking there's any difference between one black marble and the next. There's not. If you had 10 different color marbles, the odds of any given order would be 1:10! (1:3,628,800), but the chance of the white one being in any given spot would still be 10%, because there are only 10 places the white one can go.
 
Last edited:
<respectful snip>

I just can't wrap my head around this. I'm not stupid, I consider myself a learned man, but this just has me stumped.

Anyone out there want to put me out of my misery? :)
It's an interesting observation, and I'll try to put in my two cents as well. Basically, you're asking: why are those probabilities decreasing when there are two or more white marbles, but remain equal when there is only one?

First a remark on terminology: this is a probability theory problem, not a statistics problem. Probability theory concerns itself with the outcomes of experiments with known underlying probabilities. Statistics is (mathematically rigorously) guessing at an unknown underlying probability from a series of experiments.

The basics of probability theory is: counting. For proper counting here, you'll have to take into account all marbles, also the ones remaining in the bag. You have to virtually extend your drawing procedure to draw all marbles from the bag. In order to know what the probability is that the first white marble is drawn as second, you'll have to count
(1) how many series of B/W marbles there are with the first white marble from the left in second position;
(2) how many series of B/W marbles there are in all.
(both of course with the same amount of black and white marbles. Basically, the same approach LossLeader showed in his post.

And for contrast, here are all the combinations possible with 2 white and 3 black marbles (I limit the number of marbles to 5 to keep the number of combinations down):

W W B B B
W B W B B
W B B W B
W B B B W

B W W B B
B W B W B
B W B B W

B B W W B
B B W B W

B B B W W

I've bolded the "tails" in the drawing procedure, i.e., the part you wouldn't have drawn anymore. And as you can see, these tails contain both black and white marbles, which can occur in various different orders. In the case of only one white marble, however, the tail only contains black marbles and thus, there is only one possible order for each position of the (first and only) white marble.

Okay, let's formalize it a bit with the help of binomial coefficients and Pascal's triangle. And let's introduce a number of symbols:
n: the total number of marbles
w: the number of white marbles
X: the stochast which "measures" the position of the first white marble drawn

In order to calculate the probability distribution of X, all we have to do is
(1) count all possible orders of drawing as listed above;
(2) count those orders where X=k (for each k)
The probability of X=k is then simply the ratio of those two numbers.

Counting (1) is easy: it's the question "in how many ways can I pick w items from n", which is the very definition of the binomial coefficient. So it's C(n, w).

Counting (2) is only slightly more difficult. All the orderings where the first white marble occurs in position k, have the "head" fixed, consisting of k-1 black and then 1 white marble. That leaves us a "tail" of length n-k, with w-1 white marbles in it. So, the number of orderings with the first white marble in position k is: C(n-k, w-1). Thus, we get:

P[X=k] = C(n-k, w-1) / C(n, w)

In terms of Pascal's triangle - which is just a visualization of the binomial coefficients - these numerators are travelling along a path parallel to the left edge, and w-1 positions away from that left edge. You can see that those numbers are decreasing when you go up, and you can easily convince yourself of that fact when you look at the definition of the binomial coefficient.

The only case they're not strictly decreasing, is when you travel along the edge of Pascal's triangle, i.e., when w-1 = 0 or, in other words, when w=1, you only have one white marble. Then there is no choice left how the tail could be drawn.
 

Back
Top Bottom