Stats Question For Stats People - Pill Bottle

Thank you all for your thoughts on this. I hope somebody can crack it fully.



Are you talking about calculating the probability for day 20 when one already knows what happened on the previous 19 days, or are you talking about calculating the probability for day 20 before the bottle is opened? I assumed, perhaps incorrectly, that the OPer was referring to the later. That would make the problem exceptionally difficult in that on day 20 the number of half pills could be anything between 1 and 19


The highlighted one. Before you open the bottle, what is the chance of getting half a pill on any given day.

It seems like you would have to figure out all of the possible combinations for any given day, multiply each by the chance of it occurring, and add (or aggregate) them all. How one might do that is a complete mystery to me.
 
Starting with 0 half pills on day 0, there is a 100% chance that there will be 1 half pill on day 1. On day 2 the probabilities are 99/100 = .99 for 2 half pills and 1/100 = .01 for no half pills. On day 3 it is 3: .99*(98/100) = .9702, 1: .99*(2/100) + .01*(99/99) = .0298;

On average, the number of half pills will start out growing linearly (1 new half pill per day) but this trend will decay (inverse exponentially) towards a 50/50 split of whole/half pills. There will then be a spread around the median similar to a binomial distribution but more concentrated in the center.
 
But that is not true. If you start with 50 whole pills on day 0 (before taking the first dose) then on day 20 you would have taken 20 doses as either both haves of a single pill or separate halves of two pills. On even numbered days there are always an even number of half pills left in the bottle.

I may not be making myself clear. When I say for day 20, I mean 19 doses have been taken - one dose on each of the first 19 days. The day-20 dose has not yet been taken. I agree that at the end of day 20, there will have to be an even number of half-pills remaining.
 
I may not be making myself clear. When I say for day 20, I mean 19 doses have been taken - one dose on each of the first 19 days. The day-20 dose has not yet been taken. I agree that at the end of day 20, there will have to be an even number of half-pills remaining.


Then you are simply wrong. The initial conditions start at time 0 and for counting discreet events, time 1 is after the first event. This way the time is always in sync with the number of events that have occured. Let's take you back to your beginning: you were born, one year later you had your first birthday celebration and then you were 1 year old.

Alternatively, look what happens when bean counters that don't grok the concept of zero start meddling: the clocks that start at 12am for instance.
 
Then you are simply wrong. The initial conditions start at time 0 and for counting discreet events, time 1 is after the first event. This way the time is always in sync with the number of events that have occured. Let's take you back to your beginning: you were born, one year later you had your first birthday celebration and then you were 1 year old.

Alternatively, look what happens when bean counters that don't grok the concept of zero start meddling: the clocks that start at 12am for instance.

I am getting further and further lost.
So "what is the probability that a half pill is drawn on day 20" is not the same as "what is the probability for day 20"?

If I ask "what is the probability that a half pill is drawn on the first day" isn't the answer "zero"?
 
Before you open the bottle, what is the chance of getting half a pill on any given day.

It seems like you would have to figure out all of the possible combinations for any given day, multiply each by the chance of it occurring, and add (or aggregate) them all. How one might do that is a complete mystery to me.

This is what I took a stab at but got burned out after Day 4, because the number of half pills remaining changes up the probabilities each day. Your choices are still either 1 whole or 1 half, but the possible mix of halves and wholes gets larger very quickly. I tried doing it as a kind of "tree" which is what I mean by "brute force." Also when I say 50/50 I mean *probability*, not *odds*.

IMO, the fact that you can conceive of the question and recognize it's complicated means it probably wouldn't remain a complete mystery if you took it day by day. It does becomes very tedious, however, unless there's a formula I'm missing.
 
It's very simple. the initial condition "on day 0" there are 50 whole pills in the bottle. On day 1, you draw a pill and now there are 49 whole pills in the bottle and one half pill that you put back. If you mark the transition to the new day by the event of that day you don't have to deal with messy edge conditions like having day 1 be special or having to constantly split the nomenclature because there is the time in day 1 before the pill is drawn and the time in day 1 after the pill is drawn so the question of how many pills are in the bottle on day 1 is ambiguous and you avoid perpetual philosophical debates on what is meant by the evening and the morning of the first day.
 
This is what I took a stab at but got burned out after Day 4, because the number of half pills remaining changes up the probabilities each day. Your choices are still either 1 whole or 1 half, but the possible mix of halves and wholes gets larger very quickly. I tried doing it as a kind of "tree" which is what I mean by "brute force." Also when I say 50/50 I mean *probability*, not *odds*.

IMO, the fact that you can conceive of the question and recognize it's complicated means it probably wouldn't remain a complete mystery if you took it day by day. It does becomes very tedious, however, unless there's a formula I'm missing.


If you think this is challenging, try doing the complete tree for video poker. Somewhere on one of my old disks I have that spreadsheet.

Finding the formula is going to be even more challenging. It will be similar to the binomial distributionWP but more complex.
 
If I ask "what is the probability that a half pill is drawn on the first day" isn't the answer "zero"?

Yes.

I had hoped this would be as obvious (in retrospect) as Gauss's supposed smart-alec reponse to a teacher who had him add the integers from 1 to 100. (49 x 101) + 100. But I don't think it is.

Wollery seems to have the best approach.
 
Last edited:
It's very simple. the initial condition "on day 0" there are 50 whole pills in the bottle. On day 1, you draw a pill and now there are 49 whole pills in the bottle and one half pill that you put back. If you mark the transition to the new day by the event of that day you don't have to deal with messy edge conditions like having day 1 be special or having to constantly split the nomenclature because there is the time in day 1 before the pill is drawn and the time in day 1 after the pill is drawn so the question of how many pills are in the bottle on day 1 is ambiguous and you avoid perpetual philosophical debates on what is meant by the evening and the morning of the first day.

I am still floundering.

So if I want to know the probility for that first draw (when there are still 50 pills in the bottle) then which of the following is correct

What is the probability on the first day?
What is the probability on the zeroth day?
What is the probability on day 0?
What is the probability on day 1?

..............

and everyone else in this thread is using this nomenclature?
If the first draw were on June 1st, then we would say that June 1 is day 0 and June 2 is day 1, and so on.
 
Last edited:
If you think this is challenging, try doing the complete tree for video poker. Somewhere on one of my old disks I have that spreadsheet.

Finding the formula is going to be even more challenging. It will be similar to the binomial distributionWP but more complex.

I have heard the best game to beat house odds is blackjack. There's a formula for when to hit and when to stand, but with multiple decks and the arbitrary cut it gets much trickier. I play tournament Scrabble. Perfectly tracking tiles and computing probabilities (there are 12 "E" tiles and only 9 "A" tiles to start) would give an edge if both players have perfect memory of the complete tournament-word list but unequal ability to calculate probability.

and everyone else in this thread is using this nomenclature?

I'm using no nomenclature; I am just going literally from day to day. Day 1: 50/50 (probability, not odds) you pick a whole pill. No other choice.

Day 2: You still have 50 pieces. 49/50 probability whole piece; 1/50 probability half piece. Or: .98 and .02.

And so on.

Can't parse "Day Zero" and "Zeroth Day," all though it does help me understand why "zeroth" is a legitimate Scrabble word.
 
I am still floundering.

So if I want to know the probility for that first draw (when there are still 50 pills in the bottle) then which of the following is correct

Why are you injecting confusion by changing axis? If you want to know the probability for that first draw then ask: "What is the probability for the first draw?"

Before you can change axis you need to insure that the new axis is fully defined relative to the original axis. We got from the op that there would be one pill drawn per day so the axis are presumed to be linear and the same scale. But some origin or equivalence point is needed before you can switch from one axis to the other.

And then there is the issue that "day" is continuous while "draw" is discrete.
 
...

Can't parse "Day Zero" and "Zeroth Day," all though it does help me understand why "zeroth" is a legitimate Scrabble word.

Off by 1 errors have been a bane of programming since day 1 (That's what they get for not starting with day 0). To mitigate this issue I use zero as the origin wherever possible.
 
Last edited:
Yes.

I had hoped this would be as obvious (in retrospect) as Gauss's supposed smart-alec reponse to a teacher who had him add the integers from 1 to 100. (49 x 101) + 100. But I don't think it is.

Again there is that damn off by 1 problem.

(49 * 101) + 100 = 4949 + 100 = 5049.

1+100 + 2+99 + ... + 50+51 = 101*50 = 5050.


Wollery seems to have the best approach.


I would have had it if I hadn't botched the pills/doses issue. I'm surprised nobody caught it.
 
Again there is that damn off by 1 problem.

(49 * 101) + 100 = 4949 + 100 = 5049.

1+100 + 2+99 + ... + 50+51 = 101*50 = 5050.
Heh, I did it as (49*100)+100+50

I've always been awkward like that!
 
Prior art.

ETA: Loss Leader has in fact already answered this question:

We're assuming that your choice is random and that the feel/shape of the whole or half pills don't matter, neither do the real-world way the smaller halves might migrate (IIRC) to the bottom of the bottle.

1st day = 100 pills, chance of pulling a whole pill is 1

2nd day = 99 pills and a half, chance of pulling out a whole pill is 99/100

3rd day = There are either 99 pills, so the chance of pulling out a whole is 1 OR
there are 98 pills and 2 halves, so the chance of pulling out a whole is 98/100

4th day = There are either 98 pills and 1 half or 97 pills and three halves. So it's either 98/99 or 97/100

5th day = There are either 98 pills or 96 pills and 4 halves. So the chances are either 1 or 96/100

Yeah, I'm completely stumped. It looks like the odds are going to either be 1 or some number with the lower limit of 1/100. That number that gets smaller will have an average of about 1/2 (50 pills, 50 halves). So, if the odds are either 1 or 1/2, I'd say that the overall odds of pulling out a whole pill will be about 75% (assuming you know nothing of the conditions before the pull other than how many "units" are in the bottle).

This has made my non-mathematical head hurt very much and I shall like very much to stop thinking about this now.
 
Last edited:
It looks like I've been confused by this for years.

And I'm still confused.
 
Sol's answer is even more correct: on average, the chance of drawing a whole pill is equal to the chance of drawing a half pill. This is a direct result of the fact that for each whole pill that is picked, a half pill is returned to the bottle .

From that result and my earlier observation that initially more whole pills will be picked than halves picked, at the end there must be a period where there are more halves picked than wholes.

To take it further, it can be observed that when there is a excess of whole pills it is more likely that a whole pill will be drawn. And when there is a excess of half pills it is more likely that a half pill will be drawn. This means that the system will be driven to an equilibrium point where the likelihood of picking a whole pill doesn't change as a result of the average pick.

Written as math this is:


Given some current number of (W)hole and (H)alf pills;
after the pick we have:
W' = (W-1)*W/(W+H) + W*H/(W+H)
H' = (H+1)*W/(W+H) + (H-1)*H/(W+H)
if we maintain equilibrium ratio:
W':H' = W:H
or
W'/(W'+H') = W/(W+H)
or
W'/H' = W/H = Er

solving...
W'/H' = W/H
W'H = WH'
[(W-1)*W + W*H]H = W[(H+1)*W + (H-1)*H]
[(WW-W) + W*H]H = W[(WH+W) + (HH-H)]
WWH - WH + WHH = WWH + WW + WHH - WH

0 = WW

Doh! Of course, that was obvious :boxedin:
 
Last edited:
For Days 2–99, the probabilities are well-approximated by a cubic function (click for larger view):


The figure shows the results of a Monte Carlo simulation of 100,000 trials. The result for each trial is a vector of length 100, whose ith element is 1 if, for the ith day, the simulation picked a half-pill, or 0 if it picked a whole pill. As per the OP, if a whole pill was picked, the number of whole pills available for the next day was decremented by 1 and the number of half-pills was incremented by 1; if a half-pill was picked, the number of half-pills available for the next day was decremented by one. The probability of picking a half-pill the next day was then updated accordingly.

Each black circle in the plot is the average of the zeros and ones for a particular day over the 100,000 trials, and hence is the Monte Carlo estimate of the probability of picking a half-pill for that day. The red curve is a cubic regression function fitted to the plotted points, excluding the 100th day.

If we let x represent the day, then for x=1,2,...,100, the probability, p(x), of choosing a pill on day x is:

x=1, p(x) = 0;
x=2,3,...,99, p(x) ≈ b0 + b1x + b2x2 + b3x3, where
b0 = –.008370,
b1 = .01779,
b2 = –.0001871,
b3 = 9.464e-07;​
x=100, p(x) = 1.
 
Last edited:

Back
Top Bottom