• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Always 50/50 chance?

I hate to be a nitpicky little b*stard, but "probability" and "odds" aren't the same thing.

"Probability" for event A means
latex.php
where
latex.php
is the number of total possible outcomes, and
latex.php
is the number of outcomes that result in event A.

"Odds" for event A means
latex.php
where
latex.php
is the number of outcomes that result in event A, and
latex.php
is the number of outcomes that don't result in event A.

So the probability of flipping heads in 1 flip is 1/2, or 50%. The odds of flipping heads in the same situation are 1:1.
 
My realization on how this can warp decision making (such as my own, were I to use my previous understanding of probability to gamble on coin tosses :) ) makes me think probability and statistics should be much better integrated in formal education and job training. I don't think I should have been able to advance as far as I have educationally and professionally with my current lack of knowledge in these areas.
 
No. My fault for first bringing up this misunderstanding.

Let's go with coin tosses and assume a perfect 50/50 "fairness". The coin toss is completely independent of how you make your guess as to what it's going to come up. No matter what you guess or how you arrive at it. As long as you guess either heads or tails you have a 50/50 chance of being right. The coin doesn't care a bit about how you arrived at that decision.

And "population" size doesn't matter either because the coin doesn't know how many times you're going to toss it.

I would have thought the mean of your guesses should equal the mean of the population.

Thus for a fair coin, guessing HTHT.... or HHTTHHTT... etc. would, for many tosses, be better than HHHH... or TTTT....

If you know that the coin is only going to be tossed 10 times for each trial then it would make sense (to me) to guess 5 heads and 5 tails (in any order) for each trial, or all heads for M/2 trials and all tails for M/2 trials, where M is the total number of trials, if known.

If someone has already said this I've missed it. If not and I'm wrong (more than 50% likely:) ), please explain why.
 
If I understand your question:

If you have a situation where there is a 75/25 split you achieve "maximal wrongness" by simply always guessing the opposite of whatever has the 75% probability. Beyond that the reasoning for this case is not much different from the 50/50 case. No matter how you make your guess, you have 75/25 odds. The one difference is that if the guesser knows what the odds are they have two choices which they can choose between: They can be 75% right or 75% wrong in the long run*. Either of those two cases indicates "guessing", without real foreknowledge of the outcome.

* Now ask what happens if the guesser alternates between those two strategies.

Hm...

Redid my spreadsheet by changing the probability of girl birth to 75%

Trial 1
GGGB 625 hits
G Only 751 hits
B Only 249 hits
Random Guess 492 hits

Trial 2
GGGB 626 hits
G Only 750 hits
B Only 250 hits
Random Guess 513 hits

Trial 3
GGGB 622 hits
G Only 766 hits
B Only 234 hits
Random Guess 485 hits

Trial 4
GGGB 634 hits
G Only 742 hits
B Only 258 hits
Random Guess 497 hits

Trial 5
GGGB 636 hits
G Only 732 hits
B Only 268 hits
Random Guess 509 hits


I would have expected better by biasing my guesses to GGGB, and less than 50% with the random guess... well, that's math for ya.
 
I would have thought the mean of your guesses should equal the mean of the population.

Thus for a fair coin, guessing HTHT.... or HHTTHHTT... etc. would, for many tosses, be better than HHHH... or TTTT....

If you know that the coin is only going to be tossed 10 times for each trial then it would make sense (to me) to guess 5 heads and 5 tails (in any order) for each trial, or all heads for M/2 trials and all tails for M/2 trials, where M is the total number of trials, if known.

If someone has already said this I've missed it. If not and I'm wrong (more than 50% likely:) ), please explain why.

Answered my own question!

At drkitten said, at best you can expect to be right 50% of the time. If you stay fixed at heads or tails the worst you will expect to be wrong is 50%.

Probability does my head in:boggled:
 
If you know that the coin is only going to be tossed 10 times for each trial then it would make sense (to me) to guess 5 heads and 5 tails (in any order) for each trial, or all heads for M/2 trials and all tails for M/2 trials, where M is the total number of trials, if known.
One key thing to remember is that each individual string of results is unique and equally probable. And it doesn't matter that some of them look special to you.

HHHHHHHHHH is just as likely a result as HTHTHTHTHT or HTTHHHTHHT. All three have exactly the same probability (assuming I actually got 10 in each sequence as I intended).

Intuitively I think people imediately look at HHHHHHHHHH and realize it only covers one case out of "millions" (in actuality 1024 cases). They don't automatically recognize that HTHTHTHTHT is also one case in 1024 and seem to assume that it is more likely because it naively "looks like" a more realistic case.

And again, It doesn't matter how you arrive at your guess. As long as you have guessed heads or tails before the throw you have a 50/50 chance of being right. Doesn't matter a bit what strategy you used to form your guess.
 
I would have thought the mean of your guesses should equal the mean of the population.

Thus for a fair coin, guessing HTHT.... or HHTTHHTT... etc. would, for many tosses, be better than HHHH... or TTTT....

If you know that the coin is only going to be tossed 10 times for each trial then it would make sense (to me) to guess 5 heads and 5 tails (in any order) for each trial, or all heads for M/2 trials and all tails for M/2 trials, where M is the total number of trials, if known.

If someone has already said this I've missed it. If not and I'm wrong (more than 50% likely:) ), please explain why.

DrKitten convinced me that you're wrong about this -you seem to be where I was earlier in my intuition.

Here's my shot at explaining it to you.

This is for a coin that has a 50% chance of being heads each toss.

over 1000 tosses it seems like we should have about 500 heads and 500 tails, although the specific order will be random, right?

There's only one specific order out of the 1000 that's completely wrong.

There's an equal chance that that completely wrong order will be 1000 heads as it will be 750 heads and 250 tails, as it will be an equal number of heads and tails in a particular random sequence THTHHTTTHTTHHH... etc. that adds up to 500 heads and 500 tails. Each of these has the low odds of 1:1000 of being the given order.

So HHHT is not going be less likely to be the correct than HHHH.

It does seem like overally though there's an expectation that there will be about 500 heads and 500 tails. Why is that more likely than 50 heads and 450 tails, without considering a particular order? I know the rule of multiplying probabilities on sequential coin tosses, but I'm trying to get a more intuitive grasp.

I think I have an intuitive sense why (there's a bell curve of the coin toss possibilities # heads:tails ratios, and the high part of the bell curve is the area where heads and tails are about equal, the wings of the bell curve is where heads & tails counts are most divergent, and overall they'd equal each other numerically). But once again, another's cogent articulation would be helpful.
 
One key thing to remember is that each individual string of results is unique and equally probable. And it doesn't matter that some of them look special to you.

HHHHHHHHHH is just as likely a result as HTHTHTHTHT or HTTHHHTHHT. All three have exactly the same probability (assuming I actually got 10 in each sequence as I intended).

Intuitively I think people imediately look at HHHHHHHHHH and realize it only covers one case out of "millions" (in actuality 1024 cases). They don't automatically recognize that HTHTHTHTHT is also one case in 1024 and seem to assume that it is more likely because it naively "looks like" a more realistic case.

And again, It doesn't matter how you arrive at your guess. As long as you have guessed heads or tails before the throw you have a 50/50 chance of being right. Doesn't matter a bit what strategy you used to form your guess.

It's encouraging that your explanation matches mine. I'm interested in your insight into the other areas of my most recent post.
 
It does seem like overally though there's an expectation that there will be about 500 heads and 500 tails. Why is that more likely than 50 heads and 450 tails, without considering a particular order? I know the rule of multiplying probabilities on sequential coin tosses, but I'm trying to get a more intuitive grasp.

Because there are more "specific orders" with balanced numbers of heads and tails then there are with unbalanced ones. For brevity's sake, I'm only going to throw the coin four times, not ten or a thousand, and you'll see where this expectation comes from.

There's only one way to get all heads : HHHH
There are four ways to get three heads, one tail : HHHT, HHTH, HTHH, THHH
There are six ways to get two and two : HHTT, HTHT, HTTH, THHT, THTH, TTHH
There are four ways to get one and three : HTTT, THTT, TTHT, TTTH
.. and of course, one way to get all tails : TTTT

Now, any given pattern -- e.g. HTTH, is exactly as likely as any other pattern (HHHH). But humans don't look at individual patterns; they look at patterns of patterns. So they don't see, for example, HTTH, they see two-and-two. Any two random-seeming patterns of 490-510 will look a lot alike, but very few patterns look like 0-1000.


I think I have an intuitive sense why (there's a bell curve of the coin toss possibilities # heads:tails ratios, and the high part of the bell curve is the area where heads and tails are about equal, the wings of the bell curve is where heads & tails counts are most divergent, and overall they'd equal each other numerically).

Exactly. What you've done here is you've formalized the idea of "look a lot alike" -- since you're just counting heads, but are explicitly ignoring orders, you're making any two specific patterns appear identical to your histogram.
 
Because there are more "specific orders" with balanced numbers of heads and tails then there are with unbalanced ones. For brevity's sake, I'm only going to throw the coin four times, not ten or a thousand, and you'll see where this expectation comes from.

There's only one way to get all heads : HHHH
There are four ways to get three heads, one tail : HHHT, HHTH, HTHH, THHH
There are six ways to get two and two : HHTT, HTHT, HTTH, THHT, THTH, TTHH
There are four ways to get one and three : HTTT, THTT, TTHT, TTTH
.. and of course, one way to get all tails : TTTT

Now, any given pattern -- e.g. HTTH, is exactly as likely as any other pattern (HHHH). But humans don't look at individual patterns; they look at patterns of patterns. So they don't see, for example, HTTH, they see two-and-two. Any two random-seeming patterns of 490-510 will look a lot alike, but very few patterns look like 0-1000.




Exactly. What you've done here is you've formalized the idea of "look a lot alike" -- since you're just counting heads, but are explicitly ignoring orders, you're making any two specific patterns appear identical to your histogram.

fantastic explanation. thanks.
 
Hm...

Redid my spreadsheet by changing the probability of girl birth to 75%

Trial 1
GGGB 625 hits
G Only 751 hits
B Only 249 hits
Random Guess 492 hits

Trial 2
GGGB 626 hits
G Only 750 hits
B Only 250 hits
Random Guess 513 hits

Trial 3
GGGB 622 hits
G Only 766 hits
B Only 234 hits
Random Guess 485 hits

Trial 4
GGGB 634 hits
G Only 742 hits
B Only 258 hits
Random Guess 497 hits

Trial 5
GGGB 636 hits
G Only 732 hits
B Only 268 hits
Random Guess 509 hits


I would have expected better by biasing my guesses to GGGB, and less than 50% with the random guess... well, that's math for ya.

So a 50/50 random guess, we would expect:

Guess(G) Actual (G) = 1/2 * 3/4 = 3/8
Guess(G) Actual (B) = 1/2 * 1/4 = 1/8
Guess(B) Actual (G) = 1/2 * 3/4 = 3/8
Guess(B) Actual (B) = 1/2 * 1/4 = 1/8
Accurate: 4/8 = 50%


Well, if you know that p(Girl) is 75%, then shouldn't you change the random guess to also be 75% Girl? Then at least you're slightly closer to my optimal solution of always guessing "Girl"

Guess(G) Actual (G) = 3/4 * 3/4 = 9/16
Guess(G) Actual (B) = 3/4 * 1/4 = 3/16
Guess(B) Actual (G) = 1/4 * 3/4 = 3/16
Guess(B) Actual (B) = 1/4 * 1/4 = 1/16
Accurate: 10/16 = 62.5%
(which also matches your GGGB rate)
 
I hate to be a nitpicky little b*stard, but "probability" and "odds" aren't the same thing.

"Probability" for event A means http://www.randi.org/latexrender/latex.php?$\frac{N_A}{N}$ where http://www.randi.org/latexrender/latex.php?$N$ is the number of total possible outcomes, and http://www.randi.org/latexrender/latex.php?$N_A$ is the number of outcomes that result in event A.

"Odds" for event A means http://www.randi.org/latexrender/latex.php?$N_A:N_{A'}$ where http://www.randi.org/latexrender/latex.php?$N_A$ is the number of outcomes that result in event A, and http://www.randi.org/latexrender/latex.php?$N_{A'}$ is the number of outcomes that don't result in event A.

So the probability of flipping heads in 1 flip is 1/2, or 50%. The odds of flipping heads in the same situation are 1:1.

Eek! Did I drop an "odds" comment when I should have said "prob"??? Well spank me sideways with a soggy tuna. Do I have to turn in my "Stats Geek" membership card now or at the door?

Monte Carlo simulation for the win.... Thank you very much, my Lord. (Did I get the address right for a Baron? I rarely hang out with such exalted persons....)

Nah, most people call me, "Hey, S***head!", or "Thanks, a'hole!" The standards of formal addressing has seriously degraded after the French Revolution... :D
 
If repeatedly guessing HHHT led on average to fewer than 50% correct guesses, then repeatedly guessing TTTH would lead on average to greater than 50% correct guesses, because, for any sequence of coin tosses, the latter is right exactly when the former is wrong. But of course there's no way to guess coin tosses with better than 50% accuracy. Q.E.D.

Is that too tricky?

You can't predict coin tosses. They're random. That's what "random" means. And predicting which side won't come up can't be easier than predicting which side will come up, because if you know one then you know the other.
 
If repeatedly guessing HHHT led on average to fewer than 50% correct guesses, then repeatedly guessing TTTH would lead on average to greater than 50% correct guesses, because, for any sequence of coin tosses, the latter is right exactly when the former is wrong. But of course there's no way to guess coin tosses with better than 50% accuracy. Q.E.D.

Is that too tricky?

You can't predict coin tosses. They're random. That's what "random" means. And predicting which side won't come up can't be easier than predicting which side will come up, because if you know one then you know the other.

Excellent explanation and one that does a good job making this concept intuitive. I'd nominate you, but there should be a limit on how many nominations one can do in a day, like some boards have with karma.
 
A number of people have stressed independence of the coin tosses. I'm not sure why. Can someone explain?

The expectation of the sum of a bunch of random variables is equal to the sum of the expectations of the random variables, whether the random variables are independent or not. (The random variables here are the correctness of the guesses---say, 0 for an incorrect guess, 1 for a correct guess.) So, I don't see what difference independence makes.
 
A number of people have stressed independence of the coin tosses. I'm not sure why. Can someone explain?

The expectation of the sum of a bunch of random variables is equal to the sum of the expectations of the random variables, whether the random variables are independent or not. (The random variables here are the correctness of the guesses---say, 0 for an incorrect guess, 1 for a correct guess.) So, I don't see what difference independence makes.

You need independence for these arguments. Take the case where I have a bag with 5 red balls and 5 blue balls. I now draw each ball out of the bag one at a time.

Person A guesses: RBRBRBRBRB
Person B guesses: RRRRRRRRRR
Person C guesses: BBBRBBBRBB

If we do this enough times, person A should be accurate on average 50% of the time. Sometimes he'll be right 7 times, sometimes wrong 7 times, but on average he'll be right 5 out of 10 times. (full range of 0->10 correct)

Person B will be right 5 times out of 10 each and every time. (no range, only 5 correct)

Person C, using the same logic, will definitely have 3 wrong and 3 right. He's only playing with the remaining blue balls. (reduced range of 3->7 correct)

Because the variances are different, the "optimal" guessing strategies can possibly be different.
 
A number of people have stressed independence of the coin tosses. I'm not sure why. Can someone explain?
No one has spelled it out IIRC, but most of the conversation is trying to explain the idea that joint probability for two statistically independent events is the simple product of their individual probabilities.
 

Back
Top Bottom