Possibly, but we certainly can't deny that some people have an uncanny ability to have the outcome go in their favor mnore often than others.![]()
Okay, it's been a while. Let me bring you up-to-date and see if any one has an explanation for these results other than random chance.
Hubby hasn't played much poker of late due to many other things going on in our lives. However, he started recording all all-in's rather than just the ones we refer to as 'races'. If you don't recall, that was defined as a pair against two over cards.
He's had 10 all-in's and lost nine of them. Here are the hands:
In September, he had one all-in from an on-line game. He went all in after the flop with AK. His opponent had 3, 7. We estimate the probability of loss at 1/3. The flop came 3, 8, 7. He lost.
In October, he played twice with his buddies and had three all-in's each night.
First set of games, at PB, he went all-in
10, 10 against 5, 8. We estimate the probability of loss at 1/3. He lost.
K, K against Q, Q. We estimate the probability of loss at 1/5. He lost.
K, T against A, T. We estimate the probability of loss at 2/3*. He lost.
A, A against 9,9. We estimate the probability of loss at 1/5. He lost.
A, T against A, K. We estimate the probability of loss at 2/3. He lost.
Your AT vs AK the odds are probably closer to around 70:30 against you, but good enough for government work
The following hand is the only non-pre-flop all-in. In this case, he had
A, 6 against K, 3. They went all-in after the turn with 6,3,6,3 showing. We estimate the probability of loss at 0.02. He lost. The river was a 3.
November, he's had two poker nights.
First set of games, at PB, he went all-in
J, J against Q,9. We estimate the probability of loss at 1/5. He won this hand!
A, K (suited) against Q, Q**. We estimate the probability of loss at 1/2. He lost.
Second set of games, at BG, he went all-in only once.
A, A against 9,9***. We estimate the probability of loss at 1/5. He lost.
My computation of the probability of getting one win out of those ten games as 0.00003.
This was computed as 10 * 1/3 * 1/5 * 1/3 * 2/3 * 1/5 * 1/3 * 1/50 * 4/5 * 1/2 * 1/5.
So, any ideas? Are those probabilities reasonable?
Sorry, that was my error in typing it up. I wrote after, but should have said before. This was a pre-flop all-in, not post flop. Is 1/3 a fair assessment of that probability?If he went all in AFTER the flop, he was WAY behind.
Thanks for the confirmation on the odds. Your assumptions are good. When it wasn't a pre-flop call, I'll specify what cards were already up in order to assess the win/loss probability at the time the all-in was made.Pair over two undercards (assuming they're suited) you're 80:20 favorite (approx - can be some fluctuation, you need to use a poker hand calculator for more accurate)
All the above assume you have gone all-in pre-flop, in 'cleanly' dealt hands, without any shenanigans.
Yes, it happens. My husband feels such outcomes go against him more often that is reasonable to expect. I have to admit, he's been collecting data on his poker hands for close to a year now and the statistics so far confirm what he's been complaining about for years.Yup - guy caught a 1-outer. It happens. It was a 1:46 shot.
In the first example he was a 70:30 favorite. 2nd example is a coin-toss.
Well, I would interpret this run of cards as a little bit unlucky I suppose, but certainly not indicative of some sort of cloud of poor luck that is following him around.
Yes, he's been assiduous with the data collection. Sorry about my error on the description of that first hand.Again, I assume here you've been assiduous with data collection - the scenario you describe at #1 makes a big difference (going all-in post-flop) as you can see. You can't measure that hand as AK vs 73, once the 73 has flopped two pair.
Yes it was a tournament. That's all he plays with his buddies. They get together twice a month and run two or three tournaments on a Saturday night. Basically, you can assume all the games are tournaments as he isn't currently playing in any other venues.A couple of general observations that may get better results:
- You said he just sort of chucked in his last few blinds (and was an underdog) - I don't know what sort of stakes he's playing at, but this is a 'loser' move. If he has lost for the evening (at a cash game) and isn't motivated to keep playing, he should pickup his chips and save them for next time / cash in whatever he has, rather than throw them away or shove out of desperation. In tournaments, one is forced to do this, but you need to look at cash games as a continuous run of never-ending games - and its important to be properly funded each time you sit down, as stack-size matters in poker, big-time.
- In general he seems to be making not bad pre-flop decisions and generally getting involved in preflop contests where he is a reasonable favorite. Only one 'race' and only 3 mistakes. So, in 60% of all-in decisions, he's making 'correct' decisions. That's not a bad start.
Another way to look at this - lets say your state-run scratch & win lotto advertises 1:8 tickets is a winner (of some prize). If you bought 8 tickets, you could reasonably expect to win on at least one of those tickets. In this case, maybe he's bought 10 and hasn't won yet. He's been a little unlucky over an isolated set of examples.
Hopefully soon!I would expect that if he is making decisions such as described here, the results will revert to the norm, over time. And just as its rather rare to spot a run of 10 hands where he's only won one, play enough of these sequences, and he'll have one where he wins 9.
You mentioned at one point that the same guy has beaten him with an underpair (which happened to be 9's) twice now. This is hardly a pattern to be excited about, but I would make sure you're following good anti-card cheating mechanisms even at a home game. Use a cut card to make sure the deck is cut before each deal, and keep the cut card on the bottom of the deck. This prevents a lot of the more common ways people try to cheat with cards - and yes, unfortunately, sometimes people DO cheat even at 'the friendly' game for relatively low stakes. It can be like a disease.
So you continue the study until you get the result you're after? You really need to figure out the size N you need to have the power to answer the question you're after and state it up front.Beth said:Hopefully soon!I would expect that if he is making decisions such as described here, the results will revert to the norm, over time. And just as its rather rare to spot a run of 10 hands where he's only won one, play enough of these sequences, and he'll have one where he wins 9.
Recomputing the probability with your suggested changes to the odds:
10 * 1/3 * 1/5 * 1/5 * 2/3 * 1/5 * 2/3 * 1/46 * 7/10 * 1/2 * 1/5 = 0.000018 = 0.0018%
If you don't feel this is indicative of consistently poor luck, let me ask you how you feel that 'poor luck' might be established? What data should be collected and how should it be evaluated? These are serious questions, as that is exactly what we are trying to establish is or isn't happening.
Actually, I did analyze it a month ago, I just haven't had time to post results here. I will also be analyzing it a month from now. The analysis method will not change though unless some error in how it is being done is pointed out to me. That's why I'm posting it here, so others can point out errors in the analysis method.I still think you needed to state your N up front. Why analyze the data now? Why not a month ago or a month from now? The potential for post hoc hypotheses is open when you do it this way.
No. We continue this study of the data until we have either a better way to collect and analyze the outcomes he's reporting or until he gets bored with the project and stops bothering to collect data. Please recall that this was designed to assess how well his subjective observation was matching with reality - i.e. was his luck really as bad as he was claiming.So you continue the study until you get the result you're after?
You really need to figure out the size N you need to have the power to answer the question you're after and state it up front.
Maybe. I'm not sure why those differences would have an impact on the outcome after an all-in occurs. Can you give a valid reason why they shouldn't be combined?Also the switch from on-line poker to in-person poker means you are mixing data from two very different things.
Also there is the problem of bias in the data collection since such collection is not done blind (or objectively) but by someone with a pretty strong position. (I've pointed to evidence of the goat and sheep bias in data collection already, I think.)
Yes, that's a fairly succinct way of putting it. Luck is when outcomes are significantly different than expected - i.e. the odds of what actually happens are statistically significantly different from what random chance would predict.I thought all these questions were answered some time ago. First, you need to define what you mean by "luck". It seems to me it's just being used here as an alternative explanation when you find any result that doesn't line up with expected outcomes.
Yes, I'm aware of that. That's why I multiplied the computation of the probabilities by 10 to account for all possible sequences of one win in ten contests. What I am doing is analogous to computing the probability of getting 3 heads and 3 tails rather than the probability of getting the sequence HTHTHT.But remember, streakiness in data is expected even in random outcomes. (We don't really expect a series of coin tosses to alternate HTHTHT!)
The power of a test is the probability of correctly accepting the null hypothesis. The probability of rejecting the null incorrectly given the sample size is the p-value.So at best, all you're asking is what size N is required to have the power to reject the null hypothesis (that results are due to chance).
In this case, rejecting the null is confirming that his observations were correct, that he is consistently losing such contests more often than random chance would expect. Whether you want to call it 'luck' or the blessings of the IPU, I think it does support the hypothesis that his outcomes are not in line with random chance.If you reject that null, you still haven't supported this ill-defined "luck" hypothesis anymore than you have supported the hypothesis that the IPU is affecting the outcomes.
It is, in fact, one fourth of the odds of getting a royal flush and exactly the odds of getting a Royal Flush in a particular suite. That is not the computation I am making. See above.ETA: Your 0.0018%, is just the Texas Sharpshooter fallacy. Deal out any 5 cards, and the probability of getting that exact hand is lower than the probability of being dealt a Royal Flush. You really have to define your hypothesis at the beginning.
Thanks for the correction. The revised p-value is 0.000022 or 0.0022%On the AK vs 73 preflop, you're somewhere between a 60/40 and 65/35 favorite (I know it seems like it should be better than that) - depends on how the suits match up. You need a calculator to get it precisely. I used a calculator - assuming AK unsuited and 73 suited (and the AK doesn't take away from possible flush draws), you're 61.33 to 38.23 with a small tie percentage.
We'll continue to collect more. As a statistician, I'm always wanting more data. On the other hand, I constantly work with minimal data sets because the analyses I do professionally are on very expensive tests. Every additional data point costs addtional $100's. The major problem with small datasets is low power, which means that if you don't reject the null, you don't have much confidence that the null is correct. But rejecting the null is done with a confidence of 1 minus the p-value regardless of sample size because the sample size is included in the computation of the p-value.For me - and I'm not a mathematician - I would say you have insufficient data.
Removing that hand and the p-value for the remaining nine goes all the way up to 0.000894 or 0.0894%. That change takes the confidence level to reject the null all the way down to 99.93%If we ran a monte carlo simulation, and tried this sequence of 10 hands, I don't know how many standard deviations off the norm he would find the result - I suspect that getting REALLY 'unlucky' on that one hand skews the net result significantly. (Remove that one, and calculate the 'unlucky' quotient based on the 9 hands and you'll get a much 'flatter' result).
Being more than three standard deviations from the mean is a fairly standard criteria to define outliers.Anyways - I think you'd find that he's a couple of deviations off the norm, but not an 'outlier'.
That understandable. You're only seeing the recorded results. But from my hubby's POV, it's confirming what he suspected previously but did not have data to refute my contention that it was just confirmation bias.And this also doesn't show that he's consistently 'unlucky'. You would need to capture a number of these types of sequences to start building that sort of case to my mind.
I'm not sure what hypothesis to build around it. I was expecting a far more normal outcome, not what we're seeing. If what I expected had occurred, I had hoped to use the data to talk him into adopting a more positive attitude about such things.And fundamentally - lets just say that over the next year or so, you show that in 8 of 10 sets of data, he's 2+ standard deviations off into the 'unlucky' realm - what hypothesis would you build around that? To suggest your hubby is a 'cooler' is a paranormal explanation - he could just have been 'unlucky' enough to have a rather unlikely run of cards.
Thanks. I appreciate the sentiments although he had no plans to enter any tournaments of that sort.I do hope it turns around for him soon - hopefully at the WSOP!
Let me bring you up-to-date and see if any one has an explanation for these results other than random chance.
He's had 10 all-in's and lost nine of them. Here are the hands:
He went all in after the flop with AK. His opponent had 3, 7. We estimate the probability of loss at 1/3. The flop came 3, 8, 7. He lost.
10, 10 against 5, 8. We estimate the probability of loss at 1/3. He lost.
K, K against Q, Q. We estimate the probability of loss at 1/5. He lost.
K, T against A, T. We estimate the probability of loss at 2/3*. He lost.
A, A against 9,9. We estimate the probability of loss at 1/5. He lost.
A, T against A, K. We estimate the probability of loss at 2/3. He lost.
The following hand is the only non-pre-flop all-in. In this case, he had
A, 6 against K, 3. They went all-in after the turn with 6,3,6,3 showing. We estimate the probability of loss at 0.02. He lost. The river was a 3.
J, J against Q,9. We estimate the probability of loss at 1/5. He won this hand!
A, K (suited) against Q, Q**. We estimate the probability of loss at 1/2. He lost.
A, A against 9,9***. We estimate the probability of loss at 1/5. He lost.
My computation of the probability of getting one win out of those ten games as 0.00003.
This was computed as 10 * 1/3 * 1/5 * 1/3 * 2/3 * 1/5 * 1/3 * 1/50 * 4/5 * 1/2 * 1/5.
** This qualifies as a race. Our statistics are now 22 wins out of 57 races which has a p-value of .0556.
The data are accurate; the probabilities are estimates. I'm more than happy to change them when someone provides me with a better value. I haven't used a poker calculator because I haven't been able to figure out how to use them to help me estimate the probabilities I need. As you point out, without the suites, the probabilities are not completely accurate.This brings me to my next point: your data aren't accurate.
I agree, but I'm not the one collecting the data. If he chooses not to keep track of suites, that is his choice not mine.The suits of the cards matter, as the above paragraph shows. Your husband should be keeping track of the exact cards shown up.
Yes, that is a more accurate formula. I've been somewhat lazy about that computation, multiplying by 10 rather than computing out the probability that each hand was the only winner. Thanks for making the computation. I'll try to check your arithmetic laterWe are now down to only a 99.4% confidence level for rejection of the null hypothesis.The above calculation is wrong. If we play n hands, each with a probability of losing of pi, i=1...n, and X is the number of hands lost, then
[qimg]http://jt512.dyndns.org/images/p1.png[/qimg]which, using your data, I calculate to be 0.000056 (someone should check my arithmetic).
To calculate a one-sided p-value, we have to add to this the probability of losing all 10 hands, which nudges the probability up to 0.000057. This is about double the probability you calculated.
Actually, what we did was start a new experiment based on feedback I had received here - to wit, that when the all-in call is made, the probabilities can be computed and it doesn't matter whether it's pre- or post-flop. Hands that fit the original criteria are added to our previous dataset, but we have also started building a new dataset with the criteria for inclusion being just that it was an all-in hand.However, until this post, you made the point repeatedly that you were only using all-in pre-flop hands. Yet, in this calculation, you mysteriously have included an all-in post-flop hand. This was not part of the original hypothesis. You've changed the rules of the experiment part way through it to include selective bad beats that don't qualify for inclusion.
Excluding data just because you don't like it isn't kosher. Incidently, a p-value of 0.0009 is not what I would term an ordinary bad run. It would allow us to reject the null hypothesis at a confidence level of 99.91%.If we exclude the post-flop bad beat, the revised p-value is 0.0009; that is just under 1 in 1000, and is starting to look more like an ordinary bad run than a near impossibility.
Correct, the p-value is slightly above 0.05% for the 'races'.So, when you stick with the original hypothesis, your results still are not significant at the 0.05 level (one-tailed).
It means that my dh's complaint seems based on an accurate assessment of his results, not confirmation bias due to his remembering the bad beats and not the wins.Mathematically, you are testing the hypothesis that the probability of your husband wining is less 0.5 for showdowns of the type you've described earlier. Let's say that you continue collecting data for another year, and stop the experiment according to some rule not related to the calculation of an interim p-value (arguably, that's impossible at this point, but let's ignore that). Furthermore, let's that say that the final calculated p-value is 10–8, overwhelmingly rejecting the null hypothesis. What could this mean?
Yes, that's one possibility, as it is for any experiment. The p-value tells us exactly what the that probability is for the actual outcome. However, it's generally not considered a reasonable conclusion that a p-value of 0.001 or less was due to random chance.1. The null hypothesis is correct, and an extremely unlikely event occurred by chance.
Yes. That is why I started this thread and have posted our results for discussion. The feedback I received for the 'races' experiment was that the probability of 0.5 was a reasonable estimate of that probability.2. You made a methodological error in constructing the experiment such that the probability of winning the type of hands that were to be counted was actually less than 0.5.
This is always a possibility. It's certainly a reasonable supposition on your part. However, this is not an explanation for our results that I or my spouse finds convincing.3. There were errors in data collection.
Quite possible. Again, that's why I'm posting the results here. Thanks for your help in this regard. I'll make the corrections and let you know if I get agreement with your values. However, the corrections don't appear to make much difference in the interpretation of the p-value. It's still very low.4. There were errors in the analysis.
I actually find this to be quite unlikely given the circumstances of the games he plays. With his buddies, the composition of the other players changes with every game and there are only a few people who are there every time and there's no one person who wins consistently over time. He's been playing with them for years. The 'races' experimental data consistent primarily of on-line play money games where cheating seems a very unlikely possibility. Again, while it is a reasonable supposition for you or others reading this, it is not an explanation for our results that I or my spouse finds convincing.5. Your husband was cheated.
I see no reason to posit a supernatural force. I don't define 'luck' that way.6. There is a supernatural force—call it luck—that negatively affects your husband while playing poker.
I agree. Still, the data are consistent with a lower than normal win rate for all-in show-downs, which is what he has been complaining about. I've been working to correct any issues with regard to reasons 2 and 4. Reasons 3 and 5 are not reasonable conclusions for us. Reason 1 is not a reasonable conclusion with the p-values we have. So I'm looking for other possible explanations and we are continuing to collect data to see if his win rate will eventually approach something more in line with the expectations of random chance.It should be evident that no matter how firmly the null hypothesis is rejected, the possibility that one of the first five possible explanations above (and probably more I haven't thought of) is responsible for the result is astronomically more likely than the sixth.
I'll try to check your arithmetic later.
The feedback I received for the 'races' experiment was that the probability of 0.5 was a reasonable estimate of that probability.
@JT - The original hypothesis from Beth was that her husband was 'unluckier' than could be expected only in 'all-in' situations. As such, I pointed out that once all the money was all-in (whether it was skill or foolishness that arrived is moot) - one can determine simply the mathematical expectation of that situation.
ie - AhAd vs QsJs preflop - that has a specific EV. (80.12% for AA, 19.52% for QJs) Now whether or not skill was involved in getting to this stage is immaterial. Over the long haul, the player with the AA should win about 80% of the time.
At first the test data was only going to include 'race' type of hands. These hands vary only slightly in value (generally 52/48 range).
I'm not the one collecting the data. If he chooses not to keep track of suites, that is his choice not mine.