But if a state moves over 3 times out of 5 runs of the programme......?
If they have the percentages, surely they don't really run simulations, do they? Surely it's a simple mathematical calculation?
 
If they have the percentages, surely they don't really run simulations, do they? Surely it's a simple mathematical calculation?

The percentages are the outcome of the simulations, not the reverse. Silver runs 1,000 simulations, in 999 Clinton wins MA, then the percentage for MA is 99.9%.
 
The percentages are the outcome of the simulations, not the reverse. Silver runs 1,000 simulations, in 999 Clinton wins MA, then the percentage for MA is 99.9%.

But the simulations start with percentages, right? That's the role of the polls, I presume (weighted, but that matters little).

I should think that if you have percentages which are fed into the simulations, then the simulations are unnecessary. A decent statistician should be able to tell you the odds without running simulations.

I admit that I must be missing something, since Silver is a decent statistician and I am not.
 
But the simulations start with percentages, right? That's the role of the polls, I presume (weighted, but that matters little).

I should think that if you have percentages which are fed into the simulations, then the simulations are unnecessary. A decent statistician should be able to tell you the odds without running simulations.

I admit that I must be missing something, since Silver is a decent statistician and I am not.

The simulation algorithm is described here: http://fivethirtyeight.com/features...ethirtyeights-2016-general-election-forecast/

It's quite complex, and digs down past the numbers into things like demographics, instead of treating the states as independent blocks. This prevents the simulation from doing things like calling Pennsylvania for Trump, but Ohio, with similar demographics but slightly more conservative leanings, for Clinton.
 
I think it's just how they run the simulations. If you win a given state a certain number of times in the simulation, then you are given part of its electoral vote.

I sure hope that's not how they do it because that would be horribly wrong. The proper way is to allocate EC votes per each simulation THEN do the averaging. The two different approaches would not necessarily give the same results.
 
I sure hope that's not how they do it because that would be horribly wrong. The proper way is to allocate EC votes per each simulation THEN do the averaging. The two different approaches would not necessarily give the same results.

Yep.

I can't believe the people arguing for the sanctity of St. Nate's data. I don't disagree that the data reflects something, but it's shown as EC votes, not Halpern's Hypothesis of EC Votes. The number changes if you hit Now-Cast, Polls Only, or Polls Plus. In none of them does it reflect what those options are predicting. It's just silly. All his percentile scores for the three different models address the probabilities. Since he is best at forecasting state results, and the rules (including ME2 and NE2) are known, you can still count the EC votes.

I can almost do the figures in my head, frankly. It's just a strange number to have with merely a label that says EC Votes.
 
Yep.

I can't believe the people arguing for the sanctity of St. Nate's data. I don't disagree that the data reflects something, but it's shown as EC votes, not Halpern's Hypothesis of EC Votes. The number changes if you hit Now-Cast, Polls Only, or Polls Plus. In none of them does it reflect what those options are predicting. It's just silly. All his percentile scores for the three different models address the probabilities. Since he is best at forecasting state results, and the rules (including ME2 and NE2) are known, you can still count the EC votes.

I can almost do the figures in my head, frankly. It's just a strange number to have with merely a label that says EC Votes.
I don't think anyone is arguing for the sanctity of Nate's data. It isn't his data anyway, he doesn't poll himself; he only does an analysis of others' data, so at most people could be arguing for the sanctity of Nate's modeling.

And also, it isn't silly. It's just that the math is a bit more complicated than you think.

I think it's just how they run the simulations. If you win a given state a certain number of times in the simulation, then you are given part of its electoral vote. It's intended to be a reflection of the distribution of EVs from all the simulations, rather than a whole number - i.e. when you average it all out after the simulations Hillary has a mean of 334.5 EVs.

I sure hope that's not how they do it because that would be horribly wrong. The proper way is to allocate EC votes per each simulation THEN do the averaging. The two different approaches would not necessarily give the same results.

I have to backtrack on my own words again. For the expected number of EC votes, you can simply multiply the chance of winning in each state with its number of EC seats. It doesn't matter that the races in the various states are not independent - for that single number "expected value".

But that single number doesn't give the whole picture. It doesn't say anything about the probability distribution among the possible outcomes. Right now, the expected value of Hillary's EV is around 330. If the distribution of outcomes is something "normal", like a Gaussian distribution or more likely a Poisson distribution, yes, then her chance of getting a majority in the EC is something like 90%.

But the distribution could also be something different. In the extreme case, the results in all state races could be completely correlated: win all or lose all. An expected value of 330 EV could also be the result of 50% of simulations coming up with 130 EV, and the other 50% coming up with 530 EV. In which case Hillary's chance of winning the presidency would be only 50%.
 
538 has now updated their "polls-only" results to 86% for Clinton, and Arizona is now (slightly) blue. Arizona is still red in their "polls-plus" prediction.
 
I have to backtrack on my own words again. For the expected number of EC votes, you can simply multiply the chance of winning in each state with its number of EC seats..

That's what I'd assume.

Consider the following: suppose they had predicted that for every state, Clinton had a 90% chance of winning (ignore maine and nebraska and DC issues). Now, if that is exactly true, then that means that we should expect her to LOSE 5 states. And the average number of electoral votes would be 53.8 for Trump and the rest for Clinton.

The problem is, you don't know which are the 5 states that she would end up losing. Or if it would be 4, or 6. The chance that she wins them all is only 0.5%. It could be that the 5 states she loses would be the 5 largest in electoral votes. In that case, Trump would get a lot more than 50 votes. Alternatively, maybe it is the 5 smallest states, in which case he might get 20.

Now, in this scenario, either of these outcomes is equally likely. However, in the real case, where probabilities vary all over the place, the math is really hard. So in that situation, it's probably a lot easier to run 10 million simulations using the probabilities for each state and looking at the outcomes that way. If I had the probabilities, I could do it easily.

I will say, however, that I think Silver underestimates the probabilities for the states, at least has been recently. In fact, his claim that he correctly called 99/100 states in the last two elections would suggest that. Unless his probabilities are are in the 99% range, he should be getting a lot more wrong than he is. As I pointed out above, if all the states have a 90% probability, the odds of getting them all right are only 0.5%.

I pointed this out after the last election. Silver's model isn't working as good as he asserts, because if he was right, he'd be getting a lot more wrong. If that makes any sense.
 
538 has now updated their "polls-only" results to 86% for Clinton, and Arizona is now (slightly) blue. Arizona is still red in their "polls-plus" prediction.

So we're back to where we were in early August. What was the point of the last 2 months of campaigning? :)
 
That's what I'd assume.

Consider the following: suppose they had predicted that for every state, Clinton had a 90% chance of winning (ignore maine and nebraska and DC issues). Now, if that is exactly true, then that means that we should expect her to LOSE 5 states. And the average number of electoral votes would be 53.8 for Trump and the rest for Clinton.

The problem is, you don't know which are the 5 states that she would end up losing. Or if it would be 4, or 6. The chance that she wins them all is only 0.5%. It could be that the 5 states she loses would be the 5 largest in electoral votes. In that case, Trump would get a lot more than 50 votes. Alternatively, maybe it is the 5 smallest states, in which case he might get 20.

Now, in this scenario, either of these outcomes is equally likely. However, in the real case, where probabilities vary all over the place, the math is really hard. So in that situation, it's probably a lot easier to run 10 million simulations using the probabilities for each state and looking at the outcomes that way. If I had the probabilities, I could do it easily.

I will say, however, that I think Silver underestimates the probabilities for the states, at least has been recently. In fact, his claim that he correctly called 99/100 states in the last two elections would suggest that. Unless his probabilities are are in the 99% range, he should be getting a lot more wrong than he is. As I pointed out above, if all the states have a 90% probability, the odds of getting them all right are only 0.5%.

I pointed this out after the last election. Silver's model isn't working as good as he asserts, because if he was right, he'd be getting a lot more wrong. If that makes any sense.

The highlighted bit is the part that surprises me. Seems that the math could be settled more easily than running simulations.

But, of course, I know very little about statistics and Nate Silver knows a lot about statistics, and so I presume that my surprise is due to ignorance.
 
The highlighted bit is the part that surprises me. Seems that the math could be settled more easily than running simulations.

Probably not, because you have basically have to consider every possibility. So "What is the probability of winning these 49 states and losing 1?" That's pretty easy, but now you have to do for having all the states being the losing state. That means 50 calculations right there. And now, what about 48-2? Well, there's 1225 combinations there, right (50*49/2?) And then for 47 - 3, there are 19600 possibilities. And for 36 - 14, there are something like 1e11 possibilities.

You could either calculate them all, or you could just run a simulation of 10 million outcomes and see what is most likely.

You can always calculate the average - that's easy. It's just probability of winning the state*Electoral votes in the state. But to calculate the most likely individual outcomes? The sample is way too large.
 
That's what I'd assume.

Consider the following: suppose they had predicted that for every state, Clinton had a 90% chance of winning (ignore maine and nebraska and DC issues). Now, if that is exactly true, then that means that we should expect her to LOSE 5 states. And the average number of electoral votes would be 53.8 for Trump and the rest for Clinton.

The problem is, you don't know which are the 5 states that she would end up losing. Or if it would be 4, or 6. The chance that she wins them all is only 0.5%. It could be that the 5 states she loses would be the 5 largest in electoral votes. In that case, Trump would get a lot more than 50 votes. Alternatively, maybe it is the 5 smallest states, in which case he might get 20.

Now, in this scenario, either of these outcomes is equally likely. However, in the real case, where probabilities vary all over the place, the math is really hard. So in that situation, it's probably a lot easier to run 10 million simulations using the probabilities for each state and looking at the outcomes that way. If I had the probabilities, I could do it easily.
No, it's not that easy as "either of these outcomes is equally likely". The outcomes in the various states are not independent, but they're correlated. Someone mentioned Ohio and Pennsylvania above. Those states have similar demographics, with Ohio traditionally more conservative leaning. If Ohio votes more Democratic than the polls indicate (due to something that happens between now and Nov 8), then surely Pennsylvania will also be more Democratic. No way you're going to see on Nov 9 that Ohio voted Democratic and Pennsylvania voted Republican.

I'm not 100% sure on the need for simulations, but I think this is where they come in. Nate's model does not only take into account the various polls out there, with their trustworthiness and their bias, but his model also contains correlations between voting habits of the various states. And the easiest way to calculate is to run simulations that take those correlations into account. So when one simulation gives Ohio an outcome +2 for Clinton compared to the polls, it will simulate the Pennsylvania outcome also with a +2 slant for Clinton (or a bit more sophisticated than that).

I will say, however, that I think Silver underestimates the probabilities for the states, at least has been recently. In fact, his claim that he correctly called 99/100 states in the last two elections would suggest that. Unless his probabilities are are in the 99% range, he should be getting a lot more wrong than he is. As I pointed out above, if all the states have a 90% probability, the odds of getting them all right are only 0.5%.

I pointed this out after the last election. Silver's model isn't working as good as he asserts, because if he was right, he'd be getting a lot more wrong. If that makes any sense.
However, on election day the numbers will have stabilized and the margins of error of the pollsters are small, so the confidence of calling a state will have greatly increased.

And another factor is the correlation. It means you can't treat the 50 states as independent probabilities. Exaggerating, that means: he's either spot-on or he's off in a handful states at a time.
 
NYT: 89%
538: 87%
Daily Kos: 96%
HuffPost: 91%
PredictWise: 89%
PEC: 97%

[Imgw=640]https://i.sli.mg/uHrNAQ.png[/imgw]
 
Last edited:
NYT: 89%
538: 87%
Daily Kos: 96%
HuffPost: 91%
PredictWise: 89%
PEC: 97%

[Imgw=640]https://i.sli.mg/uHrNAQ.png[/imgw]

The snake graph is very important. If you look at RCP, they have her at 260, but haven't conceded her ME1 (2 EC votes) and MN (10). Every other aggregator has those in her column already. She has a very VERY conservative 272, and those states that they dream of in Trump Tower (NH, WI, MI, PA) are not turning back.

More important, though, is that the 272 doesn't include OH, NC, FL or NV. She doesn't need the favorite battleground states of the last two elections, although she will take three and possibly four.

RCP is doing yeoman-like duty trying to make it look close for their conservative readership.
 

Back
Top Bottom