The right-hand sidebar features a meta-analysis directed at the question of who would win the Electoral College in an election held today. Meta-analysis provides more objectivity and precision than looking at one or a few polls, and in the case of election prediction gives a highly accurate current snapshot. In 2004, the median decided-voter calculation captured the exact final outcome.
Calculations are based on recent available state polls, which are used to estimate the probability of a n Obama/McCain win, state by state. These are then used to calculate the probability distribution of electoral votes corresponding to all 2.3 quadrillion possible combinations. For a popular article about this calculation, read this article and the follow-up.
Selected reader comments appear at the bottom of this page. - Sam Wang
What’s different about this analysis in 2008 compared with 2004?
There are three major differences.
First, the Meta-Analysis relies entirely on the well-established principle that the median of multiple state polls is an excellent predictor of actual voter behavior. On Election Eve 2004, a calculation based on this principle made a correct prediction of the electoral vote outcome. Additional assumptions were unnecessary and unwarranted. In 2008 the calculation is kept simple - and therefore reliable.
Second, the calculation is automated to allow tracking of trends over time. This allows the Meta-Analysis to be used to identify changes in voter sentiment as seen through the lens of actual electoral mechanisms.
Third, instead of focusing on battleground states, we are tracking all 50 states and the District of Columbia.
Is this Meta-Analysis a prediction of what will happen on Election Day?
Definitely not - at least not until close to Election Day. Between now and then, think of it as a precise snapshot of where the race stands at any given time. In November the Meta-Analysis should come quite close to the actual outcome. You can find predictions at other sites.
In the Meta-Analysis, how can you possibly go through 2.3 quadrillion possibilities? Wouldn’t that take forever?
The Meta-Analysis doesn’t actually calculate the probability of every combination of states one at a time. At a rate of going through a million combinations per second, that process would take over 71 years. Yet repeated simulation is exactly what other sites do - though they only do thousands of simulations, not quadrillions. Such a laborious approach means that they can only approximate the expectation based on a set of win probabilities.
Instead, the Meta-Analysis uses an overlooked method to calculate the probability of getting an exact number of electoral votes, covering all ways of reaching that number given the individual state win probabilities. This is a much easier problem - it can be solved in less than a second. Here is a simple example.
Imagine that there are just two states. State 1 has EV1 electoral votes and your candidate has a probability P1 of winning that state; in state 2, EV2 electoral votes and a probability P2. Assume that EV1 and EV2 are not equal. Then the possible outcomes have the following probabilities:
EV1+EV2 electoral votes (i.e. winning both): P1 * P2. EV1 electoral votes: P1 * (1-P2). EV2 electoral votes: (1-P1) * P2. No electoral votes: (1-P1) * (1 - P2).
In general, the probability distribution for all possible outcomes is given by the coefficients of the polynomial
((1 - P1) + P1 * x^EV1) * ((1 - P2) + P2 * x^EV2) * … * ((1 - P51) + P51 * x^EV51)
where 1…51 represent the 50 states and the District of Columbia. This polynomial can be calculated in a fraction of a second.
Why don’t other projection sites use your approach?
Three reasons. First, the Meta-Analysis is unlike, say, fantasy baseball, where a lot of enjoyment comes from thinking about the outcomes of individual simulations. At this site, we’re unsentimental - we just want the exact answer. Second, perhaps it hasn’t occurred to hobbyists at other sites to take the polynomial shortcut, which is made possible by the fact that the Electoral College follows a relatively simple system in which EV are added up. Third, certain aspects of the Meta-Analysis are patented.
Why should I believe the Meta-Analysis? In 2004, didn’t it predict a narrow Kerry victory?
Actually, the method was fine, but its inventor, Prof. Sam Wang, made an error. In the closing weeks of the campaign, he assumed that undecided voters would vote against the incumbent, a tendency that had been noticed in previous pre-election polls. Compensating for the “incumbent rule” had the effect of putting a thumb on the scales, lightly - but unmistakably - biasing the outcome.
Leaving out this assumption, the prediction in 2004 was exactly correct: Bush 286 EV, Kerry 252 EV. In retrospect, it’s clear that the incumbent rule is subjective and cannot be relied upon. You can read about the confirmation of the prediction in the Wall Street Journal (pre-election story here). A second confirmation came in 2006, when, using a related but simpler method Sam expected the odds of a Democratic takeover of the US Senate were 50-50, a higher chance than predicted by either pundits or electronic markets. Indeed, that event did end up occurring.
Sam’s error won’t be repeated this year. Overall, the analysis will be kept as simple as possible as a means of avoiding unintended bias. Both data and the code for doing the calculations will be freely available. That way, anyone can check the results. Everything was open in 2004 as well; readers provided lots of useful feedback, such as this exchange.
State polls are done less often than national polls. Does that introduce a delay into your analysis?
Yes. As of early August this delay is about two weeks in key states. The delay will diminish dramatically as the campaign season progresses. A correction based on national polls is possible, but adds considerable uncertainty to the estimate.
What is the Popular Meta-Margin?
The Popular Meta-Margin is the amount of opinion swing that is needed to bring the Median Electoral Vote Estimator to a tie. It helps you think about how far ahead one candidate really is. For example, if you think support for your candidate is understated by 1%, this can overcome an unfavorable Meta-Margin of less than 1%. If you think that between now and Election Day, 1% of voters will switch from the other candidate to your dude, this is a swing of 2% and can compensate for a Meta-Margin of 2%.
What if I think that polls are biased against my candidate? Do you provide a tool for me to see how a bias changes things?
One tool is the Popular Meta-Margin (see above). Another tool is the map in the right-hand column, which comes in flavors that show single-state probabilities with a 2% swing toward either candidate.
What are jerseyvotes?
Jerseyvotes, invented at this site in 2004, are a way to measure the power of individual votes to sway the election. Conceptually, jersyvotes are distantly related to the Banzhaf Power Index, but normalized to the power of one individual. If you have ten times as much influence over the win probability as a voter in New Jersey, your vote is worth 10 jerseyvotes. Sadly for the hosts of this site, one jerseyvote is not worth very much.


12 responses so far ↓
1 Frank // Aug 9, 2008 at 2:07 pm
I appreciate your analysis. In addition to the meta-margin, it would be helpful to me to see the % of undecideds that would equalize the median EVs. Can you show both?
Also, am I correct that Obama’s median EV is holding at 309 but his meta-margin is increasing slightly, and if so what would the interpretation be?
2 jcie // Aug 9, 2008 at 4:31 pm
I think the presentation would be a bit clearer with extra parentheses in the polynomial:
((1-P1) + P1*x^EV1) * ((1-P2) + P2*x^EV2) …
3 Sam Wang // Aug 10, 2008 at 12:31 am
jcie, thank you.
Frank: If we assumed 10% undecided, they would have to break about a 2-1 break in McCain’s favor. However, this is a very unreliable estimate. Some pollsters push harder than others in forcing respondents to choose a candidate. The recent undecided numbers in Pollster.com’s national polls range from 3 to 15% (some pollsters do not even report the figure). At the low end of the range, even 100% of the undecideds voting for McCain wouldn’t do it; at the high end, little more than a 3-2 break would be sufficient.
I noticed the same thing that you did about the meta-margin. Perhaps a state (or states) in the “safe” range for Obama or McCain recently reported a result more favorable to Obama, leading to a subtle shift in the distribution. Maybe the impact of the recent McCain attacks (Obama is a celebrity and/or the anti-Christ) has peaked. However, that would be a lot to hang on this result. We are unfortunately in a period when state polls are sparse and changes take several weeks to be fully seen in the meta-analysis.
4 Pat // Aug 11, 2008 at 10:09 am
Since you have the probability distribution of electoral votes, it might be nice to display a probability of victory for each candidate. Currently, with the distribution almost entirely above 270 Obama EV, I assume there would be a probabilitly of victory of at least 95% for Obama.
It could be a useful indicator to plot over time. And since you made a point of differentiating your approach from that of fivethirtyeight.com, it would be interesting to see the extent to which the two probabilities differ.
5 Sam Wang // Aug 11, 2008 at 2:18 pm
Pat, a probability measure of the type you describe (currently >99%) would only be a snapshot of today. To get a true probability, it would have to be multiplied by the probability that the polls don’t move far enough to flip the outcome. That’s a highly uncertain number. My reading of the situation is that the true probability is about 75% or so. I’ll write about this later.
In the meantime, the history graph in the right sidebar gives something I think you will like, namely an indicator that can be plotted over time.
6 JoeA // Aug 13, 2008 at 7:43 pm
Even though I know it’s a bit of a lagging indicator in August, it would be great if we could get a graph of the metamargin and how it changes over time. Would be a nice way to see how the campaign’s evolving, and until the # of polls picks up, might also be a good indicator of trends likely to continue.
Love the site!
7 Richard Gilman // Aug 15, 2008 at 5:42 pm
I’m not a scholar so I basically read this site for the forecasts in the most basic sense. Right now, at the top of the home page, it says Obama 300, McCain 238. Where is the list of states that you project for each candidate? Thanks.
8 Sam Wang // Aug 15, 2008 at 6:21 pm
Mr. Gilman, there is no such list. The EV totals are the center of the range of the preponderance of likely outcomes. So they do not guarantee any particular outcome. However, it is possible to say with high confidence that if the election were held today, the total EV outcome would be at or near the numbers given.
If you really insist on definite assignment of states based on polls, you can click on the maps at right, which give probabilities. Click the map, at which point every state is forced to be assigned to Obama or McCain. This is more like what sites like electoral-vote.com or Pollster.com provide.
9 Oliver // Aug 16, 2008 at 10:52 pm
With all respect, Sam, the assumption in 2004 that undecideds would vote against the incumbent–an assumption that, by the way, makes sense and empirically is proven valid– was not wrong. The evidence is good that Kerry did win, as the investigative report by Congressman John Conyers says.
But let’s step back a moment. It’s an important tenet of science that we let the evidence lead us to conclusions, not the other way around. You will concede, I hope, that there are legitimate questions about the outcome of the 2004 election. If there are doubts about a piece of data, then one should not make conclusions based on either including or excluding the questionable data point. It’s bad science to do so. The proper approach is to suspend judgment, and let new data prove the matter one way or the other.
Personally, I think elections are mostly cooked by a terrible news media and the alienation and gullibility of less affluent/less educated voters. But there is plenty of evidence of outright cheating around the margins. What the US Civil Rights Commission documented in 2000, the racial disparities that emerged in the Lopategui case in New Mexico in 2004, and the March 2007 conviction of two Cuyahoga election board workers on felony counts for rigging the 2004 election are incontrovertible facts that make it clear that our elections are tilted. How much is still unclear. Until we know, I suggest letting your analytical system be guided purely by data and not by assumptions.
10 Frank // Aug 20, 2008 at 2:02 pm
Is the meta-margin weighted or unweighted by state EVs?
11 Pat // Aug 20, 2008 at 10:58 pm
Currently, the site fivethirtyeight.com projects a composite electoral vote total of 272 for Obama and 268 for McCain. On the other hand, they observe that McCain wins more often in the simulations (52% of the simulations).
You explained before that Nate Silver’s simulations were not only useless, but also inaccurate: indeed the distribution they get is far from a regular gaussian. Could these factors cause that paradox of one candidate getting a larger median EV total and the other winning more often? In principle, if they did not only perform 10,000 simulations, but an arbitrarily large number of simulations, would these situations be avoided? (in other words, can we expect the distribution of outcomes to be totally symmetrical in all cases?)
Thanks for your comments.
12 Sam Wang // Aug 20, 2008 at 11:49 pm
Pat - first, let me say that I have moderated my thoughts of Silver’s site. His commentary is excellent and he is quite knowledgeable about opinion polling. All prediction is intrinsically inaccurate, and singling him out is beside the point. His approach is a bit complex but the assumptions are not radically wrong. As you know, my preference is for the current snapshot.
The fact that the most likely specific EV total doesn’t always favor the probably winner is not the fault of numerical simulation. It is an intrinsic oddity of the Electoral College.
It is true that the spikiness of the probability distribution is closely linked to the fact that the most likely specific EV total (i.e. the mode) doesn’t have to match the median outcome. The median is by definition the most probable outcome since it represents the middle of the probability distribution.
If this is confusing, consider the following simplified example. Imagine that your favorite candidate has a 49% chance of winning in both OH and in VA. The most likely specific combination (the mode) is that he will lose both. Yet it is probable that he will win at least one.
Leave a Comment