On the track record of simple poll aggregation

October 24th, 2008, 8:46am by Sam Wang

As I’ve written many times, you should never get too concerned with a single poll. But what about the flip side – can you trust aggregates of polls? The answer is yes. In Presidential and Congressional elections since 2000, the general approach I advocate has an exceptionally good track record. Simple meta-analysis of polls should do at least as well as more elaborate models and can also outperform electronic markets. Read on.

Of the following evidence, 2004 was amply documented on this website at the time. The rest was based on publicly available data and is easily confirmed.

The evidence from 2004. On Election Eve, the 2004 Meta-Analysis indicated an outcome of Bush 286 EV, Kerry 252 EV – the exact outcome. The exact match was to some extent a lucky hit. But even at the level of single states, the only incorrect call was Wisconsin, which was won by one percentage point. As you can see from the 2004 history (here plotted using the 2008 averaging rule), the race was stable for most of October, and the outcome should have been no surprise.

Median EV estimator from 2004 race

Further lessons from 2004. At the time, I made additional assumptions about undecided voters splitting unequally in favor of Kerry and a difference in turnout from pollster expectations. These assumptions were wrong. This is why I am so critical of speculations that go beyond data. As I have previously written, there is very good evidence that the “Bradley effect” dissipated in the mid-1990s. The possibility of a “reverse Bradley effect” is unfounded. The omission of cell-phone-only voters may undersample support for Obama, a fact documented by the Pew Center.

2000: The origin of the Meta-Analysis. In 2000, I monitored Presidential polls using a proto-aggregate site run by Ryan Lizza at The New Republic. It became clear that the election would hinge on Florida. As we know, that turned out to be correct. This was a different picture than than national opinion polls, which suggested a Bush popular vote win that never materialized.

2006: Congressional midterm elections. At the level of the Senate and House, high-quality meta-analysis requires more data than are typically available in single races. So the thing to do is to combine data from all states.

In the Senate, polling data suggested a 50-50 chance of a Democratic takeover, which occurred. At the time, Intrade showed the probability as only 25%. Public commentators were, for the most part, caught flatfooted.

The House was harder to predict because many competitive districts had only one poll – or no poll at all. However, it was still possible to aggregate all available polls by simply counting every lead, no matter how small, as a win for that side.’s aggregated House polling data showed 231 D, 197 R. The remaining steps are to split equally the seven districts with no polls, and to use binomial statistics to place a confidence interval. This led to a prediction of 234.5 +/- 3.0 D, 200.0 +/- 3.0 R, a gain of 30-35 seats. The actual gain a final outcome of 233 D, 202 R, a gain of 31 seats, well within error.

Similar documentation can be found at Andy Tanenbaum’s Tanenbaum is the pioneer of poll aggregation and is worth reading on this subject. He has analyzed data since 2000. I will provide a link when I find it.

The only remaining question is…what about this year? Answers on Monday.

Tags: 2008 Election

  • blair alef

    Absolutely great background. Very clear and easy to understand. I’m having a great time this election season switching between 538 (projection) and this site (current reality). I think your median EV estimator graph is a fantastic way to follow events & the election.

    P.S. As someone who values knowledge, try and check the schedule for CSPAN every night. They have had some fantastic coverage lately.

  • Sam Wang

    Yes, from a practical point of view the value added on this site are the history and the voter-power measure. That’s interesting re C-SPAN.

  • gprimos1

    Dr Wang,

    Thanks for responding to my question yesterday. As you mentioned, with the median your n may be very small (isn’t it 3?) leading to discrete jumps. I would be curious to see what the EV estimator would look like with the mean since the tails aren’t that long. My intuition says there would be more day to day variation in the EV estimator, but that the peaks would be muted (in addition to smaller CIs of course), and you would not lag as much as if you increased your n. Weighted median would also be interesting but I don’t know much about it.

  • Hans

    I find you methods very compelling, Sam, but is there any data source that you could use to check earlier elections than 2000? It would be interesting to see how, for example, the method would have worked had it been used in the 1996 election, which was the last real electoral college blowout, and, given current indications, may be a better model for this years election.

  • David U

    I teach psychology and will have my students compare your site and For the most part, the two are converging, which we would expect as the race draws near and the number of polls grows large. The comparison is to highlight the importance of simplicity and elegance–the fewer assumptions, the better. Thanks for doing this.

  • gprimos1

    More harping on the median…

    So just to investigate the data I grabbed the last 30 data points for Florida off pollster. The O-M difference shows no trend at all (steady at . If we do a 3-point moving average, the resulting distribution looks kinda normal (p=.19), with stuff pretty much bunched in the middle. If we do 3-point moving median the resulting distribution does not look normal (p<.005), with more on the tails. The results are in the same direction but less pronounced if we increase n = 5.

    Interesting to me, but it would probably have no affect on the model.

  • gprimos1

    Err whoops! I should have taken the range of the proportions themselves not their differences. The median looks much nicer there.

    Please ignore my previous posts, carry on, nothing to see here…

  • blair alef

    As a novice I enjoy the direct simplicity of the Meta-Analysis. One thing I enjoy about 538, though, is comparing current polling averages to the regression analysis portion of their site. So far the comparison of the two values – current polling vs. projection of voting behavior based on socio-economic analysis – has been a very good predictor of where and how the polls will move.

    The 538 regression is a very interesting interactive recipie for predicting voter preferences using a wide rage of demographic factors. As an example, in early September, as whites, and then white males in particular, began to warm to Obama, the regression projection jumped way out ahead of the polling comming in at that time, ‘projecting’ the jump in polls and EV’s that appeared later in the month and early this month. Looking at the election in this way there is still some possible movement toward Obama in OH, GA, MT, & WV, although it may not actually change the EV map. From the same viewpoint, CO, FL, MO, NC, NM, NV, PA & VA have hit stability, close to fulfilling the most likely voting patterns for their particular state demgraphics.

  • Blair T. Johnson

    One thing that is not coming out clearly in this discussion is the stability of the estimates underlying the meta-analytic averages that this site summarizes. I agree we should trust aggregates more than individual poll numbers, but the value of the aggregates hinges on the assumption that the underlying estimates converge on a single population value, that there is no more than sampling error at work (click on my name for more information about meta-analysis and thes sorts of assumptions). This assumption can be evaluated by calculating Cochran’s Q or Higgins and Thompson’s (2002) I2 statistic, which essentially test the “goodness-of-fit” of the aggregate against poll estimates. I suspect that Sam has not been showing numbers like these because there are two few estimates available for most states within the time-frame that this site aggregates. This article addresses that issue directly: (So the point is that it would work well with states for which many polls are available.)

    The other sites that use regression with the state poll data are essentially assuming the opposite, that the poll numbers are not stable (especialy over longer reaches of time) and that these trends can be modeled to produce more precise estimates.

    Any comment, Sam? Keep up the great work!

  • Geoffrey Hellman

    Two questions from a lay observer (expertise in logic, phil of math/science):

    1) the formula for computing EV probs. seems to build in statistical independence of states, which has to be wrong in many cases (e.g. Alabama & Mississippi, RI & Massaachusetts, etc.); don’t demographics need to be taken into account, or are they at some other stage?

    2) You say the 2004 presidential race was accurately predicted, but if–as now seems clear from work of Mark Crispin Miller (“Loser Take All: Election Fraud and the Subversion of Democracy”)–Kerry actually did win Ohio and therefore the election, how can such a claim be made. Indeed, even if there is serious doubt about the result, obviously no claim about “being right about the outcome” can be any more certain.

  • Sam Wang

    Geoffrey, I have little sympathy for claims of fraud in 2004. Perceptions are colored by the crisis of 2000. There is no denying that irregularities occurred in 2004, but the question is whether they were numerically sufficient to affect the outcome. I submit that they were not. The sign of the polling margin matched the reported outcome in all states except Wisconsin.

    Indeed, it is my view that a careful look at professional polling before an election can serve as one of our better bulwarks against fraud – one that escapes overheated claims of theft.

  • Evans

    Wow… do all polls not include cell-phone only voters? We’re getting pretty old now, and I think more or less the entire 18-25 crowd considers the land-line a novelty…

    Also, we split more than 2-1 for Obama, so that’s definitely a huge effect not being counted in these polls (to say nothing of the way early voting will change our turnout… we don’t like to be rushed :) ) .

  • egc52556

    Can this analysis tell us anything about how a poll SHOULD be conducted to get more accurate results? Is is possible to define a poll’s accuracy given that there is no way to objectively measure what the “real” answer is?

  • Blair T. Johnson

    egc52556 says, “Is it possible to define a poll’s accuracy given that there is no way to objectively measure what the “real” answer is?” Of course, on election day, we’ll see what the ‘real’ numbers are. (Of course some would argue that those are not ‘real,’ either, especially when people who try to vote are not allowed to do so.)

  • Sam Wang

    Evans, many national polls now include a cell-phone sample. This year the discrepancy is probably 1-2%, not that much. I wrote about this, as have the originators of the data at the Pew Center and analysts, for instance at New methods of surveying opinion are inevitable. We’ll see what pollsters do in 2010.

    In regard to the conduct of polls, my own view is that there is a certain advantage to having everyone do a slightly different thing. One person who has worked on increasing transparency and improving methodology is a former pollster, Mark Blumenthal at His writings at that site are quite informative.

  • Collin

    I don’t follow the meta-analysis v. Intrade reference. If meta-analysis said the chance of a Democratic takeover of the Senate in 2006 was 50/50, and the Intrade market showed a 25% probability, I don’t see how to draw any conclusion as to the robustness of either tool given the actual true outcome.

    Likewise, with respect to the adjusted versus unadjusted meta-analysis for the 2004 presidential election, if the actual result was (as I assume) within the error band of both methodologies, can one assume that the unadjusted methodology is better just because it was “right on”?

    I’m not a skeptic of meta-analysis or Dr. Wang’s reasoning, but I’m interested in the issue of evaluating estimation methods. Given two alternative methods of this sort, how many reality checks would it take to distinguish which is better on formal statistical grounds? Do we have to reserve judgement until the year 2300? I’m not sure I can take waiting even until Nov 4!

  • Quadrivium

    I’ve learned a lot from this site; thanks.

    Occasionally I’ve been puzzled by things that look like small inconsistencies, though I assume there are good explanations for them. Right now, for example: the “Power of Your Vote” table shows Obama’s margin as +6.5% in Ohio and +4% in Indiana–but Indiana is darker blue on the map than Ohio. Why is this? Can a smaller margin translate into a higher win probability? (I would think this might be true if there were more reliable data in the state with the smaller margin, but I’ve seen a lot more polling data in Ohio than in Indiana.)

    Also, when I click on “+2% for McCain,” the states flip back–Indiana becomes lighter than Ohio. Is there reason to believe that nationwide improvement in McCain’s numbers would be felt more strongly in Indiana than in Ohio?

  • Nicholas J. Alcock

    Dear Sam,
    I did try and post on this issue a few days ago. Now, Andrea Moro uses Monte Carlo simulation, Pollster uses a regression, 538 I’m not sure about, Realclearpolitics uses simple averages. I have looked on Google about meta-analysis and there is a mathematical tool to estimate this. Now, I would guess this estimation is different to the aforementioned statistical tools.Can you please tell us, how it is different and how it is better?

  • Sam Wang

    Collin – your points are valid. There are more tests, which are documented by Andy Tanenbaum at The overall observation is that general election polls are good predictors in the home stretch, especially when averaged.

    Quadrivium – it’s not the margin that is used, it’s the margin divided by the error bar, which gives a z-score. In other words, variation among polls (and sampling error) is used to calculate confidence. In this case Ohio has quite a large error bar. See the data.

  • Frank

    If the differences described in were correlated across elections, then, using today’s medians and the 2004 differences, this year FL and MO would move into McCain’s quadrant, and there would presumably be other states (on net) doing the same among the 36 states not shown in that graph. Do you think the state differences are correlated across elections? And if so, how much would that lower the EV estimate this year?

  • Frank

    The meta-margin is now 7.88%. If it were based on mean rather than median EV, then the current state median margins would have to decrease by just over 7% (i.e., McCain gaining 3.5%) to result in the narrowest McCain win (270-268; a tie would not be not feasible with the current state margins). This just means that McCain would have to pick up all 10 states with current margins of 0 to 7%: FL (0), ND (2), NC (2.5), MO (3), IN (4), NV (4), CO (5), NH (7), NM (7), OH (7).

  • Frank

    Correction: The measure I discuss in my previous comment has nothing to do with mean EV (currently 361.85, very close to median). It just gives the state to the leading candidate. List states by median margin and cumulate EVs until 269 is hit. It’s sometimes helpful to think of the polling data in very simple terms.

  • William

    Is it just me or as the median EV has started jiggling up and down the Popular Meta-Margin continues to increase?

  • Evans

    Frank, not sure that works. What is the basis for the assumption that voter turnout differences from polls will be the same this year as in 2004? Isn’t there about as much a reason to this the differences will be the exact opposite, given considerations like the early vote differential in both years, increased cell phone users, more enthusiasm in younger voters, etc?

    I think we use the current polls as the basis (with their changed internals to better reflect the population, including from the last election) because using past differences no longer applies…

  • William

    In fact, I’m willing to bet that to some degree past differences are already in the polls and putting them in would actually be a form of “double-dipping”.

  • Sam Wang

    William – A similar fluctuation also occurred in 2004. In the closing weeks of the campaign, the density of polls is quite high. For this reason I will base my actual predictions on a longer time window. One set of predictions will come soon.

    In regard to turnout and other variables, most spectators lack the analytical skills to do a good job of second-guessing the estimates of pollsters. I expect a meta-analysis of all available professional polls to do very well at predicting outcomes.

  • Aaron

    Really appreciate what you’re doing here.
    I graphed the Meta-Margin on top of the Median EV and Obama “Safe” EV, I think it says interesting things. here.

  • ron

    How can you state that the analysis got the EV for 04 right but incorrectly called Wisconsin. Im sure there is a reasonable explination but the sentence itself screams inconsistency. Other than that good post.

  • Frank

    I just heard the suggestion made on a TV program that Obama is investing more in states with close senatorial races.

  • Mona

    I don’t see any of the polls including those who are going to write in Hillary Clinton as their choice for president (and for that reason, I don’t know how many of them there are). But I wonder if these people are included in the polls among those who say they are going to ‘vote Democrat’? That would then take away from Obama’s numbers. Would love to know your thoughts. Thanks! And thanks for your great site!

  • Sam Wang

    Frank – I’ve read that Obama cut an ad in Oregon. This type of activity is smart though overdue. Since he lags other Senate Democrats in support, remaining smart moves would be money transfers and House campaigns. Conversely, McCain (or perhaps better yet, Palin) could help Republicans by going to Georgia, Kentucky, and Mississippi.

    Mona – Polls specifically ask about Obama and McCain. Polling evidence suggests that Clinton supporters have largely fallen in line. Some were probably Republicans who wanted to vote for a woman, and are back in their party’s fold. If there are any holdouts, there’s no obvious reason for them to be more anonymous than supporters of Barr, Badnarik, Paul, and others. In short, polls are likely to reflect adequately the actual state of the race.

    People should look at Aaron‘s graph, especially the fact that the Meta-margin is still moving even as the EV estimator has reached a plateau. It’s Obama’s red ceiling.

  • Todd S. Horowitz

    Has Obama’s “red ceiling” moved up any? A number of states that seemed deep red back in September (e.g. North Dakota, Montana, Georgia) are suddenly looking pale pink at best. On October 4th, you published a graph of the EV estimator as a function of a swing towards Obama in the current polls. Have we just been moving along that curve, or has the curve itself changed in the last three weeks?

  • Jonathan

    I would just like to second Colin’s comment above, specifically: it would be nice to see some statistically rigorous comparison of the performance of poll-based and electronic markets-based predictions of electoral outcomes. I’ve just been over to and couldn’t find anything about market-based predictions.

    It’s interesting that both poll- and market-based sites (e.g., intrade) claim to be superior, but I have yet to see any solid data to support these claims. Do you know of any that you could point us to?

  • Bing Zhang

    Dear Sam,

    I heard of your website from your friend D. Smith. True or not, your graphs have kept my hope alive for Obama and reduced my anxieties! Let’s hope your prediction will be confirmed. Thanks, Bing

