Princeton Election Consortium

A first draft of electoral history. Since 2004

The power of polls: a protection against fraud?

November 7th, 2008, 10:52pm by Sam Wang

In a radio interview on Wednesday, I pointed out that averaged pre-election polls were very good predictors of final outcomes. Indeed, they are an underappreciated defense against fraud.

First, let’s look at this year’s results. In every state where the median pre-election poll showed a margin of 1% or greater, the leading candidate ended up winning that state.

(Click the map to get a detailed zoom for polling margins less than 15%. Data here, from Error bars corrected to be plotted horizontally.)

Overall, the median discrepancy between polls and outcomes was 3% (r=+0.975). The discrepancies are about 1.5 times as large as would be expected from variability among polls and/or sampling error. But because the outcomes tend to be more lopsided than indicated by polls, the larger error doesn’t hurt when it comes to predicting winners.

In the four states with extremely close polls, Indiana, Missouri, North Carolina, and North Dakota, three were won by one percentage point or less. In the fourth state, North Dakota, pre-election polls were sparse and dominated by partisan organizations. I do not expect the United Transportation Union to become a major polling force in the future.

Pre-election polls did equally well in 2004, when they successfully predicted every battleground state winner except Wisconsin, which was won by less than one percentage point. Indeed, it is often not appreciated that pre-election polls tend to be more accurate than exit polls, which are done face-to-face and as a result suffer from biases. For this reason pre-election polls are an underappreciated bulwark against vote fraud.

The reasoning can be taken even further. In the Alaska Senate race, Senator Ted Stevens (R) is unexpectedly leading Ted Begich (D) in the vote count. Stevens recently received a felony conviction for ethics law violations, and trailed in the last two polls (though they differed, +22% for Begich, then only +8%, suggesting volatility in the race). Skulduggery in vote counting has been suggested. Or Begich’s supporters may have simply voted early, like many Democrats nationwide, and are not yet counted.

If the controversy persists, one possible answer is to commission a professional, nonpartisan post-election poll. The poll could be simple: a question to find out if the person voted; a question about how they voted in the Presidential race, to make sure the sample is representative; and a question about whether they voted for Begich or Stevens. This would probably identify any margin for Begich of a few points or greater. It’s something for a media organization to consider doing.

Tags: 2008 Election

20 Comments so far ↓

  • Sam Wang

    Zev, Tony, Paul – I get into more detail on your question here. No definitive answer, but my suspicion lies in the quantification of “likely voter” status in terms of probability.

    Larry and Observer – That’s interesting regarding the Alabama race. I submit that the track record of pre-election polls is sufficiently good that they should in fact be admissible in such disputes.

  • Observer

    Larry: I’d be very surprised if any judge would look to anything outside the formal voting process itself as even admissible, let alone seriously probative, in a close election dispute.

    Ken: I think your topic, the vulnerable voters, deserves some serious thought. My first reaction is that we ought to have ‘voter aides’ in every precinct. They probably would have to be volunteers, or at least paid very little, like current precinct-level poll workers. And that would mean some kind of careful control to keep partisans from taking over the ‘voter aide’ positions. But there are voters who need help, and they should get it.

    (Currently, you can bring someone of your choice with you to help you vote in some circumstances. If you are officially disabled, for example. But that doesn’t begin to cover all vulnerable voters, and not everyone has such a friend/relative to bring with them.)

  • Ken

    Given that there is an error rate associate with any measurement including the actual vote count and that it is important for governance that the results be an accurate reflection of the majority’s voting intension, should there be an extra tier of safe-guards when polls indicate a close contest?
    If so, what forms would these take?
    Also, are runoffs useful in this situation?
    What steps can we take to protect the vote of those who have difficulty with the actual physical technology of the voting process, the “vulnerable” voters?

  • Larry

    As a lawyer I have been intrigued by a similar idea using regression analysis of past election returns to identify unusual results as the election results come in. James Gundlach presented a paper in April, 2003 using that technique to look at the questionable results of the 2002 Alabama Governor’s race. I have tried to understand his technique both as a decision point of whether or not to contest an election (which typicall have to be made very quickly) and what, if any, evidentiary value it would have at trial. Your discussion is interesting whether the use or addition of polling data adds to the robustness of the prior election data explified in Gundlach’s study. Gundlach’s paper is available at

  • Sam Wang

    Jim, I can’t believe I let that one slip by me. Yes, of course – they are misplotted. I’ll fix this shortly. I’m glad somebody’s paying attention – not me evidently…

  • Behnam

    Let me be the first to propose that elections be abolished and be replaced with opinion polls, or with a met-analysis thereof.

  • Jim

    Sam: Wouldn’t it make more sense to put the error bars on the poll results (horizontal axis) rather than on the final results (vertical axis)? The MOEs are for the polled results, not the final results.

  • Paul

    Zev: At a guess, the problem is simply that the reported MOE in polls is wrong — perhaps because it only includes sampling error, and doesn’t take into account the uncertainty of the likely voter model (which is probably just as large as sampling error).

  • James

    Regarding validation in polls…

    Why not just match what people say against precinct logs? If you voted your signature should be recorded in them.

    That in itself could be an interesting poll–how many people lie to pollsters about whether or not they voted, and how does this vary with locale or change with time since the election? E.g., by 1976 it was apparent that nobody voted for Nixon. I imagine the same is true by now for Bush.

  • Tony

    I think Zev raises a good point about why the actual outcome is outside the error bars so often. My guess is that this happens because the error bars take into account only one type of error, ie that due to undersampling. Other types of error (eg due to an imperfect model of who is going to vote, etc) would not be included and so the actual error is likely to be larger, as is in fact observed

    Sam, comments?

  • Zev


    One thing I’m puzzled about is why the actual outcome is outside the error bars in so many states. Shouldn’t this only happen 5% of the time, or is that not a valid expectation because we are looking at a single election not 51 independent elections? Does this suggest there is something wrong with the computation of the error bars? Or is it related to a systematic underestimation of margins in more lopsided states? Any thoughts?

    Thanks again for all your work this year.

  • Lorem

    I don’t think this idea would work, at least not in this case. 538 writes that, in fact, all GOP candidates outperformed expectations, and by approximately the same margin. This means that if there was some irregularity it hit all races, so using the McCain-Palin vote as a control would (almost certainly) get you nowhere.

    It’s a neat plan in the abstract, though, and would be nice if a good control could be implemented. Perhaps a simple comparison of the “yes, I voted” answers with the voter turnout rate? Although this would be effective against some types of fraud and not others…

  • Jack Rems

    Does anybody here know: what is the likely relative timing of
    1.) Final recount in Minnesota
    2.) Official count in AK
    3.) Voting in GA

    Are they currently campaigning in GA? Running TV ads? Early voting? Polling?

    There’s also likely to be a lot of out-of-state money spent to make it seem the other side has too much out-of-state money.

  • Sam Wang

    Wagster – It’s a good point. For instance, in polls taken half a year after the 2000 election, Bush beat Gore, despite the fact that Gore won the popular vote. However, several factors might help matters here: the recency of the election, and a cross-validation in the form of the McCain-Palin question. This was the point of including it as a necessary internal control.

    Evans – Before the election my prediction about MN-Senate was “Too close to call…with a tiny advantage to Coleman.”

    In regard to that scholarship, that’s pretty egregious self-promotion.

  • Evans

    You mention the AK race and polls, what are your thoughts about the MN race in light of the polls? To my half-trained eye it seems like the result (dead heat) was actually predictable given the closeness of the polls in the final week, virtually scattershot around dead even at 40-40.

    If you wanted to really make a statement about the power of polls (and the power of your site in particular), you could call Alaska for Begich right now and then, in the case Stevens wins, ask for a recount based on the irregularity (if the polls were right, the odds of this happening are astronomically against). Of course, the felony Bradley effect also sounds plausible, so I wouldn’t necessarily recommend this path, but it would give you something concrete over the other sites in the future.

    Also, please activate your region of the blogosphere to win a $10k scholarship for my blog, which is a finalist:

    please vote for “Evans Boney”. Thanks fellow Prof. Sam Wang fans.

  • Wagster

    I’m a fan, but this is a bad idea, Sam. In polls there is a strong tendency to over-report voting behavior. Because there is a stigma to admitting that you didn’t vote, the reported results to the question “did you vote last election?” have nearly always outstripped known turnout, so an after-the-fact poll is unlikely to be accurate.

  • Joe Canepa

    How were the votes counted in Alaska? Diebold machines? Is there any way to determine if every vote was tallied?

  • James

    After the 2000 election, it occured to me that a grassroots organization could dramatically augment the results of a traditional poll.

    By dividing the effort among 1,000 or 10,000 activitsts (e.g., via DailyKos), you could take a number of precincts and cost-effectively poll every single resident. I imagine that in some precincts you could approach 100% definitive answers (as opposed to “I won’t say”). These could give an extremely sensitive measure of discrepencies in the reported results.

    There could even be a deterrent effect in letting it be know beforehand that certain close races would be subject to this post-election scruteny.

  • Adam

    So is that North Dakota at 0, -8? And what kind of voting machines do they use?

    Seems like a regression line would be shifted counter-clockwise relative to the diagonal!

  • dave kliman

    can you tell us about GA?