Princeton Election Consortium

A first draft of electoral history. Since 2004

Accounting for poll biases

November 3rd, 2008, 11:57pm by Sam Wang


Here’s your tool for handling many biases: last-minute swings, the Bradley effect, the cell phone effect – and others as well.

The general idea here is to add a fixed percentage to all polls across the board, then redo the Meta-Analysis. The result looks like this:

It works like this. You add up all the biases you think may be present in polling data. Then you look that point up on the horizontal axis and read off the results. For example, if you think there’s a Bradley effect that hurts Obama by 2%, but you also think that landline surveys understate Obama’s support by 1%, then the total bias is -2+1 = Obama -1% (i.e. McCain +1%). In that case the Median EV Estimator is Obama 338 EV, McCain 200 EV.

Here’s a table of key values:

Net bias
Median EV Estimator (95% CI on Obama EV)
McCain +5% Obama 278 EV, McCain 260 EV (251,313)
McCain +4% Obama 296 EV, McCain 242 EV (268,338)
McCain +3% Obama 311 EV, McCain 227 EV (277,341)
McCain +2% Obama 324 EV, McCain 214 EV (290,353)
McCain +1% Obama 338 EV, McCain 200 EV (305,367)
No bias Obama 352 EV, McCain 186 EV (313,378)
Obama +1% Obama 364 EV, McCain 174 EV (335,388)
Obama +2% Obama 378 EV, McCain 160 EV (350,396)
Obama +3% Obama 381 EV, McCain 157 EV (362,406)
Obama +4% Obama 391 EV, McCain 147 EV (372,406)
Obama +5% Obama 396 EV, McCain 142 EV (377,409)

At 0% the median and mode happen to be the same. At the moment, my personal estimate of the bias is Obama +1% because of the cell phone effect. It only shifts the median EV estimate by 12 EV, to Obama 364, McCain 174. But because of all the ties (IN, MO, ND), this 1% shift moves the mode all the way to Obama 378, McCain 160.

Now I want to call your attention to a data resource, which answers a number of your questions.

Some of you have asked for results that are already posted. Many key files are all linked from the Geek’s Directory. The most useful files are readable by Excel or any text editor. Here is how to read them:

stateprobs.csv: Each line corresponds to one state, and contains 5 items from left to right:
- % win probability based on polls alone
- the median margin in %
- % win probability if margins move toward Obama by 2%
- % win probability if margins move toward McCain by 2%
- State postal abbreviation

EV_estimates and EV_estimate_history – Today’s results and historical results. In both files, each line contains:
- Date code (1 is 1-Jan…365 is 31-Dec)
- Median EV Estimators for Obama and McCain
- Mode EV for Obama and McCain
- Safe Obama and McCain EV (safe means probability>95%)
- Toss-up EV
- 68% confidence band for Obama EV
- 95% confidence band for Obama EV
- Number of polls used in Meta-Analysis
- Popular Meta-Margin (%)

polls.median.txt – the summarized poll averages, day by day, starting from the most recent day. It’s composed of sets of 51 lines. Each line corresponds to one state, and contains, from left to right:
- The number of polls used to calculate that state
- Median date of oldest poll used
- Median margin in %
- Estimated SEM of margin in %
- Date that the median was calculated

This last file does not contain postal abbreviations. Within each set of 51 lines, the order is (10 per row):
AL,AK,AZ,AR,CA,CO,CT,DC,DE,FL,
GA,HI,ID,IL,IN,IA,KS,KY,LA,ME,
MD,MA,MI,MN,MS,MO,MT,NE,NV,NH,
NJ,NM,NY,NC,ND,OH,OK,OR,PA,RI,
SC,SD,TN,TX,UT,VT,VA,WA,WV,WI,
WY

Okay, that’s the midnight information dump…

Tags: 2008 Election

7 Comments so far ↓

  • Davos Newbies » Blog Archive » A chart to stop the worrying

    [...] If there are still nervous Obama supporters out there, Sam Wang at Princeton has the perfect tonic. [...]

  • Tapen Sinha

    This is a very very useful picture.
    Thank you very very much.

    Tapen

  • Rathna

    since Sam has also graphed the Meta-Margin vs time, it’s a little less interesting.

  • blair alef

    I would also like to share one small observation that is social rather than political or statistical. There is a chance that five million disadvantaged people of color who have never voted will vote in this election. A good prorortion of them will be older than thirty years old. The message in this election is not who will be President but those many, millions of individuals who thought no one cared about their opinion and voted this year for the first time.

  • blair alef

    Thank you as always.

    For PEC afficionados and election geeks, just watch Indiana, earliest poll closing. The percentage of counted votes one way or another when announced will give a good indication of the strength of the Obama ground game advantage – if there is one or not. Whatever moves Indiana one way or another will give the best indication of who has a ground game advantage in all of the rest of the seriously contested states.

  • yoyo

    Hm, i remember you posting this, but didn’t it plateau more? is it new polling in red states?

  • Aaron

    I made another image like the last one using EV_estimate_history . But since Sam has also graphed the Meta-Margin vs. time, it’s a little less interesting.
    Either way, here it is.

    Exciting times.