Princeton Election Consortium

A first draft of electoral history. Since 2004

FOO (freakouts over outliers)

July 15th, 2016, 1:00pm by Sam Wang


A CBS/NYT poll released yesterday indicated a 40-40 tie between Clinton and Trump. Cue media freakout. This illustrates the point that news organizations habitually report on outlier events, a bad move when it comes to data points when other data points are available.

Four other surveys with mostly overlapping dates show Clinton +1% (Morning Consult), YouGov/Economist (Clinton +3%), AP-GfK (Clinton +4%), and Clinton +12% (Raba Research). Five data points, with CBS at the extreme end. Oldtime PEC readers, all together now: take the median. The median is Clinton +3.0 ± 1.3% (± estimated one-sigma uncertainty). So the race may have narrowed from a 5% gap – maybe because of FBI director Comey’s public announcements? Anyway, it’s not a tie yet.

Same story with state polls. Most state data (and there’s lots of it, yay!) show a fairly steady race. The exception is Florida, where, if we believe it, the race suddenly went from Clinton +6.0% to Trump +3.5%, with the breakpoint happening around June 26th, a full two weeks after the Orlando shooting). However, the post-June 26th data included two partisan pollsters and one pollster (Quinnipiac) that took two weeks to do (indicates they didn’t have the resources to reach people in a shorter time period, a possible problem for sampling). So I have doubts.

Now we have three Florida polls, all spanning July 5-11. Throw in the Quinnipiac survey to get Trump +5%, Trump +3%, Clinton +5%, and Clinton +7%. That wide spread gives a median of Clinton +1.0 ± 3.7%. So, Trump may well have caught up a bit in Florida…but it is not clear who is in the lead there.

Update: looking into why the estimator isn’t reflecting the Clinton +1% value. I believe it depends on which ones are included in the “last 7 days” rule. The other possibility is a sluggish XML feed that lags other HuffPollster feeds.

Update #2: fixed. HuffPollster had changed the way they report the candidates’ names. All is fixed. While we’re at it, we expanded the aggregation rule from “last 7 days of polling, using median date of each poll” to “last 7 days of polling, using last date of each poll.” That may be better for both this part of the season, when polls are sparse…and later, when they are abundant.

Tags: 2016 Election · President

39 Comments so far ↓

  • A New Jersey Farmer

    Some new polls out today suggest that the ones showing Trump leading might be the outliers.

    http://elections.huffingtonpost.com/pollster/polls

    • Matt McIrvin

      The national polls that show Trump leading lately are all from Rasmussen, and they are extreme outliers. Excluding them doesn’t change the picture of a race that has narrowed a bit, but Clinton is clearly ahead.

  • Ilan

    If freakouts over outliers is FOO, does that make this website and Sam Wang foo fighters?

    Foo(d) for thought.

    • 538 Refugee

      We can meet at the Foo Bar (https://en.wikipedia.org/wiki/Foobar) where the special is a liberal helping of Hair Flam·bé.

    • Jay Sheckley

      Pronounced “Wong Foo Fighting”.
      [All our brains are FOO Fighters.]
      CHORUS: Bright minds like Wang’s FOO fighting
      –Aggregation fast as lightning–
      PEC finds state polls arent frightening;
      Bias proves unenlightening.
      Take pundits & outliers lightly.
      Fight FOO. Doc Wang now posts PEC nightly.
      THANKS, SAM!

  • BillSct

    To help suppress my personal FOO impulse, I went back though the archives and found snapshots of the EV Estimator for 2004, 2008 and 2012 and created this slide ( http://s1012.photobucket.com/user/billsctphoto/media/EV%20History%20Comparison%2007162016.jpg.html ) to compare the past to today. This year’s EV estimator trace, so far, even with this week’s dip, is far more stable then past cycles. We will see if that holds.

    • Jay Sheckley

      Seeing 4 EV Estimators side by side eased my Pence-plotzing and tertiary FOO. How strongly we feel _recent_ bumps! Since the past can’t stir fight/flight, maybe immediacy is grossly distorting. Is that correct, Dr Wang? Info?
      Together, the 4 graphs prove what Sam said on his first WOOcast: This election does resemble others. (Maybe mostly the last? As if a referendum on Obama’s administration.)
      Bill, I’ll refer to this PEC EV chart grouping during the conventions, and also to emotionally survive the debates, gaffes, allegations, lack of fact checking, blackswanophobia, and yes the actual election. Thanks!
      PS Sam, could PEC add a margin link to the charts together? Is an aggregate showing date overlays doable, or an aggregate/overlay maybe comparing +|- during events [conventions, debates..] Either way, PEC and your extant clear event marking is much appreciated!

  • Olav Grinde

    I am a little confused about how The Power of Your Vote is related to the Median Poll Margin in various states.

    Question: Ohio is tied according to present polls – why doesn’t your vote have more power there than in New Hampshire, where Clinton’s margin is +4 %?

    (Yes, I do realize that Ohio has almost ten times the population of NH…)

    Recalling Bush/Gore in Florida, surely it takes fewer votes to swing a “tied” populous state than to overcome/secure a +4 % margin elsewhere?

  • Louis

    Sam,

    Long time reader and veteran of multiple Canadians over the years (have done my share of polling too). I’m very curious why pollster or RealClear choose to include some polls in their average and not others they’ve included even a week earlier. For example, in the last week I notice both included all of the Rasmussen polls. However, I did notice they both failed to include a Reuters/IPSOS poll showing Clinton with a 13 point lead. They do have the earlier 12 point Reuters poll but neither included the one from July 8-12. And just last night Reuters came out with a new one showing a Clinton 12 point lead. Any thoughts? Here’s the latest: http://mobile.reuters.com/article/idUSKCN0ZV2OA

    • jdkbrown

      As far as I understand it, JerseyVotes also depend on how likely a state is to be the tipping point state. So we can infer from the current table that there are fewer scenarios in the EV histogram where flipping Ohio changes the outcome of the election than where flipping New Hampshire changes the outcome.

    • Sam Wang

      For tracking polls, in the past they have set their algorithm to only include every N days, where N days is the length of the running average. This avoids double-counting a sample.

  • bks

    How long do current events like the Orlando shooting take to show up in the polls? My guess is two weeks. Hard to say for sure.

    • Sam Wang

      Maybe to take full effect, but that is not what I am talking about. Go look at the data. From June 25th I see a flat period, followed two weeks later by a sudden jump. This is an implausible temporal pattern.

  • Shawn Huckaby

    New Marist poll out today should help pull the data back to the previous median:
    http://www.nbcnews.com/storyline/2016-conventions/clinton-leads-trump-diverse-battleground-states-new-polls-n609551

  • Amitabh Lath

    So, the media is full of innumerate J-school types who don’t understand that measuring a +4 difference with tools that have delta-3 uncertainty will occasionally lead to negative values. In other news, water is wet.

    I read these freakouts (or glee-outs, depending on the side) as entertainment. My favorite remains the epic freakout by Andrew Sullivan after the first Obama-Romney debate.

    • Josh

      The downside to this is that it perpetuates a misinformed electorate.

      The upside is that it skews betting markets…

    • MarkS

      I’d like to understand why the meta-margin graph shows a present-time uncertainty of zero. It doesn’t have a gray 95% CL band the way the EV margin does, and its future-uncertainty yellow band collapses to zero width at the present moment. This would seem to mean that the current uncertainty in the meta margin is too small to be visible. Is this correct?

  • Heavenly Blue

    Looking at those 5 polls, CBS’ poll indicating a tie doesn’t strike me as an “extreme outlier” with other polls at +1, +3 and +4. In a logical question of “which doesn’t belong: 0, 1, 3, 4, 12″ I’m forced to conclude that the Raba Research poll is the extreme outlier and should probably be thrown out or given a very low weight when taken as part of a predictive weighted average.

    • Olav Grinde

      But Dr Wang doesn’t compute the average (mean); he focuses on the median. Hence the +12 % carries no undue weight.

  • The Ohioan

    I get Sam’s point and I wished they had crafted a better headline. But I wonder: could a headline like this be the trade-off we have to pay if we want to keep the stream of good quality surveys coming? It’d be hard for the people who commissioned this survey to impress upon their organization heads of the value of such an expensive survey if they just reported it as a marginal contribution to an aggregated median. The headline overhypes the significance of one survey (from the perspective of poll aggregation*), but such overhyping contributes to feeding the data stream needed for poll aggregation.

    (The other point is that I wonder if we can live with data-driven analysis that has different methodological assumptions. What the NYT is doing is tracking a survey over time that follows similar assumptions and methods of construction over time. I see tracking as valuable as aggregating. Its value lies in detecting movements over time. FOr what it’s worth the survey did find a significant shift and I think it was worth reporting.)

    • Josh

      If people knew that one-off polls mean essentially nothing, they wouldn’t read articles like that–which means those articles wouldn’t get proposed by editors and written by journalists. And the space on the front page would (hopefully) be filled by something more valuable?

    • Matt McIrvin

      In 2012 the RAND Corporation did something interesting: they got a sample of voters and tracked the opinions about the election of those same people over time. The result was a survey that didn’t necessarily do a good job of reflecting the absolute position of the electorate, but maybe gave a better feel for time variations.

      (Though there was a danger of the survey population becoming less representative of the country over time simply because they weren’t naive subjects any more: people were asking them regularly about politics.)

    • Josh

      Neat. Did the survey find that those people became better informed over time?

    • Matt McIrvin

      I’m not sure the questionnaires were capable of detecting that. In any event, they’ve restarted the project for 2016:

      http://www.rand.org/labor/alp/2016-election-panel-survey.html

    • Matt McIrvin

      …here’s their 2012 results page:

      https://alpdata.rand.org/index.php?page=election2012#election-forecast

      It’s interesting to see, and doesn’t necessarily jibe 100% with the fluctuations in national polls over the same period. You can see some of the shifts that happened during the time of the presidential debates, but it’s not as dramatic as it was in the conventional polls.

  • Matt McIrvin

    I’ve noticed that not even news outlets seem inclined to freak out over Rasmussen’s polls this year, which show Trump substantially ahead nationally even when no other polls do.

    • Olav Grinde

      I suspect Rasmussen’s political polling has long since lost its credibility. I would be curious as to what extent this has hurt the rest of this pollster’s business…

      …unless of course their business is to create a false narrative. If so, then Rasmussen is an epic fail. Mittmentum, anyone?

      Seriously, I am astonished that Rasmussen seems unable (unwilling?) to improve their polling methodology, after being demonstrably way off the mark election after election!

  • 538 Refugee

    If I wasn’t so lazy today I might look up some article on polling volatility vs time frame. If memory servers isn’t this the time to least trust the polls?

    http://election.princeton.edu/2016/05/22/february-national-polls-are-the-best-you-get-until-august/

  • Michael Coppola

    It’s easy to understand why the media insists on hyping the latest polls, with a heavy dose of bias for the outliers, but it would be nice if they’d at least look at the data. Taken at face value, these polls show Clinton losing a little bit of her support. Trump has not gained any; he’s still stuck at ~40%..

  • Olav Grinde

    Sam, in your blog entry you give the median for Florida as Clinton +1.0 ± 3.7 %.

    If so, why do you have Florida colored red, i.e. Trump leading in Florida?

  • azlib

    And you should have read the comments on the story. It had a few folks pointing out one poll does not a trend make and others with the usuall Bernie Bro “I told you so story” plus how Hillary is a dishonest hack and the poll proves it.

    The NYT reporter did point out the poll could be an outlier, but the headline swamped that part of the message. Sigh, will reporters never learn.

    • Olav Grinde

      Reporter may learn – but much to their chagrin, their editors are in the business of selling newspapers. In other words, never mind that a headline isn’t really supported by the article!

    • Bill G.

      “The NYT reporter did point out the poll could be an outlier, but the headline swamped that part of the message. Sigh, will reporters never learn.”

      In all fairness, reporters generally do not write their own headlines.

  • Michael Hahn

    Hi Sam: Thanks for the very welcome dose of reality!! I am curious though: For how many states do you now have multiple polling data? It might be interesting for you to also have a map where you show the states for which polls have come in over the past month or so. For example, I keep hearing rumors that my state of Georgia might actually be in play this year. But I see little change in the “trending color”; perhaps because no current polls??

  • A New Jersey Farmer

    Exactly. Thank you for the update.

    Most of the other state polls in Colorado, North Carolina and Virginia show Clinton continuing to hold her lead, so Florida does seem to be the exception.

  • Paul

    Even for someone who has conducted polls, worked on numerous federal election campaigns, and understands the volatility that events(like FBI report on HRC’s email) I get caught in the media
    Hype. Thank you Sam for bringing me back from the hyperbolic edge!