Princeton Election Consortium

Innovations in democracy since 2004

Outcome: Biden 306 EV (D+1.2% from toss-up), Senate 50 D (D+1.0%)
Nov 3 polls: Biden 342 EV (D+5.3%), Senate 50-55 D (D+3.9%), House control D+4.6%
Moneyball states: President AZ NE-2 NV, Senate MT ME AK, Legislatures KS TX NC

Politics & Polls #20: What Just Happened?

November 11th, 2016, 2:51pm by Sam Wang

Despite the importance of understanding this week’s cataclysmic events, I have been slow to write. There are other demands, especially my annual national scientific conference, which begins tomorrow.

The question of what went wrong in polls – and where I went additionally wrong – is an important one. I owe you a serious assessment, but it is not a topic to write about quickly.

This is a wrenching time in national politics. Most supporters of both Donald Trump and Hillary Clinton were surprised by the outcome. As Harry Enten at FiveThirtyEight points out, voters were more partisan than ever, with amazing party loyalty. Despite the few key upsets in close Rust Belt races, voting patterns are nearly identical to 2012. The state-by-state correlation between Romney-Obama and Trump-Clinton is +0.95 – right in line with post-Gingrich polarization.The parties are now fighting over mobilizing and turning out their own voter demographics.

In Politics & Polls #20, Julian Zelizer and I react to the results in episode #20, our first post-election recording. Among a host of issues, we discuss why the polls might’ve been off and what a Trump presidency means for the nation and possible implications for our democracy. Listen.

Tags: 2016 Election

76 Comments so far ↓

  • John

    I have been a devoted apostle to you and (to a lesser degree) the two Nates, Electoral Vote, Larry Sabato, 270towin…I have just two words…Allan Lichtman.

    • wvan014

      In 2000, Allan Lichtman claimed his model predicting Al Gore was victorious because Gore won the popular vote. In 2016, he claims his model got it right in predicting Trump despite the fact that Clinton is going to win the popular vote. He can’t have it both ways. Lichtman always was, and remains, a self promoting hack.

    • Chuck

      Yes lichtman is a publicity whore. His model has always predicted the popular vote.

      Plain and simple by his standards he got it wrong!

      It is completely dishonest not to point it out when by reporters.

      When developing the models he counted Hayes-Tilden and Harrison-Cleveland as Democratic wins.

  • Alan Cobo-Lewis

    There’s some irony in the higher uncertainty reflected in 538 vs the low uncertrainty reflected in PEC. Namely, that, by using median-based statistics, PEC is designed to be robust in the statistical sense. Basically, instead of trying to exact maximum statistical information conditional on a specific model being true, PEC instead uses medians, which are less efficient when a specific model could be assumed true, but is more robust when the model’s specific assumptions are violated.

    But PEC wasn’t sufficiently robust to violations of the assumptions about interstate correlations (or across-the-board polling errors, which for these purposes amount to the same thing). In that case, it was actually 538 that was more robust. (Perhaps ‘robust’ is the wrong term here–maybe what I really mean is that 538 was less sensitive in a statistical sense. Hard to say, since I don’t know exactly what assumptions were built into 538.

    Something to keep in mind if you choose to pursue PEC in future elections.

    • Steve

      Of course 538 was more robust…Like you say, they added unknown variables in their calculation to account for potential error that was unmeasured and impossible to quantify.

      I ask you this though: what’s the difference between the 538 model and a pundit simply having said, “HRC is more likely to win based on polling, but it is possible that the polls are wrong and Donald Trump wins”? I think the victory lap that they are taking right now for being the “least wrong” is quite disingenuous.

      In the future, PEC shouldn’t try to account for systematic polling errors as 538 does, but simply be more upfront about the potential for it.

    • Alan Cobo-Lewis


      Yes you may be right

  • Anthony

    I look forward to the deep analysis of the polling data vs actual results as well as where your model went wrong.

    It seems like the national polls this time were a lot more accurate than the state polls, particularly in the midwest, so the assumption that state polls as the golden standard needs to be reevaluated.

    This may go more into punditry than you like, but I would also love for your to dig deep into the demographic details. Rural whites vs urbanites, income (looking like there is no correlation on income vs voting this election?). Also would love for you to discuss popular vote vs EC since it is looking like we are in unprecedented territory in regards to the popular vote vs EC split. It also looks like this problem is going to get even worse in the future.

    Also, it seems this may be the most unrepresentative government we have ever seen. Democrats got more votes at the presidential and senate level but Republicans control all 3 branches of government.

    • Phoenix Woman

      This all came down to Comey.

      Early voting done pre-Comey had her up by around four points.

  • Reginald

    Further reading: “When genius failed” by R. Lowenstein

  • Bowen Kerins

    I think the state-to-state correlation is something you really need to think about going forward when building the snapshot analysis. It’s clear that the polling errors were correlated, but the snapshot analysis assumes they are not.

    • Phoenix Woman

      Everyone got this wrong. Including the GOP’s own internal pollsters.

    • Sam Wang

      That’s not exactly true. The snapshot does assume that, but it doesn’t make sense to do otherwise. What’s left then can move up and down together, which is what a nationally correlated swing is – and that was used for the prediction.

      It is true that local correlations at a sub-national level, for instance in the Rust Belt, is not captured. That’s a worthwhile thing to think about…though it’s not the reason for the prediction error. The fundamental driver of my error was the 4% systematic error across all polls, amplified by my excessively aggressive estimate of systematic error.

    • JayBoy2k

      I am attracted to this site because of your transparency, integrity, drive to understand and build a polls aggregation model. It bothers me not a twit that we do not share political views.
      As far as the model, there are many more potholes between here and there.
      As part of your post election , are you going to focus on WHY “The fundamental driver of my error was the 4% systematic error across all polls, amplified by my excessively aggressive estimate of systematic error.”?
      There is a ton of early interpretations on what exactly caused the pools to be so incorrect. I’m skeptical of quick analysis. It also seems unlikely that with all the different pollster methods and biases, any single cause could create a consistent 4% swing across all polls. It would seem that to develop the 2020 version , you would have to understand what happened and adjust.

  • Remember Kerry

    I will tell you what just happened. People started praising you for having stable predictions and you started seeing stability as a virtuous goal of prediction. So you started looking for justifications of stability in the statistical data and when you found some you latched on to it in your mind and shaped your whole analysis around them. So hubris and self-interest is the main part of the explanation. The second part is your intense bias to the left side of politics. Just like with John Kerry 12 years ago, you started inventing statistical “reasons’ why in fact Kerry would win. Your analysis and commentary was replete with bias.

    You are not a credible analyst.

    I hope for your own sake and for the thousands of people you disappointed and misled, that you give this endeavor up.

    • Phoenix Woman

      Hillary won the popular vote by a bigger margin than did Kennedy in 1960, Nixon in 1968, or Gore in 2000.

      But you do you, deplorable.

    • Mary B.

      And I hope, Dr. Wang, you continue with the PEC endeavor. I’ve learned so much. As an ‘Arts and Letters’ sort of person, I struggle to keep up, have to google to understand concepts,and don’t have a lot to offer in return. But I learned a LOT.

    • Sam Wang

      There is much for me to write about, but you are being inaccurate. In the case of 2004, I was rightly criticized for looking for hidden voters among undecideds. Since that time, I have specifically avoided doing that. I agree that I did not think of a covert rule that would say “aha, now split them asymmetrically.” There are no credible analysts who did.

      I do admit completely to setting a systematic-error parameter that led to overstated certainty. This will be central in what I eventually write. However, there is no rule that would have sent the overall win probability to >50% for Trump. It wouldn’t have been possible.

    • E. D.

      Oh, please. Stop this stupid bashing of one site. No one saw this coming, not even Trump himself. I really deplore this kind of blame game. It’s worse than kindergarten. Go play somewhere else.

    • Jeremiah

      This is not an accurate assessment of why Sam’s model did not correctly predict the Electoral Vote outcome. It was the polling. If you look you will find that no poll had Trump leading in Wisconsin since Sept 2015, none. In fact the Huffpost average is +6 Clinton.

      I’m not really sure how pollsters account for people who said they were going to vote for Clinton but then ultimately don’t or even switch their vote to Trump.

  • Phoenix Woman

    Dear trolls attacking PEC:

    Everyone missed the huge surge of Deplorables. Including Trump’s own camp. GOP internal polling had him losing.

    By the way, if anyone is interested in preventing this miscarriage of justice from ever happening again, I suggest you see about getting your state onto the National Popular Vote Interstate Compact, which is 2/3 complete already:

    • pechmerle

      Along with the National Popular Vote effort (my state has passed the bill), if your state doesn’t have a non-partisan mechanism for redistricting Congressional seats, join in the effort to get one.
      In California and Arizona, the two I know something about, that has been a significant change for the better. Maybe someday we can get the correct number of House seats for the popular vote margin on that too.

    • JL

      There was no surge of deplorables. Look at the popular vote: Trump barely got more votes than McCain 2008! What was deplorable was Dems staying home for HRC. Don’t feed the Trump beast by trying to corroborate his claims to be riding some kind of unprecedented wave of support. Whatever else might be said about Prog Wang’s analysis, I think he remains correct in his assessment of the essential partisan stability, i.e. the election was not a horse-race–if the Dems had come out who were supposed to come out, Trump had no chance.

  • Lorem

    I was probably more overconfident than you (at least earlier in the race), Sam! It’s so tempting to think that a model that performed excellently a several times should be at least not-bad the next time.

    On another note, when you do your post-mortem, I’d like to see the MM history graph with the margin shifted 4.5% (or whatever it was) towards Trump. If we assume that the error was consistent, was Hillary ahead for only a few weeks here and there? (Although it’s debatable whether the error could really be consistent, given the – I think – relatively better performance of polls in the primaries.)

    Incidentally, the image for the history of MM graph on appears broken for me.

    • David

      You’ve attempted to predict four presidential elections.

      One of those (2008) was essentially non-competitive, as Obama led McCain substantially in all pre-election estimates. You got that one right, but it was an easy call.

      Of the three competitive elections, you got one right (2012), and two wrong (2004 and 2016).

      No matter how much math and science is behind it, the facts is that the predictions have been less than stellar.

    • fred flint

      There is nothing wrong with Sams model. However, if you put garbage in you get garbage out.

    • Sam Wang

      The shift would be about 3.4%.

  • Jeff

    I think there’s a lot we can all learn. “What went wrong” will be discussed for a while. As a physician, I would caution against using probabilities ” > 99%” because that sends you down a very different path than 75%, for example. Perhaps we might all agree that polling analysis is not an exact science.

  • Paul


    I think a large part of your error was falling to deep into the math. Math and statistics are powerful tools, and because they are powerful, they can lead us anywhere we want to go. If you do decide to continue as a poll analyst (or however you would describe your job, I’m not sure what the technical term is) you will have a lot of trust to regain. In the future, remember that a model is just a model. As my cat is sitting in my lap I’m reminded of the saying that the best model of a cat is a cat. Best of luck with whatever you decide to do, and if you do decide to continue in this line of work I’ll look forward to reading your thoughts!

    • Dan Jacobs

      The model here is relatively simple on purpose with minimum assumptions and there’s no “thumb on the scale.” I recommend reading through the model and Sam’s explanations.

      This election is not a rejection of poll aggregation; rather it is a data point. The model will still work in the future, but now much more error will be considered possible. For example, the same data might point to a 75% probability next time.

      It may be that polls missing badly is a good thing. Perhaps it’s better if we all go into an election with greater uncertainty, which makes every vote more valuable.

  • Kevin

    Dude. This is a comment you’ll probably need to block/delete. But the PAIN caused by false hope from this site. It’s harsh. I’ve followed you since 2004. I trusted your analysis. This is a very, very bitter pill.

    • Koos van Blerk

      I have followed this site since 2012 and this time most of us, judging by the blogs, wanted desperately to believe in the prediction. I would like to venture that the less one knew about science and math the more blind the following was. There were people commenting “Please hold us, Sam”, which is quite bizarre. However, for those with some background (I’m a structural engineer, former professor of Mech. Eng.), it was very interesting to follow. I am truly intrigued by the methodology and hope that it can be improved. Those who are criticizing viciously either don’t really know what they’re talking about or have no understanding that researchers have to trust what they know from earlier research, but then take it a step further. The change in demographical representation in the polls obviously had a significant influence and ultimately proved fatal because of the relatively thin margins. The fact that Clinton is winning the popular vote is a nice consolation prize.

    • Koos van Blerk

      Also, I like the crowdsourcing nature of the site, to help with ideas and debugging, but the blog needs a strong filter indeed.

    • Arun

      Kevin, I certainly feel the pain too. In this case, it is not a model predicting whether it will rain tomorrow, but the future of our country for the next 4-8 years. For some perspective though – any predictive model really only has modeled three presidential elections successfully before this one. As any modeler will tell you, it takes only one experiment to invalidate a model!

    • Steve

      Kevin, i’d just like to echo the sentiments expressed by others here.

      The poll aggregators as a group have done a poor job communicating the systematic errors that have the potential to turn their predictions upside down. It’s a lot sexier to say that HRC had a 99% or 70% (or whatever) chance of winning rather than really emphasize that these predictions are completely reliant on accurate polling and consistent voter turnout /demographics. Unfortunately, doing the latter doesn’t get you on TV or make your blog popular and picked up by the NYT. The truth is that nobody can really make a sound prediction on somebody’s % chance to win a national presidential election…There is not a big enough dataset of national presidential elections, since they only happen every four years and so much changes in that time. That’s why they are predictions and not true scientific models.

      Public polling, at its core, is a social science. Many of us in the scientific community who work in the “hard” sciences are always skeptical of predictions that come from social science. For example, predicting how a defined chemical system will consistently behave is extraordinarily challenging to do…It is significantly more challenging to make similar predictions on human behavior.

      With that said, don’t let this outcome impact your opinion of Prof. Wang. Despite being one of the “most wrong” prior to the election, I have a lot more respect for how he has represented poll aggregation than others (namely 538 who was the “most right” of the aggregators). His methods are all available and transparent, and is the first to admit that his models are reliant on polling (albeit he probably didn’t emphasize it enough).

  • Behnam Gharebaghi

    The failure of polling on such a large scale is a unique opportunity for enhancing or replacing existing models.

    If I may be excused for pointing to the obvious, we need answers to the following:

    -Was there an error in estimating turn out for various demographic groups (millennials, the poor, the educated, the religious, the rust belt, etc.)?

    -Did different demographic groups vote differently than predicted?

    -Was the behavior of the undecided voters modeled correctly?

    Once one figures out what went wrong, one can also make some headway toward answering this question: would Trump have won if Sanders had been the Democratic nominee?

  • Zoheb

    I have been following your blog and your technical analysis is amazing.

    Given the polls, it was impossible to assign Trump a probability greater than 0.5. In Hindsight, the standard deviation for the metamargin was set too low. The problem with the way you picked the standard deviation is that the approach assumes independent and identically distributed samples which is clearly not true for year to year elections. Some judicious editorial skepticism needs to be introduced into the standard deviation. This means that poll aggregator ratings have an editorial aspect. This seems unavoidable. Overall methodology seems very sound.

    Another thought is that the way undecided voters split should be considered to be a uniform distribution. This will further dent the final probability ratings but it is perhaps for the best.

    • Arun

      The whole point here is “given the polls”. Assuming the polls are accurate, the probability is correct. Where the model failed was because the polls were off, and consistently off in one direction because the error was correlated. That kills the assumption of independence that is key to the probability calculation.

    • Lorem

      Why should the undecided split be modeled as a uniform distribution? It seems to me like a 100-0 split should be much less likely than a 50-50 split (indeed, otherwise they would probably not be undecided).

    • Zoheb

      @Arun, the meta margin is supposed to take care of correlated errors. The problem was that the standard deviation meta margin was set too low.

      @Lorem, a uniform distribution would be an extremely conservative distribution. It would be more helpful to think in terms of cdf, rather than pdf. Basically, cdf(0,33) = cdf(33,67) = cdf(67,100) , quite conservative but not unreasonable.

    • Arun

      Zoheb – I think of the meta margin as the max allowable state polling error before the race shifts to a coin toss. While Sam applies this shift uniformly, it really is only necessary to apply it to the close states – FL, NC, NV, NH, and the mid west states, since the other states are very secure for one or the other candidate and will not give a meaningful probability shift. Basically, the polling errors in these states needed only to be off by 2.2% to shift the race to a coin toss. This can be seen in the Trump +2% map Sam provided before the election. The state polls in each one of those states was off by well more than that, and all in Trump’s direction.

    • Arun

      I also wanted to maybe provide a different way to think about this business of correlated error. The act of voting has nothing to do with correlated error, since masses of people don’t consult with each other before they vote. Rather, it matters in the methodology used in the polling. If the likely voter screen is biased in OH, it is likely to be biased similarly in WI, MI, and PA, for example. The best way to see the effect of a correlated error is to simply apply a uniform median shift as as Sam already does (Clinton +2%, Trump +2%) and see how the win probability changes. A future suggestion would be to show a win probability no greater than what would be shown assuming a uniform polling shift similar to what we saw this year. As we saw, polling itself is simply not accurate enough to justify 90+% probabilities in any model.

  • RTR

    Maybe I’m off the rails, but it seems like some of the current detractors of this site are mis-interpreting probability. The top bar of this site may have read >99% in the week running up to the election, but Sam made very clear that this was difficult to measure precisely, and that an estimate of 90-95% was more likely (he basically stated he’d have picked a more conservative error model if he had known how extreme the polling was going to favor Clinton). This was before the election. In that sense, is it really unreasonable to suspect that this was somewhere between a 1 and 10 to 1 in 100 election outcome when solely considering the polling data? Garbage in, garbage out! In this sense, the site really isn’t even comparable to Nate Silver’s. I have a lot of respect for Silver, but his model isn’t pure statistics. We’ll never know, because his methods aren’t on the display table, but are any of you convinced that he has done the appropriate model selection to ensure that adding in all these “fundamental” voodoo variables actually results in a meaningful improvement to 538’s explanatory power? No. Having said that, I was a little perplexed by Sam’s confidence in the model estimate. After all, all models are wrong, only some are useful. I think this is a mute point here though, because there just doesn’t seem to be any way to predict such an across the board bungling of the polls to such an incredible magnitude. As Sam stated in the podcast, hopefully the future holds some innovations in terms of figuring out which set of earthlings makes it to the polls on game day. With that in mind, I hope you continue this great experiment, Dr. Wang. I have certainly appreciated the analysis and the community you’ve brought together.

  • Emigre

    Reading these many insightful comments about the validity and usefulness of Dr. Wang’s model reminds me of Erwin Chargaff ( who rarely missed a chance to question model building. But even he was somewhat positive. He wrote in Heraclitean Fire: : “I should advise to wait and see. Models – in contrast to those that sat for Renoir – improve with age”.

  • SK Platt

    I made a comment about Dr. Wang starting to do punditry now, which appears to have been blocked.

    It was a serious comment.

    I would like to know – what in Dr. Wang’s tenured Princeton life and background in computational neuroscience qualifies him to talk as a public figure about “the implications of Trump on our democracy”.

  • William Henry Oznot

    After the polls got it wrong on Brexit, an objective researcher should have done a root cause analysis and done a post-mortem to see why they got it wrong. Likely many reasons, but the primary one is that one side’s “bullying” made it unpopular, even dangerous to divulge one’s voting intentions in public, much less to a complete stranger. This pattern repeated itself. In a different forum one might discuss the merits or not of such “bullying” but its existence is palpable.
    As for Dr. Wang I’ve observed a drift from being an objective external observer to letting his personal political views invade his commentary. I won’t go as far as to say he has lost his objectivity but if I were him, I would engage in some serious soul searching and decide whether I want to be an observer or a participant. Hearing him speak at more than one occasion, he was unable to mask his disdain for one candidate and his support for the other. I would have preferred that he failed a political sort of Turing test where I would have been unable to guess his views.
    Now, did this cloud his judgement? He will have to decide, but I believe he was “pleased with the results” which he presented as very assuaging (albeit to one half of the country only). And he gained fans for reasons of his analysis being right for years, but of late, as is evidenced by some of the commentary above, he gained fans because his results were reassuring to people who wanted to hear that Clinton would win. And in addition to the analysis of the data, his message was pretty much that. Since it gave the “right” answer, was there a reason to consider it might be inaccurate? Especially post Brexit?
    BtW this is not “armchair quarterbacking” because I did actually raise the question of the validity if the polls — of course this elicited the usual “bullying” responses. Not by Prof. Wang thankfully, but by several of his colleagues/supporters.
    In fact, the day before that fateful day (which should NOT be described with the hyperbole and vitriol it is, in what should have been an objective blog post), I did an exercise of adjusting the prediction by factoring an estimate of polls undercounting, using a swag based on Brexit. my prediction was 309 Trump.
    No bragging here, it was just a guess but I think if you came up with a model of the degee of bias, you would have nailed the prediction.

    • E. D.

      It would be difficult for any thinking person to mask disdain for Trump. Get off your high horse.

    • Philip Mathews

      The polling averages in Britain leading up to the Brexit vote indicated a toss up with a significant number still undecided.

      So I don’t know what you’re talking about.

    • Sam Wang

      Philip Mathews is correct.

    • Josh

      This is some good word salad.

    • Jeremiah

      “my prediction was 309 Trump.”

      This would only have been valuable before the election, you had published it and you made your methods transparent.

  • Richard

    I’ve looked, but I’ve not been able to find anywhere an explanation of how you convert a state poll result to a win probability for that state. I’m not aware of any mathematical means of doing so without simulation. I’d love to be pointed to it.

    • Arun

      What you want is the standard error of the mean (or median in this case). Sam calculates the median of the different polls; the standard deviation of the individual percentages from the polls divided by the square root of the number of polls gives him the standard deviation of the median. Assuming a certain probability distribution, the area of the curve above 50% is the win probability of the candidate whose poll percentages you are using in the calculation. No simulation is needed.

  • Joec

    Take your time Sam. You were excellent this morning on CNN. A calming voice. Sanity still exists.

    Exists not reigns.

  • Michael

    We all make mistakes, Sam. But you were so sure !! A few days before the election you declared, “this election is over!!” I know you are a brilliant and thoughtful man. In the future, make the predictions but without the arrogance and smugness.

    • Dan Jacobs

      Sort of. The model here is not a truly *subjective* prediction. The miss points to some false assumptions, but the majority of the error was the pollsters’.

  • the other Donald

    These poll-aggregating models are fine for states where the results are outside conventional margins of error. Once the results narrow well within MOE, the call should be “too close to poll, we’ll just have to count the votes”. No compelling spectator story there, but elections may not be spectator events.

  • Nick Levinson

    Trump’s pollster, interviewed on BBC Radio within hours of the win, explained his successful prediction by saying he polled “low-propensity” voters. Defining likely voters varies among campaigns and among pollsters. I think the most probable group of low-propensity voters this time would have been those who voted only in a Presidential race and specifically only in 2008 but not 2012, because voting for McCain (military veteran) would be more appealing than voting for Romney (hedge fund manager who closed stores and put people out of work) among likely Trump supporters. I doubt someone could assemble a list of those voters consistently across the U.S. if some jurisdictions disqualify voters for not voting in, e.g., four years and since some voters would have moved, but that problem could have been solved by weighting.

  • Chris Burg

    A month ago there were numerous newspaper stories about the Russians hacking voting machine vendors in twenty states.

    Then this bizarre anomaly occurs.

    We know Russia would hack a US election. We know the machines are vulnerable.

    What are the statistical chances these events are unrelated?

    Is there a way to tell from the data if it has been tampered with?

    • Nancy

      I don’t give in much to conspiracy theories, but I do believe there could be something to this one. I was struck before the election by how calm Donald seemed about Pennslyvania, even though polls had him way down. He kept saying “They’re telling me I’m doing well there.” Then the slow migration of insiders back to him. At the time, I wrote it off to his density.

      He won with the only path he had as an option.

      In any event, it would be interesting to know if hacking is truly possible (many experts say no), and what methods there would be for determining it. From a novelistic point of view, it’s a brilliant way to turn/rig an election, by just enough votes, not over the top.

      Whatever happened, people will move on. They always do.

    • Martin Schiffenbauer

      One way, of course, to tell if any computers counting the votes were hacked is to do a manual recount of the ballots. Clinton could request such a recount in some of the states where the Trump margin was small.

    • Gelatinous_Cube

      Exit polling may also help confirm official results or cast suspicion on them. A problem, though, is that willingness to answer, and answer truthfully, to exit polls, may not be distributed evenly among voters. Possibly there’s a clever way to account for this, but I wouldn’t be surprised if it’s subject to the same pitfalls as likely voter screens.

    • Andrew

      Nancy: ” was struck before the election by how calm Donald seemed about Pennsylvania, even though polls had him way down.”

      The final polls taken in Pennsylvania in the week before the election showed Trump down 1-2%, tied or up 1-2%. The outcome shouldn’t have surprised you.

      Generally across the midwest and north in PA, WI, MI, OH, IN, MN, MO, ME, etc. Clinton hit the mark on her polling number exactly, and all the 5%-9% supposedly “undecided” voters in the polls went and voted Trump, as many of us expected they would due to the “Shy Trump Voter” effect.

  • Tim


    While I agree with the trolls that you were obviously rooting for Hillary, I don’t for a second think it impacted the quality of your analysis. You simply used the polling available, nothing more, nothing less. I think you did a great job of translating the polls into a prediction. I have always found you to be a professional first, putting your politics aside.

    While the prediction was wrong, I blame the polling, not your methods.

    What I think you are more than capable of doing in the future is figuring out a way to ignore the top line of the polls, but instead have your models consume the underlying splits. For example, ignore the top line of Hillary 45%, Trump 42%, and instead model the splits by age, highest obtained education, gender, race etc.

    From there, build a model for each state to try to two separate predictions for vote share and turnout within these segments.

    I assume to build a quality turnout model you would also need to import the polling data for enthusiasm and % likely to the degree it is available.

    I’m not saying any model would have predicted trump’s EC win, but I do think the method described above would tease out some of the underlying issues with the polling to help us predict more uncertainty in the data than previously understood.

  • Jack

    As a retired math professor/computer programmer/actuary and lifetime political junkie, I ate up this article! Questions: could you describe your algorithm for selecting a manageable set of Congressional district possibilities (connected sets of election districts?) More importantly, how do you explain it to a bunch of Supreme Court justices who probably are math illiterate?

  • PECismyoasisofsanity

    Did people being polled simply lie about their choice for president?

    • Josh

      No. Pollsters used 2012 LV screens which assumed a Dem turnout of 66-67 million. 61 million actually showed up to vote. The difference–5.5 million out of 123 million total votes cast–is about 4%, which matches Trump’s over-performance nationwide.

  • Tony Shifflett

    I think Sam did an excellent job. This was a fluke election, and we all got it wrong.

    Demographics are on our side. Next time will be different. Demographic changes are set to decrease the white population by ~1 percent per year. Go read Ruy Teixeira. In 2020, the changes will really start to tell. There are only so many rural whites, and it gets fewer every year. Go to Vox and read what he wrote.

    I stand by my assertion that the Republicans are looking at long-term doom. Can’t pursue this strategy for too much longer. It’ll be tough for them to pivot once minority views of Republicans get set in concrete. Trump will do that for us.

    • ptuomov

      “This was a fluke election, and we all got it wrong. ”

      Nate Silver and 538 didn’t get it wrong. The result was not extremely unlikely based on their predicted distribution.

  • Andrew

    Prof. Wang:

    I pointed out 6 months ago here that there was amazing turnout in the contested GOP primaries, and that possibly comparing GOP and Democrat primary results would be a useful barometer.

    GOP outvoted in primaries in IA-NH-SC-AL-AK-AR-GA-OK-TN-TX-VA-KS-NE-ID-MI-MS-FL-MO-NC-OH-AZ-UT-WI-IN-NE-SD

    Democrats outvoted in primaries in MA-MN-VT–KY-LA-HI-IL-NY-CT-DE-MD-PA-RI-WV-OR-WA-CA-NJ-NM

    In general election, NH ended up a tie; VA, GOP outvoted Democrats for Congress, but due to #NeverTrump movement, Clinton prevailed for President. GOP won IA-GA-MI-FL-NC-AZ-WI in general as in primary.

    On the Democrat side PA flipped as I predicted due to Democrat crossover vote and exclusion of Independents (who lean GOP) in closed primaries; WV , LA and KY flipped for similar reasons as they still have a heavy Democrat registration advantage but don’t vote Democrat for President.

    A couple of odd cases remained – NV and ME had caucuses with very little Democrat turnout, Colorado & North Dakota saw the GOP cancel their primaries.

    As you note, voting patterns are mostly hardened partisanship and turnout. If you had simply gone by the primary elections for your prediction, you could have called the result except for having NH & VA wrong, and leaving ME, CO & NV as tossups. In the end, that prediction would have been very accurate.

  • Nick Spiegel

    Two thoughts: First, people are constantly asking me personal questions (who am I voting for, how much money do I make…) I make a point to lie to them about it. I want politicians to be more honest, and I have long seen calculated positions based on poling as a force working against personal integrity. I’ve always believed in the “Clinton Bosnia Theory.” Bill polled the public and found that going into Bosnia was going to be very unpopular. He went in anyway, and gained popularity by being willing to take an unpopular stance.

    Second: I can’t help but notice a similarity between aggregate polling and packages of subprime mortgages. Risk is not being reduced if the ground-level processes, from sample to sample, carry the same biases.

Leave a Comment