Princeton Election Consortium

Innovations in democracy since 2004

Nov 03: Biden 342 EV (D+5.3% from toss-up), Senate 53 D, 47 R (D+3.9%), House control D+4.6%
Moneyball states: President AZ NE-2 NV, Senate MT ME AK, Legislatures KS TX NC

Presidential Polling Error: slightly smaller than 2016…but in deep Trumpland, larger

November 6th, 2020, 12:17am by Sam Wang

While we wait for the likely conclusion of a Biden win with 306 electoral votes and a 5-6% popular-vote margin…

The graph above shows only states which have been called by media organizations, and where the count is >95% complete (results from NYT). For example, California, Arizona, Nevada, Pennsylvania are left out. Things may change.

In only two states, North Carolina and Florida, did the polls point in the wrong direction, where “wrong” means the sign of the polling margin and the outcome are opposite, and are indicated in red (North Carolina’s not entirely done yet). From a public consumption standpoint, though, that was consequential: before there election, expectations were raised of a possible Election-Night resolution…which was then followed by four days of suspense (I’m updating this on Saturday 11/7, after the networks finally called Pennsylvania).

But if we get into the details, there is a notable error in state polls. It has two components: (1) In states where polls favored Biden, the actual vote margin favored Trump by a median of an additional 2.6 points. (2) In states where polls favored Trump, Trump did better by a lot – 6.4 points median, and increasing steeply with his vote share.

Before you get all excited about that larger number…I should point out that it’s a known phenomenon in landslide states. Pandemic effect? A long time ago, I pointed out that states with lopsided margins tend to have polls that understate the winner’s margin. (See “The exuberance of likelier voters,” November 12, 2008.) In this case, the larger bonus in strong Trump states might have arisen from lower enthusiasm for voting among Democratic voters who were aware that their votes were not influential. If true, it’s interesting that there isn’t a similar effect visible yet in strong Biden states (though we don’t have enough data in IL/NY/NJ/CA yet, so it may yet show up).

This steeply-increasing error could be a pandemic effect. Democratic voters were more public-health-conscious. It would be worth going back and surveying voters in red states to see if Biden likely-but-didn’t-voters were deterred by the disease risk. One could test this by seeing if systematic errors were smaller in deep-red states with a lot of mail-in voting.

This may have mattered quite a lot in Senate races, where the polling errors were quite large: a median of 4.5 points, the largest such error in the last 20 years. I note that Senate candidates’ vote share was more like the Trump vote share than their own polling:

I think there is more to this Senate story, but will leave it alone for now.

Among states where the voting margin is smaller than 10 points, the largest polling errors occurred in Iowa (7 points), Florida (6 points), and Wisconsin (7.5 points). Errors were smaller in Texas (3.3 points), Michigan (4.5 points), Virginia (3 points), and NH/MN/CO/ME/NM (less than 3 points). The following key states don’t have enough counting to get a good estimate: OH, GA, NC, AZ, PA, NV. They seem unlikely to change the overall pattern.

Difficulty of sampling non-college and Hispanic populations? For purposes of following elections using polls, the other thing that needs explaining is the 2.6-point error. Exit polling shows that Trump gained support this year among non-college and Hispanic voters [GA] [TX] [WI]. Those groups are known to be hard to reach. Maybe the ones who responded to surveys were not a representative sample of those groups.

Or, to put it another way…these aren’t monolithic groups. It would be a good idea to figure out how to stratify within each group and identify different groups. That might possibly be done by asking more questions about news engagement, social connectedness, and so on. It might not take long and one could even identify questions that don’t poison later answers, a known problem in polling.

Shy…or just undecided? Another possibility is so-called “undecided” voters. They were thought to be a factor in 2016, but there were fewer of them this year. This year, they tended to vary inversely with Trump support, i.e. if undecideds went down by a point, Trump support went up. It is as if some Trump supporters were unaware of their own leanings. In 2008 I wrote about the cognitive science of how one could be unaware of one’s own preference.

Tags: 2020 Election

57 Comments so far ↓

  • Peter Norvig

    Sam, could this be differential nonresponse of Trump voters who don’t like answering polls, either because they come from the “fake media” or because the voters have low social trust?

    • Sam Wang

      Maybe, though I am somewhat reluctant to ascribe the exact reasons. To put it in a more boring manner, demographics that favor Trump (noncollege and now, to a greater extent, some Hispanic voters) are known to be hard to reach. It could be that the ones who are reached aren’t representative of those groups.

    • 538_Refugee

      Maybe a large percentage is ‘cell phone only’ and, like me, don’t bother answering unknown numbers.

    • AP

      I think pollster weigh their samples according to major socio-economic factors, and education was added after the 2016 “surprise” by most pollsters, certainly the best ones. Race is traditionally part of these corrections. Of course very large weights for hard to reach groups increase the variance of the estimates, but in a well-understood way. So if there is a non-response answer to this polling error, it may require considering other factors. Eventually, there may not be a reasonably small number of socio-economic factors that allow to predict the non-response rate of supporters of one side or the other.

  • Antonio Soza

    This data could be used by any partisan hack to implicate one side or the other. It was clear post-2020 election that Trump and GOP pundits said that polling was intended to suppress the Republican vote when, in fact, a obvious and clear victory for Biden would actually convince Democrats to stay home because it’s in the “bag.” And also would encourage or spur Trumplicans to vote like never before.

    • Sam Wang

      I would word that slightly differently.

      In strongly partisan states, the losing party doesn’t bother to turn out as much. I reported that in 2008. From that point of view, the question is why that pattern stopped working in strongly Democratic states. Maybe it’s like what you say, i.e. Trump supporters turn out as if they are the dominant party wherever they are.

    • Joseph Bland

      “Trump supporters turn out as if they are the dominant party wherever they are.”

      Really well put. And it makes total sense to me. There’s a deep lesson to be learned here by the Democrats….

  • Joe Leahy

    When you speak of “polls” being “off,” in terms of “Trump overperformance,” you are referring to the percent difference between Biden & Trump being wrong, right? But what happens when you just look at whether the polls are off with respect to the performance of one candidate, individually? I am more familar with 538 and RCP averages, but yours seem to show the same general trend: in many states that ended up being more pro-Trump than expected, the Biden numbers are not really that far off; rather, the Trump number is low and the “other” category is bigger than it turned out to be. (That is what I see overall, but if it does not comport with the actual numbers, please feel free to correct me.) Doesn’t that suggest that the gray pie slice of “other” is largely where the answer to the “error” lies?

    • Sam Wang

      Maybe. During the campaign I did see evidence that undecided voters and Trump support varied inversely to one another. That is, for every point that undecideds went down, Trump support went up. It was as if some Trump supporters were not fully aware of their own leaning. Didn’t publish it, seemed a bit weak.

      I specifically think the shy-Trump idea is wrong. See this piece I wrote in 2008:

    • Joe Leahy

      Thanks for your reply, Sam. Even if the “shy Trump” thing is not the crux of the disparity, when experts talk of polls being “off,” do you know why they are so focused on the margin rather than the absolute numbers? I have seen lots of discussion how the margins were off, but little mention of the fact that the Biden numbers are mostly right and the problem seems to lie in the difference between the Trump numbers. That would seem to be an important nuance on which to focus. (Of course, it is possible I have the polling averages wrong, since I did not do a comprehensive review. But I checked for a bunch off swing states and the Biden numbers all seemed extremely close.)

    • James McDonald

      You raise an interesting point. Given the historical record of many polls it should be relatively simple to correlate day-to-day swings in the undecided vote with contemporaneous swings in the D and R votes. (In an extreme case, if U increases by the exact amount R decreases, while D stays the same that is 100% correlation with R. In the more usual case where all three fluctuate the correlation requires more work to tease out.)

      Those correlations could then be used to estimate how to divvy up the undecided vote to get a heads up D/R comparison. I suspect that would produce results much closer to the final tally than ignoring the undecideds, asking undecideds to choose, or using some seat-of-the-pants allocation.

    • Sam Wang

      Did that. Basically the undecideds and Trump traded off. Consistent with the idea that “undecideds” were nonverbalizing Trump voters. Note this is different from the shy Trump idea.

  • Pechmerle

    Landslide state effect: Reminds me of the adage familiar to ag commodities traders. Big crops get bigger; small crops get smaller – from USDA forecasts in the spring before the planting season to actual amounts harvested in the fall. A known way to play the options markets.

  • Jeremy

    Could this be a function of GOTV dynamics? Republicans went the traditional route of feet on the street, knocking on doors. Democrats encouraged early voting including by mail, the traditional route being too off-brand in a pandemic. Could the result reflect that good old-fashioned arm-twisting is still the best use of campaign resources?

    • Pechmerle

      I think this is a very persuasive point. I did phone banking, and even though most of those called were registered Democrats, many of them just didn’t like the interruption of a cold call.

      In my town, when a person comes knocking at my door for a local candidate, I tend to be impressed by the real support it indicates for that candidate that someone would walk the neighborhood for them. This is especially so if the door-knocker is a young person (as they often are).

  • Jeffrey Davis

    Is the accuracy of polling in predicting the outcome closely linked to paper ballots?

  • Alex

    Given your finding on undecided voters, do you think “reluctant” Trump voters plays a factor?

    Maybe many conservative-leaning people had reservations about Trump or voting Republican for most of the cycle, but towards the end, they ended up coming home.

    Similarly, my guess is that there was a lot of wavering soft support in states like South Carolina that ended up reverting to partisan baselines. Hence polls showing a close race, but Graham winning by double digits.

  • Joe

    I heard Dan Pfeiffer state that a close look should be made of exit polls, which will have very large Ns. His concern is these may show similar skewing with respect to the actual vote making polling unreliable until corrected. As Peter mentions above, there may be a large cohort ‘distrustful’ voters, ones who distrust government, media and pollsters. Presuming they strongly favor Trumpism and refuse to participate in polling, the statistical analysis will be flawed since they are not measured in the ‘assay’. If true, I wonder if this will last after Trump is no longer President.

  • BD

    What about Senate races? Gideon led in Maine, but it wasn’t even close.

    • Sam Wang

      Part of it may have been a late swing, partly having to do with her vote against Amy Barrett’s confirmation to the Supreme Court.

    • LondonYoung

      Four polling operations took a swing at Maine in the last month. Each asked about both Prez and Senate. They got president right (only +1.5% more Trumpy), but they missed senate by 13.3%. So, 12% of voters flipped to Collins because she voted Nay on just the last of Trump’s three nominees? Maybe …

    • LondonYoung

      (really, 6% of voters would have needed to flip their position)

    • Kenneth O'Brien

      We have rank choice voting so it likely was a bit closer than indicated. Probably a VERY high percentage of missing 7% between two major parties would have gone to Gideon if Collins did not break 50%. So final result most likely 52/48% Still a big miss but not a 13% miss.

  • Oliver

    Did you consider that maybe the polls actually did reflect voter intentions correctly but votes did not because of the voter suppression strategy used by the GOP ? This would certainly explain why the difference between polls and votes gets bigger in redder states where voter suppression is easier to implement.

  • Pierre Lebel

    I am convinced that this is again a very bad case of GIGO. Garbage in garbage out. The huge reduction of landline phones, people that answer surveys as a joke, in other words pollsters not reaching the right crowd, and the ones they are reaching not being truthful for a number of reasons, (fake news, distrust of the establishment, the Facebook & Twitter echo chamber effect) you name it . Thanks for doing al of this, Sam, but I think it’s time to find a new way to get the pulse on the general population.

    • Sam Wang

      It’s not what I do, mostly, but thanks. As most people know I am focused on electoral reform.

      The polling stuff has value – it was on target in 2018 and 2019. See my Columbia Journalism review article. I see your negativity as being overblown.

    • Amitabh Lath

      I agree with Pierre wholeheartedly.

      If you are constantly recalibrating your instrument after each measurement, your instrument is broken.

      There seems to be a large mass of undetectable voters. They do not interact in the Standard ways: they don’t answer phones, or internet polls, and are not part of any UCLA/USC panel…but they do vote. That is the only visible indication they exist.

      Sometimes they are distributed evenly, and polls work (2012!). Sometimes they clump in one direction (2016, 2020) and throw you off. Sometimes they clump in Florida but not Minnesota.

      Maybe we don’t need a Kepler to rework the whole cosmology of voter preference estimation. Maybe
      we need a Vera Rubin who can tease out the signatures of this elusive group.

    • AP

      Statistical estimates have always some level of error. If that makes them “useless”, “a joke” etc is normally not decided by opinionated people commenting on blogs, but by those who commission and act on those estimates. Polls are an essential tool of contemporary politics. Candidate time, on-the-ground operations, ad money are all carefully optimized based on polls. If you have a better way of doing it, there is a lot of money to be made (or glory to be achieved, if that is more your thing).

  • Matthew J. McIrvin

    How complete are the results? Do many of these states have a large chunk of mail votes out, that we aren’t paying much attention to because there’s no chance of them flipping the presidential election?

  • Marc

    Not every pollster was bamboozled this cycle. J. Ann Selzer, in a poll for the Des Moines Register last week, seems to have predicted the result in Iowa pretty well. This NYT article says she avoids the use of assumptions from prior years. Maybe we could all learn from her methods.

    Does anyone have more insight into how she conducts her polls? Is this sort of method more expensive or time-consuming than the more common process?

    The following is a quote from the above link.

    Not every pollster fared poorly. Ann Selzer, long considered one of the top pollsters in the country, released a poll with The Des Moines Register days before the election showing Mr. Trump opening up a seven-point lead in Iowa; that appears to be in line with the actual result thus far.

    In an interview, Ms. Selzer said that this election season she had stuck to her usual process, which involves avoiding assumptions that one year’s electorate will resemble those of previous years. “Our method is designed for our data to reveal to us what is happening with the electorate,” she said. “There are some that will weight their data taking into account many things — past election voting, what the turnout was, things from the past in order to project into the future. I call that polling backwards, and I don’t do it.”

  • clayton

    Would you be willing to label the circles in this plot? It would be much more readable. Thanks!

  • Nff

    I do truly think you gentleman are avoiding the major issue: abortion. Most white women & Catholic POC are now against abortion. They may not have loved 45 but they loved Barrett. While they may have starting drifting away from 45 over the summer, Barrett brought them home in the last week.

    I doubt most of you in this thread know or talk to many “average working mom” women & so I think you have to reflect on how this discussion ignores & underweights them. Best wishes!

  • Jones Murphy

    I do truly think many are avoiding the major issue: voter suppression. The USA is one of the world’s most aggressive voting rights abusers. The chief targets of this are black people. Supreme Court Republicans threw out most of the Voting Rights Act(one of Dr King’s most enduring legacies) back in 2013. The reddest states(i.e. the most racist) promptly seized the opportunity to return to their despicable practices of voter suppression in numerous ways. This means big mismatches between polls and election outcomes.

    There’s no “Catholic POC” vote. Black Catholics vote like black people. Asians and Latinos are much more conservative than black people. Right-wingers shifted to abortion as their excuse when openly pursuing segregation was no longer acceptable:

  • Amitabh Lath

    No wait! The major issue was *searches notes…* immigration/ undocumented workers. Yeah that’s it. All you highly credentialed people do not realize the plight of someone in a zero-threshold-to-entry job who is counting on cutting the lawn or patching the roof of the bourgeoisie in order to make rent. This is why you got older immigrant communities (Hispanics, Vietnamese) and young African Americans moving towards Trump.

    Ok, sorry for the snark, but enough of these just-so stories that fit some data-free notion of what is going on. Without numbers (more likely a matrix with many many off-diagonal terms) we got nothing.

  • David Angell

    Ms. Selzer (and Trafalgar group) aside — and it would certainly seem that their methods and accuracy are worthy of discussion — predictive modeling based on polling came out of this with a lot of questions that need to be answered. Unless, as Frank Luntz suggested, this miss is the death knell for political polling.

    Whether it was an effect of the low social trust of Trump voters, other Trump-specific effects, or intrinsic problems with polling methodologies and vectors, the credibility of predictions and data punditry based on polling and aggregation has taken a serious blow. Anecdotally, I know Trump voters who say they would not talk to pollsters — and I wouldn’t be surprised if there are those that would lie to pollsters to advance the narrative that they cannot be trusted. Not to mention the increasing polarization of our politics and the challenges in obtaining representative samplings even when the respondents are all telling the truth.

    While it will be up to the polling industry to determine what went wrong and how they can try to fix it and regain the trust of the public, as Sam has pointed out above, this site is focused on electoral reform and how to allocate resources. With that in mind, I was wondering what the data says regarding the “moneyball” states and races.

    From a cursory look, the final moneyball states, AZ, NV, and NE2 were clearly good places for resources, but so were Michigan, Wisconsin, Pennsylvania, and North Carolina. After 2016, the importance of the first three were obvious and North Carolina wasn’t really a surprise either.

    Which brings us to Georgia, which seems like a miss. The way things stand now, it looks like Georgia has the lowest number of votes to change the outcome per electoral vote. Will you be doing a final “voter power” ranking based on what actually happened once the votes are certified? In retrospect, was there any indication in the data of the importance Georgia would play or was the noise across the map just too high to have predicted it with any confidence?

    In any case, big thanks to Sam and his team for all of their hard work this cycle! I especially appreciate articles like this (and more that I am sure to follow) trying to understand what went wrong and how to improve the methodology and better interpret the results (and the uncertainty in them) in the future.

  • Sam Wang

    The error is notable by historical standards – like I said, 2.6 points in places where it mattered. I am inclined toward problems like noncollege voters, Hispanics, and undecideds. It doesn’t minimize the problem, but we don’t have to invoke “lying to pollsters” until there is evidence for that.

    The bigger error was in Senate races. That is quite something, will show that later.

    If you have something substantive to say, go ahead. But complaints about how you’re not coming back to PEC…do the graceful thing and go quietly! Set an example on how to leave gracefully.

    OK, I will go back to chewing over what this means for the future of redistricting and electoral reform…

    • Marc

      As part of what you suggest above, would appreciate your thoughts on where pollsters got things right (e.g. J. Ann Selzer), and what alternatives to polls might provide more reliable predictions. I’m thinking here of analysis (e.g. machine learning) of social media traffic, web searches etc. Big Data.

    • Amitabh Lath

      A lot of movement among non-college, Hispanics (also Native Americans) etc. seems to be relative, but miniscule in absolute terms. And even if these are found and “fixed”, they will be the 2020 misses.
      Who knows what new subgroups will form and lead everyone astray in ’22 and ’24?

    • 538_Refugee

      “But complaints about how you’re not coming back to PEC…”

      You/we are working with available data. I’m pretty sure I’m on record here BEFORE the actual election in 2016 saying a miss would be on the pollsters. One famous site that corrects for pollster error was just as off.

      I’d have to believe any concerted effort to mislead pollsters would have shown up on social media sites. I think the draw of Cult Trump was simply missed.

    • Amitabh Lath

      Refugee, anyone who measures things for a living will tell you, you cannot measure to the micron with a tape measure.

      Estimating systematic uncertainties is the hardest thing to teach students. After all, if a fancy electronic gadget is giving you voltage or resistance etc. to 5 significant figures, why wouldn’t you believe it? Getting them to distrust numbers on a screen is difficult.

      Polls probably have a 4% (?) systematic uncertainty due to turnout modeling assumptions. Systematic uncertainties need to be estimated, their correlations understood, and added properly. I do not believe even the A+ pollsters on your namesake site do this.

    • 538_Refugee

      “After all, if a fancy electronic gadget is giving you voltage or resistance etc. to 5 significant figures, why wouldn’t you believe it? Getting them to distrust numbers on a screen is difficult.”

      I had this discussion on the Arduino forum with someone that wanted to build an instrument capable of reading nanometer level changes on a 400 Volt input. I like to tell people that engineering is where science meets reality.

  • Stephen Huegel

    Just a thank you to the PEC. I perceive that a GREAT deal of time and thought went into the many months of effort. Hard to imagine, Dr. W, that this isn’t your “day job”! Again, my thanks for your effort, intent, clarity of thought, etc.

  • David Elk

    The past 2-3 cycles I’ve tried to be involved downballot through ActBlue – I live in an area with no interesting races. But now I’m questioning whether I’ll ever do this again. The downballot polls were bad enough to not be useful indicators of a race’s competitiveness. Senate races that appeared to be within the MoE were called almost immediately on election night.

    Relatedly, another thing I’m struggling with is whether being involved in other state’s races is actually useful. Out of state donations are starting to not feel like a net positive, especially if I flip it around and think about my own local races. It almost seems icky – like how it felt in CA when Prop 8 was funded by external groups.

    Does ActBlue cause bad candidates to win primaries? Can a bunch of progressive money force the wrong candidate into a general election? Does out of state money affect candidate behavior or how the candidate is seen locally? Does the money all go to ads which are probably not super effective? Does funding even matter?

    My gut feeling after this year is that I’d much rather be involved with GOTV somehow instead.

    • Pechmerle

      There is no question at all that the best GOTV work is door-to-door in person. Some people do go to a nearby state to participate in that. But especially here in the West, the distances are mostly too great. So we seek other ways to be helpful from out of state.

      To do so though, we do need good metrics that inform of us where most effectively to employ our resources (donations, phone banking, post carding, etc.)

      Worth reminding ourselves that this site’s Moneyball places for Pres. – AZ, NV, and NE-2 – were close, came out right, and would by themselves (no PA, no GA) still have got Biden to 270 on the nose.

      I do strongly agree with Amitabh that the polling industry has to dig deeper and see if it can great reduce the systematic error that they seem to encounter just when Trump is on the ballot. (He won’t be in 2022, but he might be – horrors! – in 2024.)

  • Dan


    Can you explain why a median of the polling *errors* across the final polling averages of all contests is a legitimate measure of polling effectiveness?

    Wisconsin was off by about 7% – not one outlier among Wisconsin polls, but the *average*, that was basically solid for months. If the scorecard for the polling endeavor is 2.6, it seems to indicate that those trying to base strategy on polls should afterwards ignore the whole Wisconsin race (its average with its 7% error) as an outlier.

    You also seem to indicate that the sign is somehow instructive–if the winner of Wisconsin happened to be correct because Wikler ignored the polls and managed to eke out .6% in a withering environment the polls said was highly favorable, it’s not clear why this directionality, on a 7% error and well within the margin of error of being wrong about the winner, should in any way count as a plus for the polling endeavor.

    Essentially we’re saying that we can expect to randomly have to disregard certain races when assessing polling after the fact, with every chance that one of these “outliers” decides the presidential outcome in the opposite of the indicated direction.

    Another way to put it: once the worst-case polls in state races–especially when sustained over long periods–have been averaged out, why isn’t the worst-case error in the average poll for any state race (here, say, Wisconsin’s 7.5%) the measure of how useful polling is for planning resource allocation?

    For comparison, vendors of market data tend to tout their average latency, but customers, once they experience the worst-case latency, find this, or the latency determinism, a far more relevant measure of the quality of their experience.

  • James McDonald

    If the landslide effect was well-known, why wasn’t it included in your calculation of the probability distributions?

    Wouldn’t that have significantly extended the tails in both directions?

    (Not trying to be harsh — just genuinely curious.)

    • Sam Wang

      No, it would not. That effect pertains to noncompetitive races. Also, the histogram displayed is a snapshot of current polls without being a prediction.

      If we were publishing a prediction, which was internally generated but never provided to you, that is done by
      which also contains a t-distribution, though in retrospect one could fatten the tails a bit there by reducing the degrees of freedom, i.e. tcdf(*,1) instead of tcdf(*,3).

      Again, this is not a prediction site. See this CJR article. If you’re a sports-fan-like person looking for a focus on predictions, there are other sites for you.

    • James McDonald

      Fair enough, and I understand you are just trying to summarize the polls, not predict, but the entire point to the polls is to offer a reasonable prediction of how the election will play out. (Otherwise, why bother?) They are trying to sample the population in a manner consistent with the way the election will sample the population.

      So if there is a known effect that biases their sampling, it would seem to make sense for the purpose of aggregation to add that uncertainty to the nominal uncertainty induced by their sampling size, etc.

      Not sure what I’m missing here. If you know (or even strongly suspect) that a series of polls has a significant source of systematic error, it is mis-informative to not clearly show that. Abstractly, a strongly suspected bias should be incorporated as an adjustment to the nominally stated values and/or the uncertainties associated with individual polls. We shouldn’t naively put a finger on the scales, but we also shouldn’t naively look the other way if we believe something else has a finger on them.

  • James McDonald

    Eyeballing polling trends over the past year for various states, there seems to be a pattern in which Trump’s final vote percentage is a much closer match to his maximum over the year, as opposed to the average or any trend line.

    The same does not seem to apply to Biden’s numbers, which seem to track the average or trend lines better.

    Not sure what to make of that, but maybe someone else could.

  • Jorge Payne

    Perhaps polling limitations will always prevent accurate forecasts, especially since divining between likely and not-likely voter is a crapshoot.
    Why not focus on more non-related yet correlated factors like NASCAR following, NPR subscriptions (the NASCAR-NPR factor?) as well as meta-data like hemlines, Stock Market, and superbowl winner. Maybe we’ll get better results with those other factors tossed into the mix.

    • Sam Wang

      The problem there is one of overfitting. You have to throw in the right number of variables or else you get something that is wildly wrong when the next election comes along. Good idea to do *something* though.

      I have something brewing at a newspaper, will be out soon. Also see Dylan Matthews’s interview with David Shor, which is along the lines of your suggestion.

    • Emigre

      Four years ago Dr. Wang made an interesting and somewhat detailed comment on “non-related yet correlated factors” after he was pointed to the use of Google-wide association studies:
      Unfortunately this idea, which appeared to produce quite accurate results, petered away and wasn’t discussed further.
      High time to take another look at that, Sam, or are you inclined to dismiss it as a red herring?

  • Marc

    Interesting article by Nate Cohn in the New York Times about some early theories about what might have happened. Too early to be definitive, but at least he outlines some areas for future investigation.

  • ArcticStones

    Is it incorrect to view these discrepancies as a general failure of Democratic GOTV efforts – and a success of Republican GOTV drives?

    Speaking of which, it might be interest to correlate the two, although I am not sure how you might quantify GOTV efforts…

  • xian

    I think the nuances among undecided, cognitively unable to verbalize, and shy are semantic ones primarily. Less committed Trump voters were likely well aware of the criticisms aimed at his supporters and felt less comfortable acknowledging their lean before they had to.

    Maybe the reverse would happen if the Dems nominated a socialist.

  • Joseph AP

    A somewhat related question for Prof. Sam. What is your opinion on Prof. Allan Lichtman’s prediction model ? He seems to be accurately predicting the next President always. There is a good discussion on this model here ( Even the popular vote % calculation seems to be close.

Leave a Comment