Princeton Election Consortium

A first draft of electoral history. Since 2004

What data got right in 2016 – and what’s ahead for PEC

December 20th, 2016, 2:25pm by Sam Wang


Harry Enten points out that areas surrounding Ivy League schools voted predominantly for Clinton. He concludes that these are bubbles. I think there is something more in these numbers.

Undeniably, academics tilt liberal, as do the communities they live in. However, additional forces were at work in 2016. White college-educated voters swung away from the Republican Presidential nominee, by double-digit percentages. The numbers above reflect that. In addition, think of the fact that elite universities are institution-oriented. That is, they favor the existing order: meritocracy, a rule-based society, and governmental/private organizations that remain stable over time. Those values are conservative – but they also cross party lines.

This year, the Republican nominee promised to upend that order. His personal actions, and those of advisers such as Stephen Bannon and Michael Flynn, do not count as conservative by any usual definition. Today, conservative columnist Michael Gerson points out that while the Republican Party is at its zenith, conservatism is at a low:

…what is the proper conservative response? It is to live within the boundaries of law and reality. There is no certain way to determine if Russian influence was decisive. And no serious constitutional recourse seems to remain. While open to other options, I see none. It will now fall to citizens and institutions to (1) defend the legislature and judiciary from any encroachment, (2) defend every group of people from organized oppression, including Muslims and refugees, (3) expand and defend the institutions — from think tanks to civil liberty organizations — that make the case for a politics that honors human dignity. And pray for the grass to grow.
 

Indeed, signs of instability to democracy have been brewing in multiple Western nations, as documented by Roberto Foa and Yascha Mounk (original article PDF here). In addition to conservative voices, liberal voices have also pointed out the risk; see this essay by Paul Krugman, “How Republics End.”

>>>

Usually, PEC would close down after the election for two years. But this year I’ve heard from many of you about your continued appetite for data-based analysis. More than ever, data is necessary to understand public life. Here are some examples of what we learned this year:

That is just the analysis done here – there was also much excellent work done at FiveThirtyEight and The Upshot.

The failure was in the general election – and even there, polls told us clearly about just how close the race was. The mistake was mine, in July: when I set up the model, my estimate of the home-stretch correlated error (also known as the systematic uncertainty) was too low. To be honest, it seemed like a minor parameter at the time. But in the final weeks, this parameter became important.

The estimate of uncertainty was the major difference between PEC, FiveThirtyEight, and others. Drew Linzer has explained very nicely how a win probability can vary quite a bit, even when the percentage margin is exactly the same (to see this point as a graph, see the diagram). At the Princeton Election Consortium, I estimated the Election-Eve correlated error as being less than a percentage point. At FiveThirtyEight, their uncertainty corresponded to about four percentage points. But we both had very similar Clinton-Trump margins – as did all aggregators.

For this reason, it seems better to get away from probabilities. When pre-election state polls show a race that is within two percentage points, that point is obscured by talk of probabilities. Saying “a lead of two percentage points, plus or minus two percentage points” immediately captures the uncertainty.

Even a hedged estimate like FiveThirtyEight’s has problems, because it is ingrained in people to read percentage points as being in units of votes. Silver, Enten, and others have taken an undeserved shellacking from people who don’t understand that a ~70% probability is not certain at all. Next time around, I won’t focus on probabilities – instead I will focus on estimated margins – as well as an assessment of which states are the best places for individuals to make efforts. This won’t be as appealing to horserace-oriented readers, but it will be better for those of you who are actively engaged.

>>>

State polls aren’t the only thing that failed. On a larger scale, journalists failed to see the Trump phenomenon coming, and did not take him seriously as a disruptive force. Now the failure takes a new form: an inability to see that if false statements compete on equal ground with truth, the rules have changed. Nowhere is this better evidenced than by this photograph of an off-the-record feasting-and-ritual-humiliation.

As CNN’s Brian Stelter has written, this pairs oddly with Trump’s continued assault on the media. Columbia Journalism Review’s Kyle Pope has written that a “new aggressiveness” is needed when covering politics. Contrary to the coziness shown in the photograph, now seems like a time to redouble the use of fact-based discussion – and to call out inaccuracy.

>>>

This all leads to a question of what to do in the months ahead. The risk to institutions does not allow the luxury of waiting for two years. I plan to use the Princeton Election Consortium as a forum for data analytics in the public interest. The general goals are to understand where we are today, and to identify ways in which individual efforts matter, not in 2020, not in 2018, but right now. For examples, see the right sidebar, which lists some ideas.

I’m looking for partners in this endeavor. All are welcome, and all political persuasions. One common theme is that quantitative analysis and facts will serve as a starting point. Sometimes we will focus on norms of society and government, though that is a domain where external expertise would be helpful.

I hope students and colleagues will be part of this effort. Postings will be less regular than during the election season, but we still aim to bring you good work. I hope you will stay with the Princeton Election Consortium.

Tags: 2016 Election

21 Comments so far ↓

  • Emigre

    Picking up again on your question “what to do in the months ahead” I wish for PEC to focus also on the one-person-one vote doctrine (http://www.theconstitutionproject.com/portfolio/one-person-one-vote/) and specifically on strategies to modify the Electoral College. In addition to the much discussed “National Popular Vote Interstate Compact” there are more recent proposals that deserve consideration such as the ‘electoral vote equivalents’ methodology of Kaplan and Barnett: http://som.yale.edu/news/2016/12/prof-edward-kaplan-proposes-cure-for-the-electoral-college

  • Bulgakov's Cat

    Dr. Wang… i dont think u appreciate the gravity of the situation. This election represents the end of republican participation in liberal democracy.
    The single most important finding post-election is the rejection of Trumpist republicanism by college graduates.
    Here is another poll that was wrong–
    http://election.scholastic.com/vote/
    Consider: by 2020 roughly half those students will be eligible to vote.
    There are ~ 20 million new college freshman every year.
    So the hispanic death-curve (where increasing hispanic curve crosses the declining white curve) is advancing on the GOP, while the non-college educated are dying off and more college-educated (liberal inclination) kids are entering the electoral pool.
    Unless the GOP changes the rules– they are staring down a forever future of losses.

  • gumnaam

    What about assuming that the full undecided percentage could tilt just one way? After all, if there were still undecided people on Oct 27 after everything Trump said and did, they were ripe to flip to him anyway.

  • Matt McIrvin

    What I would like to see is a revised analysis of your history of regional realignments and their principal components, with the final 2016 results.

    Something was clearly going on in the upper Midwest–but was it truly novel, a regional mini-realignment, or just a continuing evolution of an existing trend, exacerbated by a small nationwide swing against the Democrats (which the realignment analysis subtracts out)? I’d been thinking that region might be trending red for some time, but didn’t expect it to swing so hard so soon.

    Meanwhile, it looks to me as if the secular trend of increasing Democratic strength in the West continued.

    • Sam Wang

      The correlation between 2012 and 2016 was 0.952, quite high. I don’t think of nationwide waves as being the key parameter, but how much a state is above or below the national average.

      If you rank the states in order of vote share (see Wikipedia), it seems that Ohio and Iowa moved toward being more Republican than average. Arizona and Georgia may have moved toward Democrats.

      Generally I agree, it is interesting to re-evaluate issues that I’ve written about before, specifically state-by-state polarization and partisan gerrymandering. Anything else?

    • Matt McIrvin

      So, probably best seen as a continuation of existing trends, then.

      The other thing I can think of is the history of Meta-Margin volatility and how it relates to polling error–obviously 2016 broke a relation there that had held for several cycles, being low-volatility but with higher polling error. But is there more to say from a retrospective analysis?

  • Emigre

    Since you plan to use PEC as a forum for data analytics in the public interest perhaps you could consider partnering with other high quality data driven research sites such as the Pew Center to gain direct access to their data.
    For example, many here – me included – favor the elimination of the Electoral College. But the question was raised whether this will increase polarization. Equally valid is the concern that it could lead to a majority suppressing minority. The ongoing discussion at the Pew site provides a useful snapshot:
    http://www.pewresearch.org/fact-tank/2016/12/20/why-electoral-college-landslides-are-easier-to-win-than-popular-vote-ones/

    Considering the onslaught of fake news a second suggestion is to provide a link to one of the reputable fact checking organization such as PolitiFact or FactCheck.

  • Mark Jordan

    Dr. Wang:

    While I have truly appreciated your solid analysis and especially the creative use of statistics you showed in the primaries, I do feel you have a blind spot which may have contributed to your missed final prediction. You decry Trump’s assault on media and truth but have no mention of the same over the past eight years. Here’s one example, pulled from a Yahoo article:

    “The Justice Department spied extensively on Fox News reporter James Rosen in 2010, collecting his telephone records, tracking his movements in and out of the State Department and seizing two days of Rosen’s personal emails, the Washington Post reported on Monday.”

    Did you ever report on Jonathan Gruber’s comments that indicated the Obama administration had to deceive the American people due to their stupidity? We all know of Obama’s famous line, “If you like your doctor you can keep you doctor,” but somehow this escapes the analysis of those who think Trump represents some kind of new threat on truth-telling.

    I do hope you allow my post to be published. I am truly impressed with your work and fully understand how this election threw a wrench in even the best of models and analyses. I only wish to show you a bit of what I think that wrench looked like.

    • Sam Wang

      Actually, your examples prove the point. Those examples are (a) basically a domestic wiretap, and (b) the kind of loose talk that lower-level people engage in. In particular, Gruber’s remark is understood to be a loose comment that is not thought to have been influential. If those are your best examples, they are not convincing on the grounds that you had to scour years of records to find them, and two of them originated from lower offices.

      It is a different matter for lies to come from the top, repeatedly. Recall that a factchecking organization (Politifact I believe) places Donald Trump as an extreme outlier in the sheer number and audacity of his falsehoods. There’s nobody close.

  • Perry

    It’s the turnout modeling which is at issue. The polling companies seemed to be expecting 2012.

    Also, it is possible that GOP voter suppression may have had a stronger than expected effect.

    And finally, a huge clue was the polling in Iowa and Ohio. If Iowa leans red, then Wisconsin will likely also be more red. Same with Ohio vs Michigan and/or Western Pennsylvania.

    Thanks for great site, Sam.

    • Matt McIrvin

      There was a regional discrepancy that covered the whole Great Lakes region. It’s a really distinct blob on the maps.

      Also, many polls incorrectly had Hillary winning NC and Florida, but these were very close and the discrepancy was small there, really not too unexpected.

      The great success of state polling was in the West. That’s the dog that didn’t bark. Everyone was looking at Nevada and Hillary carried Nevada, just like the aggregates said. Colorado and New Mexico too, but not Arizona, and that was consistent with the polls.

  • Periwinkle

    Do the data available provide any advice on how much money a campaign needs, and how to spend it? Spending millions didn’t seem to achieve much in the Presidential primaries or election, but as you pointed out many times, both candidates were among the best-known living Americans before the election even began. So, I’m sure the Senate and House races would provide much more meaningful numbers, if they are available.

    I’ll be honest – I have a pre-existing bias here. What I’m hoping to hear is that TV advertising is a waste of time and money, and therefore that fundraising for the money to pay for same is also wasted effort. But if the data say my biases are wrong, at least I’ll know what the right answer is.

  • pechmerle

    Here is a data-oriented question that is beginning to surface in policy arguments. Suppose that we did succeed in doing away with the Electoral College, and went to direct popular vote election of the POTUS. Would that decrease, or instead, partisan polarization in the electorate?

    The argument that it would increase polarization seems to rest mainly on the notion that today’s EC system forces the candidates to fight for critically important votes of swing voters in battleground states. The (under-articulated) premise is that these swing voters are more centrist than voters in such liberal strongholds as NY and CA.

    Does the data support this argument for the EC or not? Can PEC &/or its readers offer quantitative insight on this topic?

  • pechmerle

    Keeping the “Saving U.S. Institutions” top right sidebar always present is very much the right thing to do.

    I’d suggest that you add a link there to the Brennan Center for Justice at NYU, which does great work in voting rights litigation, fighting the corrosive effect of big money in our politics, government and judicial reform, and several other core topics in defense, and advancement, of democratic institutions.

    A key example from their current reporting:
    http://www.brennancenter.org/publication/florida-outlier-denying-voting-rights — showing the Jim Crow origins of Florida’s laws suppressing voting rights, and its extreme effects.

  • Jeremiah

    I had a conversation over at Brad Delong’s blog before the election about how the probability of Trump winning seemed very small. They indicated that the polls could be off and if they are off they are very likely to be off in the same direction (correlated obviously.) I could see if this was true that the probability of a Trump win is therefore approximately the probability of the “tipping point” state going to Trump. This still seemed fairly unlikely (at about 538′s level ~25 percent) but it did seem more likely than other aggregators had it at. Surely this type of correlation could be built into the model? I would be willing to give it a go although I’m not great at Matlab programming!

    • Sam Wang

      That’s what the systematic error is. It just involves making a larger estimate of the error.

      Or do you mean introducing it at a regional level? That would involve knowing state-to-state relationships, which is what they do over at FiveThirtyEight. I do not think the detail adds much when it comes to bottom-line probability, though it does seem to help with individual-state estimates.

    • Jeremiah

      I’m not sure we are talking about the same thing. When you were discussing the systematic error months before the election you settled on a standard deviation of 3 percent based on the historical evidence.

      To a first order approximation this 3 percent could be applied uniformly across all states and I’m sure your probability of a Clinton win would have been much less than >99%.
      To a second order approximation one could build a model of state to state correlations and apply those to the swing. I’m thinking there could be a 50×50 array to model this. The model could be developed either using fundamentals like demographics and geographical proximity. Or, one could go back through history and find the state to state variability and find it (or a combination of both.)

    • Sam Wang

      The systematic error, which is synonymous with nationwide correlated error, was 3 percentage points then. I set it to decrease over time, which is why the Clinton win probability increased over time. In other words, I am suggesting that a uniform swing across all states would be a better approximation, which is your first suggestion.

      The second suggestion is not necessary for purposes of making a national-outcome prediction. It is also duplicative of what the FiveThirtyEight people do.

  • Patrick T Cronin

    First and foremost, the blog is great.
    Frankly, we all got it wrong; but that’s not the point. What can we learn from the last few months. Perhaps we should consider other sources of data, e.g. twitter or facebook. It has taken us 68 years to get an opportunity like this. Let’s not waste it.
    IMHO,
    Pat C.

  • A Essaji

    Thanks so much for carrying on. Despite getting the overall result wrong, you do us a great service by carefully, and intelligently crunching the data. As a fellow data nerd, I am glad you’ll continue confronting punditry with data.

    On a related note: I don’t think you should abandon your work on the meta-margin, or the probabilities. You rightly acknowledged that Nate Silver was correct on the correlated errors. Could your model be adapted to that?

    Thanks again!

Leave a Reply to Perry (Cancel)