Princeton Election Consortium

A first draft of electoral history. Since 2004

February national polls are the best you get until August

May 22nd, 2016, 12:00pm by Sam Wang

Tuesday 5/24, 8:20am: in the comments, an interesting discussion here.

Some media types are going around with their hair on fire over two unfavorable polls for Hillary Clinton in which she lags Donald Trump. In response in the NYT, Norm Ornstein and Alan Abramowitz are trying to convince you that these polls mean nothing. Nothing, I tell you! Don’t Panic!!!

In a deep sense, they’re right. As I wrote the other day, opinion can move a lot between now and Election Day. And it is inappropriate to trumpet a single poll showing an exceptional result, which is what the news channels do.

However, do not throw out the baby with the bathwater. In fact, we can learn quite a lot from polls by extracting as much value as possible from them. This can be tricky because right around now, national polls are the least informative they are going to be in 2016. To put it another way, polls will be more informative one month from now – and they were also more informative a month ago. How can this be, and what do we really know about the Clinton/Trump November win probability? [Read more →]

→ 73 CommentsTags: 2016 Election · President

Advance Voter Registration: The King of Behavioral Interventions?

May 21st, 2016, 7:56am by Sam Wang

Pro-voting activists are constantly trying to increase the rate of voting. They often get interested in behavioral interventions such as voter contact. Successful interventions typically boost turnout by a few percentage points. More generally, the smallness of any get-out-the-vote’s effort means that I don’t have to account for it in any of the polling analysis I do. However, now there’s a game-changer: automatic voter registration.

This Tuesday, the Brennan Center for Justice at NYU held a workshop on Automatic Voter Registration: Why And How. It was a practically-focused but also public event: former Attorney General Eric Holder spoke, as well as California Secretary of State Alex Padilla. Legislators, elections administrators, and activists showed up. I was invited to analyze the brain mechanisms and political consequences of automatic registration – the only time I can recall being asked to do anything in both domains at once!

Automatically registering citizens to vote unless they actively opt out is a big deal. It could potentially increase voter turnout by close to twenty percentage points. Here I’ll publish some notes from what I said. [Read more →]

→ 38 CommentsTags: Politics

Nate Silver mostly puts his finger on what happened

May 18th, 2016, 1:09pm by Sam Wang

I think journalists have missed the point about Nate Silver’s error. Since Silver personifies data analysis, it is easy to get mixed up about what failed. As I wrote last week, the data didn’t fail – clear signs pointed toward Trump for a long time. However, Silver went beyond the data – in his words, he “acted like a pundit.” Here are his comments. The essay is long, but the title is on point. Basically I agree with points #1 (he didn’t make a real statistical model) and #4 (“fundamentals”-based models might not add that much value).

A reader asks what I think of the claim in point #3 that he was “too frequentist” and that his “Bayesian prior” of a Trump nomination should have been 10-12%. Hmmm. My first thought is that estimation of priors requires a lot of judgment. I don’t fault him for that…but he should own his estimates. To my taste, he leans too hard on political science, which relies on nice, stable trends. In a disruptive race like the 2016 GOP nomination contest, this leads to problems.

Since estimating a prior probability relies on an element of taste, it seems to be an error to cling to apparent quantification (“10-12%”). It creates the appearance of rigor, but not the substance. Revising such a number post hoc doesn’t seem constructive. This move up from 2% goes high enough to reduce embarrassment, but stays low to retain credibility. Hypothetically, if Trump had lost, would we be reading about this revision? Probably not.

Also, a purely poll-based approach such as mine, from January, worked well. Frequentist or Bayesian? You tell me. “Too frequentist” seems to be leaning hard on terminology. If Bayesian means “exercising judgment in interpreting data,” I would say he was too Bayesian. But let’s forget those terms. Basically, I was fortunate enough to notice that multi-election trends like “The Party Decides” were giving strange and contradictory answers. So I went back to polls, which were being quite clear. I say: let’s stick with polls when we can, and since modeling can be intrusive, keep it separate.

Update: Reader Kevin points out: “to support the proposition that ‘relatively few people predicted Trump’s rise,’ Silver links to an article featuring data-less opinions, with the subtext that “plainly that only cranks were confidently predicting Trump’s success in 2015 (you think we look bad–look at the other side!).” This seems like a good time to recall that data-loving media figures who saw Trump coming include Norm Ornstein, Paul Krugman, Matt Yglesias, and Andrew Prokop at Vox (in September 2015!!).

→ 44 CommentsTags: 2016 Election · President

Trump expands the battleground…to Utah and the Deep South

May 9th, 2016, 8:22pm by Sam Wang

Click the map to create your own at

Historically from 1952 to 2012, the likely range of movement in two-candidate margin from this time until Election Day has been 10 percentage points, which is the standard deviation from the 16 past elections. Therefore, even though Clinton currently leads by a median margin of 7 percent (12 national surveys) and would certainly win an election held today, she could still lose the lead, and from a purely poll-based standpoint, is only narrowly favored to be elected President in November (probability: 70%).

It is also the case that Clinton is the only candidate who is poised for a blowout. Her “plus-one-sigma” outcome (current polls plus one standard deviation) is a popular vote win of 58.5%-41.5%. Trump’s plus-one-sigma outcome is a narrower win, 51.5%-48.5%.

I should point out that the last four elections, from 2000 to 2012, have been far less variable than I have calculated above. They show a standard deviation of 4 percentage points. These have been polarized years. But considering the upheaval in the Republican Party, a little voice tells me to open my mind to a wider range of possibilities…including a Trump win.

Of course, the Presidential race is played out through the Electoral College, which is composed of winner-take-all races. [Read more →]

→ 125 CommentsTags: 2016 Election · President

Among Republicans, Trump supporters have slightly lower incomes. But what really differentiates them?

May 7th, 2016, 1:39pm by Sam Wang

First, the news clips. At The New Yorker, John Cassidy digs further into the question of data journalism vs. data punditry, and cites PEC favorably. He thinks data journalism is at its best when it isn’t trying to make predictions, but helps us understand what is happening now. I mostly agree with that, though I do think future predictions can be useful if they are transparent and put the assumptions on the table where we can see them.

Also, some long perspectives from Scott Lemieux at the New Republic and from me at the Daily News on Trump’s long odds. Based on a current margin of Clinton +7%, I put Clinton’s poll-based November win probability* at 70%. That’s not taking into account my observation yesterday that Trump’s ascent shows that “The Republican Party is broken. It probably broke slowly, from 1994 to 2014.” This is addressed in part by a long analysis piece by Patrick Healy and Jonathan Martin in today’s NYT.

Now let us turn to a recent offering by FiveThirtyEight. They gave income information on Trump voters (which is good data journalism practice!) – and then created a false impression that Trump voters are well-off (which is questionable data punditry). Let me explain. [Read more →]

→ 61 CommentsTags: 2016 Election · President

Did data journalism lose – or just data pundits?

May 5th, 2016, 10:51pm by Sam Wang

I see in today’s New York Times a column critiquing journalists on their coverage of the Republican primaries. Overall it’s a good piece, but one statement pops out: “data journalists have screwed up this year.” This comment misses an important point. The people who have come under criticism are actually a hybrid of journalist and pundit – which might be the problem.

Data-driven nerds carry the potential to give readers an unvarnished look at politics, free of hype. I think their perceived lack of success over the last six months stems from the fact that they have mixed up two roles a bit: synthesizing what they report (journalism) and stating what they conceptually think should occur (punditry). Let me explain. [Read more →]

→ 68 CommentsTags: 2016 Election

What head-to-head election polls tell us about November

May 1st, 2016, 3:08pm by Sam Wang

Note: I have found a problem with this calculation. The national-poll-based Clinton win probability is closer to 70%. I will have an update and explanation soon.

General-election matchup polls (e.g. Clinton v. Trump) started to become informative in February. In May, they tell us quite a lot – and give a way to estimate the probability of a Hillary Clinton victory. [Read more →]

→ 68 CommentsTags: 2016 Election · President

Indiana may not matter any more

April 28th, 2016, 9:00am by Sam Wang

Media types want you to get your knickers in a twist about Indiana. However, the data suggests that it doesn’t matter any more. Rationally speaking, it is probably time to stop writing so much about the Republican race for delegates. Also, may we have a moratorium on “brokered-convention” articles please?

Today I write about the PEC delegate snapshot. It is based on data posted here. All polls are current, including Trump +6% in Indiana (n=3 polls). Based on Tuesday’s voting, in which Cruz underperformed polls by a median of 4 percentage points, I will no longer assign a Cruz bonus. Note that Trump overperformed polls by a median of 8 percentage points.

As of today, for recently-unpolled states (NE,WV,OR,WA,MT,NM,SD) I will start using Google Correlate-based estimates. Of those states, Trump is favored in West Virginia (34 delegates) and is near-tied in Oregon and Washington (proportional representation). The rest are Cruz states.

Put through the PEC delegate simulator, the median delegate count is 1333 (interquartile range 1304-1339). The probability of getting to 1237 delegates is 98%:

What if we assume that Trump will lose Indiana? In that case the median drops to 1284 delegates (interquartile range 1278-1287). The probability of getting to 1237 is now 97%:

The 1% change in probability is inconsequential. The main effect of forcing a Cruz win in Indiana is to reduce uncertainty in the delegate count, which you can see in the narrowing of the historgram.

Close states (Oregon, Washington, and New Mexico) happen to use proportional rules, so they contribute very little uncertainty. Winner-take-all or nearly-winner-take-all (i.e. district-level rule) states are either strong Cruz (Nebraska, Montana, and South Dakota) or strong Trump (West Virginia, California, and New Jersey).

Most of the remaining uncertainty comes from district-level races in California. With California polls showing Trump +18% (Google Correlate says Trump +31%), it will take a highly coordinated effort by Cruz and Kasich to pick up many of its 53 districts. They would use geographic information like this Sextant Strategies survey to guide their efforts. At the moment, the likeliest outcome is for Trump to get at least 160 out of 172 delegates in the Golden State.

→ 53 CommentsTags: 2016 Election · President

Trump on a glide path (since mid-March)

April 27th, 2016, 9:19am by Sam Wang

The race has been stable for weeks, varying only by factors that are local to each state. Last night’s voting confirmed that – there was nothing new revealed. In terms of voter sentiment, the GOP race has been essentially unchanged since March.

How do we know this? Two reasons. The first is that national polls have been stable for four weeks, since March 22. The second is the remarkable success of a predictive method based on Google Correlate, which relies solely on past voting and web search patterns – and does not use polls or demographics at all. Here is how PEC and N.‘s Google Correlate method did (click to enlarge): [Read more →]

→ 35 CommentsTags: 2016 Election · President

East Coast primaries – open thread

April 26th, 2016, 8:30pm by Sam Wang

Based on polls and border counties (see summary), I expect Donald Trump to get over 85% of all the delegates to be voted upon today. I estimate that he will gain about 150 delegates (43 of these are district-level Pennsylvania delegates, whose rate of faithfulness I estimate to be 0.8). Trump’s total number of delegates might exceed 1000 tonight. [live results at HuffPost]

More calculations from N. after the jump. [

→ 15 CommentsTags: 2016 Election · President