Princeton Election Consortium

A first draft of electoral history. Since 2004

GOP update, pre-New York

April 18th, 2016, 3:59pm by Sam Wang


Tomorrow New York votes. This is a critical race in the Republican primary campaign. Above is a final snapshot, based on polls and voting patterns to date. This calculation gives a median Trump outcome of 1265 pledged delegates (interquartile range or IQR, 1210 to 1305 delegates). The probability of getting 1237 or above is 64%. If polls are accurate, Donald Trump appears to be headed to getting 86 or more of New York’s 95 delegates.

The overall picture represents very little change from last week. Below are some technical notes, as well as state-by-state snapshots. I have updated my methods (details documented here). In the biggest new item, I show how to infer likely voting in states for which there are no polls, without use of any demographic assumptions. Using this method, I handicap Indiana as Trump +7%.

Some of you expressed doubt about my generic approach to representing state-to-state variation. Recall that I used the fact that the Trump v. Cruz/Kasich margin must vary around the national average, which should be predicted by national polls. Although this statement is ineluctably true, it comes with high uncertainty in the outcome because it does not take into account state-specific demographics.

I have now imputed the probable vote margin for states where there are no polls (currently South Dakota, Montana, Nebraska, Indiana, and Delaware). Using the New York Times detailed results map, I tabulated how many counties were won by each candidate in the counties directly adjoining each of the five states, as well as states that have already voted.
This table is arranged in order of fraction of border counties that preferred Cruz or Trump. For example, in Oklahoma, 23 out of 28 counties had more votes for Cruz than for Trump. I included states for which the border data is incomplete. Among states that have already voted, the rank correlation between border-counties rank and the rank of the state’s actual margin is +0.77. A predictor based on which candidate won more border counties would give the correct outcome in 15 out of 17 cases (88%), 16 if the threshold is set to include Wisconsin.

I then imputed the likely Trump-v.-Cruz margin (right-hand margin), and assumed that the values had a one-sigma uncertainty of +/-10%, i.e. one-third of the time the margin will be off by more than 10 percentage points in either direction.

In this table, Indiana is important. Indiana law forbids autodialer-based polling, which makes the survey process expensive and time-consuming. Since Trump finished ahead of Cruz in 77% of the adjacent counties in IL, MI, OH, and KY, I think he is favored to win Indiana as well. Obviously, real polling data would reduce uncertainty – and potentially alter the overall picture considerably.

The resulting state-by-state assumptions are as follows. This table includes the bonus for Ted Cruz that I have written about before. In the right-hand column is the single most common delegate assignment, based on delegate rules. As I have written many times, GOP rules heavily favor the first-place finisher. Thus, even though Trump may only get around 45% of the popular vote in these states, he is likely to get around 531 out of the 761 delegates to be assigned – about 70%.

In all states except New York, I assumed that the district-by-district standard deviation is 5%. For New York, the calculation assigns Trump 2 delegates in each of 27 districts; if he clears 50% in a district, he also gets its third delegate. I assumed a standard deviation of 9% to match Romney 2012, which would give him over 50% of the vote in 18 districts. Adding the 14 statewide delegates, his total is 54+18+14=86 delegates. This may underestimate the number of delegates Trump will win for two reasons: (1) An assumption of SD=5% would give him 2 more delegates. (2) Undecided voters total 6%; if he picks up 1% there, he would get 1 more delegate on average.

Allowing all of the state-by-state margins to vary up and down together (i.e. uniform swing) gives the histogram of likely outcomes that you see at the top of this post.

Alternate scenarios: Trump’s Meta-Margin is +1.5%; if he falls by that much, then the probability of clearing 1237 pledged delegates falls to 50%. This calculation excludes uncommitted delegates and assumes all assigned delegates remain faithful to their voters. Under uncertain conditions (I regard 20-80% probability as uncertain territory, and 64% is right in there), within a limited range, each additional (or removed) delegate alters the probability by approximately 0.5%.

For example, this survey of Pennsylvania delegates suggests that 20% of district-level delegates would be likely to defect – usually because they are Cruz supporters. That would cost Trump an average of 11 delegates (assuming he sweeps all districts). If 11 Pennsylvania delegates are faithless, then the probability of getting to 1237 or greater drops from 64% to 59%. At this rate of defection, the calculation gives a median of 1255 delegates, interquartile range 1199-1294. However, I note that we do not know if Trump voters will willingly vote for Cruz-committed delegates.

Conversely, if 20 uncommitted delegates are recruited, then the probability goes up to 74%. I leave it to you, dear reader, to decide how much to add or subtract.

Tags: 2016 Election · President

35 Comments so far ↓

  • mediaglyphic

    Love the automated Trump IQR at the top of the page!

  • Mark F.

    The big question is how close does Trump have to get to 1237 to have a good chance of winning on the first ballot? 20, 50, 100? I don’t know. But chances are he has the delegates locked up after California.

  • Amitabh Lath

    Nate Cohn’s prediction engine at the NY Times Upshot has Cruz/Kasich/Trump at 43/16/40 in Indiana. Which is significantly more pro-Cruz than your model. Apparently Rubio voters prefer Cruz in his extrapolation.

    • mediaglyphic

      Does Nate declare his assumptions as clearly as Dr. Wangs model? I think most of these predictions are acting as optimal control filters and goalseeking to the answer they already believe. There may be a black box algorithm inside their minds.

    • Amitabh Lath

      While Nate Cohn does not allow you to scroll through his raw code like Sam does, he has described his algorithm somewhere in the Upshot pages of the Times. Better than the secret sauce peddled by another Nate on another site.

      Basically his calculus had Trump ahead in Indiana after his March 15 performance. But now Rubio’s 8% is assigned to Cruz. How he knows they will go to Cruz and not Trump or just stay home I do not know.

      http://www.nytimes.com/interactive/2016/03/30/upshot/trump-clinton-delegate-calculator.html

    • mediaglyphic

      Amit,
      i bet not even Nate Cohn knows this i think this assumption is the equivalent of multiply by zero and add the answer.

      I have a question for those who are current on statistical analysis. Why is Dr. Wang using MODE rather than EV? It seems to have worked quite well in NY, am just wondering what the theory would recommend. I would have thought that we are throwing away a lot of the fat tail data, but its been 35 years since my last stats course!

    • Sam Wang

      I just use the mode to generate a table entry showing you what the rule in that state does. I just apply the rule by hand.

      The overall calculation considers all possibilities.

    • Mark F.

      I think this race is over if Trump wins Indiana.

  • Katie

    The border county thing is a neat idea – it will be interesting to see if it holds up for Indiana. Michigan City, Evansville, and most of the rural areas probably match up with the neighbors pretty well. But the South Bend area is a tougher call – it’s significantly denser than the Michigan side of the border and is a college town. Republican leaning, but possibly towards different candidates. And you can’t account for Indianapolis or Fort Wayne at all.

    • Amitabh Lath

      True, urban areas are hard to extrapolate into from surrounding counties. But urban areas tend not to play as large a role in Republican primaries as they do in Democratic ones.

    • Michael

      @Amitabh Lath Urban areas should count just as much for Republicans as Democrats in regard to winning CDs. For example, in NY, the heavily Democratic Bronx had only about a 1000 Republicans show up, yet they had the same 3 delegates assigned as other CDs that had tens of thousands of Republicans. Statewide is different, of course, but I’d still think there are more Republican in suburbs than in rural areas.

  • Jack Tenold

    I would like to add my thanks for the Herculean effort you put into this blog. The chart is amazing, and I will follow it this spring and carefully follow your updates. Best site on the web. Thank you.

  • JayBoy2k

    Thanks Sam, I have tried to stay away from this topic for a week or 2 while there was little or no voting going on. I am still laid back and likely will be until the 1st week of June. Your insights on California will be appreciated whenever in the course of time, the media probes your expertise and opinions, and we have a front row seat.
    I appreciate the chart and the work to create it. It is the clearest road map that I have seen. I will be coming back to it after each of the primary votes to see how the votes came out.

  • Gary

    Does the surrounding county analysis factor in how many other republicans were in the race at the time? If as I have heard Cruz and Kasich are consolidating more of the Rubio/Bush/Christie etc. vote it seems like the analysis might skew toward Trump if not…

    • Sam Wang

      I just counted up who finished ahead between Cruz and Trump, and ignore Kasich. Obviously Ohio, where Kasich finished first in a number of counties, is an issue.

      Another issue is how to reassign Rubio voters. They are not that numerous. Looking at national opinion, they should probably be assigned to Trump:Cruz:Kasich in a ratio of 1:2.5:2.5. I can report this – though all the info is available publicly for you to calculate it.

    • Gary

      Got it thanks!

  • Mark F.

    If Trumps fails on the first ballot, it’s hard to see them going for anyone but Cruz. The “establishment” simply does not have a viable nominee, and the delegates are not going to have one foisted on them. However, neither Trump or Cruz can beat Clinton in the general election, in my opinion. Unless there is some “black swan” event.

    Moving to the Democratic side, we are still seeing these ridiculous “Can Sanders catch Clinton?” pieces. (People simply can’t give up on this guy.) The short answer: No, barring a “black swan” event.

  • mediaglyphic

    Thank you for the state by state breakdown.

  • Amitabh Lath

    Love the idea of using adjacent counties in neighboring states. Brilliant.

    Of course, Iowa makes no sense. Cruz should have lost big there.

    The southern states are a problem as Cruz often gets zero counties, but of course will not get 0% of the vote.

    But overall, not bad.

  • JoeC

    Sam, thanks for bringing some light with an understandable approach to complex problems..

    I couldn’t do it myself, but at least I can understand it.

    There’s something to said not writing daily post unless you do what you do – avoid adding to the noise.

    Meet the Press in General and Chuck,Todd, in his role of NBC Political Director, are a disaster – not to say anything good about the swill on cable that substitutes for news, fact or innuendo.

    Todd was all too ready in February to predict the imminent demise of Trump’s candidacy based on a deficient outlier NBC/WSJ Poll that had contrary results to a series of other polls. Before and after .

    Todd and company for three days followed the line that Trump’s political demise was imminent, that their poll included an extra day that caught the shift of momentum away from Trump.

    Todd/NBC began to drift toward acknowledging that their poll was out of step. Having watched him the day the poll was released, I could only shake my head. I’m not suggesting it was partisan – just that it involved projections unrelated to data reality.

    I don’t vote in primaries and have never been enrolled in a political party.

    Taking the data driven analytical approach you do is expected in your mind and mine to separate the noise from reality. Your reaction to Josh typifies the humility of doing good analysis.

    I found the CBS Evening News Monday irritating (It happens). Was Bernie closing the gap in NY. I expect a result within the range of statistical probability. I’ll take your word for it. The same with Trump on the GOP after the primary I suspect reporting on “the gap” will depend on how close or distant your own pollster was- not good Use of data.

    The conjoint reporting of National Polling on Clinton only having a two point lead over Sanders made no sense. What has it got to do with math or that tomorrow is that the NYS Tuesday and polling indications.

    Thanks again.

  • Bill Herschel

    I know that I will be accused of extreme laziness for asking this question, but here goes anyway.

    Is there any precedent for a candidate to come within shall we say 50/1237 or 4% of the number of delegates needed to secure the nomination and being denied the nomination by a candidate who has done considerably, very considerably worse?

    Put another way, the Republican Party has put a lot of work into defining what enfranchisement means in this Primary season. Will they be able to survive as a Party if they tamper in a major way with their own definition?

    Put yet another way, and this I think is the most serious and important question, what will it take for the Republican Party to have wounded itself mortally: no more Republican Party?

    I think this can be analyzed quantitatively, I think. But I could not do it in a thousand years.

    • Frank

      Some of my random thoughts to add that may answer some of your questions.

      1. It’s looking like Trump will barely make it: between 1260 to 1290. (So start placing your office bets what the exact number will be!) Trump will make it on June 7 when the last states vote, which include CA.

      2. Before June 7, I predict Kasich will suspend, in order to help Cruz to stop Trump. This won’t work.

      3. Cruz and the GOP will try to take away as many delegates as they can, chipping Trump’s lead down, through various schemes. Trump’s campaign will have to work to counter and firewall against this.

      4. A recent poll shows a clear majority of Republican voters want the nominee to be the one who gets the most delegate votes, regardless if he gets 1237. What will happen if the GOP denies this? These Republican-lean voters could stay home, which could flip the Senate back to the Dems.

    • JoeC

      Bill:

      I doubt there is any precedent in Republican history.

      The 1952 GOP Convention is sometimes called a contested convention, but not in the sense of 2016. Eisenhower was recruited to run by the then dominant GOP moderates against Robert Taft.

      Eisenhower had 595 of the 1206 delegates on the first polling while Taft had 500, Earl Warren 81 and Stassen 20, Douglas MacArthur 10. Ike was only nine votes short. He was nominated with 845 votes on the second polling.
      Taft could not win and wasn’t close Eisenhower’s nomination was secured by a negotiation that included Nixon’s candidacy for VEEP.

      The Party doesn’t want Trump. They don’t want Cruz. Cruz can’t win it without something like 98 percent of the remaining delegates. I don’t think there is the data for a quant answer.

      I doubt Cruz will have power to deal. They don’t like him. Backing in primary season doesn’t have to translate into nomination. The Tea Party helps make Cruz poison, my opinion.

      The GOP can change the rules before the first ballot or after, do almost anything. I don’t think Cruz is going to be all that close to Trump on a first ballot, after I think they have real problem. Dumping Cruz might be easier than dumping Trump. I don’t know.

      Mortally wounded? Probably not, but today’s GOP bears only marginal resemblance to the GOP of the 1960s.

    • Matt McIrvin

      The modern party primary system just hasn’t been around for very many election cycles. I think you have to go to decidedly pre-modern times to even come close to a precedent.

      I think the Republican Party will probably survive, but if they reject Trump it will be a major repudiation of primaries as we know them. It might be better for them to accept Trump and just eat any resulting losses.

    • Rachel Findley

      Garfield 1880 is the classic example: 36 ballots, Garfield not a contender going in to the convention. The party was sharply divided over the issue of patronage. After 35 ballots, someone put Garfield’s name in contention and he won the 36th ballot, much to his surprise. The Republicans survived.

      https://garfieldnps.wordpress.com/2012/11/20/stalwarts-half-breeds-and-political-assassination/

      “Between 1868 and 1952, 18 of the 44 party conventions took more than one ballot to settle on a candidate, usually a sure sign that deals had to be struck.

      “The last multi-ballot convention took place in 1952, when Democrats had to vote three times before settling on Adlai Stevenson as their sacrificial lamb to run against World War II hero Dwight D. Eisenhower. And as late as 1968, Democrats chose a nominee — Vice President Hubert Humphrey — who hadn’t run in a single primary yet was the preferred candidate of the party-machinery chieftains.”

      Read more here: http://www.miamiherald.com/news/politics-government/election/article68111532.html#storylink=cpy

  • Josh

    Thanks for this, Sam.

    Am I mistaken in thinking that NY awards 2 delegates to the winner of a congressional district and the third delegate if and only if the candidate breaks 50% within the district?

    • Sam Wang

      Yes. It is a complex rule. I actually wrote it up wrong in the essay – the code is correct. Thank you for catching this.

      To assign the third delegate in each district I used Trump’s polling median, which is currently 54% — 4% above 50%. That leads to a probability (if you can read MATLAB) of tcdf(54-50)/9,3)=0.66, so 18 districts in which the third delegate is assigned. 14+54+18=86, which is what the table should say (not 68).

  • Jim

    What’s the Y axis represent? If I want the probability of above 1237, do I just calculate the area right to the red line?

  • Ethan

    Ah, I see it now in the past post. But this in particular really seems like an ubercritical assumption, since if it’s wrong it completely flips the analysis, and it’s an assumption that doesn’t seem warranted based on how the unbound delegate selection process has gone so far. But others can decide for themselves on its importance.

    • Sam Wang

      I do not agree with the comment about “flipping” the analysis. The most important output is an estimate of the likely number of delegates. Probably I should emphasize that.

      If 20 Pennsylvania delegates are faithless, then the probability drops from 64% to 55%. So in this midrange of highly uncertain outcomes (20-80% probability is highly uncertain), subtract 0.5% of probability per faithless delegate. You, the reader, get to decide how many that will be.

    • Matt McIrvin

      Lots of other analyses have claimed that Trump has almost no chance of getting to a majority. Is the main difference from this analysis the way they treat Pennsylvania?

    • Sam Wang

      If you look at the table in my post, it is not hard to get to 1,237. If one assumes that no Pennsylvania delegates vote for Trump, that obviously makes a difference. However, I find that to be a rather harsh assumption.

      It would help if you cited one of those analyses!

      …ok, here’s one. Over at the Great Argental Satan is a long essay that beats around the bush, and doesn’t make a firm prediction.

      It doesn’t count any Pennsylvania delegates as bound, despite this survey of delegates, which I read as meaning that 72-88 out of 110 delegate candidates would vote for Trump if he was the winner. So that could be a defection rate of 22-38 out of 110, or 20-35%. Scaled to 54 district-level delegates, that would cost Trump 11-19 delegates. However, this is not the biggest difference.

      It also says that Indiana (57 delegates, winner-take-all) leans toward Cruz. Indiana is the reason why I did that whole boundary-county thing. I would say that is a big deal.

    • Andrew

      Well, 538 has a delegate tracker that maps out a path to 1237 and shows trump about 9% below the level he would need to be to be on that track. They have been reporting (based on that) as though trump is unlikely to make it to 1237. I believe this tracker has been influencing other coverage as well.

      http://projects.fivethirtyeight.com/election-2016/delegate-targets/

  • Ethan

    Isn’t it a huge leap to give Trump all 71 PA delegates, when only 17 are bound by the vote and Trump has done horribly among unbound delegates? If you are going to assume he gets all 54 unbound delegates that seems like a huge, huge presumption that needs to be highlighted. No?

    • Sam Wang

      I wrote about this before. This is a calculation of pledged delegates, and all such calculations must contain assumptions. Another such assumption is that unpledged delegates are not counted at all.

      It is tedious to restate every assumption in every essay. Please read past posts before commenting.