Princeton Election Consortium

A first draft of electoral history. Since 2004

2016 Senate Forecast (original post)

This is the original post of the Senate forecast. The final version is posted here.

Update, August 30: Commenters are strongly against the use of an expert prior. One reader, Tony Asdourian, writes: “I understand there are competing values– the simplicity of only using polls vs. the need to be as accurate as possible– but to be honest, I think that your “brand” (not that you want one!) revolves around trying to predict using objective data only. Using pundits starts to blur what you are doing with the methodology of 538 and others. I really admire the fact that you were working hard to be ‘polls only’. I sort of took it as a prior (hah!) that you were the guy who worked hard to avoid all that other stuff.”

The commenters win. The prior will be based entirely on the history of past Senate polling trends: Presidential coattails this year, and “throw the President’s bums out” in midterm years. This guarantees that information is not counted twice, i.e. we are no longer using expert knowledge which itself takes polls into account. PEC offers, once again, a pundit-free prediction.

Close readers of the Princeton Election Consortium know that we calculate not only a snapshot of current Senate conditions, but also predictions of final outcomes. Last week, Josh Katz at The New York Times’s The Upshot started publishing a comparison of models, including PEC’s. Today, I start featuring it in the banner above.

Today, according to our model, Democrats have a 72 77% chance of winning the Senate. Because the probability of Senate control is in the 20-80% midrange, it is currently an important place for both sides to put resources, Democrats through ActBlue and Republicans through the NRSC.

I will explain PEC’s Senate model. It uses the same math as the Presidential forecast, and consists of three steps:

  1. Taking a snapshot of current Senate conditions and calculating a Meta-Margin (MM);
  2. Projecting random drift in MM; and
  3. Filtering that projection through a separate prior to make a November prediction.

This probability calculation depends most strongly on one parameter, the overall movement in conditions between now and Election Day. This approach has the virtue of simplicity. In models of greater complexity, future change has to be estimated in multiple ways, which may excessively compound the uncertainty of the prediction. In the PEC approach, good estimation of a few parameters gives a prediction that is as confident as possible based on data. However, these few parameters must still be estimated accurately!

Step 3 above is new and represents a substantial improvement over the 2014 forecast. I will describe why that election’s lessons have led me to use a non-polling an asymmetric prior. This is going to be technical; the most gory details, which can be skipped, are in italics.

In the left sidebar, you can see the current Senate snapshot. This is done using the same strategy as the Presidential snapshot:

  • In each state, take the 3 most recent pollsters or N days, whichever gives more data;
  • Take the median margin and estimated SEM to calculate a current win probability; and
  • Calculate the compounded distribution of all Senate races to get a histogram of all possible outcomes (i.e. 13 tracked races, or 2^13=8192 outcomes).

These three steps give a sharp picture of conditions today.

We also calculate a Meta-Margin, defined as how much all poll margins would have to shift to make Senate control a perfect tossup between Democrats-plus-Independents and Republicans. Because a 50-50 split in the Senate is highly likely to be tiebroken by Democratic Vice-President Tim Kaine, the Meta-Margin calculates a dividing point that usually lies somewhere between 49 and 50 Democratic-plus-Independent seats.

The Senate Meta-Margin is the key parameter for making a November prediction. I use it to ask: on Election Day, will the Meta-Margin be positive (Democratic control) or negative (Republican control)?

First, I make a prediction of where the Meta-Margin will go. This prediction requires knowledge of how fast the Meta-Margin will drift, and whether the average amount of drift has a limit.

Comparing late August with October in past years, the median state-by-state change in Senate polls was

  • 2008: toward Democrats by 4.0% (SD across states=4.9%)
  • 2010: toward Republicans by 2.3% (SD across states=4.7%)
  • 2012: toward Democrats by 3.6% (SD across states=5.1%)
  • 2014: toward Republicans by 1.7% (the Meta-Margin; no SD was calculated)

The SD of these four median changes is 3.4%. Combining this with the additional possibility of an error between final polls and November outcomes, the overall 1-sigma range of movement is +/-4.5%. For modeling change over time, I let drift grow to this ceiling at a rate of 0.6%^2/day.

Notice a pattern in the 2008-2014 data: In Presidential election years, the movement was toward Democrats – and President Obama was re-elected. In midterm years, the movement was toward Republicans. This type of pattern is driven by two causes: national elections roughly follow the Presidential race in on-years, and penalize the President’s party at midterms; and Democrats have bad turnout in midterm elections.This year, with Hillary Clinton favored to win the Presidency, Senate Democrats are likely to do better than today’s polling conditions would indicate.

This rosy outlook for Democrats is the mirror image of 2014. Two years ago, I assumed symmetric random drift, leading me to make a prediction in August 2014 that was excessively favorable to Democrats. So…how should I introduce a bias into the calculation to favor Democrats this year?

My current plan is to avoid adding an assumption of asymmetric drift. Instead, I am setting a Bayesian prior using expert opinion. This is a departure from my usual rule of using polls only. Over at Larry Sabato’s Crystal Ball, adding up all their ratings leads to the conclusion that Democrats have a slight advantage in retaking Senate control. I convert this to a quantitative prior as follows:

In all non-”safe” races, count a race as a probability of 0.8 when it is “Likely,” and 0.6 when it is “Lean.” That leads to an average outcome of 50.4 Democratic+Independent seats. Each one of these races also comes with an uncertainty that can be calculated using binomial statistics, using the same probabilities.

Then comes the question of how to combine these single-state uncertainties. The various races could all vary together, the “wave election” assumption; or they could be independent of one another, the “all politics is local” assumption. To combine the uncertainties, I take a midpoint between the two to get an estimate of 50.4 +/- 4.0 seats. Using a factor of 1.7% Meta-Margin per Senate seat, I convert this to a prior of Democrats +1.9 +/- 6.8%.

The prior has to be asymmetric: a “coattail” effect in Presidential years, and a “throw the bums out” effect in midterm years. The coattail effect can go in either direction, depending on who wins the Presidency. Our Presidential model’s prior has a Clinton win probability of 71%, and a Trump win probability of 29%. So we have a 71% probability that the Senate Meta-Margin will move toward Democrats by 3.8%, and a 29% probability of moving toward Republicans by 5.5%. The latter number is greater because Vice-President Mike Pence would be the tie-breaking vote. The weighted sum of these possibilities gives an average post-August move of 1.1% toward Democrats.

Throughout August, the average Senate Meta-Margin was D+1.8%. The Senate prior would then be 1.1% above this, or D+2.9%. Using an overall SD on the prior of 7.0%, the prior probability of a Democratic-controlled Senate is 65%.

This prior matters when the poll-based uncertainty is large, early in the campaign season. It has less impact as the maximum amount of random drift diminishes in the weeks ahead.

Note that today, the prior doesn’t matter at all, since it aligns very well with the polling snapshot. Of course, conditions can change.

Finally, I combine the random-drift calculation and the prior using the MATLAB script Bayesian_November_prediction.m. The ultimate output is PEC’s November Senate control probability, which you can see over at The Upshot. It is drawn from the second column of our file Senate_D_November_control_probability.csv.

Separate from the overall party-control prediction, I also calculate individual November Democratic win probabilities. These are given in the second column of Senate_stateprobs.csv, and reflect random drift only. They are also listed at The Upshot.