Princeton Election Consortium

A first draft of electoral history. Since 2004

Home stretch analysis of 2009 special elections: NJ-Gov, NY-23

November 3rd, 2009, 4:02am by Sam Wang


The polls in the Corzine (D) – Christie (R) – Daggett (I) race have been very close. So it’s time for the variance minimization tool that I advocated at the end of the 2008 Presidential race. Using that, I come up with the following conclusions:

Since 10/23 (i.e. the last 10 days), the race has been static. Polls spanning that entire period give
Final margin: Christie over Corzine by 1.0+/-1.0% (68% CI, 16 polls).
3-way outcome: Christie 45.5%, Corzine 44.5%, Daggett 10.0%.
Christie win probability: 83%, i.e. 5:1 odds.

More details, and analysis of the NY-23 race, after the break.

As usual, I use data from the fine obsessives over at Pollster.com: NJ-Gov. (Note that they omit a late update to a Fairleigh-Dickinson on the grounds that it pools data over 11 days, with 7 days previously reported. Including that poll does not change my bottom line.)

Last November, I suggested that variation in individual measurements arises from two sources: (1) sampling repeatedly from a fixed distribution, and (2) changes in the distribution. On average, sampling from a fixed distribution gives the same standard deviation, independent of the number of samples taken. If the distribution shifts or changes shape, then the standard deviation would be likely to change.

If we plot the standard deviation of a standard quantity (in this case, the Corzine-Christie margin) over various time windows from Date X until Election Eve, allowing X to vary, the standard deviation should stay approximately constant for periods during which the race did not shift – but rise steadily in periods of true change. The plot looks like this:nj-gov-09
Please pardon the confusing x-axis label. It indicates the first poll # used. There were 48 polls, so that the leftmost point indicates the SD of polls #1-48, the next point is SD of #2-48, and so on.

The plot starts to take off at #44 (Quinnipiac, 10/27-11/1). It might be that polls after that, taken over the weekend, show more volatility. But there’s not enough data to tell for certain, and in any case these polls still give a median of Christie by 2 +/- 2%. So instead I will go back to the minimum SD at #33 (PPP, 10/23-26). That set of 16 polls gives the median and error bar (estimated using this method) listed above.

It’s right down to the wire – the Christie-Corzine margin is not statistically distinguishable from zero. But if we believe that these polls are representative of voters, the odds do favor Christie.

Considering that Republican-leaning votes are being split by Chris Daggett, a Republican running as an Independent, it is clear that Governor Corzine is not a popular guy. If Corzine pulls this one out of the fire, it will be thanks in part to Daggett. Also, Christie was hurt by his weaknesses – leaving the scene of an accident where he hit a motorcyclist with his car, lack of compelling message, and abuse of his position as U.S. Attorney. We’ll see tomorrow.

For other statistical analysis and color commentary, I refer you to Fivethirtyeight by Nate Silver (Christie win probability 57%) and Stochastic Democracy by David Shor (Christie win probability 53%). They and I point in the same direction, toward Christie. But they are less certain, perhaps in part because they use a shorter time window. I consider this a test of my variance minimization (VM) method.

Postscript: The last 16 polls sampled 12,860 respondents. Binomial sampling suggests that from such a sample, it ought to be possible to get a 68% confidence interval of +/-0.4%. Clearly there are additional uncertainties having to do with methodological differences among pollsters and whether n=16 is the right number of polls to use. But it’s not crazy to imagine making an even more accurate estimate than what I have given.

NY-23: The Conservative candidate, Doug Hoffman, is favored over the Democrat, Bill Owens. The right-wing push for Hoffman, who does not live in the district, has contributed much of the excitement here – as well as the fact that Dede Scozzafava (R) has dropped out and endorsed Owens. However, take a look at the polling data, which show that Owens’s support has stayed in the 33-36% range. Basically, Republican voters are breaking toward the Palin-backed Hoffman, not a surprise. Considering that Scozzafava wants Owens to win, dropping out was a bad move. She would have been better off staying in the race. For more commentary, read Mark Blumenthal.

The fact that the Republican base has rejected the moderate Scozzafava would seem to be a sign of the Republican party being pushed further to the right by its base – seemingly not a good move for them, no matter what happens in this race.

For more on NY-23, see polls at Pollster.com and reporting at TalkingPointsMemo.ons

Tags: Politics

4 Comments so far ↓

  • Clay Shentrup

    Sam,

    You said:
    “Paul, my first reaction is that it is hard to properly gauge the relative advantages and disadvantages of IRV, Condorcet, and other voting systems. What I mean is that the arguments all seem good, but it is hard to tell which ones would tend to be more important.”

    That problem is solved by using election simulations to calculate Bayesian regret for the respective methods. And that’s precisely what was done by Warren D. Smith, the Princeton math Ph.D. behind most of the material at ScoreVoting.net.

    Here are some sample Bayesian regret calculations, showing the pretty stark superiority of score voting (aka range voting).

    http://scorevoting.net/UniqBest.html
    http://scorevoting.net/StratHonMix.html

  • Sam Wang

    Paul, my first reaction is that it is hard to properly gauge the relative advantages and disadvantages of IRV, Condorcet, and other voting systems. What I mean is that the arguments all seem good, but it is hard to tell which ones would tend to be more important.

    One statistically-based solution would be to simulate/emulate past races with three or more candidates. Then define outcomes that would be considered grossly unfair, and see which method is less likely to produce the unfair outcome in many simulated scenarios.

    My proposal contains an element of subjectivity. It also contains the problem that people will start developing specific opinions based on their preferences in past elections, as opposed to a desire for ideal outcomes. But it’s a project worth tackling.

    This may not be what you meant by a statistical approach, but it seems like it might be useful. It might be the case that many of these systems would constitute enough of an improvement over “plurality-takes-all,” that the big win comes from simply picking one of the good-enough solutions to the problem.

  • Paul

    As we undertake our first day ever of Instant Runoff Voting here in Minneapolis, I wonder if you have any thoughts on the tradeoffs of various vote-counting methods?

    I’m excited about ditching plurality voting myself, but skeptical of IRV. Approval voting, range voting, and various Condorcet methods seem preferable to me. The study of voting systems has traditionally belonged to the field of discrete math. Can statistics shed any new light?

  • Victoria

    Hope you’re right about Christie winning.
    Corzine’s unfavorables couldn’t be any worse.