In the home stretch, I wrote that midterm polling is far less accurate than in Presidential years. Today, in

From a polling standpoint, estimating turnout is likely to be a major source of systematic error. Here are some details of this year’s GOP “bonus.”

**In the Senate, the GOP overperformed by 5 percentage points**

Republican performance above and beyond pre-election Senate polls can be estimated in two ways.

The first is pretty simple: just calculate the difference between outcomes and polls for states that were in contention. This gives a “bonus” for **Republicans of 5.2 ± 1.1% (mean ± SEM; SD=4.2%)**. Comparing with past Senate polling errors, this is the largest error in the FiveThirtyEight database going back to 1990. The next-largest error was a 4.9% bonus for Democrats in 1998.

The second way is to use the Meta-Margin, the amount of swing needed to create a tie for control. This can be calculated by arranging the races in order of winning margin in order, and calculating what the “tipping-point” margin is between the 50th and 51st GOP win. That’s a little ambiguous because the Louisiana runoff has not occurred yet. But if we assume that the tipping-point margin is halfway between Georgia and Louisiana, it would be R+5.8%. Compared with the pre-election Meta-Margin of R+1.0%, that’s a **Republican bonus of 4.8%**.

I note that one state where turnout cannot account for the polling error is Iowa. There, turnout was quite robust – and Joni Ernst (R) outperformed by about six percentage points. This is worth examining. Nate Cohn at the NYT suggests it’s a sign that Iowa is newly competitive. That is certainly possible. One possible reason is early voting in that state, an area where Republicans did much better than in 2010.

**Similar overperformance in the House **

In pre-election polls, the generic Congressional ballot showed a Republican lead of 1.5 percentage points. The House vote isn’t done being counted yet, but David Wasserman and his team at the Cook Political Report currently have Republicans leading the national popular vote by 6.8%. (They also confirm that turnout was exceptionally low, down by about 42% from the 2012 election.) The difference between polls and the actual House vote is **a bonus for Republicans of 5.3%**, similar to the Senate estimates.

Republicans gained at least 14 seats, with 6 more races outstanding (in these six, 3 Democrats lead and 3 Republicans lead). A gain of 17 seats is larger than expected (0 to 12 seats gained). However, it is smaller than 26 seats, which is the median post-World War II midterm gain by the President’s opposition party.

Finally, as I wrote last Wednesday, gubernatorial races showed a smaller bonus for Republicans, about 2 percentage points. I don’t know yet why this is smaller. One possibility is that voters are more likely to cross party lines when voting for governor. Many states have a history of electing governors from the non-dominant party. Another possibility is that when pollsters are focused on a particular state’s race, they are better at gauging likely voters than when they have to build a model that applies to many states at once.

It occurred to me to ask whether the direction of the correlation between turnout and polling bias was always the same. The answer to that is no. In 2006 and 2010, Democrats did *better* than expected in low-turnout states. The absolute magnitude of pollster error was similar – but the direction was different. In those years, it was Republicans who tended to show up less often than expected. I’ll post a graph of that later.

Currently, I think the general principle is that pollsters have a tough time identifying people who are marginally committed to voting. In 2010, Republicans seemed angry with the President and Congressional Democrats – but evidently not as mad as pollsters thought.

One interesting note: as David Wasserman points out, Republicans won the popular vote in 2010 by 6.6%, nearly the same margin as in 2014. Yet Republicans led the generic Congressional ballot in 2010 by over 10 percentage points. This year is a particularly vivid example of the difference that voter turnout can make.

>>>

P.S. Later I will do postmortem examination of long-term forecasting. It’s not far from my mind. More later.

]]>As I wrote in *The New Republic*, last night’s performance by the GOP was remarkable. In close Senate races, Republicans outperformed polls by an average of 5.3 percentage points. Prime examples of that effect could be seen with Republican wins in Kansas and North Carolina, two races that went against pre-election polls.

In gubernatorial races, Republicans outperformed polls nearly 2 percentage points on average. This was enough to put Paul LePage of Maine (tied), Rick Scott of Florida (tied), and Bruce Rauner of Illinois (Quinn +2.0%) over the top. All in all, Republicans had an excellent night.

Historically, midterm polling is much more prone to large biases than in Presidential years. In 2010, Democrats benefited; in 2014, it was Republicans. In six Senate races that were polling within less than three percentage points, two were won by the lagging candidate. That is entirely in line with past results. Added to the median poll-based snapshot of 52 Republicans, 48 Democrats+Independents, the result could be as large as a convincing 54-46 majority.

Before the election, I pointed out the possibility that polling bias could go in either direction. It is likely that pollsters face a tough challenge in identifying likely voters in an off-year.

With control the Senate so closely fought, even a small bias put into question who would control the chamber. And, as I wrote, it also opened the possibility of a GOP blowout. I said we didn’t know what would happen. Maybe we can call that my Peggy Noonan moment.

**Brier scores**

Over the weekend I suggested Brier scores as a way to compare predictions. Aggregators and analysts did worse than in 2012, when polls did not miss any races (PEC Brier score, 0.01; scores close to zero are considered good).

I used final probabilities as listed at The Upshot to calculate Brier scores. The lowest (and therefore best) score came from Drew Linzer (DailyKos Elections), who took a Bayesian polls-only approach and ended up with a Brier score of 0.10. Coming in second was The Washington Post with a mostly-polls approach, at 0.12. Next came HuffPost, FiveThirtyEight, and Betfair got 0.14, followed by The Upshot at 0.15. And finally we have PEC, with 0.18. Although the number of “misses” (i.e. being on the wrong side of 50% probability) was no worse than the other sites, we were done in by an across-the-board lack of certainty, which we predicated on the unreliability of midterm polls. Congratulations to Drew Linzer!

*Postscript: as pointed out by commenter Paul, Drew Linzer shines even more if his calculation’s performance in the several months prior to the election is included.*

*P.P.S.: Doug Rivers at YouGov has evaluated his own organization’s miss of actual-voter behavior, as well as that of other polling organizations. The findings seem consistent with what I’ve reported here.*

**12:10am:** Tonight’s performance by the GOP has been quite remarkable. In close Senate races, Republicans seem to be outperforming polls by around 5 percentage points. That goes a long way toward explaining what is happening in Virginia. In close gubernatorial races, Republicans are outperforming polls by about 3 percentage points.

I did say that historically, midterm polling can be off in either direction by a median of 3 percentage points – far worse than Presidential years. Tonight is certainly consistent with that.

**11:30pm:** Ernst will win Iowa. Other than New Hampshire, it’s looking like a sweep of close races by Republicans. Counting CO, GA, IA, KS, and NC gets to 52. Alaska and Louisiana are still outstanding, but that’s icing on the cake for the GOP.

**11:15pm:** Republican candidates are heading toward victory in Kansas and Colorado. Still outstanding are Virginia (possibly D), North Carolina (probably R), and Iowa (moving fast, heading toward R).

**10:55pm:** Republicans are overperforming polls substantially. The exact amount varies, but in key states (KS, VA, GA Senate; WI, GA governor) the bonus is remarkably large.

**10:45pm:** Rick Snyder (R-MI-Gov) wins.

**10:20pm:** Nathan Deal (R-GA-Gov) wins. In this and other races, Republican governors are looking good.

Senator Pat Roberts (R-KS) is leading. If Orman loses that one, then I think it’s over for Senate Democrats.

**10:08pm:** Scott Brown (R-NH-Sen) loses by an estimated 3-4%.

**10:00pm:** Update on projected margins in early states: KY, McConnell +14%. NH, Shaheen +4%. NC…man, that’s a close race.

Very few governors’ races called yet…but incumbents are running stronger than indicated by surveys.

**9:45pm:** Wyoming Governor and Senate for the R candidates. This is in the category of taking out the recycling…not exciting to talk about, but gotta do it.

**9:30pm:** NYT calls NH-Sen for Shaheen. The margin’s looking like 3-4%, a bit better than the pre-election 2%.

**9:00pm:** Tom Wolf (D) wins Gov-PA. Susan Collins (R) wins Sen-ME. Abbott (R) wins Gov-TX.

The Upshot is estimating Virginia Senate at Warner (D) by 1%. Looks similar over here. A nail-biter for sure. This will depend on large-population counties: Virginia Beach, Prince William.

]]>**8:40pm:** The Upshot has projected counts. For now, use those for your Geek’s Guide. Shaheen (D-NH) around +5% and McConnell (R-KY) around +13%, both ahead of their pre-election polls. Ambiguous for estimating Delta.

**8:27pm:** Reader Forrest asked me how The Upshot estimates vote share from partial returns. I can’t say what they are doing, but look at Jay Boice’s HuffPollster calculation. Basically take the prior history of the state, county by county (or whatever level of granularity you have available). Then slide over all the counts in past comparable elections, and see how each county would have to break in order to reach a 50-50 tie. Use that as an over/under, i.e. calculate whether a candidate is over/underperforming that expectation. Then do a weighted average across counties. That is an estimator of the margin between candidates.

**8:17pm:** Virginia Senate race looking close. But recall that in 2012, Romney led Obama until around midnight. This is going to take a while.

**7:44pm:** Everyone, please note: rural areas tend to come in earlier. That means early returns will tilt GOP. That’s not so useful for projecting results.

**7:38pm:** Kentucky’s projecting ~~around McConnell+7% at the moment, close to polls~~. Actually, less sure. Stand by…

Virginia counts are screwy. Might be technical errors on the ground there.

**7:06pm: **CNN has called Kentucky for McConnell (R). It is too early to say whether he ran ahead or behind the pre-election polls of McConnell +7.5 +/- 0.6%. If he does, it will be by a few percentage points either way.

Commenter Chandra says CNN has McConnell+17%. Don’t pay attention to those raw counts! Uncorrected, that’s of no use in estimating the final margin. I think it will come in super-close to pre-election polls.

]]>Despite the certainty of pundits, we actually don’t know who will win the Senate! In

From 2004 to 2012, only thirteen Senate races have had margins of less than three percentage points in the week before the election. Of these, four were won by the trailing candidate. One more, the Florida 2004 race, was tied in the polls, and was eventually won by the Republican, Mel Martinez, by 2 percentage points. Scoring that one as half correct, the overall rate of wins by a front-runner is 65%, a bit better than chance.

In light of that, the probability that all six close Senate races (AK, CO, IA, KS, NH, and NC) will be won by the candidate in the lead is only 7%. A wrong call is almost inevitable. We should not be surprised to see one to three races to be won by the candidate who trails this morning. This allows us to hazard a guess as to the most probable path to Democratic retention of the Senate (which PEC currently has at 35%).

Last night, I gave poll-based probabilities for Senate and governorships. Republicans are favored to take over, but what is the likeliest route to Democrats retaining control? To estimate that, I will use the margin/standard-error-of-the-mean (margin/SEM) ratio (see table at the end of this post) as a measure of which margins are flakiest. From this, upsets seem likely in Iowa (margin/SEM=1.0/0.8=1.3), followed by Alaska (margin/SEM=1.0/1.7=0.6). If they flip, that gives

**Democrats+existing Independents: **45 safe seats plus NH, NC, IA, and AK. Total: 49 seats.

**Republicans: **45 safe seats plus AR, KY, LA (runoff), GA (maybe runoff). and CO. Total: 50 seats.

**Kansas: **Orman, who would make the 50th vote for the Democrats+Independents – or facilitate a power-sharing arrangement.

I note that the margin/SEM ratio in New Hampshire is 3.1, suggesting I might escape having to eat a bug.

I’ll have more to say about errors in *The New Yorker* later today. In the meantime, where do you see underdogs winning?

**Useful links (will add as day goes on):**

HuffPollster: Senate Election Live-Tracker.

DailyKos Elections, hour-by-hour guide.

New York Times, The Upshot, Senate tracker.

Here are final polling snapshots for Senate races:

Put your own predictions in comments! Some more notes…

The calculations above will test the question of how well we can do with polls alone. As always, we did not do any house-effects corrections or fundamentals-based modeling. This is a polls-only snapshot.

*Technical notes: *The same methods were used as for the gubernatorial snapshots. The error bars are SEM of the polled demographic. To calculate win probability, I have incorporated an additional possible 2.5% error to account for polling error/bias. To see how many polls were used, see this spreadsheet.

**House: **Republicans win popular vote by 1.5 ± 2.0%, gain 8 ± 6 seats.

Please give your own predictions in comments. What surprises do you predict?

]]>Here are final polling snapshots for gubernatorial races that are either close or likely to switch party control:

Put your own predictions in comments! Some more notes…

The calculations above are a benchmark for how well we can do with polls alone. No house-effects corrections or fundamentals-based modeling was done.

*Technical notes: *For each state, at least three polls were used. The number of polls was determined using the variance-minimization method. All data were taken from HuffPollster. The median (no more than one poll per pollster) and the estimated SEM are shown in the second column. The win probability was calculated assuming an additional possible 2.0% error in the home stretch to account for polling error or bias. This was converted to a probability using the *t*-distribution (3 d.f.). To see how many polls were used, see this spreadsheet.

I’ll incorporate this into the *2014 Geek’s Guide*, hopefully by midday Tuesday. In the meantime, please give your own predictions in comments. What surprises do you predict?

My preferred measure is the Brier score. As I explain this concept, I’ll refer to some suggestions from FiveThirtyEight and Drew Linzer.

**Are “well-calibrated” probabilities enough? **FiveThirtyEight has suggested that probabilities should be well-calibrated, i.e. 50% probabilities should be correct 50% of the time, 75% probabilities should be correct 75% of the time, and so on.

This is sufficient if one thinks of prediction as being like gambling, i.e. the avoidance of money loss. The problem is that if I were to give win probabilities of 50% for every race, I could call that well-calibrated. But it would not be informative.

Ideally, we’d want a measure that rewards confidence, but does not reward random guessing – and really sticks it to you if you get a prediction wrong. There’s a simple measure that does this: the Brier score. Here’s how it works.

Basically, you express your win probability as a fraction (i.e. 100% is 1.0, and 0% is 0.0). Then score the outcomes as wins (1.0) and losses (0.0). Calculate the difference between the probability and outcome, and square it. That is a Brier score for one prediction. Average the scores for all your predictions to get your overall Brier score. The lowest score wins.

In this example, the person forecasting races A and B just gave 50% probabilities, which are basically random guesses. He/she ended up with a Brier score of 0.25. The person forecasting races C and D made more confident predictions, and ended up with a Brier score of 0.04.

And of course, two wrong predictions (let’s call that the Dick Morris score) would lead to a Brier score of 1.0, which is very high.

(Note that the Brier score is not the only way to go. A few weeks ago, reader Forrest Collman made a pitch for a logarithmic-scale evaluation. Check that out.)

The Brier score concept is fairly commonplace. I believe other aggregation sites will be using it for evaluation. In 2012, Rationality.org used Brier scores to evaluate our Senate predictions. We did considerably better than FiveThirtyEight, in large part because of two wrong calls by FiveThirtyEight (North Dakota and Montana). I am not certain we will do better this year – there are so many uncertain races. But we’ll try!

**How do we reward correct predictions made far in advance? **The truth is that all prognosticators should perform at similar levels on Election Eve. A better test is whether we made predictions that were ultimately correct, weeks or months *before* the election. Here is what Drew Linzer says:

A good election forecast zeroes in on the correct outcome as quickly as possible, without overreacting to daily noise http://t.co/vILBieob2q

— Drew Linzer (@DrewLinzer) October 30, 2014

I have to think about what the best measure would be. My first thought is to calculate an average Brier score over the entire campaign, starting in June. If anyone has further ideas, I’m all ears.

*Note: A previous version of this essay made incorrect statements about The Monkey Cage’s stance on how to evaluate probabilities. I regret my error. I am also told by John Sides that HuffPollster will be using Brier scores to evaluate predictions. Good for them!*

Coming into the home stretch, President Obama’s net approval/disapproval rating is at minus 8%. Not good…but 4% better than June. This is what candidates face as in-person voting starts tomorrow morning.

]]>