Environmental lead (Pb) and crime
At Mother Jones, Kevin Drum has an in-depth article on the hypothesis that environmental lead is a likely root cause of the increase in cri...
Senate: 48 Dem | 52 Rep (range: 47-52)
Control: R+2.9% from toss-up
Generic polling: Tie 0.0%
Control: Tie 0.0%
Harris: 265 EV (239-292, R+0.3% from toss-up)
Moneyball states: President NV PA NC
Click any tracker for analytics and data
(Updated to reflect the effect of Senator Stevens’s conviction today. A Begich win over Stevens would make a 59-41 split even more likely. -Sam)
The voting makeup of next year’s Congress is a simple sum of votes, and therefore poses an easier estimation problem than the Presidential race. Many other sites give just the average estimate. Here I will give a probabilistically-based analysis that (a) estimates the likelihood of Democrats reaching a supermajority in the Senate, and (b) puts confidence intervals on the estimates. This projection will be updated as the election nears.
The Senate. Based on Pollster.com data, Democrats are headed for new wins or continued control in 57 seats (counting Sanders and Lieberman); Republicans, 38 seats. The remaining five races are uncertain: Alaska, Georgia, Kentucky, Minnesota, and Mississippi (B). Two states are polling-poor: In Alaska, the third-oldest poll is from 10/6, a long time in the corruption trial of Senator Ted Stevens (R). In Mississippi, the third-oldest poll is from 9/24. So the main weakness in Senate meta-analysis is that polls lag behind current conditions.
The rule used here is the same as the Presidential meta-analysis: start with the last 3 polls and last week of polling, then calculate a z-score, which is converted to a probability. I use the t-distribution rather than a normal distribution, which statistics aficionados may appreciate. To allow for added uncertainty, I assume that between the last available polls and Election Day, the margin may change by an unknown amount S, with an S of up to +/-2%.
Under these assumptions, the win probabilities are
State | Dem | GOP | Median margin | Dem win probability |
Minnesota | Franken | Coleman | D +3% | 91% |
Alaska | Begich | Stevens | D +1% | 71% |
Mississippi | Musgrove | Wicker | R +2% | 17% |
Georgia | Martin | Chambliss | R +2% | 8% |
Kentucky | Lunsford | McConnell | R +4% | 3% |
The resulting probability distribution of Senate outcomes is:
Current Senate predictions: 58-60 Dem/Ind, 40-42 GOP. The probability of falling within this range is 92% (96% w/Begich). The most likely outcome is 59-41, a pickup of 8 seats, with a probability of 49% (69%), approximately even odds. The probability of reaching 60 or more seats is 21% (23%), or 4-1 against.
Note: After I wrote this, Senator Stevens’s conviction on ethics charges was announced. However, the above conclusions are basically unaffected. The figures in (italics) above indicate the case in which Begich wins.
A moderate swing could alter these odds. It would take a swing of about 2 percentage points for the Democrats to reach 60 seats. Such a swing could come from 2% of new commitments to the Democratic candidate from among undecideds, or from 1% of voters switching sides. I will update this calculation when new polls come in.
Previously I recommended particular races as being on a knife edge, but conditions have now shifted. In terms of resource allocation, current conditions indicate the following categories:
Lean Republican. These races are aggressive investment opportunities for Democrats, conservative for Republicans: Georgia (Martin v. Chambliss) and Kentucky (Lunsford v. McConnell).
Knife edge. Giving to this race provides the maximum leverage for both sides: Mississippi-B (Musgrove v. Wicker) and Alaska (Begich v. Stevens). They are listed on the ActBlue site and at the NRSC.
Lean Democratic. This races is a conservative investment opportunity for Democrats, aggressive for Republicans: Minnesota (Franken v. Coleman). This race is no longer on a knife edge.
The House. Here polls are sparse, as few as one per competitive district. However, even if we cannot predict individual races, we can pool all districts to estimate the overall outcome and its uncertainty. This approach worked quite well in 2006.
Today’s data at Pollster.com give 158 strong R, 8 lean R, 24 toss-up, 10 lean D, and 235 strong D. Democrats lead in 13 out of 24 toss-up races. Therefore the median expectation is 158+8+11= 177 Republican seats, 13+10+235=258 Democratic seats. If we assume that the 18 “leaner” seats have win probabilities of 0.8-0.9, and toss-ups have win probabilities of 13/24 or 11/24, binomial math gives a snapshot with confidence intervals (CIs):
Democrats 258 seats (68% CI 255-261, 95% CI 252-264),
Republicans 177 seats (68% CI 174-180, 95% CI 171-183).
The 2006 Congress came in at 233-202 and is now 235-199 (1 vacancy). The pickup for Democrats is therefore 23 seats (68% CI 20-26 seats, 95% CI 17-29 seats).
One benefit of calculating confidence intervals is that it provides a means of estimating the probability of particular events such as whether Democrats will get 271 or more seats in the House. Such events are traded on InTrade and other electronic markets. However, these sites only approximate the true odds. There are discrepancies, which I will write about later.
Dr Wang,
I have been thinking recently about how to evaluate the success of the model. Would only a final EV value (or in this case, Senate and House seats) outside the election eve 95% EV CI indicate a problem or would you only feel comfortable with a value within the 68% CI?
Excellent, this is just the kind of post I was looking for. I don’t think nearly enough coverage is given to the senate and house races, I’m glad to now see some juicy details.
gprimos1 – It would take a final EV count outside the 95% CI to make me think that polling data alone were insufficient to predict the result.
I also plan to take steps to reduce the size of the CI. In addition to the automatically updated history, I will make a final estimate of the Presidential race that may use more than one week of data.
Thanks for some insightful and intelligent analyses of the polls. Much appreciated. Here’s a question you probably haven’t been asked:
I have several friends who are adamant that the presidential election was stolen in 2000 and 2004 by republican shenanigans in Florida and Ohio. They make a good case. They are now concerned (terrified) that the fix is in for this election and that despite the polls pointing to a possible landslide, they fear another rigged election. So my question is, how far off the polls would the results have to be for it to suggest that the vote-counting in any state was significantly tampered with. 2%? 4%? What?
Thanks in advance for considering the question.
I wouldn’t be too worried about fraud: http://www.slate.com/id/2202777/ suggests that both parties are hiring massive armies of lawyers ready to pounce at the slightest irregularity. That, plus the landslide margin Obama’s going to win by anyways, should lead to neither party even bothering to try fraud.
Hmm. Issues with apparent vote tampering are well-known, and they famously always work in favor of Republicans. Even in this election, we have already witnessed evidence of this, shall we say, partisan malfunction. I agree with William: It won’t affect the outcome of the presidential election.
But that conclusion only underscores the other facet: House and Senate seats. The Republicans aren’t hiding the fact that they’re desperate to avoid losing those seats, and the fact of the matter is that they could easily get away with even a severely conspicuous series of apparent upsets on those votes. Watch it happen.
Folks out there might want to look into the legality (in their state) of videoing their vote, so as to capture any of the several funny goings-on as they happen. Perhaps widespread Youtube evidence will at least force a long overdue dispensation with easily hackable voting solutions.
I have heard about a predictive model that doesn’t rely on polling data at all but rather uses historic vs present econometric data. What are your thoughts about this type of modeling and could/should it be used in your modeling?
I have been asked about voter fraud many times. It is a constant source of questions. One of my replies is here.
Comments are off. Feedback is still possible via email.