Gerrymandering notes: how many votes to take House, 1992-2012
To the reader: This post focuses on technical notes regarding the House prediction. It is not a popular essay, but is for diehard geeks. Additional notes will be added here.
All error bars below are 1-sigma values. Underline indicates a parameter that is used for the calculation.
Part 1: Converting national vote share to seat count.
I have broken this question down into (i) the relationship between national House popular vote, 1946-2010, and seat count; (ii) effects from immediately preceding Congress (“incumbency effects” and other historical effects); and (iii) the effect of redistricting for the 2012 election.
(i) Popular vote as a function of seat count.
This is calculated using a linear fit of the form
(seat margin) = a0 + a1 * (%vote margin)+ a2 * (previous Congress seat margin)
where margins indicate the Democratic-minus-Republican difference. Both a0 and a2 are needed to effectively correct the generic Congressional poll margin.
The addition of a2 decreases the residuals considerably, and leads to a modest increase in parameter uncertainties. As I have written before, adding more parameters fails to meet these criteria, and may constitute overfitting.
From 2002-2010, a0 = -3.3 +/- 8.2 seats and a1 = 6.2 +/- 1.1 seats/%vote.
From 1992-2010, a0 = -0.5 +/- 6.2 seats and a1 = 6.8 +/- 1.0 seats/%vote. The ratio a0/a1 is R+0.1+/-0.9%.
From 1948-2010, a0 = +5.9 +/- 4.8 seats and a1 = 8.0 +0.5 seats/%vote.
The parameter a1 appears to be smaller over the last 20 years compared with post-WWII. This might be a reflection of increased incumbent advantage and/or redistricting.
(ii) Historical effects (“incumbency”). An incumbent’s advantage has been estimated to be as high as 5-8%. This could affect both a1 and a2. The generic Congressional ballot is a direct measurement of opinion, and therefore is likely to already capture the effects of this advantage. For this model, the question is how to estimate the macro-level advantage.
Because I previously referred to a2 as reflecting incumbency, I will continue to refer to it that way. The macro-incumbency advantage for 2012, based on recent data, gives estimates that go all over the place when even one data point is added or removed. It is not a stable parameter, suggesting other effects that require district-by-district analysis. Here, I use as much data as possible to get the error down. For 1948-2010, a2=0.2+/-0.1, which in units of generic Congressional ballot translates to a macro-incumbency advantage of R+1.2+/-0.4%.
(iii) Redistricting. From 2010 to 2012, the net overall shift in PVI distribution is R+0.62 +/- 0.06%. Because the seats-vs.-vote data above have a similar slope to the PVI distribution, I assume that this shift will translate fully to an effective change to the seats-vs.-vote relaitonship. Therefore the relationship in (i) requires a redistricting correction of R+1.2+/-0.1%.
Part 2: Estimating the national Congressional vote.
This is done by taking a median of all post-RNC/DNC convention generic Congressional preference polls. Aggregated-poll performance from RealClearPolitics suggest that these polls do a good job of predicting the final national vote. They are not perfect – a discrepancy can arise in the home stretch of up to 2-3%. Therefore the nominal error bar on a polls-now snapshot must include +/-2% uncertainty.
Part 3: Estimating future movement by Election Day.
Movement should be at least comparable to Presidential movement, which at >20 days from the election I have estimated as +/-1.8%. Congressional movement is likely to be greater because of low attention to local Congressional races. I make a baseline assumption that the movement in opinion is +/-2%.
Possible corrections:
Taking into account these and other possibilities I have not thought of, it would seem safe to stay with a symmetric assumption. I will assume +/-2% movement in either direction, symmetric around zero.
The combined errors from Parts 2 and 3 above are sqrt(2*2+2*2) = 3%. Therefore the estimate of Election Day generic Congressional preference is post-convention median, with an error bar of +/-3%.
This is converted to an “effective” margin that takes into account incumbency and reedistricting as follows:
(effective margin) = (predicted true generic Congressional preference) + a0/a1 + (incumbency advantage) + (redistricting advantage)
Currently, that is
(D+2.5 +/-3.0) + (R+0.1+/-0.9) + (R+1.2+/-0.4) + (R+1.2+/-0.1) = D+0.0 +/-3.2%.
Converted to seat margins, this gives a seat margin of D+0 +/- 22 seats. 1-sigma prediction: median D 217.5 +/- 11 seats, R 217.5 +/- 11 seats.
Predictions: D+2.5+/-3.0% popular vote, D 217 +/- 11 seats R 218 +/- 11 seats. Democratic control: 50%.
