Princeton Election Consortium

Innovations in democracy since 2004

Outcome: Biden 306 EV (D+1.2% from toss-up), Senate 50 D (D+1.0%)
Nov 3 polls: Biden 342 EV (D+5.3%), Senate 50-55 D (D+3.9%), House control D+4.6%
Moneyball states: President AZ NE-2 NV, Senate MT ME AK, Legislatures KS TX NC

Were There Really Reverse Coattails?

April 20th, 2021, 12:48am by Sam Wang

A positive development for the left in the last few years has been the renewed interest in running for downticket races. From Congresswoman Alexandria Ocasio-Cortez to legislative candidates nationwide, such efforts are necessary for building a healthy political party.

Last week came a new claim that such efforts are not only directly useful, they can help the top of the ticket. However, this latest evidence for such “reverse coattails” needs some work before it’s ready for prime time. This is not to say that the claim is false. It’s more that the evidence is too preliminary to judge. In basic research, new claims get subjected to close scrutiny through the peer review process. I will offer some comments in that style. I hope they are constructive.

First, the basic claim: in precincts where a legislative race was contested by a Republican and a Democrat, President Joe Biden did 0.3-1.6 percentage points better than similar precincts that had an uncontested Republican candidate. (News reports claim 0.3-1.5 percentage points, but that appears to be a rounding error.) On the face of it, this result would seem to support the idea that downticket competition gets out the vote for the top of the ticket. In other words, reverse coattails.

The news coverage says who did the research (two Democratic data analytics groups) but it doesn’t link to the original evidence. After some digging I eventually found this slide deck, which appears to contain the analysis. We’ll use that as our starting point.

Rather than re-explain what they did, I will just go through these slides and highlight three major problems that need to be addressed: (1) inappropriate comparison between states, (2) only one of the two models gives a statistically significant result, and (3) the estimate lacks a true confidence interval.

General comment: The model. If I were to try to unpack the methodology, it seems they began with two models. Both included all of their predictors (age, income, education, etc…). Then, to account for underlying partisanship, one used Clinton two-way vote share and the other used the TargetSmart Score. I’m guessing they did an incremental variable selection procedure (based on statistical significance) which is why they ended up with a similar, but not identical set of final predictors in the two models. 

I was not expecting the TargetSmart-based model to include both “% bachelor” and “% high school or less education” as a variable. I imagine that % high school or less and % bachelor are highly (negatively) correlated. I would have thought post-modeling multicollinearity tests would have led them to remove one of those variables. However, this may not be important, since they may be better off discarding the TargetSmart-based model entirely – see the second point below.

Major issues

1) Slide #10: Comparisons should be made within-state when possible. The researchers obtained precinct-level voting data in eight states where the downticket legislative race featured (a) an uncontested Democrat, (b) a Democrat-vs-Republican race, or (c) an uncontested Republican. (There were also some “semi-contested” races with a minor-party candidate, which I will ignore.) The basic study design was to compare performance between these groups, to see where Biden did better, after accounting for the underlying partisanship of the precincts.

However, 2,157 out of 2,446 precincts with contested races were in Florida and Ohio, whereas most of the uncontested-R precincts (146 out of 246) were in Kansas. These are not comparable states: Florida and Ohio are high-turnout swing states, and Kansas was uncompetitive for Biden. From this information alone, one might expect Biden to do better in Florida/Ohio precincts. 

In short, it would be good to make at least some comparisons within the same state. For instance, looking at Ohio-only might have given fairly good statistical power. At a minimum, state ID should be included as a variable in the model.

2) Slides #11, #15, and #16: Why two models, especially when one is a negative result? The researchers controlled for a number of variables, including partisanship measured two ways: (a) the Clinton 2016 vote, and (b) partisanship imputed by TargetSmart, a Democratic targeting firm.

They found that these two variables are highly correlated (Pearson’s r = 0.927). But arguably, Clinton 2016 is ground truth. It’s not clear why one would ever want to use TargetSmart estimates, which only capture 0.927^2=86% of the variance. Looking at the scatterplot in slide #15, the median error appears to be at least 5 percentage points. Considering that the estimated effect is 1.6 percentage points, the TargetSmart model might cause problems. 

And indeed, the TargetSmart model gives p=0.37, which is a negative result. This result does not even tell you whether the reverse-coattail effect is of positive or negative sign. (Slide #15 claims the 0.3-percentage-point effect is interpretable, but this is simply wrong.)

3) Slide #16: No confidence intervals are given. Contrary to news reports, the range of effects isn’t 0.3-1.6 percentage points. The ends of this range come from the two different models (and are not a confidence interval) using Clinton two-way vote share in one and the TargetSmart Partisan Score in the other to account for underlying partisanship differences. The first model resulted in a contested coefficient of 1.6% (and was significant), but the second had a coefficient of 0.3% (and was not). Their argument is that the Clinton two-way vote share and TargetSmart Partisan Score are highly correlated variables, so even though contested races were not significant in the second model, it still “indicates directionality”. But considering that the TargetSmart estimate might have a high uncertainty, it should probably be disregarded. Anyway, calculating a confidence interval from the Clinton 2016-based model is easy, and is a must.

Tags: 2020 Election · Moneyball

One Comment so far ↓

  • Ken Lawler

    Prof. Wang, here’s an interesting stat we uncovered recently in GA. In 2020, state legislative races that were uncontested had lower turnout by 7.4 percentage points than those that were contested. We estimate approx. 295,000 voters didn’t vote as a result. In a state in which Biden won by under 12,000, we think this is significant. It’s correlation vs. causation, but interesting nonetheless.

Leave a Comment