Presidential prediction 2012 (final… stay tuned)
Now that all the polls are in, it’s possible to perform variance minimization, a simple procedure to identify the range of polls that can be us...
Senate: 48 Dem | 52 Rep (range: 47-52)
Control: R+2.9% from toss-up
Generic polling: Tie 0.0%
Control: Tie 0.0%
Harris: 265 EV (239-292, R+0.3% from toss-up)
Moneyball states: President NV PA NC
Click any tracker for analytics and data
Today I describe how we address pollster heterogeneity. Along the way I will also answer (1) why our state probabilities appear more confident than other aggregators, and (2) why the EV distribution at right is so spiky.
It is well known that pollsters vary in their methods. The American Association of Public Opinion Researchers has established common standards of practice and encouraged transparency, driven in part by Mark Blumenthal of Pollster (our data source). But poll-sniffers aficionados know that Rasmussen Opinion’s results consistently trend more Republican than other organizations.
Differences like this present a challenge to poll aggregators. An obvious solution is to estimate the size of each pollster’s bias, then subtract it. However, this generates three new problems: (1) Who is the neutral reference point? Gallup? Quinnipiac? Rasmussen? (2) What to do about pollsters who do very few polls? (3) What if the pollster changes methods mid-season?
For my Meta-analysis I have chosen a simple solution that gets rid of most of the bias: use median-based statistics. Here’s how it works. Imagine the two following similar sets of poll margins between candidates A and B:
Data set 1: A +2%, A +4%, tie, A +3%, A +1%.
Data set 2: A +2%, A +4%, tie, A +3%, B +4%.
The difference is that in the second case one pollster is shifted by 5% toward candidate B, approximately corresponding to the Rasmussen effect. This single outlier poll brings the average margin toward candidate B, and increases the uncertainty considerably:
Data set 1 (averages): Candidate A leads by 2.0 +/- 0.7 % (mean +/- SEM), win probability 98%.
Data set 2 (averages): Candidate A leads by 1.0 +/- 1.4%, win probability 74%.
However, now use medians. The two data sets have the same median, 2.0%. Median-based statistics allow calculation of estimated SD, defined as (median absolute deviation)*1.4826. This gives
Data set 1 (medians): Candidate A leads by 2.0 +/- 0.7% (median +/- estimated SEM), win probability 98%.
Data set 2 (medians): Candidate A leads by 2.0 +/- 1.33%, win probability 90%.
Generally speaking, using medians gets rid of most of the bias from a single outlier. In this example, the race is taken most of the way out of the toss-up category.
Which brings me to the second consequence. Increased certainty in individual states makes the EV histogram at right more spiky. This is because at any given moment, few states are actually in play. Today it’s IA, VA, and NC, for a total of 2^3=8 major permutations.
Final question: if medians are so great, then why don’t other aggregators like FiveThirtyEight use them? One reason is that intuitively, readers want uncertainty about the future to be baked into the estimate, even if it’s a snapshot of where things are today. Another is that media organizations are under pressure to attract readers, and artificial uncertainty attracts readers. However, to me that seems like spitting in the soup.
Your median-based methods seem ideal for dampening the effect of outliers of any kind. Why do you, in the title, specifically single out Rasmussen polls?
Also, what is it about Rasmussen’s methodology that gives it an inherent Republican skew? Are there reasons to doubt the polling company’s professional neutrality? I have heard that Rasmussen has partaken in Republican fundraisers…
Rasmussen’s organization is a frequent subject of discussion among obsessive consumers of polls. Statistically, the bias of his data is very well-documented. Read this piece by Mark Blumenthal and articles linked from it. Here is a telling chart: http://www.mysterypollster.com/photos/uncategorized/franklinrasmussen.jpg
How does it happen? One possible cause is stratification: all pollsters must weight their results according to how they believe their sample will differ from a real voting population: by sex, ethnicity, party, and so on. An incorrect weighting by party ID would give results that were “good” in the sense of being a real sampling that gave information, but was biased in one direction. The good information could then be extracted by subtracting the bias.
Sam-
You assume a relatively benign explanation. The more suspicious minds see a pattern of State-level Rasmussen polls being custom-made to produce a certain effect. That in some irrelevant states Ras will produce a less-biased, or even Dem-biased poll to balance out the extra- heavy bias he adds to polls in states where he is trying to drive a particular narrative. I don’t know how to test that, but the editorial slant of the analysis he puts out along with his polls is so intensely biased and designed to drive his narrative that it would not be surprising if he monkeys with the poll results when it suits him.
I think the major problem is Ras’ LV model and robocalling. There is a sampling error baked in that we can call White Male Land-owner effect…eg…if you own your house you are likely to have a landline and be a White Male Romney voter.
The emergent demographic of cell-only voters are inaccessible by Rasmussen robocall, and much less likely to take a live poll call because they have to pay for minutes.
White males OTOH represent the property class, and also are the only demographic Romney carries, besides old people, who don’t have cells as a rule.
Rasmussen’s LV model failed in 2010 when he whiffed on Colorado and Nevada, because of the cell phone demographic.
And please…artificial uncertainty attracts a certain type of reader…like FOXnews attracts a certain type of viewer.
The hardcore stat-nerds that comment at 538 regularly question Nates inappropriate use of Rasmussen.
But the NYT needs pageclicks. And HotAir and other ‘conservative’ sites won’t link Nate if he’s too discouragingly truthful.
If one goes back through Nates archives, the titles of posts often seem staged to attract links from conservative websites, while the body of the text says the opposite of the title. Some of us thought Nates editors were titling the posts.
But now I suspect Nate is doing it himself.
Its like regulatory capture in the markets.
How can we take account of the fact that Rasmussen publishes polling results every day (a three-day rolling average I believe)? Even with median-based statistics, doesn’t Rasmussen just swamp the other data sources by its sheer volume?
Read the methods section please. We use polls for which there are nonoverlapping samples. Also, I believe you are referring to a national tracking poll. We do not use national polls.
True enough…
I remember that in 2008 there was a popular parlor game of trying to extract raw numbers back from the Gallup and Rasmussen three-day rolling averages, even though this is not mathematically possible. Nate Silver was one of the major practitioners even though he admitted himself that he couldn’t do this reliably.
I fell prey to that. See here. Of course it is impossible without further constraints. I used the idea that the overall variance would be likely to be minimized by the correct answer. That worked fairly well. The code’s here.
Does it work as well to simply toss out the most extreme polls on either end of the spectrum and then just average the rest? So to take a recent example, you’d ignore the Pew poll showing a 9-point Obama lead and you’d ignore the most pro-Romney poll in the mix.
That would achieve a similar effect, but it is a mixture of median and mean. Not a standard statistical technique, very ad hoc. It is better to have a rule that works well in many different situations. Thus the median.
I’m starting to wonder if “house effect” may be entirely dependent on the capture (or non-capture) of cell phone demographics by the pollster.
That also might explain the gap between nat’l and local polling? There are more state polls of the swing states, and many incorporate cell phone polling.
Maybe a LOESS regression? its nonparametric!
Maybe LOESS would help if used in an outlier-resistant form. My calculation uses a running median, which seems to be adequate for state polls.
For national data, I think one would want to subtract a pollster-specific correction, then smooth, then add it back – the subtraction and addition being net neutral summed across pollsters. Unless there is specific evidence of fraud, departing from neutrality is a problem.
Jason, that would only work if the political behavior bias of the pollsters is symmetrical.
Im starting to believe in Asymmetrical Political Behavior Theory.
well…its not fraud. its ….lack of ethics?
Rasmussen polling undersamples cell-only and smartphone demographics. Those demos go for Obama by about 20 points. The result is Ras overweights landline and landline plus cell demos.
Who has landlines? White males (property owners) and old people. The only two demos favorable to Romney.
Its not that Ras does this with evil intent– he just doesnt care. He is aware of the bias. But there just arent any market forces operating on him to change. Everyone consumes his data. Polling is down 40% since 2008.
Sam,
Sorry for this long-winded statement of a simple question..
Regarding your ~10/1 odds on President Obama’s re-election: could you please explain, in terms of your model, this possible scenario: the vast majority of current battleground states voting red..?
You refer readers to the Huffington Post polling data. Their electoral map currently shows CO,IA,OH,VA,NC,&FL as toss-up states. For the sake of discussion, let’s paint CO blue & the remainder red, as in this 270-to-win map:
http://www.270towin.com/2012_election_predictions.php?mapid=svw
I trust you wouldn’t dismiss this scenario as ludicrous..? (I do see that you have FL as safe-blue, but isn’t that arguable?) In this case, Romney would eke out a victory.
By focusing only on the uncertainty within the battleground states & ignoring everything else, it appears at first glance that the odds might be much tighter, doesn’t it?
Could you please point me to any previous discussion of yours specifically about focusing solely on the battleground-state uncertainty?
Thanks,
Brad
p.s. for anyone curious about my previous mention of buying Rubio/VP at 10-1: the last-minute chatter about Ryan disrupted my original framing of the VP race as a 3-man contest. Given that I couldn’t dismiss Ryan out-of-hand (as I could Condi Rice), I felt I had to somehow hedge this new possibility. Ryan had gone from 30-1 to 12-1 overnight on Intrade. Seeing that he was still “cheap”, I picked up enough Ryan contracts to fully cover my Ryan/VP stake, at very low cost.
Brad – I have outlined the model as clearly as I know how in postings here and here. To put it briefly, we can learn how much a re-election race is likely to move by looking at previous years’ races. That gives us a sense for how much movement is likely.
In that light, whatever scenario you are linking to is almost certainly of low probability. When I state that re-election will happen with ~90% probability, I also mean that ~10% of the time I will be surprised.
In that light…yes, I am dismissing whatever specific scenario is listed at that website. Generally I will not discuss specific scenarios. Considering that there are 2.3 quadrillion of them, that would be endless. Even ten swing states leads to 1024 possibilities.
[…] Sam Wang, whose site Princeton Election Consortium consistently gives Nate Silver’s FiveThirtyEight a run for its money on Presidential elections, offers a spirited defense of the median. […]
[…] Statistics has been dominated by the mean as a measure of central tendency for over a century, but medians often work better. This is true for measures of economic wellbeing and it is true for political poll aggregation. Sam Wang is a biologist at Princeton who created a very simple model for aggregating political polls as a hobby in his spare time which has been more accurate than almost all of the professional pollsters who stake their careers on predicting election outcomes. And one of his simple techniques is to use the median of polls for aggregating them rather than the mean of poll results. [Wang’s] Analysis relies entirely on the well-established principle that the median of multiple state polls is an excellent predictor of actual voter behavior. [One reason is because] median-based statistics correct for outliers. […]