Exceptionally Low Turnout Can Account For Polling Errors
At The American Prospect: Tweet In the home stretch, I wrote that midterm polling is far less accurate than in Presidential years. Today, in The...
Senate: 48 Dem | 52 Rep (range: 47-52)
Control: R+2.9% from toss-up
Generic polling: Tie 0.0%
Control: Tie 0.0%
Harris: 265 EV (239-292, R+0.3% from toss-up)
Moneyball states: President NV PA NC
Click any tracker for analytics and data
We’ve been experimenting with presenting the probability as a decimal, on the grounds that the false precision of showing the ones-place is misleading. For example, “0.4” means 40%. However, I’m not seeing a lot of love in comments about this change – a bit of a mixed reaction.
Note that the uncertainty (1 sigma) on the probability is at least 0.15, or 15% (and it’s asymmetric; more uncertainty in the D direction). For this reason, aggregators should not be showing a ones-place in the percentage; you don’t see “39%” in weather forecasts, and those are about as accurate as what we’re doing. We could also show it as “40 +/- 15%”.
If you want to see the precise forecast of many aggregators, they’re all available at The Upshot (NYT). They just added PEC – many thanks to Josh Katz and the team there. The calculations all point in the same direction, a very gentle lean toward Republican control. However, everyone’s using the same polls, so a polling error would make us all wrong. Ponder that!
I’ll say it again – 60% is not that certain. If you flipped a coin weighted like that in favor of heads, 2 out of 5 times it would come up tails. The show’s not over.
Update: PEC’s November win probabilities are here, as well as piped over to the NYT.
Dr. Wang,
I think you may have answered this question before, but i cant find the answer. Does it make any sense to weight polls with a greater sample size with a larger weighting?
I actually appreciated the switch to probability and one decimal. I had a question similar to @mediaglyphic above. The YouGov polls have a MUCH higher sample than the other polls. In CO its 1634 and IA its 2359. These are two polls that are showing the D in the lead. Should this give D’s any extra hope?
Look at the MOE’s. A properly designed poll with sound methodology can give an accurate read of the electorate with a small sample size. On the other hand if the poll is poorly designed or has faulty methodology the results are suspect no matter what the sample size, unless the sample size is well over half the electorate. A big sample size is not necessarily better than a small sample size.
Reported MOEs are usually just the random error determined by the sample size, though; they don’t take possible systematics into account.
Instead of “40% +- 15%” why not post 25%-55% with confidence 65%”?
I like this approach. It ensures the reader isn’t seeing “40% [something something]” and limiting toward 0% or 100%; it actually shows two numbers, in this election’s case on either side of 50%, and helps the reader understand that even if they round, either outcome is entirely possible.
To be clear: I fully admit that I am “a reader” who simultaneously knows better, and still experiences the visceral response of “My / other side of 50%! Yay / boo!” This above-and-below approach helps (I think) short-circuit that gut reaction, no matter the ability of the reader to subsequently add and subtract a confidence interval.
It doesn’t seem to me that it’s totally meaningful, at least on its face, to say “I’m 65% confident that there’s between a 25 and 55% chance of a D+I majority being elected in November”. Putting the percentage interpretation of SD onto the percentage representation of Democratic chances of holding the Senate seems to obscure things, I think.
Sorry Sam,
Not feeling the love? Not a lot of love on the internet. Porn, casual sex? Yep. Love? Not so much. Still, some of us appreciate your efforts in providing this very entertaining sandbox.
Look folks, we knew that toward the end Dr. Wang’s model and the others would converge. Unfortunately, as it appears at the moment that PEC is doin’ a little bit more of the convergin’. That could change dramatically. It has in the past.
We all knew the race for controlling the Senate was going to be a nail-bitter. Why is anyone surprised that the polls now support this? Granted the early-voting data was good, if somewhat of a mixed bag. (e.g., Rs narrowing the 2012 gap in IA.) It just means more local effort, including you, to increase voter turnout, either by driving grandma and grandpa to the polls or cutting a check.
That’s the ticket:
— We could also show it as ”40 +/- 15%”.
I think: .4 +/- .15
Imparts the probability and the uncertainty very succinctly while imparting the limited precision of the probability measure. Perhaps overstates the precision of the uncertainty measure, but it seems that dropping a digit would understate it more.
Thanks for the link to the Upshot. It saves me the work of comparing different poll aggregators. The net I get is that – well within the noise level — all are predicting the exact same outcome, even down to the state level. That in itself is pretty amazing — Those with/without fundamentals, house effects, etc all agree.
Seems like there are only 2 topics if these polls stay consistent. 1) Is everyone going to be incorrect on the final result?
and,
How much of an effect will D GOTV have on the polls in the next 3 weeks and on the final result?
GOTV is both energizing your base AND generating new voters. Energizing your own base, given the inclusion of cell phones in poll methodology, would seem to be already baked in — Getting absolutely new voters, I think , is not..
I do not see why a better methodology can not account for all of GOTV.. 1st by measuring both aspects for this election, and including an identification in polls for the next election.
Jay, I am also curious to see how gotv affect predictions. If Dems have significantly changed voting patterns and polls are adjusted to account for outdated patterns, are the polls we’re seeing accurate? We may never know…it seems somewhat likely that at least one race will break in a way that has us scratching our heads for months.
Iowa is still close enough to go either way. In fact, most of the competitive states are still toss ups. Just as Tillis seems to have gained ground in NC, Perdue seems to be losing ground in Georgia. Republicans have a slight advantage, but until they shift the polls decisively in their favor the Senate is still a toss up.
Sam,
Why so many changes this time around? Is it simply because the elections are close this year? You removed the snap shot, toyed with changing time frame, and now you are switching to decimal. I think you are opening yourself up for a lot of criticism for introducing so many changes in such a short time.
I like the percent. Decimal is confusing because it looks too much like the meta-margin.
I understand that, statistically speaking, this is just about a coin flip. But unfortunately, I see no reason to expect the odds to change for the better (yes, that means for the Democrats).
A few weeks away, my view is that public opinion is essentially settled and that there are very few undecided voters left. I take no solace in the hope that GOTV operation will make much of a difference, because that is the Democrats’ hope every midterm election and it never seems to change much at all.
I’d be quite surprised if the average in the polls in the next few weeks show that Dems will be favored in Alaska, Iowa or Colorado.
It is such a coin flip right now that it won’t be that big of a surprise if the Dems flip Kansas and Georgia. And win Iowa and/or Colorado. Or expand their Senate control enormously in 2016.
Am I allowed to find it amusing that 538 is 5% ‘bluer’ than PEC?
There is that irony, indeed. And it simply highlights the salient issue all along, i.e. there was never that much variation between the two when one considered the closeness of the races and the capability of shifting back and forth with the tiniest of changes in a state or two. What I like about the PEC model is that it is more sensitive to trends and changes in the electorate.
Easy to get burned trusting Politico, but I wonder if the poll alluded to here is real and if the results will be forthcoming:
http://www.politico.com/story/2014/10/georgia-democrats-2014-111864.html
hi Sam, i love your website. i’m getting used to the decimal point (change freaks me out) but visually the heading looks more consistent using the %. anyway.. 🙂 but what do i know.
is the cake baked?
No.
Nice, clear concise answer. Looks like the party is pulling out of Kentucky and maybe letting McConnell have that race. I’m ever an optimist, though, and agree that, if it were a done deal, the likelihood would be over 80%, not 60%, so fingers crossed.
“I’ll say it again – 60% is not that certain. If you flipped a coin weighted like that in favor of heads, 2 out of 5 times it would come up tails. The show’s not over.”
I think my “research” yesterday made me even more aware of this fact. When I looked at 2010 and 2012 the Dems seemed to outperform the RCP averages by about 1.5 to 3.5 points. I can easily imagine a situation where the polls that favor the Republican candidate by 1-3 points go the other way.
It’s also still “early.” Based on what I saw on RCP, there was still movement in the polls at this point in the race in previous elections.
Sorry, I respectfully disagree. There should have been no change in the presentation. I agree 1001.99% that the average person does not understand statistical measures and the related error, but to make a change just when the polling went south is asking to be misinterpreted, and that misinterpretation is worse than wrong inferences from the numbers.
–bks
Out of curiosity, I averaged three different combinations and plugged them into the NYT make your own prediction model ( polls only minus house effect) this morning.
All prognosticators, 63%R 49/51 R
All prognosticators minus WP, and PW. 54%R 49/51R
Polls only models, 51%D, 49/51R
I realize I am only splitting already split hairs, but the last one is pretty close to what Dr. Wang is expressing in his Meta-margin( I think).
hi,I have a question about the power of your vote. I am sure you have probably answered this question but it seems to me blue leaning states like Colorado,Iowa who are very close would have more power to your vote than alaska were it seems its been 5 points for awhile now
I think it has to do with the sparse population so each vote has a larger impact. Changing 1000 votes in Alaska has more impact than changing, let’s say, 1000 votes in New Jersey as a completely random example.
Well, the power calculation applies to citizens of both persuasions. For someone keen for GOP takeover of the Senate it seems to me that AK would be a good investment of time and money — small population, erratic poll results, etc.
I’ve mentioned this before, so sorry for repeating myself. A lot of races are hinging on votes that aren’t cast either Dem or Rep. Any chance we can get a more detailed look at those effects in some depth in the next week or two?
I am a democrat. Like most of the rest of us on this site, I gather. But is anyone else sick and tired of how shallow many dems are? I know this is politics and that is baked in. But not acknowledging that you voted for the president of your party? Really?
And not one of these dems can read the immense positive consequences of the ACA and stand up for it (I’m talking to you ALGrimes—500,000 Kentuckians covered and she lets people in Kentucky live with their delusions that KYnect is not Obamacare? Sometimes I think they should just lose, b/c none of them have any courage, or apparently, convictions (except to do anything it takes to win).
Savanna,
I view Ms. Grimes silences on her presidential voting preference as her concession speech. It looks like she and the national party have conceded the seat to the Rs. (The national D senate campaign has pulled out all its funding and is no longer running ads in KY.)
Look at her response, she emphasized her support for Sec. Clinton in 2008 and called herself a Clinton Dem. She’s already looking beyond this election. Her sites are set for a run at the governor’s office and she’s afraid that going on record as voting for the President will hurt her. This also explains her seemingly weird and unnecessary profession of loyalty to Sec. Clinton’s 2008 campaign.
The DSCC pulling funds says little about the race since Ms Grimes has her own large war chest to use. Those DSCC resources are headed to other races.
I think you have to think about the politics in a close election in a purple state. Grimes NEEDS the independents to have a chance. This has been a R seat.
Think about the 2006 Virginia Senate election with Jim Webb (49.6%) beating George Allen (49.2%). What if a reporter asks Allen if he voted for GWB in 2004?
Grimes is trying to win independents who do not particularly like Obama and expects her base to understand and still vote for her over McConnell. If that is not possible, she has no chance.
I think the lack of “love” is due to the probability being expressed in it’s root form i.e. a rational number =< 1.0 . I think you should simply express it as a percent – the ratio most people relate to. Most people have not taken Probability and Statistics. Thanks for all your work. Great poll.
Agreed. I have a PhD and an affinity for structural equation modeling, yet it just doesnt seem right to change the probability expression or anything else on the banner this late in the cycle.
Sam,
I think listing an uncertainty around a probability is a bit strange. It doesn’t make sense if you’re a pure Bayesian and it doesn’t make sense if you’re a pure frequentist. It’s a little complicated. So I wouldn’t fault lay readers if they find it hard to interpret; a lot of the world’s foremost probability theorists would have at least as many problems with it.
I do sort of get what you mean. Correct me if I’m wrong. You mean to say that if you have a number of Democrats 35% +/ 15%, you wouldn’t have qualms with a model that showed Democrats with a 50% chance or a 20% chance because of the specification uncertainty in the models.
That said, I think this is an area in which some of the non-academic forecasters have an advantage. They (especially Silver) are unabashedly Bayesians. It makes their probabilities easier to interpret: they’re fairly explicitly meant to be betting odds. So when they get in a huff about something, they’ll challenge you to a bet about it to suss out what your probability really means. It often HASN’T been so clear and this may or may not help.
I am glad to see the extra detail on the state-by-state probabilities. I’d encourage you to take the further step of being more explicit about your assumptions on Orman/Pressler. Without that, it’s harder to compare PEC to other models.
I have a question. A lot of the polls will say something like this.
Dem 44%
GOP 46%
Undecided 10%
So my question is this. How does your model predict what the 10% will do? Don’t undecideds tend to vote more for the incumbent? And if that’s correct, Dems have enough close, contested, incumbent races to maintain the senate.
I saw a poll that had undecideds at 21%
There is no clear pattern as to how undecideds break. Best guess is an even split, which is how we assign them.
That used to be the rule and that was one of the main arguments that Karl Rove made in 2012. However there has been no evidence of that for at least a decade.
I was reading this article – http://www.dailykos.com/story/2014/10/14/1336609/-Democratic-early-vote-outpacing-2010?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+dailykos%2Findex+%28Daily+Kos%29 – on the Daily Kos (I recognize it has a Democratic bias) and this NYT piece that it linked to http://www.nytimes.com/politics/first-draft/2014/10/14/?entry=2273&_php=true&_type=blogs&partner=rss&emc=rss which seem to indicate that my analysis regarding Democratic GOTV efforts have some merit. It’s early days yet, but considering that the Republicans have implemented a lot of voter suppression laws (curtailing early voting, voter ID laws, eliminating same-day voter registration, etc) across the country over the past four years, it’s a promising sign at least. We will have to see if it translates to an increase of Democratic voters on Election Day as opposed to the same voters voting early.
The Democrat internals in Georgia that Politico reports would be quite stunning in their implications, if indeed Nunn is not only leading but nearing 50 %. We know the registration effort has been phenomenally successful. We also know that the registrations of 40,000 Georgians are being withheld from the roles because of a mere 25 ” suspicious ” applications. If these voters are disenfranchised it would account for about 1.5 % of the total vote. If Nunn can still lead with that factored in, it will be a well deserved comeuppance for Republican officials. The polls are now clearly showing the effects of Perdue’s outsourcing comments, and the dynamic of this race has changed markedly.
I suspect that the DSCC’s $1 million investment in Georgia, like their $1 million drop in SD, is a good indication that they’ve seen something that the MSM hasn’t quite seen yet (at least the SurveyUSA poll confirms this trend towards Nunn). If they can continue blasting Perdue over his outsourcing comments and get as many of their voters out as possible, I very much think Michelle Nunn will win outright on Nov 4th, in spite of the Republican efforts to steal the election. (I would go as far that if she comes first, but goes to runoff, there will be a major incentive for most Democratic voters to participate despite Georgia’s history, especially if Senate control hangs in the balance on that seat. And yes, I am taking into account 2008, which I think is different as it was an incumbent who came first in a Democratic wave year, narrowly missing the 50.1% mark in Round 1, and there wasn’t a guarantee that a Dem victory for that race would result in the Democrats getting their magic “60” seat filibuster-proof supermajority – the Minnesota race was about to go into a legal wrangle rivaling the 1974 New Hampshire Senate battle, and Specter showed no signs of defecting.)
Anyway, at this point in the cycle, I don’t care how the Democrats do it, but as long as they end up with 50 seats and Vice-President Biden keeps them in control of the chamber come January 3rd, it will be a victory (although I think they will end up with 53 seats). And besides, seeing Rove having another meltdown when he’s realized he’s wasted yet another $100 million+ on failing to recapture the Senate will definitely be worth it.
But isn’t Georgia where lots of new registrations for Dems have yet to be entered into the voting lists?
Yet another factor to account for–mishandling of voter registrations, thereby invalidating potential votes, most of which are minority votes.
I watch your optimized picks when deciding how to contribute. But I’m wondering about GA with Nunn seeming to move ahead and Dems pulling out with funding. It isn’t currently one of your picks, but is that likely to change? What do you think?
I agree, it is about to move up the ladder. Perdue is still at +2%, but that might be due to old polls.
I’ll be reevaluating later today. Others entering the mix: SD, NC, KS. Starting to think AR is done, not sure yet.
Well even though the actual Democratic poll showing Nunn near 50 has not surfaced AFAIK, now we have an independent poll from SurveyUSA showing the same thing:
http://www.surveyusa.com/client/PollReport.aspx?g=196fcc85-d1bc-4da9-b4be-4344a384d33d&c=26
What is so extraordinary about the SurveyUSA’s numbers is that they reveal in their analysis that Nunn trails Purdue among men by only 3 – up from a peak of a 19 point deficit ( she already has a huge advantage among women ), trailing only 6 in the independent vote – up from a 28 point deficit, and now leads the Greater Atlanta area by 22 points – up from 10. What this shows is that this isn’t just newly registered voters. She is dipping heavily into Perdue voters.
Lots of love here. 40% conveys more comprhensible information to more people than 0.4. This is especially true for those who are not particularly statiscally literate. Go for it.
I’m a bit confused by what it means semantically to have a confidence interval around a prediction. The “true” outcome that D+I will hold 50+ seats is either 0% or 100%. The prediction is the best estimate of probability that can be extracted from the data. The CI means…. that any value within this range may be as good an estimate as any other? Maybe the Election Day probability should then be expressed as a range: 15-45% instead of 30±15%.
I have learned a lot from this set, including the commentariat. I also have learned a lot from Nate Silver over the years, altho I prefer the approach to election forecasting here at PEC. In many ways it seems that the approach here is a better manifestation of the principles of data analysis that undergird 538. However I have begun to wonder recently whether the recent shift of perceived advantage towards R, if it continues, does give evidence that “special sauce” has its role. I say this quite tentatively, since one data point (election result) does not a theory validate. Nevertheless the beauty of the MetaMargin — and I have come to appreciate its utility more and more recently — also highlights the situations in which “special sauce” may be required. Please help me understand this. If the recent shift in preference towards R holds up and the GOP does take (say) 52 seats, does that not fit with Nate’s view that there are underlying fundamentals that would eventually move the election towards this result, once voters (not just pundits and prognosticators) really begin to focus on the choice at hand, usually around September? At one time Sam chided Nate for giving 52R more of a chance that 52D, but now both models make the former more likely than the latter. On the other hand if the Dems squeak by and retain the Senate, it will probably be because of systematic polling error (not necessarily bias, just mismeasurement of some sort). The MetaMargin is a great tool for those kinds of hypotheticals, but if that is the only way that we can foresee a Dem victory given the current data then does that not say that fundamentals have more long-range predictive power?