Housekeeping note

October 11, 2016 by Sam Wang

Thanks to all the readers who noticed something wrong with the poll-aggregation rule. It’s fixed now, thanks to rapid response by PEC co-conspirator Walker Davis. I apologize for the glitch.

The rule should have been “last 3 polls or 7 days, excluding repeat contributions from the same pollster.” Polls are sorted by the end date of the survey period. The reason for the no-repeat rule is to avoid excessive influence from any one pollster’s methods. This rule had been implemented only if the polls overlapped in their date range. Now the rule is implemented for any multiple contributions, even if the samples were taken at different times. The code is here.

Use of the correct rule changed the Meta-Margin by about 0.7% toward Clinton, probably because now UPI/CVOTER can only weigh in once per state. There are also some new state polls this morning, which moved the Presidential estimator an additional 0.8%.

One final note – I still have to repair yesterday’s estimate in the history. Dates previous to that should be okay.

64 Comments

Brad says:

Love the transparency of this site, Sam. Absolutely awesome work this election season.

Pat says:

Has everything been corrected? Currently, in the “power of your vote”, Virginia is indicated as “Clinton +1%”. It seems to me the only way for this to happen is if 2 UPI/CVOTER polls are used for the median.

Sam Wang says:

I think it’s okay…check again in a few minutes. I will do the manual calculation later to make sure.

WildIrish says:

Professor Wang, something seems to be weird with the Power of Your Vote listings. When I click on IL (+7%), the difference in the polls is 9.1; AZ shows +18, but the difference in the polls is 10.3. IN and AK say too few polls to calculate, but show +14% and +28%, respectively. NH shows +2%, but the difference in the polls is 0.4.
I am probably misunderstanding how this works, but I wondered if the correction for the duplicate polls today caused downstream issues.
Is the listed percentage supposed to represent the difference between R and D poll values?
And just for the record, I am in absolute AWE of your ability to do any of this. I know just enough stats to know how brilliant you are. Thank you SO much for helping us understand the ins and outs of poll analysis.

Sam Wang says:

Those are HuffPollster’s aggregates, not PEC’s. Thanks for reading.

Matt L says:

Sam,
The update to the algorithm makes sense, but it still makes me nervous that the aggregator has been changed twice this election season, both more favorable toward Clinton (The first one I remember is “sharpening the Presidential forecast.”)
I’m not sure what I’m trying to say, since I understand neither change was “adding secret sauce,” other than that one of the appeals of coming here is that there’s not secret sauce or extensive tinkering.

Sam Wang says:

The main effect of this fix is to reduce the size of fluctuations. It would be hard to introduce a lasting bias in the estimator.
Also, I have a hard time believing that sites who are silent on their fixes have made no fixes at all. They just don’t make a fuss over it. You’re criticizing me for transparency…you always have the option of not reading my disclosures.

Some Body says:

The only potential concern I see here has to do with how this fix came about: several commenters were nervous about a surprising drop in the MM, and quickly came up with a diagnosis, leading to this fix. What’s concerning? Well, we PEC readers are a strongly left-leaning lot, and so a bug that results in a better MM for Clinton (& other Dems) is less likely to be spotted by us. Over time, this could result in a bias.

Josh says:

Your transparency is why this site is the first thing I read when looking for election forecast updates. It’s what sets this site apart and I hope you continue to run it that way; this is a great resource.

Tony Asdourian says:

Sam, he’s not criticizing, he’s just looking for reassurance that the revision of the method is party neutral. In other words, that you would have done the same if Trump was up by 5. I feel sure you would have, but it’s not a critique per se to ask for reassurance of lack of bias.

Sam Wang says:

To take a page from how philosophers and social psychologists deal with cognitive bias, consider-the-opposite is a good strategy. Scenario: in 2020, if a Democratic-leaning pollster floods the zone, would we start letting in multiple polls per pollster again? That would be a bad move!
It is certainly the case that PEC readers will be quicker to notice problems that adversely affect the Democratic candidate. Any response, as long as it is theoretically neutral, should have at most a temporary effect. A trick is to maintain theoretical neutrality.
Response to another reader: it is tempting to recalculate the entire history. However, the whole history will change because we don’t store the date the poll entered the calculation – we only store the date range of the sample. When we recalculate, polls would enter the history at least a day earlier. So my preference is to recalculate after November 8th. Past years’ graphs are, in fact, recalculated that way.

Matt McIrvin says:

This is why Sam was uncomfortable with Huffington Post’s decision to omit landline-only polls from their feed. It seems like a sound decision, except that this is something that probably makes the aggregate look better for Democrats, and it’s not clear that they’d make an equally fiddly omission to make things look worse for Democrats. Better to just leave them in. But I think they’re still out.

Ed Wittens Cat says:

One of my most beloved professors
once said
All humans have cognitive bias
Mathematics is how we deal with it
🙂

Kevin King says:

No need to apologize! Thank you and Walker Davis for all your work and expertise. I’ve learned alot from you and greatly appreciate it!!! 🙂

bks says:

That will teach Sam to stop saying things like: Here it is: poll-based Presidential prediction is not very hard.

Matt McIrvin says:

In the grand scheme of things, it wasn’t a large move; the most noticeable thing was that UPI/CVOTER sees Virginia as a tossup state.

Paul says:

The evidence suggests it’s software QA that’s hard.

Erik Pescara says:

I think a PEC git would help in that aspect and help readers to collaborate and submit their own bugfixes and/or additions.

Paul says:

Erik: I think that’s a great idea, although I can tell you from experience that being an open source project owner — guiding, reviewing, and accepting changes — takes a _lot_ of work. I’d hardly expect Sam to take that on himself.

Scott J. Tepper says:

If the Senate and the Presidential vote are moving in lockstep, why has the probability of a Democratic takeover of the Senate declined as the probability of a Clinton win has moved to 95-97%? Is it that the decline is not statistically significant, or that the polling data for the Senate lacks chronologically the President polling?

Josh says:

Senate polling generally lags presidential polling. There are fewer polls done per race and polling isn’t done as frequently.
As an example, one of the most-polled senate races, Hassan-Ayotte in NH, the six most ‘recent” polls on the HuffPollster feed date from September 12 to October 5. By contrast, just in the last 5 days we’ve had 6 new presidential polls come out.

David says:

Speaking of UPI/CVOTER, the difference between their latest survey and the one immediately prior to it is remarkable in its consistency. It seems like in every state they polled, which is most of them, Clinton is doing 3-6% better in the latest poll than she was doing in the prior one.

Ken L says:

Interesting thst as Clinton consolodates her lead The EC map appears more and more polarized.
I hope Sam can illuminate the neuroscience of the undecided voter. It appears to me, that in the age of the internet, the qualifications and records of the candidates are known and the data and information are easy to find.

Erik Pescara says:

The polarization of the map is a natural occurence. The closer the election gets the less time is left for a change and the more likely is the predicted outcome.
In theory even a 1% lead on election eve should be enough to make a prediction of the outcome if you feel confident in the data.

Rachel Findley says:

Yes please help us to understand what’s going on with the undecideds. I suppose intentions freeze when different pieces of information point different ways. What does neuroscience say?

Daniel Barkalow says:

It’ll be interesting to see if there’s a sudden shift in the EC map now that GOP leadership and proxies are changing positions. The polls haven’t been distinguishing between “Trump” and “generic Republican” in their questions, so it’s entirely plausible that Trump will go back to underperforming in red states without that being any sort of contradiction with recent polls.

Matt McIrvin says:

The map is a snapshot of the most likely outcome if the election were held today; I don’t think that’s in any way dependent on how much time is remaining in the campaign.
I think the polarization increases just because there’s a hard ceiling of very red states that are really difficult to flip or even de-polarize. You can see that in the way the upper tail of Sam’s yellow “watch zone” in the EV chart looks compressed, though it doesn’t in the Meta-Margin chart.

Erik Pescara says:

Matt: Yes, you are completely right, I mixed that up. And I agree with you, it seems that the states are not continously distributed on the left-right axis.

trlkly says:

Can you just use the data and rerun it with the fix in place?

Swami says:

You must not know many businesspeople or attorneys.

Rick says:

I was an academic for 20 years, and since then I’ve worked with a lot of business people and attorneys. I can tell you that academics are the least likely to admit a mistake.

Nathaniel Hedman says:

A big plus-one for the PEC git idea. There are a lot of us who hang out around here with pretty strong technical backgrounds who could not only spot bugs, but chip in site improvements when we had spare time.
As it is, there’s no real process for either.

Arthur Neelley says:

Thank you Sam for all the excellent and thorough work you do ! I can assure you we all appreciate it, especially this go round.

Jay Bryant says:

Errors happen. You handled it the right way, by fixing it and then owning up to it for your readers. Good on you.

David Elk says:

PEC’s EV prediction in animated form, using (maybe) the fixed history: http://imgur.com/a/Lrzxs

George says:

That is way cool. Thanks.

A New Jersey Farmer says:

Very nice.

Bela Lubkin says:

David, wow! That’s excellent.
Will there be ongoing maintenance of this through the end of the campaign?

Michael Hahn says:

Very nice!! Thanks for doing this!!!

David D. says:

Very cool! Thanks!

Michael says:

You, Sir, are a gentleman and a scholar.

Rachel Findley says:

Is the House in play now? Focus on Senate, on House, or stay with the Presidential race in fear of the 5% chance it might go wrong?
The particulars of whether and when a Republican candidate disowns Trump could matter to the outcome of the House and Senate races.
Also, of course, the policies advocated by the candidates, as well as their character, matter to me.
At this point it probably boils down to GOTV and protecting the vote. I don’t really want to fund a whole bunch of ugly ads.
Any guidance about where to focus at this point?

fred flint says:

Undecides this late are a myth. Like all those “undecideds” at the debate cheering putting Clinton in Jail. That whole audience was supposedly selected based on being “undecided”.

David Fry says:

The whole audience wasn’t undecided, the smaller number of people on the stage who were queued up to ask questions were to be undecided. The audience was anyone who could get a ticket, most provided by the campaigns.

David Cutler says:

Sam,
This is vaguely off topic here, but suddenly occurred to me as the dumbest thing you could do to demonstrate your central thesis of this election: namely that it has shown very little variability, and that the cause is likely due to political entrenchments. To see this graphically:
Add two vertical lines to your EV moving average picture / projection. Make those lines at 332 and 365 and label them Obama 2012 and 2008. I think it would make a pretty nice picture showing that the Clinton EV estimate has hardly left the range of EV votes Obama got (other than for a few week period where it only slightly dipped below but stayed well above 270).
Cheers,
dave

Ray Jones says:

I don’t see a mention of the change in the comments at the top. I just wanted to make sure I was seeing the latest version. There appears to be a commented out condition in drop_overlapping_polls() that would match this change, but I’m not sure.

Ed Wittens Cat says:

this is simply ridikkulous–
opening a civil war four weeks out from the election?
https://twitter.com/billmon1/status/785884648681443328
the tweets are coming from inside the House!!!

Michael says:

Didn’t the Republicans sign on for this when they went to Trump Tower, hat in hand, and had that public signing of “The Pledge?” Think of the number of crises that have occurred since then, as well as before, and you realize that’s all there’s really been.

Michael says:

Sam,
If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
But make allowance for their doubting too…
Rudyard Kipling would be proud.

538 Refugee says:

Two things.
1. The idea of open source is multi-faceted but the end goal is always to ‘get it right’. The more eyes on the code the better.
2. Arizona. At least one place has it for Clinton and another tied based on polls not included in the Pollster feed. On one hand I’d like to think of this a significant but even if Clinton takes Arizona it may be more about Trump’s treatment of McCain that anything else.

538 Refugee says:

“Mrs. Clinton’s campaign has concluded that at least two traditionally Republican states, Georgia and Arizona, are realistic targets for her campaign to win over. And Republican polling has found that Mr. Trump is at dire risk of losing Georgia, according to people briefed on the polls, who spoke on the condition of anonymity.”
http://www.nytimes.com/2016/10/12/us/politics/donald-trump-gop.html

A says:

Just a note for those who are interested, a report is out that republicans are cratering in early voting in NC, down by half as of now from 2012…
http://www.dailykos.com/story/2016/10/11/1580971/-Holy-shit-North-Carolina-Republican-voting-is-down-by-HALF-compared-to-2012

Scott J. Tepper says:

Perhaps they’re waiting to see who the Republican nominee will be soon.

Andrew says:

I also keep seeing weird comments in news articles along the line of “Clinton may open up an insurmountable lead due to early voting.”. My thought has been to dismiss this as ratings-generating punditry thus far, but now I’m beginning to wonder.
As interesting as this is, it may not be a trend since in NC the Republicans have had some specific embarassment at the state level. Also, NC is really a “pink” state but is being governed as if it is deep red, so maybe doesn’t surprise me that the earliest of early voting is showing some deviations. But if Texas or Tennessee were to show the same result, then I’d say we’ve got a trend! (But if Texas really went much more heavily for the DNC, would we ever find it out? Our experience with Fla. in 2000 suggests our electoral bureaucracy is perhaps not as robust as we would wish.)

Andrew says:

P.S. CNN is looking at early voting for a larger number of states:
http://www.cnn.com/2016/10/11/politics/donald-trump-hillary-clinton-debate-early-voting/index.html
Looks like a mixed bag to me so far, but there is only two states where they discuss proportions of early votes by party affiliation, NC and IA. I assume OH is a mistake or they changed the rules.

Josh says:

I hate to be *that* guy, but NC is going to have about 5 million votes for president this year. That Republicans have cast 8,000 this year at a time when they’d cast 16,000 in 2012–and after a hurricane devastated a number of heavily GOP-leaning counties in the last week–doesn’t seem particularly noteworthy to me. But I could be wrong! If you have more info on this, would love to see it.

Scott J. Tepper says:

If the meta margin increases significantly (5% now) will the number of Electoral Votes predicted to the winner go up, too? Is there a calculable correlation?

Bill says:

Is the rapid uptick in the meta-margin the past couple of days an indicator or changes since the Trump tape leak on Friday, or is it still aftermath of the first debate? Looking through Huff Pollster it looks like the UPI/CVOTER poll is the only one out so far that covers any time after the tape leak, so I’m inclined to think that this is still aftermath of the first debate. I admittedly only looked through swing states though, so I could be missing something.

Matt McIrvin says:

These Ipsos/Reuters polls cover such a long survey period, it’s hard to say what they’re indicating at this point in the race.

A says:

Another interesting tidbit–a new poll (not sure how accurate) has Clinton, Trump and McMullin basically in a 3 way race…
Trump’s support apparently falling very low with Mormons after the tape surfacing…
http://www.deseretnews.com/article/865664606/Poll-Trump-falls-into-tie-with-Clinton-among-Utah-voters.html?pg=all

Sam Wang says:

McMullin’s presence is going to save Mia Love there. If his supporters stayed home, she’d be in trouble. Instead, they turn out – and vote for a Republican downticket.

Matt McIrvin says:

I don’t think most of the big national pollsters are even asking about McMullin. Utah is the one place where I might believe he’s a major player.

PPM says:

I think there is some important and interesting psychology here. I have only anecdote to support the contention that academics can be more stubborn in not admitting mistake. I’m not sure it is true overall, but I can see how it might happen.
Academic scientists need funding to survive. To get funding, it really, really helps to have general interest. This creates an environment in which a bit of sloppiness gets rewarded (at least temporarily) by press attention (or publication in a better journal). I think we see this happen with things like the news that flossing hasn’t been proven to work. I bet there will be some funded grants on flossing next year!
A second factor is what happens _after_ an error is discovered. If you work for a firm as an attorney or a company as a business person, the error is, I think, easier to own. The responsibility is more shared by the whole organization. A good organization recognizes that errors happen and rewards those who are honest (as long as the errors are few and not too consequential!).
Academics really have their reputation on the line every time they publish. Even a minor paper carries reputational significance. A bad figure or poor reasoning can happen even to the best scientists. The shame in publishing an error can be very large and the fear can outweigh honesty. The onus is on the authors entirely, there is no social mechanism in which honesty in these circumstances is praised (generally, but not quite true).
In some ways this is more like being a pop musician, than a business person or attorney. You make a mistake, its you on the stage. Its you that gets remembered for messing up.
The way the errors generally get fixed is: (1) you publish a new paper talking about your previous result as a straw man argument (2) an ambitious younger person shows that the previous result was wrong.
These things sort themselves out in the long term, I suppose. The problem comes when policy is built on uncorrected errors. You never want people to be hurt by a social construction that is intolerant of error and so doesn’t correct the error.

Gopa says:

Yes, admitting one’s mistake and then rectifying is a sign of intellectual honesty–both as a professional and as a human being.
I have seen several in academia who do not practice it, and that shocked me as I thought they were above it–as opposed to, say, businesspeople or attorneys.

Leave a Reply

Your email address will not be published. Required fields are marked *