Princeton Election Consortium

A first draft of electoral history. Since 2004

A flaw in the NYT Now-meter?

August 2nd, 2012, 10:01am by Sam Wang


Peter Norvig asks: “You give odds of 10-1 and Nate 7-3.  Can you explain the difference?” The answer is: yes, I can, though it will take several blog posts over the coming weeks. At the end, I will unveil a model, based on polls alone, that makes a true prediction.

Most poll aggregators restrict themselves to polling data only: electoral-vote.com, Pollster, RealClearPolitics, and so on. All show a stable Presidential race. It is obvious that in an election today, Obama would win by about 100 EV.

At the NYT’s FiveThirtyEight, Nate Silver does something different. He includes indirect factors: GDP growth and unemployment, as well as corrections to individual pollsters. This is done both for a November prediction, and also for a “Now-meter,” a somewhat mysterious entity that sounds like it measures current conditions – though here I argue that it does not.

I believe both the Now-meter and the November model to take an incorrect approach to Presidential race forecasting, one that adds noise but not signal. (Disclaimer to frequent readers: I’m not just throwing mud – I have something better in store.)

Think of an analogy to weather. Consider the following map:

We all know how to read this. The red circle marks where the storm is now. The white zone indicates where the storm is likely to go in the future.

In this analogy, pollsters are like weather instruments: they provide a lot of information about where the Presidential “storm” is now. Nate Silver is attempting to be a forecaster, asking where is the storm headed in the future?

One problem is clear immediately: adding indirect factors to the current picture, as is done in the Now-meter, conceals the storm’s current location. We are robbed of information about the current state of the race. Imagine putting the widest part of that big white zone around the red circle. This would not be of  use to people in Honduras and Guatemala who wanted to know where the storm was.

A second problem is worse: we do not know how much signal is added by the indirect factors. No known political-science “bread and peace” model, as outlined by Ray Fair and successors, is particularly predictive. A November win probability of 70% makes it clear that Silver’s uncertainties are still rather large. Probably he has probably run regressions on multiple variables, then put them back into a formula. I note that this approach is unverified. That is inherently not his fault…but one should be skeptical.

Indeed, the information added by the indirect factors might already be manifest in the polling data. And if polling conditions already reflect economic variables, the econometric variables are adding noise – but not signal.

So…if the NYT’s Now-meter and the November model don’t measure current conditions, and they’re not really a forecast, what are they? Hmmm. Roughly speaking, I’d say they indicate perceived odds, InTrade-style.

However, there is one situation in which the complex model is of help: the missing-data problem. In House and Senate races, far less information is available. For example, it is not at all clear where the North Dakota Senate race (Heitkamp v. Berg) stands. Here is where a model based on non-polling data might help. It’s an area where Silver can shine. I would be glad if he (and you also, dear reader!) concentrated on that.

In coming essays I will describe an approach to extracting predictive information from polls alone, using the 2004 and 2008 races as a starting point.

Tags: 2012 Election · President

28 Comments so far ↓

  • wheelers cat

    “However, imagine the blowback from Rasmussen fans.”

    Yes. IPOF Tomansky’s post that uses 538′s model has over five thousand comments on it.
    It is a terrible thing that in this country that there are two kinds of science, real science and conservative science, and now apparently two kinds of statistical polling.
    Nate shouldn’t be encouraging this.

    “Anti-intellectualism has been a constant thread winding its way through our political and cultural life, nurtured by the false notion that democracy means that ‘my ignorance is just as good as your knowledge.’”
    ― Isaac Asimov

  • wheelers cat

    I am a big fan of robust statistics, and I totally agree that that is the best way to distill information from Rasmussen. My problem with Rasmussen is that Scott deliberately distorts the data, and Nate absolutely knows this. Nate wrote about it in 2010.
    And Rasmussen robocalling is probably the single greatest flaw in his likely voter method.
    Nate is a classic Bayesian, and the cell only demographic didnt really exist in 2008.
    For example, smartphone users are going to vote Obama in November 49 to 31. Rasmussens LV model failed to capture this demographic in 2010 in Colorado and Nevada.
    The market force actually leads Nate to “spit in the soup”, that is, his paycheck from the NYT.
    He includes Rasmussen in the interest of prolonging the horserace, and the truly idiotic “fairess policy” of the media– giving representation to both sides.

    I believe in a biological basis for all behavior, and sadly I think Nate’s libertarian tendency also influences his work.
    Libertarians believe that all humans are entitled to their beliefs, even if those beliefs are wrong, stupid, bad or evil. I think it must be very conflicting to be a “glibertarian” and a statistician.

  • wheelers cat

    “Whatever the case, I am confident that he is being loose with treating uncertainty. The “probability” makes it apparent that he is adding more noise than signal.”
    ok, that is genius. I totally agree.
    I actually think Nate is trying to deliberately add uncertainty to his model. And who is to say that is wrong? After all, probability can never be exactly zero, right?
    Nate attempts to remove noise from Ras polling but adjusting for house effect. But Ras isn’t really a good pollster. His LV model was fooled by cellphone demographics in CO and NEV in 2010, he uses crude tricks like poll saturation and poll selection to shape a narrative. So why include Ras at all?
    Uncertainty.

    • Sam Wang

      wheelers cat – I use median-based statistics, which allows Rasmussen to be used without dragging the estimate away from the true value. Using averages, as other sites do, the best way to squeeze out information would be to subtract Rasmussen’s offset. It would have to be done for all pollsters, not just Rasmussen. However, imagine the blowback from Rasmussen fans.

      Good point about cellphone demographics – though people in the business tell me this is a big topic. The AAPOR meeting is filled with people who chew over the methods with care.

      Uncertainty added on purpose? That’s like spitting in the soup, blech. However, I will point out that there is no market force that would lead him to do a better job. Just leads to fewer things to say.

  • wheelers cat

    Do you use Rasmussen polls in your intake data, Sam Wang?
    This could certainly account for any disparity.
    Your histogram of Romneys chance at 270 EV is a single data point, presumably corresponding to a black swan event. I can’t tell from the graphic what Nate has calculated Romney’s chance at 270 EV to be. It looks very small. Approaching lim (0) perhaps?
    He does have the popular vote probabilities on the sidebar…. but the EV probabilities are coupled with popular vote probabilities.

  • BrianTH

    “In contrast, polls-alone perform quite well — almost as well as predicted by their statistical sampling error, which implies that pollster-to-pollster biases cancel one another.”

    I don’t think that proposition nails down either issue I noted above (how to allow for correlation between possible polling errors in the model, and whether or not to use national polls). I might note Gelman specifically explained why the correlation issue meant the results in the 2010 election did not definitively prove Silver’s model had been underconfident.

    • Sam Wang

      One could use national polls with suitable corrections. The trade-off, if one does it right, is a moderate gain in time resolution vs. an increase in uncertainty.

      It was really the table of comparisons Gelman assembled that I wanted you to look at. It is plainly true that in the past, Silver has underestimated probabilities. I regard this as a settled point. In regard to correlation between polling errors, I have never liked the argument, and have written on the subject. Some formal modeling (with illustrative toy simulations) could be used to test the idea.

  • badni

    BTW: Long time listener, first time caller. Love your site. Followed it in 2008 and am glad you’re still putting in the brainpower to put out analysis I couldn’t hope ro replicate.

  • BrianTH

    Personally, I’m still just at the stage of trying to understand what methodological differences account for the difference in results. Which methodology is providing “better predictions” would be a question for a later stage.

    Anyway, here is a recent post on the national/state issue:

    http://fivethirtyeight.blogs.nytimes.com/2012/07/24/state-and-national-polls-tell-different-tales-about-state-of-campaign/

  • badni

    I do have a question you may be able to address: what does it really mean to say there is an x% chance someone will win (or a storm will hit) 90 days from now? Could we test the accuracy retrospectively by looking at someone’s 90-days out predicted odds for 1000 races (or storms) and seeing if that percentage of races ended uo that way, or that percentage of storms hit?

    • Sam Wang

      In a sense, this has been done. Read this and this, which show that two prognosticators, InTrade and Nate Silver, are habitually underconfident in their probabilities. This arises (by definition) from inefficient use of information.

      In contrast, polls-alone perform quite well — almost as well as predicted by their statistical sampling error, which implies that pollster-to-pollster biases cancel one another, especially if one uses median-based statistics, as I do. This means that a sufficient number of polls will outperform the two prognosticators listed above. As I state in this post, the prognosticator approach is mainly useful for the missing-data problem.

  • Obama 2012

    As an Obama supporter I’m hopeful that it’s actually the Rasmussen and Gallup tracking polls that are the “outliers” this year.

    This would make sense in light of how much better Obama is doing state by state in comparison to the very tight election Gallup & Rasmussen are showing nationally…

    The national #s from Pew fit in with those state by state #s quite a bit better.

  • badni

    I am certainly not qualified to agree or disagree with your math. I just thought that accusing a statistician of ad-hoc adjusting outlier polls is a statement about his integrity, and I wanted to clear up the facts.

  • dabni

    1. Nate’s adjustment of the Pew poll was not ad hoc. He simply explained how his model treats polls with that a particular kind of history- all Pew polls would be adjusted the same way, based on the house effect and rv/lv effect he has calculated. May be worthy of criticism, but it absolutely is not ad hoc data fitting

    2. The “nowcast” is purely poll based.

    • Sam Wang

      To make my point again: none of these things matter. A better prediction can be made without all of those assumptions and corrections. It adds noise, not signal. Therefore why bother chewing over the details? Better to spend time on something that matters, like figuring out which Senate races to support.

      That said, if you have something for me to look at, please email it to me directly. Thank you for commenting.

  • BrianTH

    Oh, another thing I believe Silver is doing is using national polls in conjunction with state polls, with the state polls appearing relatively more favorable for Obama than the national polls.

    Therefore, using the national polls might at least partially explain why Silver’s Now-Cast would attribute lower odds than a model that only used state polls.

    • Sam Wang

      I’ll look into that, especially if you have documentation. In 2008 his documentation was terrible, and he showed a tendency to add variables along the way in an ad hoc manner. Certainly he could make an attempt to get rid of obviously redundant information, but as far as I am aware, these relationships are not well-enough understood to a complete job. Let’s put it this way: this is not error analysis in a particle accelerator — it’s a bunch of econometric regressions.

      Whatever the case, I am confident that he is being loose with treating uncertainty. The “probability” makes it apparent that he is adding more noise than signal.

      Stay tuned.

  • BrianTH

    I’m not sure I understand this critique of Silver’s Now-Cast. According to Silver, the Now-Cast actually removes economic factors and depends solely on his polling analysis:

    http://fivethirtyeight.blogs.nytimes.com/2012/06/12/a-guide-to-forecast-model-updates/

    “If you want to get a sense for the flow of the polls and polls alone, you can look at the ‘now-cast’ — our estimate of what would happen if the election were held today. The now-cast removes the economic component from the model.”

    So I still don’t quite understand what explains the difference in odds that Peter N. raised.

    I will note that I believe Silver has previously attributed Obama’s seemingly rather modest odds of victory in light of the wealth of polling data showing him ahead in more than enough states for an electoral college victory to the possibility that if the polls are wrong, they could be wrong in a fairly coordinated fashion. It seems to me there is a lot of room in this particular area for somewhat different analysis to lead to significantly different results.

  • Matt McIrvin

    Hey, epicycles at least had solid empirical motivation.

  • Olav Grinde

    Thank you for putting this in perspective!

    When time allows, I am very interested in reading your take and prediction on the Senate and House races.

    Whoever ends up in the White House next January, their ability or inability to pass legislation and budgets, and get appointments approved, etc etc, will depend on the makeup of Congress.
    (And, of course, on who has bought votes on both sides of the aisle, but that is a separate and perhaps unallowable topic…?)

  • Matt McIrvin

    The Pew poll could very well be a simple high statistical outlier for Obama. But, by the same token, the recent polls showing good results for Romney could have been outliers too. You really don’t know in advance.

    The right way to deal with outliers isn’t to try to detect them off the cuff, explain them away and discount them. It’s to keep aggregating data. The pure statistical outliers will tend to cancel each other out. Silver does seem to recognize this, but he also keeps tweaking his model in complicated ways.

    • Sam Wang

      Adding more and more parts to a model – it’s bad form to add so many when one don’t understand what is in there already. It makes me think of epicycles. Ptolemaic Nate?

  • Olav Grinde

    It was interesting to see Nate Silver’s attempt yesterday to explain away the recent Pew Research Poll that gives Mr Obama a 51–41 lead over Mr Romney.

    This seems an egregious example of a statistician trying to “tweak” a huge data point to make it fit in better with his other data points and — more importantly in this case — core assumptions.

    I think what we are seeing across the board is Mr Romney’s failure to connect with voters, other than those who are already dead set on voting against Mr Obama.

    Moreover, presidential elections are won in the sum of state results, deciding in turn the Electoral College split — which of course is the premise of this site.

    Nationally and in critical states it seems the gap persists, and Mr Romney’s recent attempts to paint himself as the statesman-in-waiting were a dismal failure. In this regards it is worth reading about the GOP’s candidates mis-steps abroad as analysed by the BBC’s North American Editor Mark Mardell at http://www.bbc.co.uk/news/

    • Sam Wang

      My view is somewhat narrowly focused – so much so that you may find it un-fun. I believe that a major benefit of aggregating polls is that it allows us to see what is really going on, and therefore avoid having to explain individual data points. I believe Nate Silver’s view is that of a color commentator, where individual data points are chat-worthy. In this case…it’s just an outlier which does not change the fundamental truth, which is that Obama has been unambiguously ahead all season, and that the genuine suspense is in the Senate and House.

      All that said, of course I am interested in all the narrative: cerebral Obama, tax-dodging Romney, unable-to-connect Romney, shaky economy…it’s just that I regard it as a parallel story to where the race actually is. Kind of like human-interest stories by reporters broadcasting from a hurricane strike path. The meteorologist should not be doing that!

  • Gerry E.

    It may be helpful to convert Peter N.’s odds from 7-3 to approx 2.5-1. To put this in perspective, Nate is saying that Romney is 4 times more likely to win than Sam is. Someone is wrong and I think we know who it is!

    • Sam Wang

      This is not how I think about it, since it is predicated on looking at the output of a black box. As a black-box designer, I am questioning what is inside the box!

      It will not be possible to say who was right based on the election outcome, since the quantities expressed are probabilities. In some sense, if Romney wins, then Silver is 30% right based on today’s prediction, and I am 10% right. However, even this does not make sense because what he and I call probabilities will change over time. What I am really saying is that he has constructed a bad model.

      My point is that his measure is (a) not actually a probability, (b) it is statistically poorly constructed, and (c) it does not give us a clean read on today or Election Day. I believe what it gives us is conventional-wisdom betting odds. This is fine for entertainment purposes, it’s about like relying on InTrade for hurricane forecasting!

  • Olav Grinde

    And one more thing: It detracts greatly from the site that the “comments” shown on the front page for the meta-analysis are from 2008!!

  • Olav Grinde

    A most interesting blog entry!
    I read FiveThirtyEight regularly and have long searched for a pattern in Nate Silver’s methods. Thanks for clarifying the answer: there isn’t any!

    I do have one strong wish for this site, though. It would be far better if you tidied up the front page, so that it refers more clearly to the 2012 elections. Moreover, in my opinion Mr Wang’s latest blog entry belongs on the front page — not buried several layers deep through a left-side menu choice.

    A bit of tweaking might help your website get more of the traffic that it surely deserves!