Princeton Election Consortium

Innovations in democracy since 2004

Outcome: Biden 306 EV (D+1.2% from toss-up), Senate 50 D (D+1.0%)
Nov 3 polls: Biden 342 EV (D+5.3%), Senate 50-55 D (D+3.9%), House control D+4.6%
Moneyball states: President AZ NE-2 NV, Senate MT ME AK, Legislatures KS TX NC

Nate Silver mostly puts his finger on what happened

May 18th, 2016, 1:09pm by Sam Wang

I think journalists have missed the point about Nate Silver’s error. Since Silver personifies data analysis, it is easy to get mixed up about what failed. As I wrote last week, the data didn’t fail – clear signs pointed toward Trump for a long time. However, Silver went beyond the data – in his words, he “acted like a pundit.” Here are his comments. The essay is long, but the title is on point. Basically I agree with points #1 (he didn’t make a real statistical model) and #4 (“fundamentals”-based models might not add that much value).

A reader asks what I think of the claim in point #3 that he was “too frequentist” and that his “Bayesian prior” of a Trump nomination should have been 10-12%. Hmmm. My first thought is that estimation of priors requires a lot of judgment. I don’t fault him for that…but he should own his estimates. To my taste, he leans too hard on political science, which relies on nice, stable trends. In a disruptive race like the 2016 GOP nomination contest, this leads to problems.

Since estimating a prior probability relies on an element of taste, it seems to be an error to cling to apparent quantification (“10-12%”). It creates the appearance of rigor, but not the substance. Revising such a number post hoc doesn’t seem constructive. This move up from 2% goes high enough to reduce embarrassment, but stays low to retain credibility. Hypothetically, if Trump had lost, would we be reading about this revision? Probably not.

Also, a purely poll-based approach such as mine, from January, worked well. Frequentist or Bayesian? You tell me. “Too frequentist” seems to be leaning hard on terminology. If Bayesian means “exercising judgment in interpreting data,” I would say he was too Bayesian. But let’s forget those terms. Basically, I was fortunate enough to notice that multi-election trends like “The Party Decides” were giving strange and contradictory answers. So I went back to polls, which were being quite clear. I say: let’s stick with polls when we can, and since modeling can be intrusive, keep it separate.

Update: Reader Kevin points out: “to support the proposition that ‘relatively few people predicted Trump’s rise,’ Silver links to an article featuring data-less opinions, with the subtext that “plainly that only cranks were confidently predicting Trump’s success in 2015 (you think we look bad–look at the other side!).” This seems like a good time to recall that data-loving media figures who saw Trump coming include Norm Ornstein, Paul Krugman, Matt Yglesias, and Andrew Prokop at Vox (in September 2015!!).

Tags: 2016 Election · President

44 Comments so far ↓

  • AySz88

    Since you’re more a frequentist, I wonder what you think of section 3? He seems to be diagnosing that his original estimates were too frequentist, and points out a Bayesian estimate of ~10-12%.

    • Sam Wang

      I have to read that carefully. My first thought is that estimation of priors requires a lot of judgment. I don’t necessarily fault him for that…but he should own it. To my taste, he leans too hard on political science, which relies on nice, stable trends. In a disruptive year like the 2016 GOP nomination, this leads to problems.

      It seems to me to be an error to cling to quantification (“10-12%”) when such judgments require an element of taste. Revising such a number when looking back smacks of getting the number high enough to reduce embarrassment, while keeping it low enough to remain credible. Tell me, if Trump had lost, would he have performed this revision? Probably not.

      “Too frequentist” sounds like b.s. to rope in the statistics nerds.

    • Amitabh Lath

      Wow that’s a long and rambly.
      Hidden in there in the middle somewhere is a statement about fundamentals not being all that. Perhaps going forward they’ll lay off the secret sauce.

    • Sam Wang

      Yeah, I am slogging my way through it. He is obviously stung by this – otherwise why write such a long essay?

      I am revising my post here as readers respond. Hope that is okay with everyone.

    • 538 Refugee

      I don’t know if I can finish what seems to be nothing more than the long version of, “I didn’t want to believe it so I didn’t.”

    • AySz88

      Sorry, I need to clarify that the phrase “too frequentist” was my interpretation of what he was saying, not a quote from him.

      His actual words were that he was using “a kind of rigid empiricism”, and goes on to cite 6-8 prior candidates who all lost the primary. He contrasts the frequentist interpretation (“0-for-6 or 0-for-8”) with a Bayesian calculation that has an uninformative prior, explaining it as hedging (“the uniform prior has you hedging a bit toward 50-50 in the cases of low information”).

      (I agree that he’s making a leap to say he would have actually published a model with that 10-12 percent number. It occurs to me that if he actually used that methodology, there are still plenty of ways to produce something like the original lowball estimate. Since there were ~16 candidates, he may well have started with a prior with a mean p = 1/16, right?)

      The article is definitely difficult to interpret precisely, especially when he is avoiding the standard statistical terminology in favor of common parlance.

    • Josh

      This whole essay seems like a lot of words to say “I ignored all the polling data because now I get paid to create the news”.

  • Tony Shifflett

    After reading his piece I immediately thought, “damn, didn’t I just read that over at Wang’s place…”

  • josh f

    538 was trending towards more polls and less other factors. I can’t find the post now but I swear in 2012 NS wrote something about how most of his fundamental factors added very tiny edges. Their 2016 primary methodology they investigated many more fundamental approaches but went with simplest ones and then segmented them further.

    I think their polls-based ‘beta’ type models broken out from ‘alpha’ type approaches like plus is valuable even if they don’t work equally well.

    Most sites don’t have ESPN resources to do lots of exploration, so I think 538 structure makes sense not only for them but benefits everyone. Also I believe dr wang alluded that post-espn-538 probably has to appeal to broader audience and get them to read more regularly, so this may have contributed them emphasizing daily content and punditry over more boring/static descriptions of models (which said same thing everyday). This doesn’t affect their models, but it does affect how their models are perceived.

    That said NS first debrief on trump was BS but this latest one he got right. As long as they’re being honest and falling forward, will continue to be loyal reader w/outside checks like dr wangs site here.

    • Matt McIrvin

      If the fundamentals adjustments were small, I wonder why they were even in there. The validity of a small adjustment to a political model is probably very hard to test, and getting rid of it removes a parameter. Simple models that are powerful are what you want.

  • anonymous

    In that post, Nate Silver seems to take pains to not mention your analysis (although he cites the Linzer fundamentals model assessment paper). He also seems peeved at Rutenberg from nytimes. All in all, the long post raises some good points, but does not seem like a good example of reflective and dispassionate data journalism. To be fair though, it does resemble some academic papers I have read amidst scientific disputes.

    • AySz88

      Regarding the Rutenberg article, Nate complained on the podcast that the two apparently had some sort of working conflict when they were at NYT (something about different narratives on Romney’s chances), so I think you have to read the articles (perhaps both of them) with that in mind.

  • Kevin

    Silver’s mea culpa is a step in the right direction, but doesn’t go far enough in diagnosing the problem. It wasn’t just stubborn commitment to lowballing Trump. 538 also confidently predicted early on that the Republican nominee “must” be either Bush, Walker, or Rubio, and gripped that conclusion equally tenaciously in the face of contradictory evidence (including the entrance and rise of Trump). Walker, of course, didn’t even make it to Iowa, and it was relatively easy to see that Bush and Rubio were dead in the water if you actually looked. Insisting that Trump’s support was a bubble or a mirage became a necessary condition for making a case for any of these candidates.

    Despite Silver’s moderation in tone, I still detect something less than humility and transparency in his essay. The insistence in Part 2 on measuring the performance of his models based on “calibration” smacks of cherry picking available measures to find one that seems reasonably favorable. What’s the root mean squared error? This measure would utterly trash his “polls-plus” model, which I assume is why he hasn’t published it.

    Where Silver really lost credibility this season, imho, was in the immodesty of his predictions. Instead of spending his time disclosing his priors and couching his work in appropriate disclaimers, he chose to spend his time trash talking people who had a different view of Trump’s chances. Somewhere the meaning of the phrase “early primary polling isn’t particularly reliable” got twisted to mean proof that Trump’s candidacy was bound to fail.

    The headline “How I Acted Like A Pundit And Screwed Up On Donald Trump” is dead right.

    • Kevin

      This is going to be harsh, but it also torques me that to support the proposition that “relatively few people predicted Trump’s rise,” Silver links to a Slate article entitled “They Totally Knew: The People Who Foresaw the Rise of Donald Trump.” The pundits/entertainers listed in this article are Joe Scarborough & Mika Brzezinski, John McLaughlin, Scott Adams, Chris Cillizza, Mike Cernovich, Tom Anderson, and Howard Stern. With apologies to Cillizza, the subtext is plainly that only cranks were confidently predicting Trump’s success in 2015 (you think we look bad–look at the other side!).

      Maybe! But Sliver doesn’t want to acknowledge that, while his rating of Trump was stuck in the single digits, others like Wang, Krugman, Ornstein, and Yglesias were pointing out that, contrary to conventional wisdom, Trump sure looked like the favorite.

      Why not? Perhaps because Silver’s insecurity impelled him to pick embarrassing fights with Wang, Krugman, and the whole staff of Vox early on, and so he now pretends that none of these people (or their analyses) exist. Also there was an evident effort to rebrand 538 as more conservative-friendly after the move to ESPN, so he probably feels he can’t acknowledge Mann and Ornstein either.

      Schadenfreude is not usually my thing, but I have been taking guilty pleasure in Silver’s comeuppance. Because he deserves it.

    • AySz88

      I have to point out that he has indeed been giving props to Ornstein (on Twitter, and in the “Why Republican Voters Decided on Trump” article a couple weeks ago).

      Yes, there’s a conspicuous absence of recognizing any other rivals, but I wouldn’t interpret this so personally. The whole article is about why people should give him another chance, so it’s not like it’s going to be a neutral review that might drive traffic to competing sites.

    • anonymous


      I think you grasped something I was struggling with. Nate Silver is probably under severe pressures in running the present version of 538. It is a commercial venture that has to maintain reader interest and distinguish itself from the competition. He is also now responsible for the 538 staff, whose livelihood depends on the success of 538. A large part of the success of 538 is tied to his own reputation as an infallible prognosticator. If he fails in just one aspect, it probably has financial consequences.

      That being said, he really has nothing to lose and much to gain in being comprehensive and fair in writing an assessment about who was right, who was wrong, and why. The fact that he is not, and that he takes swipes at people like Rutenberg gratuitously (whatever their history might be), does not reflect well on him.

    • Kevin

      I understand that Silver may have strategic or personal reasons for wanting to avoid driving traffic to people on his enemies list, and that’s his prerogative. But it doesn’t reflect well on him. I don’t think it hurts to be collegial. What he has shown instead is an extremely limited capacity to weather criticism. To the extent that Silver avoids reckoning with or refuting credible contrary analyses, he impoverishes his own work.

    • AySz88

      To be honest, I don’t think there’s much value in seeing someone trying to rate their direct competitors, especially without some sort of controls in place (ex. anonymization, as done in peer review). Whether the potential bias comes from financial or commercial pressure (which anonymous articulated well), or simple ego and reputation, it just wouldn’t be credible.

    • Kevin

      Oh, I’m not terribly concerned about interrating. The 538 style is to pepper their posts with external links. It’s conspicuous that the links are all to certain types of outlets and not others. Also there is an effort to paint a picture of the landscape of election commentary, sometimes by imputation through selection of links and sometimes by direct assertion, which is distorted in ways that superficially bolster 538.

  • Amitabh Lath

    See, this is what really gets my goat. Even in an apology screed like this there is a desire to obfuscate with math. Start with an absurd proposition that Jesse Jackson, Gary Hart, Newt Gingrich, Herman Cain and the rest are somehow topologically equivalent to Trump according to some silly homeomorphism, and because none of the 8 got the nomination… math math math…and viola!…p-value of 10% for Trump! Bayesian, so must be right! Ugh.

    It’s ok to have mistrusted the data when it gave what you considered ridiculous results. It happens in science all the time. But stop trying to use arithmetic to give gravitas to your gut feelings.

    • Olav Grinde

      …and viola!

      As I read the aforementioned mea culpa, I am indeed hearing violins – but methinks the <i/viola has nothing to do with this. ;)

    • Amitabh Lath

      You got me Olav.
      Voila: there it is, vs. viola: stringed instrument.
      I’ve always been prone to malapropism. I know, it goes against the stereotype of US-born South Asians being spelling-bee champions, but some of us have to be in the left-hand tail of the distribution.

    • Joseph

      “I’ve always been prone to malapropism.”

      But it was such a cute malaprop! I’d string it along, but I’d probably end up with a sour note….

  • Olav Grinde

    I believe that Nate Silver’s mea culpa should take the form of a promise: full transparency in the future. Whenever he goes beyond a polls-only interpretation (read: adding his “special sauce”), he should be honest enough to tell us what is in the sauce, and how much.

    In other words, Mr Silver should have the honesty to tell us how hard he is “putting his finger on the scale”, and the precise rationale behind this.

    What I have always admired about Dr. Sam Wang’s approach is his full transparency. Read: anyone who wishes, can duplicate his data-based analysis.

    • Matt McIrvin

      “What I have always admired about Dr. Sam Wang’s approach is his full transparency. Read: anyone who wishes, can duplicate his data-based analysis.”

      Sam actually makes a major mistake once in a while, but because he shows his work, his mistakes get rapidly caught and corrected, and the corrections are the basis for further work.

      Then people on other blogs talk smack about him for making them in the first place, whereas if he were arguing without visible numerical support in the first place, nobody would notice.

  • Truthy

    I completely lost faith in Silver after his disastrous UK election performance in 2015, after that I found out he was equally awful in the UK 2010 election – but likes to sweep that one under the carpet.

    His excuse then “no one else saw the UK conservatives winning so you shouldn’t blame me that I got it wrong” is not a million miles from his main excuse now “no one else predicted trump winning back in January so don’t blame me” (even though Sam and others did predict it)

    • P G Vaidya

      I worshiped him in 2008. However, I followed 2010 British election closely and lost all faith in him. That is why when I found this website, it was such a breath of fresh air and lovely mathematics beyond linear regression.

  • Zach

    I’d argue fundamentals do matter quite a bit, but that they get reflected in polling once folks pay attention to the race. So once you are close enough to an election polling tells you all you want to know as best as any measure can. If you are way out from an election, fundamentals will tell you more than polling. Trying to use fundamentals to correct polling once we are far enough along for polling to be reasonably accurate is to misunderstand what fundamentals are doing causally. In doing that you end up counting them twice.

    Now of course if polling is sparse and polls disparate fundamentals might tell you which is more likely to be off (and the might tell you something about the quality of the polling methods as you could check the internals to see if they make sense). But once there are a lot of polls, trust the polling aggregate of properly conducted polls.

    On spot where Poli Sci might matter is with the rapid polling improvement of Trump v Clinton since Indiana but not v Sanders. Republicans are beginning to accept Trump. Sanders supporters aren’t there yet with Clinton. Now if this trend holds a couple weeks after June 6 that’s interesting. Until then it might not be real.

    • Commentor

      I’d suggest that this election provides more evidence that fundamentals can be a veneer or post hoc rationalization after a more type one (in Kahneman’s terms) style of decision making from voters.

  • whirlaway

    Silver’s article would have been much shorter if the question was “why” rather than “how”. Of course, the reason *why* he screwed up on Donald Trump is that he is part of the establishment media now and wanted to see Trump defeated.

  • CyclicLaw

    The problem with his mea culpa is right there in the title. He isn’t “acting” like a pundit, he “is” a pundit. He should stop making excuses and just fess up…building narratives for an audience pays a lot better than interpreting boring old data.

    • Joseph

      Like it. Fess up and move on.

      Stepping back, nobody seems to be asking WHY. Why did this man come from nowhere to the top of the ticket? We know why Mrs. Clinton is there; why is Mr. Trump?

      There are forces at work in our society that it would behoove us to be aware of….

    • Amitabh Lath

      Good question. Usually when an outsider makes a big splash in a field coming in without credentials or endorsements, it usually turns out the person was actually well trained, but in an unorthodox way.

      I would argue Trump was being schooled in politics and group dynamics over the past few decades in the celebrity/tabloid world of New York. He didn’t come out of nowhere.

    • Lorem

      In fairness, it’s probably not in a pundit’s best interest to admit they are a pundit at the moment, so Nate probably should not fess up.

      As to where Trump came from, I recall reading that he did a good deal of prep work: talking to a previous primary candidate, designing a platform, checking that said platform had popular appeal. The background in showmanship certainly helped as well. So, I don’t see any particular mystery.

      Which is not to say that I have no questions remaining, but rather that they are narrow ones. I more wonder things like: “What is the proportional contribution of various factors to populism becoming more popular now than it was in the past?” Instead of wide ones like: “Why did Trump stand a decent chance?”

    • alurin

      Actually, a lot of digital ink has been spilled on the “why” question. There are a number of interesting theories. Perhaps most convincing to me at this point is the “authoritarian personality” account. However, it’s not really a question amenable to the types of rigorous quantitative analysis that Dr Wang has been conducting.

    • Joseph


      I think you’re on to somthing! Literally my first “hit” in typing in “Trump as authoritarian” gave me this:

      And btw, this rests on a statstical analysis that’s right up PEC’s alley….

    • 538 Refugee

      “A PPP poll found that a third of Trump voters Twenty percent said Lincoln shouldn’t have freed the slaves” (from the VOX article quoted above)

      Maybe my standing joke “The Republican Party, the party of Lincoln. We freed ’em, we can bring ’em back” isn’t funny anymore? Well, maybe it never was. ;)

  • Kathryn Zunich

    Of interest in this discussion might be the fact that in February, Washington and Lee University’s mock convention identified Donald Trump as the GOP nominee this year, and has quite an impressive record. See “” and “”

    They got it right despite what the pundits were saying and how they and all the analysts were assessing the data and passing judgement. As the Times said, “A group of about 150 students undertook meticulous research for the last two years to produce Saturday’s announcement. They consulted widely with journalists, academics, party officials and strategists to build profiles of each state’s primary contest and each candidate’s possible paths to victory.”

  • Bill Herschel

    He was wrong, you were right. Come back behind your shield or on it. In the future, I will be at PEC.

    Bayesian does not mean bloviation. There is infinitely too much of the latter at 538. Can’t be slogged through.

  • M J Sheppard

    Silver’s list of excuses misses the main one. He came from the far left “progressive’ blog “Daily Kos” and his politics and prejudices simply would not allow him to see what was blindingly obvious. What is also obvious is that his site should be marked “for entertainment purposes only”

    He continued on with his ludicrous “endorsement primary” which showed key GOP figures endorsing Bush/Rubio in particular long after it was obvious that such “the party decides” formula was meaningless.

    Silver assembled a “panel of experts” who
    presented a list of “How many delegates Trump would win” From Utah to Oregon they got every single one wrong, 15 primaries in a row.

  • Joel

    Silver was much better as a postdoc than he is as a principal investigator, so to speak.

    • Sam Wang

      I would have said that he’s somewhere between a department a chair and a dean. Both come with new motivations that are different from the priorities of individual researchers.

Leave a Comment