Princeton Election Consortium

A first draft of electoral history. Since 2004

Peer review…by Twitter?

October 1st, 2014, 2:00pm by Sam Wang


I’ve been an author on 70 scientific publications. They all went through a process called scientific peer review, where anonymous reviewers critique a paper and the authors respond. Eventually an editor decides to accept or reject the work. The process can take a while, and the editor’s role is key. My most recent paper, which provided a new way of looking at the biology of autism, had no fewer than five peer reviewers, one of whom was rather hostile. It’s an arduous process!

To my surprise, unfolding on Twitter is an alternate-universe version of peer review. But the analogy is not quite right. The most prominent reviewer’s name is known to me (and to you). In this analogy, you, the reader, are the editor.

In my experience, only a handful of strategies get a result through peer review successfully: (1) point-by-point response to every critique, (2) pointing out factual error by the reviewer, and (3) new data. With that, let me reply.

>>>

First, I should say that the best part of this discussion is to focus everyone on the national election. In some ways, the national election in 2014 is the closest electoral question since 2000, Gore v. Bush. This year’s contest is critical for shaping the coming two years in the United States.

To the reviewer: I appreciate the comments. Thank you for mentioning the Princeton Election Consortium (PEC) online. It’s driven up traffic to levels I’ve never seen in midterms before. However, I am concerned that you don’t really understand our current methods. The flavor of the statistical approach comes from physical sciences, and may seem unfamiliar. To step outside the usual peer review process for a moment: it seems like something that could be solved over a beer or two.

RESPONSE TO REVIEWER

There are a few general themes in the critique:

  1. Statements that PEC makes excessively precise statements of probability, focusing mainly on work done in 2010.
  2. A lack of accurate mention of PEC’s actual methods, which we have used since 2012.

There are a number of factual misstatements. Where possible, I will redirect the discussion to how PEC actually works.

Other commentators (Drew Linzer, Hans Noel, John Sides, Andrew Gelman, and others) have engaged in a constructive discussion of core issues: Has the Senate race been stable since June, or will it break in one direction? Does the value added by fundamentals make up for the cost of partially concealing a purely polls-based picture? Several of them brought up what I regard as the single most legitimate criticism of the PEC calculation: we assume that the ups and downs of polls since June will predict future movement. This might, or might not, hold up. We won’t know the answer to that until after the election!

RESPONSE TO SPECIFIC COMMENTS

Work done before 2010

Thanks for pointing this out. It’s true that in 2010, PEC mis-called Nevada and Colorado – as did FiveThirtyEight.

Regarding probabilities, these criticisms confuse my statement of current polling conditions (a “snapshot probability”) with a true prediction for the election outcome. In 2010, we did not make the distinction clearly, but have done so since 2012. A true forecast requires an assumption about future change in polls. Our assumptions are here.

The reviewer omitted the fact that in 2012, a polls-only approach led PEC to call every Senate race correctly, which FiveThirtyEight did not.

Accusations of excessive precision:

Again, the reviewer has confused a snapshot with a prediction. The goal of the snapshot is to get a daily distribution of outcomes, and to calculate a single-race margin or a Meta-Margin (defined as how far a multi-race contest would have to swing to create a perfect tie). Of course these daily outcomes fluctuate – as they should. That then feeds our prediction model. In fact, it’s the source of the Election-Day uncertainty – which the reviewer wants to see.

The assumptions that go into the PEC calculation are extremely reasonable and simple. They have done well in every election since 2004. They are also publicly available.

The point regarding Alaska highlights an extreme case, where the scarcity of polls makes daily snapshots volatile. As the election approaches, this issue should resolve itself.

Arguments that fundamentals do not account for differences from polls-only models:

Here, the reviewer is taking advantage of momentary conditions to make a point. However, this is cherrypicking a result from the campaign season, and is highly misleading. Here is a more typical result.

The first data column is the PEC poll median from late August. The next two columns show what a polls-only win probability looks like. Finally, the last three columns show some polls-plus-fundamentals win probabilities. All probabilities are shaded according to who is favored, the Democrat (blue) or the Republican (red). “sum6″ is the sum of probabilities (converted to seats) for six key races: AK, AR, CO, IA, LA, and NC.

I agree that polls-only approaches should give similar results to one another. But differences can arise from the specific methods used, including, in some cases, “house-effects” corrections. The reviewer seems to be implying that somehow PEC is doing something to bias the results. That is a serious accusation, and is false. PEC uses polls only, in combination with median-based statistical approaches, with no pollster-specific assumptions.

The Upshot has documented the impact of fundamentals extremely well with their “leave out the fundamentals” tool. There is no serious question about the fact that fundamentals can affect a forecast – it’s a major reason for using them. The question, which we can talk about after the election, is how and where they add value.

Other criticisms:

This is indeed a simplification. Orman’s decision about how to caucus does not lend itself well to a polls-only approach. Showing the result this way is transparent, and lets the reader evaluate the result for him/herself.

On Monday, we had paused the calculation to allow time to re-evaluate whether the model’s assumptions had been violated. Extremely fast movement like we saw last week in Iowa and Colorado is very interesting – and is a chance to learn something new. However, even that movement was within expected fluctuations, so the model stays the same.

As stated yesterday, the model will continue as originally planned, and take account of uncertainty on Election Eve, with a parameter change triggered by feedback from attentive readers. As of today, the November prediction is once again live. Now, current polls are given added weight, which will increase as we get closer to the election.

SUMMARY

I believe that analysis of polls should not take center stage, but be a service that liberates readers to focus on big questions: where the race is, and what they can do to affect the outcome. To me, politics is not about polls, or even about the horserace. It’s about candidates and issues. When the analyst becomes the story, that’s antithetical to what data-based journalism ought to be. I hope we can get past this, and return to the real issues at hand between now and November 4th.

>>>

I thank readers Amit Lath, Bradley K. Sherman, Bum, Froggy, A New Jersey Farmer, Eidelweiss, and Alan Koczela for commenting on this before publication.

Tags: 2014 Election

78 Comments so far ↓

  • Keith G

    This Silver guy seems like a real jerk.

  • Anita B

    It’s nice to see Peer Review explained to the masses. I’ve always appreciated your analyses as more trustworthy. Never been a fan of Silver–too much influence from outside others.

    For me, it’s sort of like comparing results from a company funded study to an NSF funded study. Even though both claim objectivity, I trust the NSF or NIH research more.

  • atothec

    New polls have Orman +10 in KS, Hagan +4 in NC, Braley +1 in IA and Ernst +2 in IA.

    So minus the obvious outliers (Gravis, Quinnip.) the R’s haven’t gained much if any momentum in the last 5 weeks. I think that bodes well for Democrats.

  • bks

    P(D+I) >= 50 now 51%. Ouch! –bks

    • Sam Wang

      Yes. I am actually a bit puzzled since IA and CO are only at R+1%, the Colorado change being a change from R+2%. Might be the reduced # of days to the election.

  • Debra

    I don’t know enough about stats to contribute, but Sam said a few days ago, about Iowa, that “something is happening”.

    I learned that it was that the Democrat got caught on tape saying that “if you put me in the Senate, you will have a lawyer on the Judiciary Committee, not a farmer”.

    Apparently that video went viral among the many farmers in the state.

  • securecare

    I think calling him on his misstatements is necessary but nothing beyond that. Just say that you have explained your model and why you do it your way to invite seriously interested individuals to check the details on their own.

    No detailed comments on the details unless here for our edification. Most people won’t make the effort to understand no matter what you do.

    You are correct that TL;DR is a real problem to avoid so please avoid it.

  • Hugh J Martin

    Nate Silver’s reputation was made for the lay audience when he called 2012 correctly. That led to his current gig with ESPN, which may have invested real money in Silver.
    The lay audience, including many journalists, regard a discussion of method as inside baseball, at best. So something is going on. Maybe Sam’s New Yorker pieces, a forum journalists respect, prompted questions for Silver. Maybe it’s something else.
    But Silver’s basic argument that Sam is an outlier and not to be taken seriously will have an effect if Republicans take the Senate, even if PEC makes the correct call. It will look to some as if Sam put a finger on the scale to avoid embarrassment. Look at the comments here from informed readers who say Silver’s criticisms have merit.
    Maybe, but it’s Silver who obscures key details of his method and Sam who makes the details public. So I do think Sam should make a measured reply that points out inaccuracies. Explain also that PEC makes everything public because the goal is to increase public knowledge, not to exploit your findings for commercial gain. Keep explanations why PEC has a different and changing forecast nontechnical, and as short as possible. Invite readers to visit your post on how they can do their own forecast.
    Who knows if that will work. But doing nothing risks waiting for an election that “proves” Silver was right all along and Sam’s modeling was an also ran.

  • Philip Diehl

    Well, of course polls-only models are more volatile than those including fundamentals. Fundamentals anchor predictions and reflect an assumption that what happens during a campaign is less significant to the outcome than election analysts and observers believe. This assumption might, or might not, be warranted, or maybe it is in some elections and not in others.

    A good way to test the assumption is to have models that include fundamentals and some that don’t. It’s especially useful, and a lot more interesting, to have a model producing outlier predictions. All the better for testing the hypothesis.

    This business of dissing the outlier model also adds some spice to the discussion, but Silver has taken it into the realm of the schoolyard. This is nothing more than smack-talk at the end of the third quarter of a pickup basketball game. They can be tough words to eat at game over.

  • Kevin

    According the this article, Silver is now incorporating a “now-cast” into his Senate model. http://fivethirtyeight.com/datalab/weve-added-state-by-state-polling-detail-to-our-senate-forecast-interactive/ Copied from PEC’s snapshot?

  • Rob

    A brief reply – perhaps a tweet? ;)
    I’ve followed Nate since 2008, but am quickly losing all respect for him. Rise above, Sam. Continue to do what you do. Written explanations of your decisions in your model are appreciated. A pissing match is just petty. Rise above.

  • AlekseyN

    I think the issue here is that Nate Silver insists on probability rating being a real probability indicator, while Sam Wang treats is an abstract indicator. For Nate Silver, a claim “Dems at a 93% chance of winning the Senate today” means a claim that indeed, if some supreme being was able to experimentally test the prediction (e.g. randomly generating a bunch of possible worlds that would look indistinguishably from the real one to Sam Wang, or to his model, and then forcing the senate election on that day), then 93 times out of a 100 such experiment would lead to Dems winning. Nate Silver seems to think that in such an experiment there would be enough surprises that would result in more that 7% wins for republicans, so he reasonably believes that Sam Wang’s prediction is overconfident.

    • Sam Wang

      He does not understand what I’m using those probabilities for: to establish a range of possibilities. They are the very source of the November uncertainty that he emphasizes. I don’t consider this to be a legitimate argument, but more a statement that he doesn’t understand another person’s approach.

  • AySz88

    Some, uh, meta-commentary:

    Here’s another (not necessarily exclusive) possible motivation for Nate: Silver needs a foil with which to teach a lesson. Before, the foil was obvious – all the pundits that ignore or misinterpret polling (“unskewing”, anyone?). That won’t work so well this year. There’s a smaller geekier audience, more competition for the same eyeballs, and the fact that a lot of people have learned that lesson already. So Silver, who probably truly believes that Sam is committing some methodological sin, is picking on Sam’s model to bolster his own standing.

    There’s a problem with that reasoning though – before he was the little guy taking on the establishment. Now Sam is the little guy taking on the establishment.

    It’s true that Silver has more on the line – but he would have more on the line anyway, so he might figure he actually *doesn’t* have anything to lose by picking the fight.

  • Kevin

    Nate Silver has a brand to protect–all that noodling and hocus pocus over pollster ratings and everything else described in his “How the 538 Prediction Works” piece here: http://fivethirtyeight.com/features/how-the-fivethirtyeight-senate-forecast-model-works/#ss-38.

    Dr. Wang’s message that all that work may not be necessary, or might even be counterproductive (while obscuring the immediate reaction to breaking events), and his open source philosophy, are naturally threatening to the brand. It wasn’t long ago that Nate was in the appealing position of taking on the establishment with a collection of self-made algorithms; now he is the establishment.

    It’s too bad that he can’t rise above this and be a booster for experimentation and different ideas such as those represented on this site. Both approaches are interesting, and we’re clearly better off having both in the field and in healthy dialogue with each other. If Nate would lighten up, he would probably find it good for his readership numbers as well. Scrappy and unconventional works well for him; dour does not.

  • Tyson

    I didn’t read through all the comments, so maybe someone has made this point already.

    I know Dr. Wang doesn’t want to spend a lot of time on this debate, but I do think there is an opportunity to avoid confusion by making a change to the website:

    I read in a Salon article today that Nate Silver said this: “In particular, his forecasts are based on an average of his past snapshots since June,” Silver writes. “Since Wang’s is a ‘polls only’ model, this is equivalent to looking at polls back to June.”

    It seems to me from this quote that Silver misunderstands how Dr. Wang calculates his point estimate. I looked at the description of Dr. Wang’s method on the left bar of this site, under About Us, where it says “2004 code, data, and description.” Core principal 1 is that “Polls are the only inputs…” Core principal 4 is that “Polls from earlier in the year are used to predict future outcomes.” Based on those two sentences, what Silver says is correct. There is nothing in that whole page (that I could find) that says anything about more recent polls having more weight.

    Then I go to the FAQ page (from 2012), and it says that “For the current snapshot, the rule for a given state is to use the last 3 polls, or 1 week’s worth of polls, whichever is greater. ” This implies that polls going back to June are NOT used for the point estimate (I believe those polls are used, however, for the confidence interval). NOWHERE on the Description page is it explained that the snapshots are only based on recent polls.

    You need to update the Descriptions page to make this clear. If you are using only recent data for the daily snapshot, this needs to be very clear. The description is called “The geeky version” – the geeky version should have the calculations clearly explained! Silver says your forecasts are an average of past snapshots going back to June. Is this true? I don’t think so. But I couldn’t tell if it’s true or not from your description page. That is not good! I’m sure it’s explained somewhere on your site, but there’s no reason to make it difficult for people. The Description link is at the top of your page, most people will just look there. If they can’t see a better explanation, they will assume Silver’s characterization is correct.

    • Tyson

      I just did further investigation. It looks like Silver did NOT mischaracterize Dr. Wang’s method – Dr. Wang has indeed been using the daily snapshots for the forecast, as I learned from reading this post:
      http://election.princeton.edu/2014/09/30/pec-switching-as-planned-to-short-term-forecast/

      This just reinforces my point. It should not be so difficult for the reader to understand something like how is the forecast being calculated! If I go to something called a geeky description, I want a geeky description! I know I could go in and look at the code, but I shouldn’t have to do that for basic questions like how do you calculate the forecast.

      I don’t want to waste more time investigating, but I assume that the point estimate forecast is some kind of mean or median of the daily snapshots, and the confidence interval depends on the variance of the snapshots? So since the rise in Republican chances recently, the confidence interval should have become larger recently? Is that correct? I’d like to see that in the description.

    • MarkS

      I completely agree with Tyson. And I have as many peer-reviewed scientific publications as Sam.

      Treating Silver’s politicalwire piece as a referee’s report, each point seems like a valid criticism to me (except for the Orman issue, which is a well explained choice Sam has made). If I were a journal editor, I would want to see a substantive reply to each of Silver’s other points before accepting Sam’s “paper”.

    • Sam Wang

      Except that he has made false statements, for instance the hypothetical about 2012. One rule of the game in academia is: if one side makes demonstrably false statements, that is the most serious possible black mark.

      I think a briefer statement is in order. The risk is tl;dr (too long; didn’t read) symdrome.

    • Tyson

      I had posted two comments, and it looks like only my second comment was posted. My main point was in the first comment – for a visitor to this website, if one looks at the Description under “2014 code, data, and description” it is not clear that the snapshot comes from the median of the three most recent polls or whatever, and it is not clear how the daily snapshots are used to generate a forecast.

      I don’t want you to explain to me how it is done in this comment feed. I want it to be explained in the Description section. It’s not enough to call foul according to the rules of academia. We’re in the world of public opinion. If Silver makes false statements about PEC, and a reader wants to jump over to PEC to see if Silver is being accurate or not, PEC should make it easy for the reader to come to the correct conclusion. I’ve been a fan of PEC since 2004, but I had to do a bit of searching to figure out if Silver was right or wrong in his description, and it shouldn’t be so hard for me. And other readers will not dedicate that kind of time. If Silver makes a characterization and then the reader goes to PEC and can’t efficiently see anything that contradicts the characterization, then the reader will believe Silver’s characterization.

      If you can find it, please post my original comment – I copied and pasted quotes from your site to make my point but I don’t have time to do it again.

      Keep up the good work!

    • Art Brown

      Tyson, Your point is that you can look at all of PEC’s code, but none of the others?

    • Art Brown

      Tyson, in response to your specific questions, the forecast is deduced from the meta-margin time series (mean and variation). The recent data enters in two ways: 1) each day is another point in the time series (where it has relatively little weight) and 2) since 35 days out, the most recent meta-margin is weighted more and more heavily in calculating the election-day meta-margin prediction. (As Prof. Wang says, the process starts resembling a random walk.) You can see the result in the recent decline in the D+I control probability, and of course it’s all there in the code. You can agree or disagree (with the election as final arbiter), but there’s no ambiguity. (I routinely reproduce PEC’s numbers, calculating the way we were intended (with Excel).) Not so with the other sites…

  • SFBay

    Hi Sam,
    I imagine you never thought you would be the subject of such interest from Mr. Silver. I have watched his evolution from humble numbers guy to petulant whiner. His recent attacks aimed your way says a lot more about him than you. He’s not in your league and it shows. I’d say using a fly swatter on him now will only get your fly swatter dirty. Better to ignore him.

  • Amitabh Lath

    I agree, he is to be ignored. Someone who claims that two forecasts, one claiming ~60% probability for heads, and the other claiming ~60% probability for tails are somehow significantly different from each other, does not understand either probability or significance.

  • axt113

    My recommendation Professor is to ignore him

    If he wants to keep sniping at you let him, you lose nothing and he makes himself look worse

  • Riley

    Hi Mr. Wang

    I like the article above – solid response, calculated, professional = integrity.

    I can’t say the same for Nate Silver, didn’t realize how impetuous his twitter posts were until I saw them here. I lost interest in his when he wrote an article tying the number of rules to sports popularity – was so lightweight I suppose he needed to fill the space with something.

    I’d much rather follow your posts – even if you’re wrong here and there it’s a pleasure to read. Let’s remember these are predictions, no one should take them as certainties.

    Best of luck.

  • Jim

    Regarding Nate’s response on Political Wire:

    Speaking for someone who regularly reads PEC and 538 but not in great depth, I’d like to hear a response. A few of Nate’s comments I can recognize as unpersuasive (such as regarding the likelihood that Orman caucuses with the Democrats). But I would like to understand why PEC uses polls since June, rather than another date.

    Sam has probably addressed that previously, but I suspect that there are many like me who may have missed that particular point.

  • LincolnX

    Wait a minute – who’s the Joker, and who’s Batman?

    You guys both do a great service. This is science – listen to the critique, refine where needed, and move on. Having said that, as far as Nate’s comments are concerned, I’m leaning “Joker” 53%.

    • Steve Scarborough

      Hi Sam. With regard to Mr. Silver’s latest piece you linked to, I say be cool as bks suggests. My suggestion is to not respond at this time. I realize that a non-response might be interpreted by some as agreement.

      However, if you immediately go out with a full rejoinder, that just escalates things. In my opinion, LincolnX’s view of refining where needed and moving on is called for.

    • Lojo

      Sam, yes you should reply. I do this for a leaving (marketing and a little spin doctoring).

      You need to push back and make the conversation about his model (not yours). He is the one with no consistency. You could wait until later but it might be too late (f the odds are strange and he does a lot better in the fall it is going to be very hard to get your side of the story hear). The conversation is happening now. It’s not peer review, it’s media.

      I would suggest the following as possible gambits:

      Set up a peer review of respected stat guys, academics. Say you are going to submit your model to them and challenge him (and others) to do the same. Have panel review each model and offer up their POV about both. It’ll at least make yours equivalent to 538 and is likely to reveal problems in his (which will be a big story).

      Make a public bet (but make it about the next two cycles, not just one) about which model will be better. Announce the next time you are on MSNBC. Put the money in escrow, get a panel who will come up with how to evaluate success in a tie. He’s going to lose either way (as you are likely to beat expectations he is putting forth).

      Hope this is helpful. Email if you need help. Keep up the good fight.

  • Steve

    Sorry, bks, but those words are inadequate and do not capture what makes a nerd a nerd, or a geek a geek. But i agree, the terms are suffering from over-use – my apologies for contributing to that. Actually, maybe it’s the context that is changing more – as illustrated by 538 circa 2008 vs 538 2014.

  • Steve

    Is Nate afraid that the professor is more of a nerd than he is now, and that his site has more nerd-appeal than the new 538? I suspect that’s part of it.

    There’s not a lot over at 538 these days for a nerd to have fun with. PEC offers more nerd appeal. And if PEC does better than 538 this year, that’s bad news for 538 – which is facing some difficult traffic numbers according to the rumor mill.

    Everyone makes mistakes. 538 made a big mistake with its climate change article soon after its launch. Talk about thumbs on the scale, or poor methodology!

    But i like Nate a lot, and i respect and admire his work. It’s great that we have both Nate and Sam. TPM did a story on the spat today and i suspect things will settle down now. But if not, that’s okay. I’m glad they’re passionate enough about their work to get emotional over it now and then!

    Is there any reason we can’t do 2 windows now – a 6 week window and one and back to June? I personally like having more than one approach running at the same time. It makes everything more interesting/.

    • bks

      Could we retire the word Nerd, please? Also Geek. I’ve always like the word they use in England: Boffin. If you want to be provocative, Realist. –bks

    • Insidious Pall

      Nearly everyone understands the different approaches; they’re more alike than different. But you make the salient point – PEC is more fun. More to do here.

    • Sam Wang

      Check this out.

      Does it require a response?

    • Amitabh Lath

      Nowhere does your model assume that Orman wil caucus with Democrats. Even the label says Dem+Ind.

    • bks

      In poker, Sam, the heuristic is to do exactly the opposite of what your opponent is indicating. Silver wants you to respond. It is to the advantage of both Silver and Political Wire for you to respond. I would give consideration to playing it cool. You will get plenty of chances to raise between now and November. –bks

    • Kenny Johnson

      Sam, as a layman… I’d like to see you respond to Nate. I’ve read your site since 2012 and I appreciate your perspective, but ill admit that I think there is something to the criticism of using polls from June in your forecast… and the fact that your prediction is the outlier amongst all the other aggregators

    • Sam Wang

      Oddly, June-now is the single most favorable time window to the GOP – other than a one-week snapshot, which is not a prediction any more. But ok, maybe a reply is needed.

    • Art Brown

      1) if/when the meta-margin swings blue, label it “D+I” in the masthead. 2) In football terms … well, football has nothing to do with this matter. I would love to see a detailed comparo between PEC and another polls-only outfit, but noone else publishes enough detail. Mr. Silver, in particular, is in no position to comment. Ignore him.

  • Jim

    I probably should have posed this question in an earlier thread, but perhaps someone can answer anyway.

    What mystifies me about Sam’s approach is the use of historical data to inform his prediction. If I understand correctly (big “if), his model somehow employs results from prior years to help extrapolate a prediction today. I’d love it if someone could explain exactly how the historical data is used.

    • 538 Refuge

      I don’t think it is historical data in the sense you are thinking. I think it guides the mathmatical parameters of the formulas, not like adding specific adjustments. I’d have a better idea but I’ve twice signed up for MIT’s online Phython course only to end up with a conflicting contract position.

  • RAJ

    Nice article on the cerebellum in autism. I read the abstract when it was first published. Could you send me the full text if possible? Ihave published in the Journal of Autism and Developmental Disorders. More recently I have a paper published in a new open access journal OA-Autism. It was an invited article by the Editor -In-Chief Dr. Manual Casanova. The open access publication fee was waived since it was an invited review. Comments are welcome:

    http://www.oapublishinglondon.com/images/article/pdf/1379116999.pdf

    More articles of interest can be found here:

    http://www.oapublishinglondon.com/oa-autism

  • RB

    Sabato with some ratings changes: Both MT and WV move from LKY R to SAFE R. IA moves from Toss Up to Lean R. CO moves from Lean D to Toss Up and KS moves from Lean R to Toss Up.

  • Kathy C

    I’m an average Jane with strong math skills and instincts thanks to my good ole public education! I became interested in Nate Silver when he gained fame in the last presidential election. But if even I can see the difference in what you are doing and in what he is doing, why does he continue to yap at your heels, Sam Wang, trying to discredit you? He is of the entertainment world now, somewhere beneath you. His math and methods don’t really matter now. Keeping them entertained does. A previous commenter was right, people like him come and go.

  • Steve Scarborough

    I like what you wrote, Sam. Well done. As others have said in so many words, stay the course.

    Perhaps you have heard about this: http://www.politico.com/blogs/media/2014/09/the-nate-silver-disaster-at-espn-196288.html. I wonder — given that the reports are accurate — if Mr. Silver is under some pressure.

    • Amitabh Lath

      I read that and felt bad for Nate and went over to his site to give him some traffic. Now I’m really hungry for a burrito, but according to him there aren’t any in New Jersey.

      And anyway, vegetarian.

    • Sam Wang

      Come to think of it, I am curious about why he isn’t doing any of this on his own website.

  • Davey

    Would anyone like to run probabilities on the likelihood of Sam doing a statistics and probabilities refresher course next semester, with a “family and friends” tuition discount for ESPN employees? (maybe see if you can schedule that right before an Internet etiquette course down the hall.)

    • Sam Wang

      Thank you…though I think I am not in a position to teach Internet etiquette. I have written plenty of things that I regret later.

      Also, a few thousand votes one way or the other in Iowa, and it’s quite possible that I’ll be declared to have made a wrong prediction…I guess by the people who most need the refresher course?

    • Davey

      No worries, Sam. Your time is probably better spent working on neuroscience, and I would hate to rob the other guys of time they could be spending further honing the America’s best burrito model.

  • Sean

    “When the analyst becomes the story, that’s antithetical to what data-based journalism ought to be.”

    The key line of the entire story. Kudos Dr. Wang for surviving your peer review. The competitors don’t like you making them look bad.

    I admit I followed electoral-vote.com first, 538 second and only recently found your work. I find that by removing the fundamentals you are removing the “emotion” that blinds people like Karl Rove, when he stormed into the election projection area to “counter” his wrong perception of the data coming in.

    As the election nears, the peer reviews will become nastier and may or may not be right.

    But by sticking to your statistical formula, that has worked in the past, you will either be once again proven correct or sent back to the drawing board to tinker with your formula.

    You aren’t here for money, nor prestige. You are here for the data.

    On behalf of all the electoral data nerds…

    Thank you!

    • Matt McIrvin

      The results in this election cycle, whatever they are, aren’t really going to resolve anything, because as the election nears, the models like Silver’s are going to gradually roll out their fundamentals component, and Sam is going to tighten his prediction to be basically the same thing as his snapshot. By Election Eve, everyone’s going to be saying essentially the same things with only minor variations.

      What’s needed is a comparison of the performance of these longer-term predictions methods over several election cycles. But that requires stability. PEC’s prediction model is more recent than its “if the election were held today” snapshot, and it’s changed in various ways since Sam started using it. I also recall Silver publicly tweaking his constantly during the 2008 cycle (don’t know how much of that he’s done since then).

      Going by success in predicting things way, way out, the winner in 2012 running away was Drew Linzer’s Votamatic, which seemed outright delusional during the period after the first debate, but turned out to be almost eerily prescient. But nobody really knows if that was a fluke.

    • Matt McIrvin

      …In that connection, I guess I should mention that Linzer’s Votamatic-like model at Daily Kos is one of the models Nate Silver prefers over PEC.

      Certainly it’s currently giving a more pessimistic prediction for Democrats right now, though its distribution is interesting: it’s bimodal, with a peak similar to the PEC prediction and a taller one around 48 Democratic seats. It also looks like he’s trying to incorporate a model of how Orman will decide to caucus.

    • Sam Wang

      Linzer isn’t using a Bayesian prior this year, on the grounds that Senate modeling is too uncertain. It’s a departure from 2012, when he was doing polls+fundamentals. In 2014 he is using polls only.

      That is correct that he is making assumptions about Orman in the case of a 50R/49D+I split. Either 50-50% probability or 25-75%, I forget which.

  • Marc

    We, the stats-erati, know that both 538 and PEC are predicting basically a coin flip. And, that the results of this election will not determine whose method is ‘right’ in a rigorous sense. But, we might also consider this through the lens of public opinion. The public sees things more in b&w — PEC predicts D win, 538 predicts R win. And, that’s how the public will interpret things post-election day. PEC probably doesn’t have much too lose because of this. But others might.

    • Jinchi

      Exactly. I’m a amazed at the level of the dispute. We’re basically arguing the difference between a model that predicts a most likely outcome of 51-49 for Republicans against one that predicts 50-50 (plus Biden) for the Democrats.

  • jd351

    Dr. Wang,
    Please just keep doing what you are doing and take ownership of your model. In the end you will be either right and/ or wrong. However, you will learn either way and make the proper adjustment going forward. Everything else is just noise. Thanks for the data.

  • Andrew S.

    Nice post Prof. Wang. Since I am not a twitter-er, I hadn’t realized just how many of those weirdly critical posts Mr. Silver has made. I think it is very commendable to keep the tone “professional” like you have done! (It made me smile to see a media personality treated like a hostile reviewer – how appropriate! And this strategy sometimes impresses the editors sufficiently to accept the manuscript.).

    Although I didin’t have time to comment, I also appreciated your recent post where you mentioned the 1.5% mean underestimation of democratic performance by polling. I was using 2% based on the final 2012 pollster poll average vs. the actual vote tally for president. I found it interesting that this effect was present even in non-presidential election years! So maybe this means we might be able to scratch one media talking point. I wonder if any of this is due to likely voter assumptions.

  • A New Jersey Farmer

    Live by the Meta-margin, die by the Meta-margin. Nice post and explanation Sam.

  • Richard Wiener

    I’ve always thought both 538 and PEC make important contributions to the art and science of forecasting elections.

    If an observer just wants to get a feel in advance of election day as to which way the vote is likely to go (and lots of us do want this), about the simplest thing is to take the average of recent polls. I always check the no-toss-up maps on Real Clear Politics and that pretty much tells the story. Anything more sophisticated, like 538 and PEC, is bells and whistles. Of course, it is interesting to see if more sophisticated models can improve the forecast, and public opinion polls provide fertile data for trying out various techniques in applied statistics, which is great fun and scientifically interesting. Nonetheless, whether the weatherperson forecasts a 40% or 60% chance of thundershowers, I’m going to carry an umbrella.

    If I look at RCP’s no-tossup Senate map, it shows Rs lead 52-47 (based on the mean of recent polls) and half a dozen races are pretty close. The PEC meta-margin instead uses median statistics and collapses poll information into a single number. So the meta-margin shows R+0.4%. That gives a slightly more precise measure than Rs 52-47 with several close races. Either way, I know which team is trailing and I have a pretty good sense of the score.

    538 currently gives Rs a 58% chance of winning. I know this is a forecast for election day, but this late in the game it isn’t much different than current conditions. PEC projects Ds have a 65% chance. In other words, RCP, PEC and 538 (as well as the other models) agree it’s a coin flip, but there is slight disagreement as to which way the coin is weighted.

    It would be nice if this were a friendly disagreement with the aim of providing insight into the effectiveness of different modeling approaches.

    I appreciate the tone Sam Wang has struck in responding to criticism from Nate Silver.

    • Daniel Wiener

      When the probabilities are claimed to be this close, there’s no way of determining which person’s model is better than another. Even if the actual election results for one model turn out to be more correct than another, it could easily be by chance rather than superior methodology, since the error bars are so wide. And yet if Sam is “right” on one more Senate race than Nate, or vice versa, that will be seen as a victory in this little spat. After all, they’re arguing over precisely that issue for 2012 (i.e., did Sam get just one race wrong or two?).

      In 2012 the national pollsters were predicting a narrow Romney victory while Sam and Nate relied on collections of state polls to forecast an Obama victory. That split regarding a binary outcome (only one or the other would win) is what got everyone gushing over their methodologies. In 2014 the binary outcome is a foregone conclusion; everyone concedes that the Republicans will keep control of the House and pick up seats in the Senate. Predicting the exact margin of victory — will the Republicans win 6 or more Senate races? — is much more difficult and is necessarily focused on state-level polling rather than national polling. Agglomerating those close state polls into a single predictive number is much dicier: Even the best method could still project the wrong result.

      When Sam and Nate were predicting a Democratic Senate hold two months ago, in contrast to more traditional political pundits who expected a Republican takeover based on fundamental considerations, that was news worthy (especially since it boosted Democrats’ morale). If they had frozen their predictions and been proven correct, that would have been impressive, to be that accurate that far away from the election. But as they shift their predictions based on shifting polls, it’s indeed hard to distinguish their methodologies from Real Clear Politics. And whatever the outcome is, it will neither entirely validate nor entirely falsify their models. It may, however, excessively inflate or deflate their reputations.

  • securecare

    Nate needs to get over himself and stick to the mathematics.

    • John

      If he understood the mathematics he wouldn’t need to get over himself. The emperor appears to be starting to fear that the adoring masses may notice he’s rather scantily clad. The best defense is not always a good offense, and I’m not sure his is a good offense. I’m sure many of us would have thought this level of personal attack rather beneath him.

    • 538 Refugee

      We ALL need to ratchet it down a notch and stick to the positives or I’m afraid this won’t die. I think there has been some good come from this on both sides. This site gets more attention and 538 started explaining a lot of what they do. I made use of the site tonight trying to see if I could understand the sudden apparent shift in the polls in close races and turned up something I wasn’t looking for.

      Survey USA puts Nunn at -%1 with 6% undecided.
      http://elections.huffingtonpost.com/pollster/polls/surveyusa-11alive-20563

      According to Nate, Survey USA has a good track record:
      http://fivethirtyeight.com/interactives/pollster-ratings/

      There are still lots of close races and the initial projection was 50/50 with Sam saying because it was close he wouldn’t be surprised at 51/49 EITHER way. 538′s 59.3% Republican takeover is like spitting on one side of the coin to make it heavier before you flip it. This is far from over decided.

  • James

    Love seeing your very detailed posts about your methodology and your straightforwardness about the entire process. Your ability to seriously engage issues even in the face of caustic, belittling replies for people that need webtraffic is nothing short of amazing.

    It’s great to have someone as engaged as you are in the process trying to filter out all the unending excessive noise that comes with every election.

  • Joe Plumber

    It’s all rather silly and somewhat petty. Nobody is winning any fans from the debate, though Nate has more to lose.

    • Sam Wang

      Are you sure? I do this for free, which means what little reputation I obtain for good analysis is all I get.

    • Pat

      I agree. Nate has certainly become pretty annoying and obnoxious and his new site is flirting with irrelevance (who cares about a best burrito contest, seriously?) You are better than that, Sam. While there has been lots of criticisms on his part (although, some of it was fair: it was indeed somewhat dishonest to claim you had only missed 1 race in 2010 when you missed 2), we got the point: no need to devote so much of your blog replying to him again and again… It does get a little petty and silly indeed.

  • NP

    Am I right in deducing that the Election Day Probability percentage is the weighted average movement of the current prediction based on the historical data?

  • RB

    Since this is a polling site here are a couple for you:

    Kasich is up a lot per Quin-Walker is up 50-45 per MU—Ras had Gardner up 1—Suffolk has Orman up 5(this is the race I feel had the potential to move the most due to the countless variables involved)

    Sam as a parent of a child with Autism, I’ll have to give your paper a look(but if it is in the weeds I probably won’t understand it)—if you good suggestions of good books or papers to read for the average joe that would be great.

  • Hugh J Martin

    This is sad, and a bit disturbing. Why would Nate Silver be attacking you this way? I’ve followed PEC since you started in 2004 because you were first, and because you’ve been transparent about trying to figure out how to get the best possible prediction.
    Silver benefits from huge reputation effects because of his previous association with the New York Times. Most journalists who cover him don’t have statistical expertise, and cannot evaluate his work. They rely instead on reputation.
    Silver, meanwhile, has posted some analyses on non-election issues that had serious methodological shortcomings. And it’s always been difficult to understand how he does his election forecasts.
    By positioning himself as a popularizer, Silver avoids the knowledgeable scrutiny that comes from a good peer review. He’s not the first, and won’t be the last, to do so. It’s too bad Silver also wants to tear down PEC, when he could be focusing on improving his own work.

    • tfitz

      Actually there has been a lot of coverage in the mainstream media as to the failure of Silver to make the transition to ESPN. Its a stretch and it hasn’t gone well. He is no longer the ‘darling’ and unfortunately is quite defensive as to the perceived competition.

    • Webster

      Actually, that’s backwards. He got the NYT gig because he already had a good reputation from FiveThirtyEight’s previous independent work.

  • Joseph

    I think the “peer review” simile is amusing. I also think Mr. Silver has, for some reason, gotten personally invested in this – contest. And yes, to some degree, you have as well, Professor Wang, although you come across as much more detached.

    Of far more interest to me is the higher “sensitivity” that I see in your approach to shifting polls. I suspect that is why you saw a larger shift (than most other aggregators) towards the Dems and are now seeing a larger shift towards the Reps. That is why it is so important when you recalculate the Election Day Probability, which, as I understand it, sums up the many “snapshots” you’ve taken. For that matter, I suppose you could also do a summing up of the Election Day Probabilities as well. That might indicate when people’s interest in the election finally takes hold.

    Under the circumstances, I’m glad you chose to keep the model as originally planned. Mr. Silver would have jumped on that, sure as shooting. Next year, however, please consider running two Election Day Probability models, with the second one starting from about 6 weeks out from the election. IMHO, that will aid in capturing the mind of the electorate as it begins to focus more completely.

    • Sam Wang

      I agree, it’s all too easy to get emotionally invested. I am hoping things cool down soon.

      Yes, that’s a good idea to see what emerges from varying the time window. A six-week window might be the sweet spot, since intuitively people don’t like to see a prediction get too far from current conditions. It’s less clear that it’s a prediction…but this is half math, half media to begin with. What to do in that situation is an interesting question!

  • Art Brown

    I at first thought that last week’s jump was the revenge of the fundamentalists, but on reflection, I think that fundamental-type corrections should appear as slow moves in polls (as the fundamentals are gradually phased out of a non-polls-based model).

    I can only imagine the stress induced by such a public twitter-spat. I like your “keep calm and carry on” response.

  • 538 Refugee

    This is from Twain if you don’t know the Pudd’nhead Wilson reference:
    It were not best that we should all think alike; it is difference of opinion that makes horse-races.
    - Pudd’nhead Wilson’s Calendar, 1894

Leave a Comment