## Monkeying around with House models

#### September 21st, 2012, 9:00am by Sam Wang

Via Andrew Sullivan, Dylan Matthews asserts that my House outlook (a Democratic takeover probabilty of 74%) is weakened by a model described over at the Monkey Cage. However, a careful reading of that model reveals problems in using it as a prediction – problems of the type I have been warning about for several months.

Here is a graph that puts their model into proper context.

I will explain – and as a bonus, address an issue raised by Kevin Drum at Mother Jones.

First, let me say explicitly that I do not find a problem per se with the Monkey Cage’s research efforts. They have considered many details which I will read with interest. However, their calculations contain huge uncertainties of the kind that political scientists – and reporters – often choose to downplay.

First, to quote Matthews’s claim:

John Sides and the team at the Monkey Cage have a model that uses GDP, the president’s party and approval rating, incumbency, and district-level presidential vote, rather than House polling. Their model gets the seat margin wrong by 2.61 seats, on average, much lower than Wang’s error. It gives Republicans a three out of four chance of keeping the House.

This is seemingly a convincing criticism. However, at its heart is some misdirection. The “2.61 seats” statement is revealing because it is too small to be realistic. It is the same weakness I detected in the FiveThirtyEight “four-factor” model yesterday: Overfitting of small residuals is basically chasing noise, and leads to massive uncertainties.

Now, the Monkey Cage crew is aware of this issue. To quote them:

The standard error for the vote share estimate is 5.6%; for seat share, 8.7%.  That’s a lot of uncertainty.  It means there is at least a little probability of some pretty crazy outcomes.  It explains why there is still a 1 in 4 chance that the Democrats will get the 25 seats they need to retake the House, when our own median prediction is only one seat.

In other words, the “median gain of one seat” sounds precise…but is meaningless.

Let me make the point graphically. Here are our two national-popular-vote predictions plotted side by side…but with uncertainties included:

For those of you unfamiliar with this kind of plot, the data points are the values that get reported in the popular press. The horizontal lines are error bars. They indicate the confidence with which we know the median. A large error bar indicates high uncertainty.

As you can see, our two ranges are perfectly consistent – but the PEC estimate gives much more certainty – and information. In contrast, their range, from R+13% to D+9%, contains many possibilities that we can be confident will not happen in November. If the Republicans win by 10 points, I will personally wash Dylan Matthews’s car with a toothbrush.

What about seat count, the ultimate measure of House control? Same story:

In short, their model indicates a three in four chance of GOP control because their uncertainty is massive. Do you think the Republicans will attain a 278-157 majority?

In some sense, our two calculations are consistent. However, what I presented is not a complex model in the same sense, but a precise short-term projection of likely outcomes.

>>>

My general take is that the Monkey Cage model has the potential to identify the broad picture of what influences House elections, especially if they start culling unnecessary parameters in the way that I recommended yesterday. The end product will be a hypothesis that can then be tested by current polls, which give us ground truth of true conditions. Then, in 2014, they can refine their model with the 2012 outcome in hand.

As I wrote this summer, models based on “fundamentals” (GDP growth, previous seat count, and so on) are research tools that set a range for what might happen before an election season starts. To make my favorite analogy to weather forecasting, they are like what climatologists do when they warn that there may be a lot of hurricanes next year.

However, “next year” has already started. And climatologists are not of use when one is trying to identify a hurricane strike zone. At this point the best indicator of opinion is…measurements of opinion. Polls are like a thermometer that tells us what is happening now. As I have pointed out, this is why econometric models for the Presidential race have been all over the place, yet our Meta-Analysis has been tightly clustered around a probable Obama victory since July.

OK, now that we have addressed the Monkey Cage…in a second concern, Kevin Drum expresses skepticism as to whether the generic Congressional ballot is really predictive of national popular vote. Here are some comparisons of final-week polls, courtesy of RealClearPolitics:

2010 Polling average, R+9.4%. Outcome: R+6.6%.
2008 Polling average, D+9.0%. Outcome: D+10.9%.
2006 Polling average, D+11.5%. Outcome: D+7.9%.
2004 Polling average, tie. Outcome: R+2.6%.
2002 Polling average, R+1.7%. Outcome: R+4.6%.

The differences between polls and outcome range from 2.8% toward the Democrats to 3.6% toward the Republicans. This is a larger discrepancy than Presidential polls, which get within 1% when treated the same way. But it’s not too bad – and it is an error that is contained within the error bars above.

Update: Kevin Drum drills a little deeper and points out the possibility of pro-Republican drift over the coming six weeks. Hmmm…

>>>

I will certainly entertain further criticisms. There were some good ones in the comments section yesterday.

• wheelers cat

Great title.
One problem with Sides’ model, and also with the 538 forecast model, is that both add in “economic indicators.” If what you say is true and the economy is already baked into the polling, this means Sides and Silver are counting the economy twice.
That is adding artificial uncertainty to the outcome.
Randomness is our friend, uncertainty is not.

• Olav Grinde

Economy is certainly a factor, and the normal state of affairs is that the incumbent President gets the credit or blame. However, poll after poll shows that a majority of voters blame Bush — not Obama — for the depressed economy. Hence the conventional interpretation of economic indicators and their political consequences simply do not apply!

• It’s almost as if the electorate had gone sane. but that’s just crazy-talk. I wonder if economic indicators in this case are divisible somehow…

• Michael Worley

When will your projection start narrowing to reflect our closeness to the election.

• Ralph Reinhold

Several days ago, referring to the presidential race, Nate Silver remarked that from that day forward, the financial models would have less and less influence on his predictions because the would have less and less influence on the outcomes. It would take significant financial news to modify the outcome at this point. This is consistent with what you are saying. I would think that the GDP, etc would have less influence on house races than on the presidential.

• wheelers cat

meh.
if you want Nate Silver sans artificial uncertainty, just read the nowcast. And if you note, the nowcast IS convergent with PEC’s forecast.

• Olav Grinde

Ms. Jay Sheckley: “It’s almost as if the electorate had gone sane.”

Yes, it’s pretty hard to believe. I keep pinching my arm, half expecting that any moment I’ll wake up from this alternate reality and realize it was just a dream.

• Joseph Marshall

It seems to me that these econometric model make far too many a priori assumptions about what is significant and what isn’t. Why the GDP instead of the unemployment figures, median personal income, housing starts, stock market value, and so on? There seems to me to be no convincing reason for this choice, and no way to gauge its permanent relation to voting behavior.

In the same way, why is the President’s approval rating and his votes in each district privileged over Congress’ approval rating, since this is at least some indication that the incumbent party is in trouble?

Further, it seems to me that the numerical relations between the arbitrary choices are inherently incommensurable: number of votes, number of seats currently occupied, percentage of approval numbers for the President from poll questions, and GDP in terms of either % of gain or loss, and so on.

If this is so, I can see no way to weight these factors that is not unconvincing and arbitrary. Even if all these measures are correlative to electoral success, this is no reason to assume they work together, unless there is a stable and specific causal link that that coordinates them. And if one can be identified, why use some standard for the mere summing of the parts instead of some measurement of the causal link?

All inductive science skirts the ad hoc propter hoc fallacy when no clear causal hypothesis is present. Rather than skirting it, these arbitrary models land smack in the center of the fallacy.

Here’s a pretty strong (I’m convinced) argument that it probably won’t help whomever wins in the long term:
http://www.realclearpolitics.com/articles/2012/09/21/state_of_the_race_part_3_winning_by_losing_115526.html

Even if Dems do get a majority in both houses, and the presidency, there will be massive squabbling and I am not sure anything strong could be passed- look at how weakened the stimilus bill and the healthcare bill were even with a 60 seat majority in the senate!

From where I’m sitting, I’m wishing we could go ahead and plunge off the fiscal cliff, get the pain over with sooner rather than later, make sure that the pain hits everyone fairly, get it over with (with cuts to everything and tax increases to everyone) pull back as many military bases and huge military projects as possible (after all, tanks and stealth bombers are not going to be used to fight Al Qaeda, are we still expecting to fight Russia and China?)
But all this is impossible with the political situation. And inviting the disaster might also invite the modern Republican version of FDR, which would be a disaster for the middle and lower classes.
But let’s get it over with so that I can hand my children a world that is gradually getting better rather than gradually getting worse.

• Jared

The Dems only had 60 votes in the Senate for a few months. The Republicans prevented Al Franken from taking his seat for 6 months or so. Then, Ted Kennedy got sick and died, and there was a brief delay until Paul Kirk took office. And then, Scott Brown won the special election to succeed him. The stimulus bill was enacted without 60 seats; the Affordable Care Act barely got through the Senate during the brief 60-vote period, but was weakened by Joe Lieberman and others even then. It only got enacted because the House swallowed virtually all of the weakening provisions.

I realize this doesn’t affect any of the models, which are interesting, but hardly dispositive. The only model that really counts is the actual vote tallying of an election.

• Philip

Steve,

I think it’s a mistake to minimize the policy consequences of a Democratic sweep in the fall. Here are a few that come to mind:

– Obamacare survives and the Democrats benefit as politically popular portions of the plan take effect (and no one is terminated by the Death Panels).

– The 5-4 conservative SCOTUS majority doesn’t go to 6-3 and lock in for a generation, and just maybe, the court split flips to a 5-4 majority for the moderate/liberals.

– Bernanke is reappointed and continues the Fed’s policy of balancing employment against inflation.

– The House conservatives lose their leverage to threaten the nation with bankruptcy, and a deficit bill that includes spending and tax expenditure cut along with tax increases is enacted. (Yeah, I know, there’s the supermajority requirement in the Senate, but there’s also reconciliation to overcome it.)

I think these are sufficient policy reasons to care about the outcome whatever you policy preferences might be.

Finally, the Tea Party wing of the GOP seems likely to attribute a Democratic sweep to the nomination of an equivocating moderate at the top of their ticket and nominate one of their own in 2016. That’ll probably set up an electoral disaster and give Democrats a chance to pass more of their agenda. Needless to say, if they were to win in 2016, the policy implications could be big, depending.

• Philip

Jared makes some good points. Here is another.

Democrats have been plagued by the so-called Conservative Coalition since 1937 whereby conservative Dems defect to oppose initiatives of Dem presidents and support those of GOP presidents. Ben Nelson played that role in 2009-2011 depriving the Dems of a supermajority even during the short time there were 60 nominal Dems in the Senate. In the end, Nelson extorted hundreds of millions of dollars for Nebraska for his vote on Obamacare, inflicting yet another wound to the president.

Otherwise, Obama never had the supermajority you reference.

• Ray Fair has the granddaddy of all economic models forecasting for the US elections. His model is a three/two factor model with two inputs that are GDP related and another is inflation related. You can play with his model here
http://fairmodel.econ.yale.edu/vote2012/computev.htm
He also updates it every three months. Next one would be near the end of October.
http://fairmodel.econ.yale.edu/vote2012/index2.htm
He also tells you where it all comes from
http://fairmodel.econ.yale.edu/RAYFAIR/PDF/2010C.pdf

Some years ago, I had lunch with him. He is a firm believer of Occam’s Razor
http://en.wikipedia.org/wiki/Occam's_razor

Tapen

• Philip

Ok, but what’s the fun in waiting until mid October to get a prediction? Three weeks out, most of these models are honing in on solid projections. I, for one, want to see reliable projections at least two and three months out.

• Maybe if you had read this site in July? Over time, I have been trying to show you guys what a true prediction is. I’ll spell it out more soon.

• Andrew

At this point the best indicator of opinion is…measurements of opinion

Thank you for pointing this out. It seems like a lot of people are taking the Monkey Cage model as a prediction of November’s results – when it really is just explaining what the so-called “fundamentals” would predict.

And, sure enough, the range of likely results as shown by the polls (and your model) falls within the range of likely results as shown by the fundamentals.

So, the Monkey Cage and PEC models don’t conflict. It’s just that the latter is much, much more precise than the former.

• BigAngryBubba

To the site host:

Ultimately this activity is a solipsistic exploration in the guise of polling exegesis. However, I appreciate the care with which you address even minor typographical errors.

In my view, the great majority of pollsters purposely oversample Democrats.

I never stooped to handouts even when I was on food stamps.

I am, yrs., etc.,

Bubbus Iratus Maximus III.

• Bill-once

It appears that there’s yet another problem with the Monkey Cage model. Using the graph data from their web site (linked above) it seems that the standard deviation of their model varies substantially (and inexplicably) over time: 1970’s 0.89, 1980’s 1.52, 1990’s 3.11, since 2000 3.82. Not encouraging for precise projections. Using the 2000-2012 error values leads not to the 8.7% value above, but (unless I’ve erred) instead to almost 10.5%.

• FYI, I recently called out Dylan Matthews at the Post for shaky stats and misleading citations of studies:

http://www.washingtoncitypaper.com/blogs/citydesk/2012/09/20/the-statistical-illiteracy-of-washington-post-wonk-blogger-dylan-matthews/

• Spiny Norman

That is one beautiful and technically adept takedown. Highly recommended.

• Joel

Sam,

Just wanted to offer my congratulations for your growing profile. You deserve it. I’ve even gotten over my biochemist-induced skepticism of neurobiologists in this case, and read your blog as regularly as Nate Silver’s.

I cannot say that Dylan Matthews impresses me in the same way…

• charles

1. The econometric models (Presidential) are (collectively) more consistent than yours. If you go to pollyvote.com and look at the predicted vote share from the econometric and index models combined- you find less variance over 9 months than PEC gets over 3. The modellers should be considered as individually insane people and collectively highly intelligent- exactly how we think of pollsters. I am with Nate on this point- yet go to his Is Obama toast? interactive graph from Novemebr 2011 and Romney should be winning by a wide margin.

2.One major problem I have with your Pec model on the congresssional ballot is that there are virtually no polls that are not 1. robo or 2. internet. You Gov/ Ipsos are internet- Ppp and Ras are robo. Which ancient polls are included in this median?

3.Can someone find the Stochastic Democracy guy who put everyone else to shame in the last 2 USA elections? He kicked ass in Canada 2011 as well. He was a Princeton student and German.

4.To the Nate slaggers- it maybe deja vu all over again. His soccer model (SPI-ESPN) failed to predict the winner of the World Cup and Euro 2012. The Germans aggregated the computer rankings from around the world and predicted both. In fact almost all the computers predicted one and the vast majority both- Nate stood almost alone in screwing up.

5. German-American? Prof-student? I like the counter intuitive. I have been watching too much Fox- in Thailand we get Fox only because they cant charge since it has no value. The hovels can’t afford Cnn, Bbc and my mind has deteriorated fast.

• Philip

You can’t hold it against an American who can’t predict soccer tournament outcomes. We don’t understand the sport, we don’t give a damn about it, and the outcomes of too many matches are determined by the random results of shoot-outs and penalty kicks.

• Charles

1. That is an interesting site, and an interesting point. It’s a bit meta, though, to average all us aggregators and modellers since many of us are one layer away already.

2. I use all polls available in the time period stated. Individually insane, wise in the aggregate.

3. That would be David Shor, an American. He was a Florida International University undergraduate visiting me at Princeton for a time. Not a student here. I believe he is working for Obama for America now.

SW

• MarkS

It’s really shocking to me that someone writing for something called “wonkblog” can be so ignorant of statistics. These two sentences, which appear in sequence in Matthews’ post, are obviously incompatible: “Their model gets the seat margin wrong by 2.61 seats, on average” and “It gives Republicans a three out of four chance of keeping the House”. If you can call the election to within 2.61 seats (and can I make a side comment on the extreme silliness of quoting such a number to 3 significant digits?), the odds of a takeover should be much closer to 0% or 100%. Yet Matthews doesn’t notice. Nor Sullivan, who quotes him without comment. The innumeracy of the press (especially the “wonk” press) continues to astound.

• wheelers cat

That is why there is the Stat Nerdcore!
POW! BAM!
Take that Dylan Matthews!

• DaveM

RE Kevin Drum drilling a little deeper:

Seems to me that one would need to consider the relationship between the head of the ticket and the actual vs. projected Congressional vote percentages in Presidential election years.

In this context, might the differences between the Presidential year RCP generic Congressional figures Kevin provides for Sept. 1, and those for the final pre-election week, be said to parallel the movements of the presidential race itself in those years?

In both 2004 and 2008, the Sept. 1 projection understated the Congressional vote margin for the party whose Presidential candidate pulled away at the end of the race, suggesting at least the possibility that the discrepancy between the generic projection and the actual result has as much to do with coattail-riding as any built-in party bias.

In this light, Kevin’s statement that he’d “probably subtract two or three points from the current RCP generic poll average, which has Democrats ahead by 2.2%”–based on the notion that 2008 was a landslide (which apparently explains its outlier status in his analysis, though I’m not sure how)–might be stood on its head: ADD a couple points to the current average, consistent with the behavior over the final two months of the previous two Presidential election years.

• Indeed. I have been looking into the history of congressional generic preference over a campaign season. In midterm years it tends to go against the incumbent President. This would be part of Kevin’s finding. To my knowledge there is not a scholarly study of what happens in Presidential on-years. I’ve been asking around.

• theDAWG

To my mind, discussions of match picking in the 2010 World Cup begin and end with the German Octopus. I’m sure Bubba agrees…

• Pat

Very interesting spiky histogram today! The 2 highest peaks are at 332 EV (realized mostly by Obama wins in all swing states except NC) and 347 EV (including NC).
Where are the other 2 peaks located? I assume they correspond to the same previous scenarios without Florida?

• Hemisphere

What do you make of Dick Morris’ claims that Romney is winning because “1. All of the polling out there uses some variant of the 2008 election turnout as its model for weighting respondents and this overstates the Democratic vote by a huge margin. ” and

2. Almost all of the published polls show Obama getting less than 50% of the vote and less than 50% job approval, and that undecideds overwhelmingly go against the incumbent?

http://www.dickmorris.com/why-the-polls-under-state-romney-vote/http://www.dickmorris.com/why-the-polls-under-state-romney-vote/

• wheelers cat

Blatant falsehoods designed to keep up the spirit of the conservative base or possibly conservative magical thinking.
I averaged the turnout difference between the midterm and presidential election years for the last five presidential elections.
Presidential election years get an increase of 16.1 percent over midterm elections on the average.
Nate Silver averaged the last five national polls and got an average of dem 35 to repub 30, and this includes Rasmussen, which is likely not accurate.

• pigeon

Morris’s argument is essentially that 2008 voting patterns will not repeat themselves because of lessened enthusiasm — however, the polls do not bear out the “enthusiasm gap.”

http://www.washingtonpost.com/blogs/the-fix/wp/2012/09/20/the-enthusiasm-gap-or-not-in-2-charts/

There is no real evidence for the idea that undecided voters break against the incumbent.

http://fivethirtyeight.blogs.nytimes.com/2012/07/22/do-presidential-polls-break-toward-challengers/

• These claims are contradicted by evidence. As a group, pollsters have done very well in previous elections. Undecided voters could break asymmetrically, but not by much. His opinion is wishful thinking of this type in 2004.

• ChrisD

I can’t speak to the specifics of Morris’s claims, but Andrew Sullivan has renamed his annual award for spectacular misprognostication the “Dick Morris Award.”

• Bill N

I think Dick Morris’ job is to find a way to convince important people Romney is winning. It is not to make accurate predictions.