A final unskewing
Mother Nature is our best teacher and the only one who is always right. – Viktor Hamburger, biologist In yesterday’s Los Angeles Times...
Senate: 48 Dem | 52 Rep (range: 47-52)
Control: R+2.9% from toss-up
Generic polling: Tie 0.0%
Control: Tie 0.0%
Harris: 265 EV (239-292, R+0.3% from toss-up)
Moneyball states: President NV PA NC
Click any tracker for analytics and data
Political science models should not be interpreted as predictions because they are often wrong. The models are research tools to help political scientists in their search for rules that govern behavior.
Many of you keep asking about the latest econometric “model” that some political scientists at the University of Colorado have cooked up. Most of what needs to be said is in an NYT op-ed, Political Scientists Are Lousy Forecasters. And this. If you are impressed because they can account for the last eight elections…read this passage from Innumeracy by John Paulos. Speaking as an academic, I’ll say that predicting past events was probably a minimal condition they had to meet to get published. (Illustration: NYT)
My background is in physics and neuroscience, fields that are more mature in what constitutes theory (especially physics). I read the political-science models to find out what what the interesting questions are in their area of research. I view the models as a search for causative principles, as opposed to an application of known laws of cause and effect.
But for information about current conditions during an election season, what they offer pales in comparison with what pure poll aggregators like the Votemaster and I are showing you. In this respect, the researchers are like you — observers. Let me explain, starting with an analogy.
Let us say that we are behavioral scientists, and we are trying to test the idea that rats in a crowded space are more likely to bite one another. The natural experiment would be to put the rats in an enclosure, vary the enclosure’s size, and count the number of bites per hour. We could fit the data to some curve, which could be used to predict future behavior.
However, that’s only one variable. Now we start wondering. Does it happen more when the lights are on or off? When food is abundant or scarce? Does the sex of the rats matter? And so on. In fact, we could build a predictive equation, in many ways quite similar to the “models” of political scientists.
Two problems become apparent.
The first problem is that eventually, we are going to have to come up with a mechanism to explain what we are observing. We will tire of finding new correlations. Mechanisms are essential for understanding. For example, the link between smoking and lung cancer began as an epidemiological link, but matured over decades with many other discoveries, including the effect of tar on mutations, which mutations make lung cells go out of control, and so on. We will want to turn our findings into laws of behavior…and one hopes, neural circuit mechanisms. (In my view, the same is true of economics and political science.)
The second problem is what I’ve written about before. Every time we add a new part to our “model,”we are adding information…but also maybe noise, in the form of uncertainty. Should our model account for every variation in rat behavior? What if one of our statistical associations arose by chance? We’d better be careful of what we allow into our “model.” We’d want to avoid counting a factor twice. For example, the amount of bedding in the cage is correlated with its area, but we’d probably be wrong to include both bedding and area as parameters.
Let’s add one more condition. We are allowed to make lots of different measurements…but we may only do eight experiments. Now it’s a lot more like the political scientists’ challenge.
>>>
Properly used, a political science “model” is really a tool for discovery. Researchers like those at U. Colorado want to discover quantitative laws of political behavior. Their model is a test of their current thinking. If they’re wrong…they will try again. As for me, I do not find it believable that opinion will swing by 7% between now and November. If Romney wins, it will be by a whisker. In the more probable outcome, Obama will win and there will be ample new material for them to run more regressions. That’s good, though. What, you thought the Coloradans were doing their work for you?
I have been criticized by political scientists for not engaging in what they call “theory.” That is correct, as far as it goes. Instead, I give you, from day to day, an excellent instrument for measuring what is happening. Think of what the Princeton Election Consortium offers as an electoral thermometer, useful to you…and to political scientists too. For instance, the Princeton Election Consortium gave you a highly accurate measurement of two VP bounces.
It is true that this year I did something new: I added a prediction. It is unlike the political science models, in the sense that it contains no insights into economic factors. My assumptions (and here’s more) are clear and simple: (1) opinion can be measured, and (2) its movement in a re-election race is somewhat limited, as shown by past races. These undeniable facts do not require political/economic theory, and there are very few components to argue with. In political science, this is viewed as a weakness. For the current purpose, making a sound prediction, I see it as a strength.
Great! summation Sam, but I think it will go over the top of some readers head.
Probably. Still gotta try. Thanks.
Weather forecasting only became somewhat accurate with the aggregation of both massive computing power and huge amounts of satellite data, and even now we see it is a chaotic system that is hard to predict (a butterfly flaps its wings in Hawaii and Paul Ryan is nominated as VP, right?) Why shouldn’t the same apply to presidential polling? Is high meteorological variability consistent with a Gaussian distribution of outcomes (but with a higher day-to-day variance than in state polling data, perhaps), or does it require a different, fat-tailed distribution?
Jacob Hartog: That’s what the long-tailed distribution is for. Such events can be quantified, if only roughly. Which is what I did.
Anyway, one should not look to a political-science model for that kind of insight. Polling offers the most information. If that is not enough, there is no better source of information. Except maybe a November 7th newspaper…
AlpsStranger: Holy cow, forgot about the transgender prostitutes. That sounds publishable for sure. You should call Colorado. Maybe they took your idea.
Joking aside, I think you are missing the point of what political science models do. If one of them differs widely from a poll-based snapshot, it is not because of any of the reasons you list.
Iseeurfuture: You were right.
(On varying factors that are mere correlation while searching for causation, I’m inevitably reminded of the famous story of the Hawthorne Electric study.)
I’m not sure I agree that numbers couldn’t move by 7%.
What if videos of Obama or Romney beating their wives surfaced?
What if Mitt’s taxes were leaked and he really didn’t pay any taxes for a decade?
What if Obama was caught with a transgender prostitute?
I don’t claim to be much of a scholar and I’m possibly simplifying it, but it seems like the fellas from Colorado are going heavily on the idea that no sitting president could win with an economy and unemployment rate like we have now, disregarding the fact that the polls clearly show most voters don’t blame Obama for the previous ditch we were in in the first few months of his administration.
I was being pretty silly, sorry about that.
Well, maybe it’s appropriate that I account for these factors with a “fat-tailed distribution.” Sorry. Hey, at least my calculation includes that stuff. The polisci guys don’t.
When I think about the poli-sci models vs polling, I am somehow reminded of this quote from Memento: Memory can change the shape of a room; it can change the color of a car. And memories can be distorted. They’re just an interpretation, they’re not a record, and they’re irrelevant if you have the facts.
Romney wins Florida, 100% chance.
I’m calling it right now.
http://jacksonville.com/news/florida/2012-08-27/story/democratic-registration-all-dries-new-florida-laws
Sam,
Wondering what’s your take / impact on independents currently favouring Romney by a large margin?
http://www.examiner.com/article/mitt-romney-leads-five-points-by-unskewed-data-from-cbs-news-poll
This was just one of many polls taken many times indicating that Romney is faring much better against independents…
I mean, I recall that in 2004 your personal prediction misses (not your EV median) because you assumed that independents will break against the incumbent.
To simply put, I’m just wondering if you think that independents favouring Romney is precursor of a dead heat election coming? (Which is nowhere near what the current poll suggest)
Thanks!
No, that was undecideds.
No comment on an individual poll. See the median-based meta-analysis for my opinion.
Furthermore, what that article is talking about is the data unadjusted for party affiliation, on the theory (which seems to have become Republican conventional wisdom this cycle) that the actual poll result, like most polls, is vastly oversampling Democrats.
In other words, it’s second-guessing the poll based on a speculative theory of the pollsters’ systematic error. Which does sound a lot like what Prof. Wang did in 2004.
…hmm, it says “when the data are analyzed to unskew the data”. So it sounds like they’re not even looking at the raw data, they’re applying a different adjustment to get the party balance they think is right.
I certainly know how they feel. Psychologists call this motivated reasoning. If current conditions persist, they will be learning the hard way in November.
If Romney gets a typical convention bounce, they’ll get a few days of happiness real soon now.
Sam, I think you summarized it well. The PoliSci model is weak on two broad points (for prediction):
First, not enough data points. The federal government has only been collecting unemployment figures since 1948, many of the other economic variables they use suffer from a potential for systematic errors.
Add to that the fact that during that short period there has only been 10 elections (48,56,64,72,76,80,84,92,96,04) of an incumbent vs. challenger (7 if you don’t count incubments who were VPs who assumed the presidency: Truman/48 & LBJ/64 & Ford/76) and the availble data points just keep shrinking. And, we haven’t started junking data b/c of third parties yet.
Second, there is just way too much noise. Short of Dr. Who joining the Univ. of Colorado team we can’t experiementally test using Obama/Romney in past elections, or insert Nixon/McGovern, Reagan/Carter, etc., into 2012.
We will obviously learn something after the votes are counted in November, but one lesson will no doubt be just how far we really are from having an accurate presidential forecast come from an econometric theory with so few (if any) valid data points.
Sam, I think your arguments for not using additional economic variables are strong.
If you want to make the case that they could potentially improve forcasts, it would not be enough to show that there was a correlation between, say, unemployment and the election outcome in the past. You would need to show that this was the case even after controlling for interim polling results.
This might be the case, of course! Maybe people do not care about the economy during polls, only during the election. But you can test that presumption. There would still be enough open questions afterwards. How to find the correct functional form. What about uncertainty. But it would be a starting point.
Until people come up with good evidence that additional variables decrease bias in the prediction it sounds sensible to me to stick to the H0.
I think political “science” is going to be as dead as phrenology.
The new domain science will be called red/blue genetics or neuropolitics.
How many here know what the Savannah Principle is or have read Chris Mooney’s new book “The Republican Brain”?
With fMRI and sMRI technology we can actually find significant between group differences in brain function and brain morphology between conservatives and liberals.