## The presidential predictor sharpens

#### September 29th, 2012, 4:05pm by Sam Wang

June 2016: This explanation from 2012 captures the concept of the Bayesian predictor, but in 2016 the prior is set differently. As of now, I am using national Clinton v. Trump poll median for all 2016, along with a large SD (+/-7%), to set a fairly weak prior. This assumption will eventually be replaced with statistics on the Meta-Margin history once there’s enough of that to be useful.

Today I present the short-term presidential prediction. Both the red and yellow ranges are noticeably more precise than the long-term prediction. This will be updated daily until the election.

This prediction is based on the amount that the Meta-Margin is likely to shift over short periods of time, emphasizing the limits set by the long-range prediction. As of today, the predicted “strike zone” is 314-347 EV and the November Obama re-elect probability is 97%.

The rest is very math-y…

During 2012, the Obama-Romney Meta-Margin’s typical movement (the standard deviation) has been about 1% in 1 week, and about 1.8% in 3 weeks or longer. On average, the movement forward in time has been near zero, as shown by the black curve in the middle of this plot.

As an example, right now the Meta-margin is 5.04%, and the election is 38 days away. Therefore movement in either direction today of +/-1.8% leads to a 1-standard-deviation (SD) range of Obama +3.24 to +6.84%.

The long-range prediction, which you’ve been watching for weeks, excluded the upper part of this range, suggesting that movement above +6% is relatively unlikely. So that part gets compressed. This is done by using the long-range expectation as a “Bayesian prior,” whose distribution is multiplied by the movement-from-today distribution. This gives the actual prediction. As the election draws near, the red curve below will move toward the black curve, which itself will become very narrow.

This allows us to calculate a win probability. Based on Drift-from-now alone, the win probability is 97%. If we use the Bayesian prior to sharpen things up, it becomes 99%. To be conservative (in the nonpolitical sense) I report 97% in the topline for nontechnical readers.

Then it is necessary to convert the predicted Meta-Margin range to electoral votes. This is possible using the relationship between the two quantities as the campaign unfolded this year:

This plot shows all available data since June. Data from mid-August were used for extrapolation.

The end result is the electoral vote (EV) range shown in the red “maximum strike zone” above. This is where the Presidential election is most likely to end up (68% confidence interval).

The yellow zone above is more complicated. It starts from the 95% range, calculated in the same way. This is combined with today’s 95% nominal confidence interval, the gray zone you’ve been seeing all along, which includes pollster-to-pollster variations. The yellow zone is defined as the union of these two ranges. Therefore pollster variation and the maximum possible drift are both contained in this wider “watch zone.”

These red and yellow zones will be updated from now until Election Day.

For geeks, an even more technical description and MATLAB script are provided here.

• How do make the “Days Forward in Time” vs. “Change in Meta-Margin” plot (the second plot)?

I presume the drift-from-now gaussian in the third plot is a vertical slice of the second plot.

I apologize if you have already explained that plot, it does look familiar…

• Read this and this. The second plot is two distributions (long-term prediction and the new one today) and their product. The product is normalized and is in red.

• William Ockham

I think you are substantially overestimating the range of possible outcomes. Polls move less in the last month of presidential campaign. Check your data from 2008 and 2004. I found this statement from Nov. 2 on your blog:

What’s amazing about the Presidential race is how little change has occurred in October in national polls and in meta-analysis of the Electoral College.

At the risk of sounding like Andrew Gelman’s press agent, check this out as well:

http://www.stat.columbia.edu/~gelman/research/published/psq_4021.pdf

The tl;dr is that the closer we get to election day, the more a person’s vote preference is predicted by their demographics (gender, race, age, partisanship, etc.). More support for the theory that we already have enough information from state polls to know the electoral college outcome for all but the closest states by 4 weeks out. This year that might be enough for all the states.

• I could try that. However, I only have detailed information from 2008 and 2004 on post-Labor Day movement. A bit scanty for the autocorrelation. An alternative would be to use the Erikson/Wlezien book for more time series in re-election races.

Practically speaking, this all turns into a drift parameter. It is already small, <2% of SD of margin. Note that the SD of the Meta-Margin this year is less than half what it was in 2008. Polarization is extreme this year. You really think that analysis can make that SD much smaller? Debates do move opinion a bit, especially the first one.

Finally: note that the red strike zone is 31 EV wide, a hair larger than Florida.

• In fact, in 2008 the range in September-October spanned 80 EV. In 2004 it was 50 EV. So you are incorrect about the variability of races.

• The center of your red zone is very close to what I’ve been thinking of as my gut-feeling prediction for a week or two now. Not that you have to be some kind of genius to feel that way.

• I agree. It was a slightly harder problem in July. That prediction was a bigger deal.

• baw1064

The probability distributions are quite asymmetric. Is that essentially saying that Obama is already ahead in all the states he could realistically win (that an additional 1-2% shift in the national popular vote in favor of Obama would not net him any more electoral votes)?

• William Ockham

Absolutely. Because it is harder to shift the votes in swing states than it is nationally (because swing state voters have already been persuaded extensively) and Obama is behind by about 7 points in the nearest states he could pick up, it would take an Edwin Edwards moment* for Romney for Obama to pick up any additional states.

*Edwards told reporters once that “The only way I can lose this election is if I’m caught in bed with either a dead girl or a live boy.” Of course, that was a reasonable possibility for Edwards, but not so much for Mitt Romney…

• William Ockham

Sam,

All the possible swing states have been polled extensively. I don’t believe anybody’s mind is going to be changed about who to vote for and changing the make up of the electorate in a swing state at this late date is difficult. We have a pretty clear idea of how the states line up on the D-R axis. Everything from Indiana and Missouri all the way to Utah are solidly for Romney. Nothing is going to change in those states.

Nothing that Romney can say in the debate is going to change the outcome in Ohio. Colorado, Florida, and Nevada are safe for Obama because Latinos are especially motivated to vote this year. There simply aren’t enough voters in play to make it feasible for him to change the outcome. New Hampshire looks safe for Obama as well. That leaves Iowa, Virginia, and North Carolina. The polls in Iowa and Virginia would have to be very wrong for Romney to have a chance there.

Ultimately, the only state in play right now is North Carolina and Obama looks to have the advantage there. I would argue that North Carolina going to Romney would represent an outcome more than 1 std deviation from the mean. Iowa and Virginia going Republican is over two std dev. Everything else is basically impossible to change at this point.

• Steven S

Sam,
A question reflecting my weak statistical background. Why not do the prediction based on median EV instead meta-margin? Could you look at the difference in median EV between the beginning and end of every 38-day window since June and find that standard deviation? Could that avoid needing to convert from meta-margin to EV? This question may reveal a lack of understanding on my part about the importance of the meta-margin.

Your prediction matches the sense I get eyeballing the EV history – the most radical swing has been the one we just witnessed from late August to now; even a swing of the same magnitude in the opposite direction would not put Obama in danger. It also makes intuitive sense that the tail of your distribution curve crosses 0mm, retaining a 1 to 3 percent chance of a Romney win, even though neither history has been south of the red line except for the edge of the grey.

• The reason for using Meta-Margin is that it is in units of popular sentiment. That is a natural unit for quantifying movement in opinion. Mostly it doesn’t matter, since the relationship between Meta-Margin and EV is approximately linear. At the moment it matters more because Obama is near a plateau: there is a gap between current reachable states (for instance NC) and the next-most-Republican states (AZ, IN…). In EV, one could not distinguish between current conditions and +/-1%. With Meta-Margin, one can.

• Ram

Sam

One research suggestion- The race remained remarkably stable till middle of August. Some very fundamental underlying factor seems to have triggered shift to Obama and Democrats. I am not sure it is the DNC convention or the 47% video. Is it the Ryan pick as VP? He bought Medicare issue to table. One interesting approach is to see if shift was driven by 65+ voter group compare it 40-65 and 18-40 voter group may uncover it. If these data can be compared to Medicare related information processing data points like-Medicare ad spots, \$ amount, Google search data, we may be able to confirm it. For Example- under Google trends, 90 day search trend for Medicare word till August 11 was stable but spiked from a mean 60 to 100 and has remained stable at 75 since then. Whether it accounts of to about 2% shift away from Romney-Ryan ticket. The DNC convention probably increased democrat’s interest but that doesn’t account all shift we see in the race. This is just my thoughts.

• Big-picture, this shift isn’t that much larger than the others that have happened in this campaign. If you compare it to past election years, really not much has happened in 2012; the race is very stable. 2008 had bigger swings. 2004 was additionally extremely close overall, so the swings mattered a lot.

It looks to me as if the selection of Ryan gave Romney a significant temporary boost, but it’d dissipated by the time of the conventions. What’s happening now is that most of the US population is starting to pay attention to the election campaign, and Romney just keeps making one false move after another, though the emergence of the “47%” video was a huge gift to Obama.

• E L

Having managed to live 70+ years, I don’t give the chances the sun will come up tomorrow 97%. The US Supreme Court deciding a presidential election?… 0% chance. Oh, wait a minute…

• Joel

You are saying that the sun didn’t come up approximately 200+ days in your lifetime?

• Craig

• Kurtis

Love the hat tip to LaPlace’s Rule of Succession.

• In the RAND poll’s “shifts between candidates” chart, the Obama-to-Romney line is about to catch up with the Romney-to-Obama line, which suggests that the big shift to Obama has gone about as long as it’s going to go. I wouldn’t be surprised to see a leveling out or some small motion back toward Romney.

But the other interesting thing I saw there the other day is that with voters over 60 there’s been an amazingly dramatic move away from Romney, which suggests that attacking him and Ryan on Medicare/Social Security has been really effective. Meanwhile Obama is probably maxed out with the younger voters.

• wheelers cat

umm….say what?
I dont see that, I see Romney-to-Obama as a new plateau, and Obama-to-Romney as a new valley.
I dont see any “catching up”.
Maybe you can explain that to me.

• wheelers cat

@Matt
actually that raises an interesting question.
Once a respondent has changed their mind, can they change back, can they change their mind AGAIN?

• I think that last little movement wasn’t there when I last looked at the chart…

• …Now both “switch” numbers are dropping in tandem (with Romney-to-Obama still slightly in the lead), which suggests that the race is stabilizing. Of course one could easily overinterpret day-to-day fluctuations in the RAND poll.

Sam, you’ve spent a lot of time talking about where the most effective place for voters to send their money (i.e. not the presidential races, but rather knife-edge races in the house, less so in the senate). Figure #2 (% change in the meta-margin in response to date) made me wonder when the most effective time to donate money would be got me to think about a related question over time rather than over space. In other words, when is each dollar most effectively spent on changing opinions? Campaign spending done too early may be ineffective because the most 0f the public either are not paying attention or are not becoming engaged to the election yet. Thus while the public might physically hear the candidates messages, they may not be hearing it yet. Second, given the length of the campaign and the voting date, many people may not be engaging in the same media consumption habits during the summer that they do in the fall. Television networks are aware of this, which is why they put all their new TV shows on in the fall, and generally run crap and re-runs in the summer time. This would have to be balanced against other factors, such as the possibility of cheaper media buys during the summer, and the asymmetry in consequences for buying too early versus buying too late. Spending campaign dollars on media too early means those dollars could’ve spent more effectively elsewhere, while spending those campaign dollars on media too late means that you may have a much deeper hole to climb out of, requiring even more media buys.

Lastly (and this is a somewhat different question), given your expertise in neuroscience, what do you think is more effective campaigning? Television and radio ads, or individual campaigning? I tune out an awful lot of the radio and television ads I’m subjected to, so much so that it frequently takes weeks of a new commercial being on TV before I recognize what that ad is for, unless of course it really does something to bring you in. Campaign television ads generally fall into the category of easily ignored.

• William Ockham

Sam,

In my comment about the stability of the race later in the campaign, I was focussing on the last 4 weeks. I completely expect there to be significant movement in September. Looking at your graphs from 2004 and 2008, this certainly appears to be the case, lots of movement in August and September, very little in October. However, on further consideration, I don’t think you should try to account for this in your prediction. There simply isn’t enough statistical evidence to justify polluting your formula with what is fundamentally a qualitative judgement. I’ll just say that I would still be shocked if any state’s outcome is different from the current polling direction with the exception of North Carolina, which at this point is too close to call.

• Hans

Dr. Wang, after the election is over I would love to see a little animation of the short-term prediction plots from the moment you started them to the end of the election, similar to the animations the National Weather Service does for hurricanes. No reason other than curiosity for my request; I just want to see the little red region wobble around. :-)