Comments on my sharpening of the Presidential forecast were helpful. The outcome is that I will keep the key new assumption, which is to cap future standard deviation in the Meta-Margin at 3.0%. Since Hillary Clinton’s Meta-Margin (effective popular lead, measured through Electoral College mechanisms) is 6.3%, that means that she is 2.1 standard deviations ahead. That is a lot of standard deviations.
A summary of the discussion follows. I will start with the key graph, which I produced in response to Joel. Like all my analysis of the 1952-2012 elections, this was made using Wlezien and Erikson’s data. Prof. Wlezien has helpfully provided the original dataset on his website.
Is past performance this year a predictor of future dynamics? Joel wanted to know about “in-sample variance”: is variance in the earlier part of a campaign predictive of what happens in the closing months? That would tell us whether it is kosher for me to use this year’s Meta-Margin history to estimate volatility from now until Election Day.
Jeremiah’s reaction says it well: “I think of all the discussions this is the critical chart to consider….I think the way to look at this chart is to ask oneself what scenarios would point to upsetting the prediction? Even with all of the data the maximum SD for 1-90 days before the election is 4 percent and the average is much less than this. A SD assumption of 3 percent would therefore seem conservative. Also, there are no data points in the upper left quadrant of the chart and there is only one data point where the SD got much larger closer to the election and that was still less than 3 percent.”
Bottom line: there’s no good justification for assuming that future variation will be greater than 3 percent. So I will keep it there.
In retrospect, for purposes of prediction, the graph above would have been enough. However, I think my point that polarization has come with entrenchment of opinion is still useful.
Is 2016 different? This leads to Mike’ general concern to my classifying 2016′s data as being similar to 1996-2012. “I think a lot of people share an intuition that there is something about this race that should discourage us from grouping it with the other post-1996 elections in terms of volatility. It seems like it would be worthy to look for numerical support for that intuition, if only to see what the strongest argument is against the low-variability assumption.”
Certainly I see the point of this objection. Donald Trump’s candidacy is so obviously freakish that surely 2016 is different…right? Actually, not really, from a data standpoint. The strong state-by-state correlation between Trump 2016 and Romney 2012 suggests that not all that much has changed, except that Trump is quite weak within his own party.
I see Trump as a culmination of a 20-year trend in the priorities and culture of the Republican Party. His tactics are familiar to the party base. For example, the questioning of legitimacy: of Obama’s birthplace, and of other Republicans, and even the November election itself…the list goes on. And yet he always had at least 40% of Republican primary voters on his side. I offer the following synthesis of data (2016 has been really stable) and events (crazy Trump): the U.S. is suffering from a near-fatal case of polarization, and Trump is a consequence.
The Gary Johnson factor. Several readers, for example NHM, raised the concern that this year, there are a lot of Gary Johnson supporters. Various hypothetical scenarios were laid out for how that could affect the race.
Here is a general way to think about Gary Johnson, who is currently polling at about 8%. Also, undecided plus alternative-party votes add up to 20.5%. The Clinton+Trump total is 79.5%, compared with 91.0% Obama+Romney on the same date in 2012. Because third-party votes are especially fluid in the home stretch, that could lead to more uncertainty in 2016 than in 2012. This is especially important because many of those voters are Republicans who might break toward Trump.
The maximum plausible range of what Gary Johnson supporters will do ranges from all going for Trump (i.e. 8% toward him) to maybe a 5%-3% split toward Clinton (i.e. net movement of 2% toward her). The approximate SD of such a range of possibilities is one-fourth of the total span. So SD_3rd_party =10%/4 = 2.5%. That’s still within the range of the 3% assumption.
There’s more good discussion. I encourage you to read it.