About Us

About the Meta-Analysis

These calculations are based on state polls from many polling organizations. The data source for 2016 (as in 2008 and 2012) is Pollster.com. The data are fed through MATLAB scripts (1) (2) (3) (4) to mathematically compute all of the results shown.

Our Process

The first step is to calculate the probability of winning a state, taking into account the variability of polls. This is done by calculating simple statistics on the polls: median and estimated standard error of the mean (SEM). The median is used as an outlier-resistant method of estimating the mean. A floor on the effective SEM is used to account for intrinsic sampling error and inter-pollster variability. The median and effective SEM are then converted to a z-score and probability of a win using the normal distribution (bell-shaped curve). The last 3 polls are used, or all polls with a median (changed on 7/14/2016) an end date within 7 days of the most recent poll are used, whichever is greater. Occasionally a tie can result in the use of 4 polls. At present, it is permitted for a polling organization to be the source of more than one poll for a state. From these statistics, a win probability is calculated for each state using the t-distribution. A t-distribution is used instead of the normal distribution because the t-distribution has longer tails, providing a way to allow for surprising outlier events.

The second step of the calculation is complex: it calculates the probability distribution of every possible electoral vote (EV) outcome. For 50 states and the District of Columbia the total number of combinations is 2^51 = 2,251,799,813,685,248 (nearly 2.3 quadrillion). The obvious approach, testing all these permutations one at a time, at 1,000,000 per second would take over 70 years. A far more efficient approach is to calculate the exact probability distribution in closed form (see FAQ; thanks to Lee of Quant Consulting). This reduces computing time drastically, and gives the probability of each possible number of EV (i.e. the probability of Obama 0 Romney 538, of Obama 1 Romney 537, and so on).

Those probabilities are then tabulated to come up with a number of statistics, most prominently the 50th-percentile (median expected) outcome and a 95-percent confidence interval. The 95-percent confidence interval is particularly useful because, like the famous Margin of Error (MoE), it gives the range of outcomes that would occur 95 percent of the time based on the available information. Note that the ends of this confidence interval are very similar to the 50th-percentile outcome if support for either Obama or Romney were underestimated or overestimated by 1 percentage point. The confidence interval varies in size somewhat, and when large states (such as OH, WI, FL) are toss-ups the confidence band can be up to twice as large.

Although this calculation takes into account the variability of polls, it is important to note what it does not do. It is integrated over the last three polls (mostly 1-4 weeks), so fast swings do not show up. It does not reject any polls, nor does it account for potential bias or predict future opinion shift in any way. It is an unbiased snapshot of what all the polls, taken together, tell us about where the race stands on a particular day.

Poll Selection

Polls are unfiltered and equally weighted, in part because selecting data leads to unintended biases. Therefore even though some polling organizations give demonstrable and consistent outlier results (example), all polls are still included. The effects of outliers are reduced by using the median Obama-Romney margin rather than the average.
However, even when all of the above polls are excluded the result is virtually identical. Thus the method can be tailored but is also robust enough to give a reasonable answer even with no selection of data. For a more full discussion of my methods, see this DailyKos thread from 2004.

Bias

These calculations would be affected if there is an overall poll bias, which can have a large effect in a close race. Bias could happen if polling methods do not accurately sample actual voting patterns. However, little evidence for bias has emerged in the last two Presidential elections. In 2000, Ryan Lizza at The New Republic compiled state polls. On the day before the election, that compilation indicated that the outcome would hinge on Florida. This matches what happened, arguing against major built-in biases in state polls. In 2004, unbiased application of this site’s methods made a correct predicton about the EV outcome.

Various factors may lead opinion poll results to differ from actual voting-booth results. Possibilities include undercounting of cell phone users and reluctance of voters to admit that they would not support a nonwhite candidate. At this time, the evidence suggests that these two factors do not have significant effects. Bias may still occur for other reasons. On Election Eve 2000, polling organizations favored Bush by 2.5%, yet Gore won the popular vote. Possible reasons include last-minute swings of opinion, or the possibility that one side does better or worse at turning out voters than expected.

For this reason, it is useful to ask the question: how much would voter sentiment have to differ from polls to affect the outcome. This leads to the development of the bias calculation and the Popular Meta-Margin.

A key measure of the current closeness of the race is the Popular Meta-Margin (a.k.a. Swing Index). This is the across-the-board percentage shift in opinion (or poll bias) that would be needed to make the electoral college an exact toss-up? This is analogous to the popular margin in national polls, but is more relevant to what it would take in terms of real electoral mechanisms.

There used to be a feature implementing the bias variable to let you see what would happen if polls are biased, for instance if support for Obama is understated due to the Bradley/Wilder effect. This feature is currently inactive, but may be revived at some point.

Predicting the Future

You can use the bias calculation to estimate where things are headed. If you think turnout efforts will boost your candidate by N points, add that. If you think that one candidate will gain X points at the expense of the other, add 2*X. The map at the right sidebar shows the effect of one candidate gaining 2% (or 1% of voters switching sides). For other amounts of bias, you are welcome to run the MATLAB code yourself.
For instance, if you predict that turnout will increase Obama’s vote by 2 points, but Romney will pick up 1.5% of voters from Obama, then the bias to use is 2 – (1.5 * 2) = -1%, or 1% to Romney.

DATA SOURCES

Pollster.com
electoral-vote.com
RealClearPolitics

RESOURCES FOR LEARNING MORE ABOUT STATISTICS

David Lane’s online statistics textbook
Intuitive Biostatistics by Harvey Motulsky
An Introduction to Error Analysis by John R. Taylor

Database

If you want to delve into the Meta-Analysis further, many files you will need are here. It’s best if you know a little about MATLAB programming. The scripts and data files mentioned here can be found in the Geek’s Directory. All of the files in that directory are linked to the live versions currently running the calculations.

Princeton Election Consortium

Innovations in democracy since 2004

Highlights

Saturday, July 27, 2024

Senate

House

Presidential