# About the Princeton Election Consortium

This blog’s mission is to provide informed analysis of US national elections by members of the Princeton academic community. It is open to scholars in the Princeton area from all disciplines, including (but not restricted to) politics, neuroscience, psychology, computer science, and mathematics.

For now, much of the site’s information is about polling. As the campaign season progresses we will expand to other interesting topics, and expect for diverse contributions. To write for us, please contact Sam Wang.

This blog began in 2004 as a meta-analysis directed at the question of who would win the Electoral College. Meta-analysis of state polls provides more objectivity and precision than looking at a single poll and gives an accurate current snapshot of the state of play. Over the course of the campaign, this site attracted over a million visits. In 2004, the median decided-voter calculation on Election Eve captured the exact final outcome (read this article and the follow-up). The 2008 calculation provided results based on decided-voter polling from all 50 states, and in the closing week of the campaign ended up within 1 electoral vote of the final outcome.

**The Management**

Prof. Sam Wang (email) has academic expertise in biophysics and neuroscience. In these fields he uses probability and statistics to analyze complex experimental data, and has published over eighty papers using these approaches. His research program concerns how the brain learns from experience in adulthood and development, with a special emphasis on autism. He is also the author of two popular books, the prizewinning *Welcome To Your Brain*, and a second book on child brain and mental development, *Welcome To Your Child’s Brain*. These books have been translated into over 20 languages. In 2015 Sam Wang was appointed to the New Jersey Governor’s Council on the Medical Research and Treatment of Autism by Governor Chris Christie.

Prof. Wang originally developed the Meta-Analysis in 2004 to help readers think about how to allocate campaign contributions. He was motivated by the fact that in a close race, one can make the biggest difference by donating at the margin, where probabilities for success are 20-80%. The Meta-Analysis has subsequently been found to be a highly sensitive tracking tool over time, and the concept has become extremely popular thanks to the efforts of FiveThirtyEight and other sites. To read a discussion click here. In 2004, the Meta-Analysis of State Polls got tens of thousands of hits per day, and since that time the Princeton Election Consortium has recorded over five million visits. Prof. Wang’s Meta-Analysis has also been featured on NPR, Fox News, and in the *Wall Street Journal*.

Sam Wang has written numerous articles, including for the *New York Times* and the *American Prospect* (for examples click here and here) and regularly gives public lectures. He also cohosts a Princeton University-sponsored podcast, Politics and Polls, with Prof. Julian Zelizer.

He is also founder of the Princeton Gerrymandering Project, a nonpartisan project that uses law and statistics to understand and prevent partisan abuse of redistricting. His proposed standards were published in the Stanford Law Review, and have been recognized by Common Cause.

Sam can be contacted as **sswang at princeton dot edu**.

**Lucas Manning** (email) is a current Princeton undergrad (class of 2020) studying Computer Science with a research focus on computer graphics/vision. He is responsible for administrating the website, managing/creating the 2018 PEC redesign, and gathering data for analysis.

**Alumni:**

**Mark Tengi** is a former Princeton undergrad in the School of Engineering and Applied Science (class of 2016) who took over for Andrew during the 2014 Senate race. He is studying computer science and linguistics, with an emphasis on computer systems.

**Andrew Ferguson** is a 2008 Princeton graduate and was responsible for the site’s data processing and infrastructure for its first six years. While in college, he studied probability, statistics, and computer science. He is currently a computer science graduate student at Brown University, where he works on Software Defined Networks and frameworks for analyzing Big Data.

## About the Meta-Analysis

These calculations are based on state polls from many polling organizations. The data source for 2016 (as in 2008 and 2012) is Pollster.com. The data are fed through MATLAB scripts (1) (2) (3) (4) to mathematically compute all of the results shown.

The first step is to calculate the probability of winning a state, taking into account the variability of polls. This is done by calculating simple statistics on the polls: median and estimated standard error of the mean (SEM). The median is used as an outlier-resistant method of estimating the mean. A floor on the effective SEM is used to account for intrinsic sampling error and inter-pollster variability. The median and effective SEM are then converted to a z-score and probability of a win using the normal distribution (bell-shaped curve). The last 3 polls are used, or all polls with ~~a median ~~*(changed on 7/14/2016)* an end date within 7 days of the most recent poll are used, whichever is greater. Occasionally a tie can result in the use of 4 polls. At present, it is permitted for a polling organization to be the source of more than one poll for a state. From these statistics, a win probability is calculated for each state using the *t*-distribution. A *t*-distribution is used instead of the normal distribution because the *t*-distribution has longer tails, providing a way to allow for surprising outlier events.

The second step of the calculation is complex: it calculates the probability distribution of **every possible electoral vote (EV) outcome.** For 50 states and the District of Columbia the total number of combinations is 2^51 = 2,251,799,813,685,248 (nearly 2.3 quadrillion). The obvious approach, testing all these permutations one at a time, at 1,000,000 per second would take over 70 years. A far more efficient approach is to calculate the **exact probability distribution** in closed form (see FAQ; thanks to Lee of Quant Consulting). This reduces computing time drastically, and gives the probability of each possible number of EV (i.e. the probability of Obama 0 Romney 538, of Obama 1 Romney 537, and so on).

Those probabilities are then tabulated to come up with a number of statistics, most prominently the 50th-percentile (median expected) outcome and a 95-percent confidence interval. The 95-percent confidence interval is particularly useful because, like the famous Margin of Error (MoE), it gives the range of outcomes that would occur 95 percent of the time based on the available information. Note that the ends of this confidence interval are very similar to the 50th-percentile outcome if support for either Obama or Romney were underestimated or overestimated by 1 percentage point. The confidence interval varies in size somewhat, and when large states (such as OH, WI, FL) are toss-ups the confidence band can be up to twice as large.

Although this calculation takes into account the variability of polls, it is important to note what it does *not* do. It is integrated over the last three polls (mostly 1-4 weeks), so fast swings do not show up. It does not reject any polls, nor does it account for potential bias or predict future opinion shift in any way. It *is* an unbiased snapshot of what all the polls, taken together, tell us about where the race stands on a particular day.

### POLL SELECTION

Polls are unfiltered and equally weighted, in part because selecting data leads to unintended biases. Therefore even though some polling organizations give demonstrable and consistent outlier results (example), all polls are still included. The effects of outliers are reduced by using the median Obama-Romney margin rather than the average.

However, even when all of the above polls are excluded the result is virtually identical. Thus the method can be tailored but is also robust enough to give a reasonable answer even with no selection of data. For a more full discussion of my methods, see this DailyKos thread from 2004.

### BIAS

These calculations would be affected if there is an overall poll bias, which can have a large effect in a close race. Bias could happen if polling methods do not accurately sample actual voting patterns. However, little evidence for bias has emerged in the last two Presidential elections. In 2000, Ryan Lizza at The New Republic compiled state polls. On the day before the election, that compilation indicated that the outcome would hinge on Florida. This matches what happened, arguing against major built-in biases in state polls. In 2004, unbiased application of this site’s methods made a correct predicton about the EV outcome.

Various factors may lead opinion poll results to differ from actual voting-booth results. Possibilities include undercounting of cell phone users and reluctance of voters to admit that they would not support a nonwhite candidate. At this time, the evidence suggests that these two factors do not have significant effects. Bias may still occur for other reasons. On Election Eve 2000, polling organizations favored Bush by 2.5%, yet Gore won the popular vote. Possible reasons include last-minute swings of opinion, or the possibility that one side does better or worse at turning out voters than expected.

For this reason, it is useful to ask the question: how much would voter sentiment have to differ from polls to affect the outcome. This leads to the development of the bias calculation and the Popular Meta-Margin.

A key measure of the current closeness of the race is the **Popular Meta-Margin (a.k.a. Swing Index)**. This is the across-the-board percentage shift in opinion (or poll bias) that would be needed to make the electoral college an exact toss-up? This is analogous to the popular margin in national polls, but is more relevant to what it would take in terms of real electoral mechanisms.

There used to be a feature implementing the bias variable to let you see what would happen if polls are biased, for instance if support for Obama is understated due to the Bradley/Wilder effect. This feature is currently inactive, but may be revived at some point.

### PREDICTING THE FUTURE

You can use the bias calculation to estimate where things are headed. If you think turnout efforts will boost your candidate by N points, add that. If you think that one candidate will gain X points at the expense of the other, add 2*X. The map at the right sidebar shows the effect of one candidate gaining 2% (or 1% of voters switching sides). For other amounts of bias, you are welcome to run the MATLAB code yourself.

For instance, if you predict that turnout will increase Obama’s vote by 2 points, but Romney will pick up 1.5% of voters from Obama, then the bias to use is 2 – (1.5 * 2) = -1%, or 1% to Romney.

### DATA SOURCES

### RESOURCES FOR LEARNING MORE ABOUT STATISTICS

- David Lane’s online statistics textbook
*Intuitive Biostatistics*by Harvey Motulsky*An Introduction to Error Analysis*by John R. Taylor