Princeton Election Consortium

A first draft of electoral history. Since 2004

For fellow geeks

If you want to delve into the Meta-Analysis further, many files you will need are here. It’s best if you know a little about MATLAB programming. The scripts and data files mentioned here can be found in the Geek’s Directory. All of the files in that directory are linked to the live versions currently running the calculations.

We start with data from Pollster.com, where you can find comprehensive polling information on present and past US political races, as well as running commentary. Good information with partisan commentary of several flavors can be found at electoral-vote.com, FiveThirtyEight, and RealClearPolitics.com. Pollster.com has kindly provided us with an API feed which contains all of the polling information publicly accessible via their website. Each morning, at 8:00am, the Unix script nightly.sh runs the process described here. Later in the day, at 10:00am, noon, 3:00pm, 5:00pm, and 8:00pm, a second script, midday.sh, updates the site with any additional polls released during the day.

First, a Python script update_polls.py calculates state-by-state median and SEM margins for the Democratic vs. Republican candidate. The median is calculated from the last 3 polls except for DC, which is safely Democratic and has no polling data. All polls with nonoverlapping samples are included, including partisan polls. The results are added to the top of a running data file, polls.median.txt (each line’s fields: number of polls, median date of oldest poll used, median, SEM, datenum; 51 lines are added per day). The states are in alphabetical order by state name, and can also be found in the script statename.m. After the state-by-state summary statistics are written, the MATLAB code is invoked via EV_runner.m

Median/SEM are converted to probabilities assuming a normal distribution and exported to the Excel-readable stateprobs.csv (fields: probability, median margin, probability assuming +2% for the Democratic candidate, probability assuming +2% for the Republican candidate, state).

The core Meta-Analysis happens in EV_estimator.m, which in turn calls the kernel EV_median.m. This last file is the core of the entire calculation – dig it. These scripts run several times a day to generate MATLAB data structures and export output to EV_estimates.csv, EV_histogram.csv, and jerseyvotes.csv, with a once-daily update to EV_estimate_history.csv. The fields for the estimate files are: median EV estimates for the Democratic and Republican candidate; mode EV; safe EV; toss-up EV; 1-sigma confidence interval for the Democratic candidate; 95 percent confidence interval; number of polls used; and the Popular Meta-Margin.

After the meta-analysis is completed, the controlling Unix script nightly.sh finishes by preparing the results for the website. Small text files are written which display the current median electoral votes (Python script), the meta-margin, and the jerseyvotes calculation (Python script). This text files are automatically included by our WordPress theme. Additionally, Python scripts exist to draw the current histogram of electoral votes for the Democratic candidate (histogram.py), the history of the median electoral vote estimator (history_plot.py), and prepare data for the electoral vote maps (ev_map.py). The maps themselves are generated by a Java program (pollcalc.java), which is called by an auto-generated shell script (ev_map_runner.sh) and prepared for display by ev_map_postprocess.sh.

Election Day predictions are in the python folder file as prediction.txt.