Presidential prediction 2012 (final… stay tuned)
Now that all the polls are in, it’s possible to perform variance minimization, a simple procedure to identify the range of polls that can be us...
Senate: 48 Dem | 52 Rep (range: 47-52)
Control: R+2.9% from toss-up
Generic polling: Tie 0.0%
Control: Tie 0.0%
Harris: 265 EV (239-292, R+0.3% from toss-up)
Moneyball states: President NV PA NC
Click any tracker for analytics and data
This is a technical explanation corresponding to this post.
(1) Set a Bayesian prior for the Meta-Margin by calculating average and SD for June-September 2012, using a t-distribution (3 d.f.) to generate the shape. In practice the tails do not matter, but leave them in. Result: Obama +3.26 +/- 1.02 %.
(2) Calculate the distribution of forward-going change in June-September 2012 to estimate the probable amount of divergence by November 6th. The approximate expression for the divergence is d = 0.4*sqrt(N) for N<=20 days, and d = 1.8% for N>20 days. The sqrt(N) indicates random walk-like behavior. Calculate a Gaussian with width parameter d.
(3) Multiply the distributions in (1) and (2) to get a final predicted distribution of Meta-Margins. From this calculate the mean, 1-sigma (68%), and 2-sigma (95%) confidence intervals. Convert all three to units of EV using 2012 data to interpolate.
(4) For the red zone, plot the 68% confidence interval. Plot as a diverging zone from today’s snapshot.
(5) For the yellow zone, plot the union of the snapshot 95% CI (gray zone today) and 95% predicted CI (step 3 above). Plot as a zone starting from today’s 95% CI.
And here is the MATLAB script.
>>>>>>>>
% First, input parameters (pass MM to it or leave the first line)
%
% Where are we today?
MM=5.06 % today’s Meta-Margin
MMdrift=1.8
N = 38 % days until election
%N=max(N,1) % seat belt
%N=datenum(2012,11,6)-today; % assuming date is set correctly in machine
%MMdrift=min(0.4*sqrt(N),1.8) % random-walk drift as seen empirically
%MMdrift=max(MMdrift,0.2) % just in case something is screwy with date
% cover range of +/-4 sigma
Mrange=[MM-4*MMdrift:0.02:MM+4*MMdrift];
% What is near-term drift starting from conditions now?
now=tpdf((Mrange-MM)/MMdrift,3); % long-tailed distribution. you never know.
now=now/sum(now);
% What was long-term prediction? (the prior)
M2012=3.26; M2012SD=2.2; % parameters of long-term prediction
prior=tpdf((Mrange-M2012)/M2012SD,1); %make it really long-tailed, df=1
prior=prior/sum(prior);
% Combine to make prediction
pred=now.*prior; % All hail Reverend Bayes
pred=pred/sum(pred);
plot(Mrange,now,’-k’) % drift from today
hold on
plot(Mrange,prior,’-g’) % the prior
plot(Mrange,pred,’-r’) % the prediction
grid on
% Define mean and error bands for prediction
predictmean=sum(pred.*Mrange)/sum(pred)
for i=1:length(Mrange)
cumulpredict(i)=sum(pred(1:i));
end
Msig1lo=Mrange(min(find(cumulpredict>normcdf(-1,0,1))))
Msig1hi=Mrange(min(find(cumulpredict>normcdf(+1,0,1))))
Msig2lo=Mrange(min(find(cumulpredict>normcdf(-2,0,1))))
Msig2hi=Mrange(min(find(cumulpredict>normcdf(+2,0,1))))
% Now convert to EV using data from mid-August and some added points at the
% ends. If the race swings far, these endpoints need to be re-evaluated.
mmf=[-1.48 -.74 0 .74 1.4800 1.8125 2.1383 2.5667 3.3200 3.7400 4.2000 4.6600 5.1050 6 7 8 9 10 11 12];
evf=[247 258 269 280 290 299.25 304.1667 310.0000 321.6667 328 343 347 347 347 347 347 347 358 369 383];
bands = interp1(mmf,evf,[predictmean Msig1lo Msig1hi Msig2lo Msig2hi],’spline’);
bands = round(bands)
ev_prediction = bands(1);
ev_1sig_low = bands(2);
ev_1sig_hi = bands(3);
ev_2sig_lo = bands(4);
ev_2sig_hi = bands(5);
bayesian_winprob=sum(pred(find(Mrange>=0)))/sum(pred)
drift_winprob=tcdf(MM/MMdrift,3)