Although national polls have shown steady movement in Obama’s direction, this weekend the Meta-Analysis made a clear move toward McCain. What’s going on?

Look at the current-polling probability map. There are two surprises: Pennsylvania (21 EV) is tied and Minnesota (10 EV) is very close. This is the closest that either state has been in the post-primary season. Before the conventions, these states appeared to be headed for Obama wins. Now the picture is less clear.

Perhaps this is part of the same bump that McCain received in frontier states such as Idaho, Montana, and the Dakotas. Another possibility is that one or more of the particular polling organizations – Big Ten, Fox/Rasmussen, the Star-Tribune – have some bias. As polls become more frequent we will adopt a rule that incorporates more than the last three, which is what we are doing now.

In any event, current state polls were mostly taken before last week’s financial events, so look for further changes this week.

You may have noticed that the Meta-Analysis is showing a fairly different result from FiveThirtyEight. The basic reason is that FiveThirtyEight attempts a prediction, whereas the Meta-Analysis is a pure snapshot of state polls. Major additional steps by FiveThirtyEight are (a) a correction to state polls to get them caught up, assuming they track national polls in a certain way; and (b) a projection to November 4th. There are other differences as well, but these two contribute a large amount to the difference at any one moment.

*Update: The median-of-three approach rejects single outliers but does not work well when there are two outliers in the same direction. Not to point fingers, but the Big Ten polls seem somewhat fishy, and might add outliers in many of the ten (of course) eight states they cover. A reasonable solution would build on using more than three polls; for instance, all polls completed in the last seven days. I did this during the home stretch in 2004 with good results. Another attractive possibility is age-dependent weighting, but it’s not clear how to combine that with median-based statistics. I’m open to suggestions.*

Jon Schiller// Oct 2, 2008 at 11:49 amI learned about your methodology reading the daily Caltech email. I believe it has great predictive qualities. I would like to use your methodology to predicting stock index movements. I use a 3 layer, 5 input, 1 output neural network for determining OEX trends. I learned about neural networks at last year’s Caltech Seminar lectures. I would like to use your predictive methodology for my Options Trading Software.

Sam Wang// Sep 24, 2008 at 7:31 amJames, note that in this case I was correct that some extreme polls had gotten in. The unexpected dive in the EV estimator reversed once fresh polls came in. Noticing problems like this is how one refines a model. The trick is, I agree, to avoid unintentional bias.

Peter Formaini, some states poll less often. Generally these are smaller states, or states in which the outcome is not in doubt. In those cases I think old polls are adequate for purposes of estimation.

Peter Formaini// Sep 24, 2008 at 1:36 amSam:

thanks for the feedback.

I understand the reluctance to use the ‘MEAN POLLS’ stat – but in my mind, using a ‘LAST THREE POLLS’ is equally flawed 9since it assumes a common timeframe).

Why not use the previous THREE polls ONLY IF THEY FALL WITHIN a 14-day PERIOD – and discard the oldest poll if they do not. In other words. 3 or more polls are only averaged if they are recent. That way 2 or 3 month old polls from states that donot poll as often get tossed out on a timely bases and only the most recent info (even if there is less of it) is used.

Thanks for your hard work!

James// Sep 23, 2008 at 6:18 pmMaintaining a previously determined strategy, even in the face of unexpected behavior, enhances credibility and I would not lightly tinker with your algorithms just to make things “look right” or “make sense”.

On the other hand, there is no reason that additional graphs could not be generated over the same entire history, using different methodologies. Associating the differences in the graphs with differences in the methodologies could be illuminating.

William// Sep 23, 2008 at 1:05 pmConsidering that your results have just bounced back to where they were previously, the “two waves of outliers” explanation seems likely. Perhaps increase to a median-of-five, as suggested?

Sam Wang// Sep 23, 2008 at 12:12 pmThese are useful comments. Thanks to all.

There is not yet a definitive reason to suppose that anything is “wrong” with the state polls, especially since state polls perform so well historically. In many cases, the polling organizations are the same ones that do the national surveys. There is a potential exception, which I’ll get to.

Some suggestions carry disadvantages. The mean change rule assumes uniformity, which at this point we know is false. Time-weighted medians are possible and well justified, but remove transparency. For now I’ll avoid those. The “last three polls or last seven days, whichever gives more data” rule was my final rule in 2004, and I am most inclined to follow it. The tradeoff is a loss of time resolution, which is a shame given the upcoming density of debates. Of course I’ll post an announcement if and when I adopt the change.

After going over the data and trying different patterns, I find that the current result is fairly robust. The divergence from national polls is curious. One possibility is that two waves of outlier polls hit at the same time. A median of three wouldn’t resist such an event very well.

Another possibility is a systematic difference between state and national surveys. Mark Blumenthal and Nate Silver have recently presented evidence that when national polls use cell phone samples, the results shift a few points toward Obama. This might call into question the accuracy of state surveys. However, it does not explain why there was no apparent discrepancy before the conventions. So I don’t favor this explanation.

David// Sep 23, 2008 at 1:23 amSam, please don’t tinker. Especially if you’re not going to rerun the entire meta-analysis using the new methodology for the last 6 months so the graph is completely updated. And if you did do that the whole graph would change and I’m not sure I could trust this anymore. You’ve been the best advocate of making sure you’re keeping very bare-bones.

MAYBE…and I mean only when you’ve got a lot of polls, you can expand from 3-poll median to 5-poll median if all polls have come in the last two weeks for that state. Be very careful.

BirdLives// Sep 22, 2008 at 9:44 pmI understand your conservative approach, but if Obama is really up by 3-4 points nationally, there’s no way the race is that close state by state. Something’s wrong, and it’s unlikely to be the national polls, given what’s included in the poll of poll results. The state polls, less frequent and with smaller samples, are likely not picking up the change of sentiment associated with the Wall Street meltdown.

F// Sep 22, 2008 at 8:12 pmKeep it simple,

FiveThirtyEight does its thing—trying to predict the future from this year’s polls– in an intelligent and entertaining manner. You do something different, and you did it exceedingly well in 2004. You use the data to say what will happen if the election were held TODAY.

Use your 2004 election analysis as your guide. When you let the data alone, you were right on. When you tried to extrapolate what the undecided would do…well you were not right on.

Suggested Methodology (it is just repeating what you already seem to do): Use the median of the last week’s state polls (an empirical decision—in 2004, the results of the last week accurately predicted the election results). If you do not have three, go back as far as you have to, to get three polls. If there is a poll that is constantly off, than never use it (i.e.: Zogby interactive)

My field, clinical psychology, spent too many years doing “theoretically” based therapy, instead of doing empirical clinical studies as its guide. When there is data, USE the data.

Rachel Findley// Sep 22, 2008 at 8:00 pmIt’s hard not to fiddle with it, isn’t it? Especially when it’s heading off in a direction we don’t like. I feel cautious about the idea of tinkering, but it is possible to calculate a weighted median with age-dependent weights, if you want to.

Here’s a definition of median:

The median is that number which puts at least half the total count of data values at that number or below, and at least half the total count of data values at that number or above; if more than one such number exists, there will be an entire interval of such and the median is the midpoint of that interval.

Now, you are wanting to make the more recent data values (polls) count more heavily than the older ones. So why not discount the polls by some weight that starts at 1 and declines over time. If you discount each day by 20%, you’d weight today’s polls at 1, day-old polls at .8, two-day-old polls at .64=.8x.8, three-day-old polls at .512=.8x.8x.8, etc. If you discount each day by ten percent, a week-old poll would be weighted about half today’s poll.

Then define the weighted median as:

that number which puts at least half the total weight of the data values at that number or below, and at least half the total weight of the data values at that number or above; if more than one such number exists, there will be an entire interval of such and the median is the weighted average of that interval.

See https://stat.ethz.ch/pipermail/r-help/2002-February/018614.html for Hemrik Bengtson’s code to calculate a weighted median.

Operationally, you calculate the weights based on age, then add up all the weights and divide by two, to get half the total weight. Then sort the data values, start at one end, and add the weights until you reach half the total weight. Do the same thing from the other end. If you reach exactly half the total at the same data point either way, that’s the time-weighted median. If not, take the weighted average of the data points that have at least half the total weight on each side.

This does allow you to gradually “forget” old polls instead of choosing between keeping them or dropping them completely. But if you get a bunch of similarly biased polls on the same day, they will still pull the results for that day in the direction of the bias. You could, with sufficient statistical evidence, consider weighting the lightweight polls more lightly right from day one.

To get the plain old median, just set all the weights at 1. Adding the weights then is the same as just counting the data points, and the original definition of median is satisfied.

Some economists have used a weighted median for consumer price index (CPI). The Federal Reserve Bank of Cleveland calculates a weighted median CPI as an alternative to simply excluding highly volatile components of the index. They use expenditures to weight the components. “The weighted median is measured as the central point, as implied by the CPI expenditure weights, in the cross-sectional histogram of inflation each month.”

http://www.clevelandfed.org/research/review/1994/1994-q1.pdf

“The weighted median CPI is easy to calculate and has a higher correlation with past money growth than other inflation measures, resulting in improved forecasts of future inflation.” http://www.clevelandfed.org/research/data/index.cfm

Peter Formaini// Sep 22, 2008 at 7:44 pmSam:

Why not take the MEAN average change in ALL the polls and then appply this to ALL the polls, obtaining a MEDIAN change up or down – a tactic that would ‘smooth out’ any really large and unexpected results in one state by ‘spreading it’ out over all states reporting that day (or in that time period)?

Vijay// Sep 22, 2008 at 6:31 pmHi Sam,

I don’t understand your comments about the Big Ten Poll. Specifically, you say “the Big Ten polls seem somewhat fishy, and might add outliers in many of the ten (of course) states they cover.”

First of all, I don’t think you are right in assuming that there are ten states covered by the Big Ten Poll; there are in fact only eight states (Ohio, Michigan, Pennsylvania, Indiana, Wisconsin, Iowa, Illinois and Minnesota).

The more important question is what you mean by “fishy.” In both Minnesota and Pennsylvania, the Big Ten Poll seems to be within the margin of error of the other contemporary polls; Obama’s once commanding double digit lead in Minnesota has now been reduced almost to nothing, but this is what Star and Tribune reports too. (Not Fox/Rasmussen, according to which Obama still has an 8-point lead.)

Certainly, including more polls may the computation of probabilities more robust, but I doubt if you are going to see something substantially different with these two states.

I think you should avoid the tendency to fix things when the poll results don’t fit your model of what should be happening or, even worse, what you prefer to be happening. I wonder if you would have thought the Big Ten poll fishy if Obama’s expected EVs hadn’t taken a nose dive.

Cheers,

– Vijay

Matt// Sep 22, 2008 at 5:09 pmI think incorporating the last seven days is a good idea. Maybe minimum of three (no matter how old) + maximum of seven days worth (no matter how many) would work.

I think Nate does a better job of incorporating recent developments. He has usually been pretty accurate in anticipating the state polls. Right now we’ve had three big swings in a short period of time with relatively sparce state data, so in that sense your model is having a hard time keeping up. You are right though, that attempting to predict is pretty meaningless. The snapshot in time is a much more reasonable approach because of how rapidly the polling can shift two points in either direction depending on events. It is near impossible to predict these shifts, apart from conventions.

In the end, when the state polling comes in rapid fire, and barring a last minute break, I think you will both arrive at rather accurate predictions on election eve.

Mark S.// Sep 22, 2008 at 3:53 pmFiveThirtyEight currently estimates both Minnesota and Pennsylvania as a healthy +6% for Obama. I think the difference occurs largely because FiveThirtyEight includes older polls, though with a reduced weight based on age, rather than just the three most recent. So (for example) their estimate includes polls with Obama +14 in Minnesota (Sept 1) and +9 in Pennsylvania (August 25) along with the more recent +1 and +0.

As you already mentioned, FiveThirtyEight also adjusts for trends so that polls are “caught up” to the present. Without the trend adjustment, they would rate Pennsylvania and Minnesota as +3% and +4% for Obama rather than +6%.

Why was there a sudden jump-down in the EV estimate here? My guess is that the latest polls for MN and PA (and possibly elsewhere) bumped older polls with higher Obama scores off the 3-poll average.