Princeton Election Consortium

Innovations in democracy since 2004

Outcome: Biden 306 EV (D+1.2% from toss-up), Senate 50 D (D+1.0%)
Nov 3 polls: Biden 342 EV (D+5.3%), Senate 50-55 D (D+3.9%), House control D+4.6%
Moneyball states: President AZ NE-2 NV, Senate MT ME AK, Legislatures KS TX NC

September 9th, 2012, 3:00pm by Sam Wang

Yesterday, Gallup reported a big jump in their three-day rolling average of President Obama’s approval rating. Can we figure out what day it happened? Yes, and it shows how a single speech can move national opinion, even if only briefly.

Here is Gallup’s graph.

The last few individual data points plotted above (downloadable) look like this.

 Dates Approve Disappr 08/26-28 43 47 08/27-29 44 47 08/28-30 45 46 08/29-31 45 46 08/30-09/1 43 48 08/31-09/2 45 48 09/1-3 45 48 09/2-4 47 47 09/3-5 49 45 09/4-6 52 43 09/5-7 52 42

In boldface is the post-DNC bump. This is quite notable: the last time the approval number went as high as 49% was June 9-11. This suggests that something happened to drive the numbers up suddenly on September 4th, the night of Michelle Obama’s speech.

Is this even possible? Michelle Obama’s ratings were through the roof. Nielsen estimated the viewership at about 50 million people, outstripping the entire RNC convention. Her speech went viral in China, providing independent verification of her broad appeal. Could it be that a significant fraction of US viewers improved their opinion of Barack Obama after hearing her?

Gallup does not release its single-day numbers. Mathematically, it is impossible to extract a unique set of one-day numbers from a rolling average. However, it is possible to surmise the likeliest set of numbers by adding one assumption.

Opinion is unlikely to fluctuate massively from day to day. We can calculate the set of one-day values that fluctuates the least from day to day. This assumption is easily implemented using an algorithm for variance minimization (MATLAB code here). Here is the result:

 Date Approve Disappr 9/1 40.4 51.6 9/2 49.8 47.8 9/3 44.8 44.6 9/4 46.4 48.6 9/5 (Michelle+1) 55.8 41.8 9/6 (Clinton+1) 53.8 38.6 9/7 (Obama+1) 46.4 45.6

From Sept. 4th to Sept. 5th-6th, a total swing of 16-17 points occurred. The United States has approximately 240 million citizens of voting age, so this means that a net 20 million people were flipped during that period. Evidently Michelle Obama was extremely persuasive, and maybe Bill Clinton too. But judging from the swing back on Sept. 7th, President Obama could not quite sustain the impact of the first two speakers.

There’s a chance that the “unrolling” process did not get things quite right, and Obama’s boost was actually sustained. If the Gallup 3-day average approval comes down to 49% in today’s release, we’ll know that the bump was a short-lived one. We will find out soon.

One imagines that the Obama campaign is planning to deploy the First Lady’s speech in more markets.

>>>

Update, Sunday September 9th, 3:35pm: Here’s today’s data, unrolled. Looks like there’s still some elevation in job approval. So the bounce continues.

Tags: 2012 Election · President

• ChrisD

Gallup’s 9/4-9/6 favorability poll is up now on RealClearPolitics. It’s 50/46.

• ChrisD – Thanks. I’m on it. I was sitting on this post for about 16 hours, which is too bad. Anyway, see the update.

• Carlos Brody

Nice analysis!

Maybe you’ve posted about this before, but I’m curious about why you didn’t do a comparable analysis for the Rasmussen election poll, which is also a 3-day average. Doing both Gallup and Rasmussen might reduce the overall margin of error? On 1-day data (i.e., unrolled data), the margin of error must be pretty big?

• wheelers cat

Carlos, including Rasmussen never decreases error, it increases it.
Rasmussen is a consistent outlier.

• @Carlos Brody

You beat me to it! Today Rasmussen approval figure is 52% for Obama – a number that eluded him since January 2011. It is interesting to see that the OTHER Obama having this kind of impact.

We can speculate for the 2016 Democratic ticket: Clinton-Obama. Of course, that will mean Hillary Clinton and Michelle Obama!

Tapen

• Carlos Brody

@wheelers cat: we just want to look at the *changes* over time. It doesn’t matter if Rasmussen is consistently off by -5% or by +5%. The changes over time are what define a “bounce.” Those are informative independently of any bias in the mean.

• Carlos Brody

@Tapen Sinha: I like! HC and MO for 2016!

• @wheelers cat

Rasmussen itself may be biased but if you think about their time series data [X(t)], the level may be wrong but the differenced series will still have value. In other words, [X(t)-X(t-1)] would still be useful. And that is precisely the kind of thing that Sam is examining (e.g., did Akin, Michelle Obama or other discrete event produced any CHANGE in the time series). So, the point of Carlos is still very valid.

Tapen

• Carlos – In principle that is a good idea. However, in the past I have had trouble with Rasmussen data. Its day-to-day variability is substantially lower than expected from sampling error. Therefore there is some odd normalization, for instance by party ID, that confounds the calculation. So…can’t do it, sorry.

• @Carlos Brody

Damn! You beat me to it again!

Tapen

• Carlos Brody

@Sam– thanks for the response, very interesting! Indeed, if a pollster’s margin of error doesn’t match their day-to-day variability that *has* to mean something is fishy.

And to follow up on @wheelers cat’s comment, the “fishiness” in that case is more than a simple bias in the mean. Which is maybe what @wheelers cat meant all along :)

• An issue is that party ID is not a fixed character, but covaries with candidate preference. If one reweights to get a fixed fraction party ID, low variability results.

• @Sam@Carlos

There goes my speculation! Rasmussen is using some sort of exponential smoothing of Holt Winters kind to tamp down the variability. Of course, they do not tell us that. But that seems to me the only rational explanation.

Tapen

• On the party ID question, Gallup does give us that number
http://www.gallup.com/poll/15370/party-affiliation.aspx
If it does stay fixed (perhaps with Rasmussen), it will give us artificially low variability.

On another matter, one of my old students (I am not gonna say which country he/she is from) who is now doing a lot of internal polling of the Republicans for a polling firm tells me that the Republican numbers this week are looking really dismal in Ohio, Virginia and in particular in Wisconsin. Romney’s goose is well and truly cooked.

Tapen

• baw1064

Is minimizing day-to-day variance really the best approach to attempt to deconvolute the three day average? (Realizing that there are N data points and N+2 unknowns, so as you point out, there’s not a unique solution). Since the smoothing function covers three days, a bad guess for the two unconstrained parameters is likely to show up as an artificial periodicity of three days in the reconstructed data. In this case it’s unfortunate that the length of the conventions was also three days. By eye, the reconstructed data does look like it has some three day periodicity.

It may be useful to look at the Fourier transform of the reconstructed data to make sure there isn’t a peak corresponding to a three day periodicity. And also, to delete the last data point, and see how the best fit of the remaining data points varies–hopefully, not much.

• baw1064 – good points. If you use too many days of data, bad solutions emerge. One could imagine a second additional, FFT-based criterion. The strike-one-day approach you suggest is effectively done in the update, which contains one more day of data. Looks ok.

• wheelers cat

@Carlos, Tapen
sry, but it will be Castro vs Rubio in 2016.
the Battle for the Browns.
the trend is for presidential candidates to be in their forties.

re Rasmussen: he smells bad to me. he has no method transparency. Blumenthal and Silver have both written pieces on suspected data manipulation by Scott Rasmussen.
It is true that all information is valuable, and Dr. Wang uses robust statistics to extract the data.

This is simply a brilliant piece. I’m awestruck.
/salutes Dr. Wang with respect

• wheelers cat

oh goof.
I meant to say Rasmussen is an INconsistent outlier.

• dave.james

@wheelers cat
Seems as if it works equally well with and without the “IN”.

• timbaobsessed

My guess, for which I have no data, is that Rasmussen gets its right-leaning results by selecting a sample with a larger percentage of Republicans. If, as I also guess, Republicans are very unlikely to be anywhere remotely close to persuadable, wouldn’t that make the running total less volatile?

• baw1064

Yes, the derived values in the update are pretty much identical, so that’s a good sign!

• Sorry, but these are national polls. Mathematically it is possible for, say, NY, MA, CA… to have gone from 51% to 95% which would move these polls but not change the EV count at all.

Nice deconvolution of the running average polls, but so what? This could all be happening in the low “jerseyvotes” states (including our fair state of NJ).

So how much do the national polls affect say, OH? Sam could calculate correlation coefficients, no?

The usual way to display these would be a matrix, where the off-diagonal terms tell you how state_i affects state_j. Then if you assume the Gallup national polls are truly random, you make a vector of weights by state population and away you go.

I looked at the stateprobs.csv file. It’s rather coarse in percentile, probably to get adequate statistics per bin, but never mind, it’s got all you need. If you had a time series of this data(or the polls_median.txt) you could extract the correlations.

All you need is a bright undergrad who took linear algebra.

And IF Ohio has a large correlation with national polls then we exhale.

• @Amitabh Lath

Your point is quite valid in that it might be all NY, MA, CA effect. But we ARE getting state poll results that indicate substantial movements. Since you mentioned Ohio, PPP just came out with their results. They note: “PPP’s first post-conventions poll in Ohio finds Barack Obama with a 5 point lead over Mitt Romney, 50-45. This is the largest lead PPP has found for Obama in an Ohio poll since early May. Last month Obama led 48-45.” I have some private polling data from Virginia today, it shows a similar movement.

@wheelers cat
About the age of being the POTUS, how old was St Ron when he became the prez? Hint: He was playing second to the chimp in Bedtime for Bonzo when he was 40. (Once again apologies for off topic – I will not do it again).

Tapen

• timbaobsessed

I’d like to see a study about the volatility of states like OK and NY – states that are usually at least 60% rep. or dem. When Gallup’s national number jumpss by 2%, to what extent do these lopsided states jump? Also, Reagan seemed well on his way to Alzheimer’s when he was gov. of CA.

• That connects to my probably-wrong speculations from the other day, concerning the supposed enthusiasm gap. It’d make it much easier to interpret national polls if we had better data for what’s going on in the “safe” states, not just the battleground states, because it’s possible that the national motion in opinion is dominated by phenomena in safe states. (It’d also make it easier to see if any safe states are actually coming into play.)

• wheelers cat

I guess we shall see with this weeks polling, but it is possible that is not a bounce, but a plateau?
Has that happened in the past?

• Sam, a question on techniques:
Why does the stateprobs.csv file have such large quantization? It looks like most states fall into either the 0-bin or the 100-bin, with a few scattered states with 2, 5, at the low end, 78, 88, 93 at the high end.

Does MATLAB require all bins to be non-zero? Or do the numbers really come out like that? If the former, then maybe one could think about an unbinned procedure.

• Amitabh – my worst fear, being graded…you are asking about the oldest code. I think it’s because it uses a median of 3 polls/1 week (which is almost always an integer) and an estimated SEM. There is a floor on the SEM set in case polls are not variable. Non-swing states might pose a problem because they are so rarely polled. Let me look into this…

• The Republican 2008 bounce lasted for a couple of weeks, and may have been cut short by Lehman Brothers melting down. (To be fair, I don’t think it’s really possible to separate the 2008 convention bounce and the Sarah Palin VP nomination bounce; judging with hindsight from events in 2012, it may have been mostly the latter. Palin was, herself, the big story at the convention.)

At any rate, the debates start on October 3 and go on basically weekly through the month, and they’re likely to overwrite whatever lasting effect the conventions have.