# Princeton Election Consortium

### A first draft of electoral history. Since 2004

Meta-Margins for control: House D+1.0% Senate R+4.2% Find key elections near you!

## Guest expert: Ed Freeland on wireless phone sampling

#### September 17th, 2012, 9:00am by Sam Wang

I am joined today by guest Ed Freeland, director of the Princeton Survey Research Center. Ed offers a survey expert’s view of wireless phone sampling, and how problems are addressed. Later today he will pop by to answer questions in the comment thread. Ed, welcome! -Sam Wang

Wireless telephones may be a relatively new technology, but pollsters and statisticians have long been familiar with what is known as “dual-frame sampling.” The basic challenge is how to get an accurate view of a target group when one is trying to reach that group by more than one route (“frame”).

Even before the advent of wireless telephony, pollsters had to deal with households with multiple telephone lines and people who could be reached at more than one residence.  There is a well-established literature on how to work with “multiple frame” surveys that combine samples from different sources (for example, a residential address list and a list of randomly generated telephone numbers). So this is an old problem.

The key is to know, or at least be able to estimate, each respondent’s odds of being chosen for a representative sample. This is known as the stratification issue, in which some groups are harder to reach than others. Because the odds of being chosen for an interview might be different from one respondent to the next, there has to be a statistical adjustment or re-weighting of the sample to account for different odds of selection between groups.

In the last few years, the wireless-only population has become quite large and quite diverse.  We have a good handle on its parameters thanks to Stephen Blumberg at the NCHS. So the old concepts of re-weighting can now be applied to this new population.

There are several complications in combined wireless phone/landline sampling. One is that for wireless phones, federal law prohibits the use of automatic dialers, so that surveys are less efficient and therefore more costly to do. Another is that a single respondent will have different odds of being selected by the two routes. For example, a particular demographic might be twice as likely to be reached by wireless phone for the newest Gallup poll as by landline telephone.  In this case, statisticians estimate a final weight that reflects the total odds of a person being contacted for a survey through one of several different telephone numbers.

That’s why pollsters always ask (usually at the end of the interview) the number of voice telephone lines leading to your home and the number of wireless phone numbers that might be used to contact you (yes, there are people who regularly carry more than one wireless phone).  By using your answers to these questions, along with other characteristics about you and your household, pollsters use an old method of statistical adjustment to adapt to a very modern phenomenon.

References:

Biemer, Paul P. 1984. “Methodology for Optimal Dual Frame Sample Design.” Bureau of the Census.  Statistical Research Division Report Series. SRD Research Report Number: CENSUS/SRD/RR-84/07.

Dutwin, David, Courtney Kennedy, Scott Keeter and Dale Kulp. 2008 “Dual Frame (Landline and Cell RDD) Estimation in a National Survey of Latinos.” Pew Research Center for the People and the Press.  Washington DC.

Copeland, Kennon R., Kirk M. Wolter, Nada Ganesh, Meena Kahre, Stacie M. Gerby. 2012 “Dual-Frame Sample Weighting for a Telephone Survey of Children’s Vaccination.” Paper presented at the Annual Joint Statistical Meetings of the American Statistical Association.

Groves, Robert M. and James M. Lepkowski. 1985. “Dual Frame, Mixed Mode Survey Designs.” Journal of Official Statistics, Vol.1, No.3,. pp. 263–286.

Skinner, C. J. and J. N. K. Rao. 1996 “Estimation in Dual Frame Surveys with Complex Designs.” Journal of the American Statistical Association , Vol. 91, No. 433 (Mar., 1996), pp. 349-356

Author: Edward Freeland, Director, Survey Research Center, Princeton University.  9/16/2012.

Tags: Uncategorized

### 33 Comments so far ↓

• Jose M

Thanks for the link to the NCHS article.

• Bill N

First of all, thanks for this overview. It is quite interesting.

One question I have concerns the issue of those persons contacted in a polling survey who decline to participate. Probability sampling methods produce representative samples in a probabilistic sense, and allow quantification of error by way of the probabilities assigned to particular persons or groups of persons being in the sample. It has always seemed to me that the probabilistic nature of the sample is lost as soon as persons decline to participate, and especially when the numbers of “decliners” is large. How do you factor the percentage of decliners into the estimates of, say, those planning on voting for Obama (or Romney), and especially into the computations for estimating the sampling error?

• wheelers cat

Thank you very much, Mr. Freeland.
But the multi-frame problem doesn’t seem to address the issue of cell-only households, at 31.6% (Marist) and growing.
How do researchers make the stratification adjustment for cell-onlies?

• Olav Grinde

Could possibly give us a demographic breakdown of cellphone / smartphone only users?

— By party affiliation / independent
— By age group
— By ethnic group
— By geography
(i.e. a US map showing cell phone onlies voter percentage pr state)

Also, what is actually known about how likely cell phone / smartphone only voters are to vote?

I’m also curious whether distinctions are observed between cell phone onlies and smartphone onlies — and if so, what sort of distinctions?

It’s great that you’re doing this.

• Olav and Wheeler’s Cat, I think some of the answers you seek are found in this Pew Center report and article. Also see the NCHS article, which is key.

To restate Ed Freeland’s point, the idea is that a dual frame survey approaches respondents by (1) landline and (2) cell phone. Those are the two frames. Then the pollster has to weight according to various demographics to get a picture of the whole population. In this respect, Wheeler’s Cat‘s concern would be an input to the pollster’s formulas. The concern is addressed if cell phones were included in the survey and the pollster has some model based on the NCHS- or Pew-type statistics.

• wheelers cat

Dr.Wang, in the dual frame approach the surveyer has two channels to try to capture the respondent.
What is the second channel for cell-onlies?
Could you use email?

• Olav Grinde

Sam, maybe I’m missing it, but I really don’t see those numbers and demographic breakdowns explicitly stated on the web sources you point to.

• Ralph Reinhold

The Pew 2010 report on phone uses is an eternity as far as mobile usage. I believe usage is reaching saturation. It has been increasing at about 1% of the population being added every couple of months, which would put it in the low 90% usage. The numbers on the cell phone only has been growing at a slightly faster rate. A report I saw says it is around 2% every quarter. Are the weighting factors being corrected for a dynamic demographic?

• wheelers cat

Landline-onlies are 12.9, nearly all seniors.
As the seniors die off I expect we will see ~98% cell owners.
55% of american households are landline plus cell(s). 31.6% are cell onlies.
Im not using Pew.
Marist May 2012.

• Olav Grinde

Ah, sorry. Those reports do have some very interesting demographic breakdowns. I didn’t dig deeply enough. :)

• Ralph Reinhold

@wheelers cat: Many of the landline onlies, who are not seniors are rural. The political demographics of those leans to the right a might. We have a contract with Comcast that will cost more to get out of than to continue to expiration. When that’s done, so is our land line. We already sent them a notice. Telemarketers ignoring the do not call list are driving several to cell phone only….there’s an app for that.

• Chris Bastian

The critical issue the impact of phone number selection. In a mobile society, wireless customers are likely to hold on to a familiar cell phone number, rather than change when they move. Hence, a search for numbers to call may omit a portion of the population because they don’t have a “local” phone number, or conversely call someone no longer in the relevant geography.

• Chris – Good point. When pollsters dial a number for one area code and find the person is living in a different part of the country, they have several options. If the poll is national, the respondent’s answers stay in the survey with the same initial probability of selection as the first stage weight. If the poll is just for one state, and the respondent will not be voting in that state (in person or by absentee ballot), then we’ve reached an ineligible respondent, so this respondent’s answers will not be included in the data. Same for respondents who answer an in-state landline phone but don’t live (or vote) in the state being polled and for convicted felons in states that don’t allow them to vote. In sum, the first cut on your way to a group of likely voters is to drop the ineligible voters.

• JamesInCA

@wheelers cat: What Dr. Freeland & Dr. Wang are saying is that cell-phone-only respondents are reached by dialing their cell phone. That’s the second channel.

• Thank you for your questions. I’m happy to respond. First to Bill N’s concern: if I could nominate a hymn for my profession it would be “What a Friend We Have in Randomness.” Thanks to writers like Nassim Nicholas Taleb, randomness is finally getting its due. It turns out to be more complex and more pervasive than most people think, and it’s a big part of why survey researchers worry about non-response. Our hope is that non-response is essentially random, namely that there is no systematic pattern to who decides to participate in a survey and who does not. As long as non-response is random, then our results are based on a random sample (the “responders”) of a random sample (the group of all households selected for our survey). While we may observe under- or over-representation of certain categories of respondent, it’s rare to find some groups entirely missing. The post-stratification weights applied at the end of the data-gathering phase help us adjust for over-response among some demographic categories and under-response among others. If you’re still skeptical and would like to see the results of an experiment that tests the impact of getting a high response rate, take a look at this paper by the Pew Center for the People and the Press.

Wheeler’s Cat: When we design dual-frame telephone samples, we randomly draw telephone numbers from two pools: landline numbers and wireless numbers. The nation’s telephone carriers are quite meticulous about keeping these two pools separate because (1) they are required to do so by law and (2) the two kinds of phones are billed differently. FCC rules prohibit auto-dialing to wireless numbers, so carriers have to separate landline and wireless numbers. This, in turn, allows the companies that furnish telephone samples for the polling industry (e.g., SSI and Genesys) to know in advance which numbers are wireless and which are landline. Granted, number porting and call forwarding can sometimes mess with this a bit, but nevertheless, this approach gives pollsters coverage of the wireless-only population that’s every bit as good as coverage of the landline population.

Olav Grinde: As Sam mentioned, we have lots of great data on America’s wireless-only population. Interestingly enough, there is also a “wireless-mostly” population that has a landline telephone in their residence but does not answer calls on it. The NCHS report has lots of demographic data on the wireless-only and wireless-mostly populations, but not much on political affiliation. Since the National Health Interview Survey is government-sponsored, it does not ask about political affiliation. The best research I could find on that issue comes from another paper by the Pew Center. The results underscore my earlier point about the wireless-only population being large and diverse.

Ralph Reinhold: While it’s true that the wireless-only population is still growing quickly, that growth means that the wireless-only population looks more like the rest of the population with each passing month. And because of this, the weight adjustments we make to account for differences in the two populations will have a decreasing impact on the results over time.

• Ralph Reinhold

Your point may be valid. However, if the pollsters are coloring their raw data according to preconceived notions of the populace, they are introducing a bias. The one time I was called by a poll, age was one of the questions. If they currently all ask that, then the data may be colored to fit the national statistics.

I would think that they will see more no-responses as smart phones go into greater use. I have an app that allows me to immediately hang up on an unfamiliar number. I don’t use that option, but it reports how many calls that I’ve received from that number and allows me to block repeat callers that I don’t know.

• wheelers cat

Well if cell is the second channel, what is the first?
Landlines? Only 12.9 percent are landline only.
nearly a third are cell only.
I still think you are still missing a large cohort, and some bias.
Who has landlines? White male property owners and seniors. Lets say the husband has a landline, but everone in the family also has a cell or a smartphone. But only the husband responds to polls.

I dont answer unknown callers on my iPhone.
ever.
How will pollsters reach me? Im a college educated pre-menpausal white woman. we vote Obama, non? I guess they could send me an email.

“that growth means that the wireless-only population looks more like the rest of the population with each passing month” does it?
Aren’t landline onlies mostly seniors?

Wheeler’s Cat- Sounds like you should answer your phone if you want to be counted. Like I said, I had 2 polls taken of me last week and I am cell phone only.
One of them (American Future Fund) was a robocall. Or at least a machine asking me questions rather than a real person.

• Olav Grinde

Thanks! Another question:
Is there any reason to believe that there is an imbalance in the political preference of people willing to answer polls in the first place?

• Olav – Good question – gets to the heart of the matter. I haven’t seen any data on this, but even if, for example, right leaning citizens were less willing to answer than left-leaning citizens, there are still enough people from both groups willing to answer to make reasonably good estimates of how each group will behave in the voting booth.

• Olav Grinde

Dr Freeland, one comment, if I may:
The Pew Center paper to which you refer is based on statistics from 2009. Relative to the development of wireless usage, that seems an eternity ago…

• Ralph Reinhold

His comment to me is valid. As cell phone usage increases, the statistics will come to match the mainstream.

• wheelers cat

No they won’t. Landline onlies are old people and they vote repub. Cell onlies are young people that vote democrat.
And there are twice as many cell onlies as landline onlies.
That is my point. Its not symmetrical!!!!

• In which case the less-sampled group will be weighted more heavily. That is the entire point of this procedure.

• wheelers cat

I guess. But the problem as I see it is that smartphone ownership is a choice. So dont you lose randomness? And internet invitations….random selection, but being able to use the internet is a demographic on its own. I just think our polling methodology is stale and we should be able to exploit social media and modern technology better.

And I absolutely love Taleb. He’s the black swan guy.

• Olav Grinde

@Ralph Reinhold: Absolutely, and that is an astute observation. That’s the reason I also asked about the subset of cell phone onlies: smartphone onlies.

• As a clarification, I should add that the FCC prohibition on auto-dialing cell phones doesn’t mean those numbers don’t get called. It just means the interviewer has to hand dial the number the old fashioned way – the same way telephone polling was done before the advent of computer assisted interviewing.

• Dear Mr. Freedland, this is all very interesting information. The NCHS link and dual frame technique make fascinating reading.

Could you bottom-line it for us?

1) Is this (cellphone only) a problem? What is the magnitude of the systematic uncertainty, compared to the typical statistical uncertainty.

2) If yes on 1, then is this a bigger problem for certain pollsters? Which ones, and how much?

3) What is your opinion on non-phone techniques used to mitigate this problem (YouGov’s invited internet sample, for instance)?

1) I think we have the problem solved. Granted, cell phone interviewing has more problems than landline phone interviewing. They’re more expensive, result in higher proportions of ineligible voters, and have lower cooperation rates. Nevertheless, we can be assured that our sample frame gets us good coverage of the cell-only and cell-mostly populations in the U.S. and that the interviews we are able to complete by cell-phone give us reasonably good estimates of what folks in these populations are thinking.
3) As for web-based surveys: beware the non-probability sample. Invited web panels may have lots of respondents, but then so did the 1936 Literary Digest poll – the one that predicted Alf Landon’s victory over FDR.

• wheelers cat

Mr. Freeland, what do you think of the RAND poll?
They try to insert randomness.
https://mmicdata.rand.org/alp/index.php?page=election#shifts-between-candidates

• Are there techniques being developed to poll cellphone users using non-voice based methods such as text messages?

I ask because I find that a large fraction of communication with my cellphone is not voice.

• wheelers cat

I think that is brilliant.
text polling.

• One last question (if you are still reading this).

What would you consider to be the biggest systematic uncertainties in polling?

To put it another way: let’s say you had unlimited funds and could conduct a poll with a huge sample size, at what value of N would it cease to be worth it? 1000 responses? 5000 responses? At some point the sum of systematic uncertainties will be comparable to the sqrt(N) statistical uncertainty. About how large would N have to be?