Wednesday, December 21, 2011

F-ing polls -- how do they work?*

Roger Simon has written one of the most anti-intellectual columns of the week, asking whether polls are really "magic." Not only does he appear not to know how polls arrive at the answers they do, but he seems to have no interest in learning. He even falls back on the classic "they never call me" trope. Some highlights:
I have never been called by a political pollster and don’t know anybody who has, but I know some pollsters, who assure me they don’t make the numbers up, and I believe them.
Pollsters, or rather the phone-bankers who make call after call (or computers that make robo-call after robo-call) do get people to talk to them. Not vast numbers of people, but pollsters do not require vast numbers.
[...]
We are a nation of nearly 313 million people. So how many people did the pollsters actually speak to? If you have extremely good eyes, you can find the answer in tiny type at the bottom of a chart: The Post-ABC poll was conducted by phone “among a random sample of 1,005 adults.”
That represents 0.0003 percent of the nation at large. (The number of Republicans and Republican-leaning independents was an even smaller sample of 395 people.)
[...] 
This poll has a very good reputation and I “believe” the results in that I believe they were calculated carefully and (unlike some partisan or campaign polls) without any agenda.
[...] 
Does Obama really lead Gingrich by 8 percentage points in a (currently) imaginary matchup?
I dunno. Sounds right to me. But I am an even smaller sample than 0.0003 percent.
You really don't need to be a statistician to understand this stuff. Why can a survey of 1,100 people be accurate in telling us how the whole nation is thinking? The metaphor I always liked was a blood test. For a doctor to determine if there's a problem with your blood, she doesn't need to remove it all -- she can just extract a small vial. This vial of blood represents the rest of your blood well because it's constantly being mixed up, so that a few cc's of your blood in your arm looks like the blood anywhere else in your body.

It's the same thing in surveys. You can poll a fairly small number of people as long as you can be confident that you're getting a representative sample of American voters. (Talking to your friends and neighbors? Not representative. Calling people randomly across the country? Much better.) And some relatively simple math can tell you just how likely it is that your sample believes what the rest of the country believes. Picking 1,100 people for a survey means you have a margin of error of roughly 3%. That means there's a 95% chance that the actual population is within three percentage points of what your sample believes. Pollsters have settled on that as a pretty reliable margin. You could get it down to 2%, but only by interviewing lots more people, driving up the costs of the poll considerably without improving its accuracy by much.

The sad thing is that Simon has an audience who might really appreciate a better understanding of how polling works, but he decided to waste their time with some blather about how polls are magical and therefore beyond our understanding. They're not, and Simon's readers deserve better.


*Must credit Brendan Nyhan for the Insane Clown Posse reference.

13 comments:

Gordon Danning said...

To be fair, you DO have to be a bit of a statistician to understand this stuff -- it is pretty counter-intuitive, and it helps to have actually looked at the math behind it. And, your blood cell analogy doesn't work -- all blood cells are pretty much identical, or at least are far more identical than people. But you are rignt that a blogger/pundit/educated person should certainly know the basic principle behind sampling size, calculating margin of error, etc.

Anonymous said...

Actually, you have a 68% chance that your measurement falls within 3% of the true value for the sample. If you want 95% confidence, then you must consider a range twice the standard error, or 6% in this case.

Seth said...

Actually, the margin of error is twice the standard error. So margin of error refers to a 95% confidence interval.

Anonymous said...

1 sigma = sqrt(N)/N ~ 30/1000 = 3%

68% of points will fall within 1 sigma in a normal distribution

95% of points will fall within 2 sigma...

Anonymous said...

Gallup explains sampling:

http://janda.org/c10/Lectures/topic05/GallupFAQ.htm


I can't even remember how many times I've spoken to people who think polling is nonsense and that 1000 people "cannot possibly" explain the behavior of tens or hundreds of millions.

Anti-intellectualism and anti-science at its finest.

Raphael said...

SE is roughly SQRT(N*p*(1-p)) where p is the sample average. For most surveys, p ~.5 (.45-.55 seems to be a typical range for political support, although the latest Congressional approvals are really down there). In any case for a sample mean of approximately .5 and a sample size of 1000, the SE is about 1.5% and a 95% Confidence Interval is +- 3%.

Anonymous said...

Since most reputable polls these days end up very, very close to the actual results in elections, why would someone call it "magic" or doubt their veracity?

THAT seems counter-intuitive.

David Nickerson said...

I like the blood test analogy. I will probably use it in future lectures on sampling. Thanks.

metrichead said...

Seth, you (and a few other bloggers) have stoked and rekindled my interest in political science.

It's really in the math section where I hit the wall; methinks I'm intimidated. Especially when other posters start speaking in "Mathenese" or those formulas.

But as far as Roger Simon's column went, I pretty much get the idea of how using a sample of 1,000-1,200 voters works.

Anyway, enjoy the rest of the holiday - I'll be off the grid until after New Years. I hope you have a great 2012!

Anonymous said...

Technically, it is not the case that a 95% confidence interval tells you that "there's a 95% chance that the actual population is within three percentage points of what your sample believes".

Think of it this way. Suppose the true population percentage supporting some position is 50%. Pollster A draws a sample that produces an estimate of 54% with a confidence interval from 51%-57%. Pollster B draws a sample with mean 46% and confidence interval 43%-49%. It is not logically possible for the true value to have a 95% chance of falling in A's confidence interval AND at the same time a 95% chance of falling in B's confidence interval, which don't even overlap.

Instead, what the 95% confidence interval is telling us is that in 95% of random samples, the true value will fall in the range we create by adding/subtracting 2 standard errors from our estimate of the mean. In 95% of samples, our confidence interval will include the true value. But the probability that the confidence interval will include the true value in any SINGLE sample is either 0 or 1. The interval produced from our sample either includes the true value or it doesn't. Of course the problem is, we can't tell which world we're in -- whether we've gotten one of the 5% of "bad" samples due to an unlucky draw.

CL said...

Great response to a horrible column. This drives me crazy, especially when people say "they never call me."

dmarks said...

"Pollsters, or rather the phone-bankers who make call after call (or computers that make robo-call after robo-call) do get people to talk to them"

I've made my wishes quite clear when I put myself on the do-not-call list. Anyone who ignores this, including pollsters, charities, and politicians, is being rude and harassing me.

I hope some politicians (fat chance) close the loophole so there will be fines and other punishment against the rest of those who ignore the do-not-call list.

Rob Rushing said...

I think the blood analogy is quite persuasive, on two or three grounds. 1) it is scientifically accurate because, not only are your blood cells not all the same, but because doctors are not looking exclusively at your blood cells when they draw blood, but at the variety and amount of substances that are in a sample (electrolytes, toxins, alcohol, enzymes, iron, etc.), just as pollsters are looking at Dems, Reps, Inds, Libertarians, Crypto-Fascists, and so on. 2) It creates a "well, no duh," response on the part of the listener, who is compelled to agree by common sense, and because the analogy is to a common experience 3) Finally, it subtly but deftly threatens the listener with having all of his or her blood removed, as if to say: "You think polls are magic, huh? Then you won't mind if I slice open an artery, will you? Cause it looks like I need all of it!"