Both the polling business and the blogosphere -- and especially the select group of nerds who dwell where the two intersect -- were thrown into a tizzy on Tuesday with bombshell allegations that the polling firm Research 2000 is a sham operation.
There’s always been a great deal of smoke and mirrors obscuring the polling industry, but the revelations here go beyond mere number-massaging, possibly to the point of data actually being made up.
Until a few weeks ago, the prolific pollster was best known for its work on behalf of the liberal website Daily Kos, although it also had a number of other mainstream newspapers and TV stations as clients. Daily Kos founder Markos Moulitsas ended his relationship with the firm shortly after Research 2000 repeatedly showed Bill Halter narrowly winning the Democratic Arkansas Senate runoff (he ultimately lost to Blanche Lincoln), and, more important, after Research 2000 ranked near the bottom of numbers guru Nate Silver's pollster ratings.
Several weeks ago, three statistics experts independently approached Moulitsas with concerns that the numbers being generated by Research 2000 weren't the results of random polling. This was based on their reading of the cross tabs, the subsamples of different demographic groups (such as what percentage of 18-34-year-olds would vote for Candidate X or what percentage of women had a favorable opinion of Barack Obama). That this data was available was supposedly a strong suit of Research 2000 -- many pollsters don’t even bother publicly releasing cross tabs. But here their disclosure may, in the end, have been their downfall.
One observation that stood out for the experts was the strange pattern of even and odd numbers in various politicians' approval scores. For instance, in Research 2000's June 3 sample, Obama’s favorable among men was 43 percent and among women was 59 percent (which together adds up to 102). That in itself wasn’t unusual. His unfavorable among men was 54 percent and among women was 34 percent (which adds up to 88). There’s no reason that should also add up to an even number -- after all, each of the male/female variables is independent of each other and nothing compels them to add up to an even number – but twice in a row isn’t that weird. Undecideds were 3 percent among men and 7 percent among women: Three times is getting weird. But it didn’t stop there:The even-odd property matched in a total of 776 out of 778 male-female pairs in weekly polls. That would be like flipping a coin "heads" 776 out of 778 times; the odds are astronomical.
Another pattern that stood out was that there were very few weeks in which there was no change in Obama’s favorable numbers. Now, it seems intuitive that there wouldn’t be many weeks without change; poll numbers are volatile, and change a lot. However, this is another instance where the law of averages is at work. If you’re describing a trend that’s basically flat -- and Obama’s approvals have been pretty flat for the last year once the initial honeymoon wore off -- the most common result, out of many, many surveys, is going to be "no change." Changes of 1 or 2 points happen less, while changes of more points are rare but not unheard of.
To see this at work, look at the graph of the normal distribution of the week-to-week changes in Gallup polls published in its report: it’s a bell-shaped curve, with the tallest bar for "no change." Then compare the Research 2000 graph: There are lots of -1's and +1's, but very few 0's, which theoretically should be the most common result. The odds of such a distribution occurring naturally, again, are astronomical.
It’s entirely possible that these unusual results aren’t the result of falsified data, but the result of some sort of weighting intended to smooth out data and make it conform better to expected turnout models (a common pollster practice). But without Research 2000’s willingness to turn over all their raw data -- something they’ve said they won’t do -- there’s really no way to know. With Daily Kos’ plans to file suit against Research 2000, the discovery process is likely to reveal what was happening behind the curtain. But if nothing else, this should be a spur for all pollsters to make public all of their underlying data as part of the routine disclosure process. With the previous discrediting of Strategic Vision and now, potentially, Research 2000, the polling industry’s credibility is increasingly on the line.