So You Want to Unskew the Polls

Our modern obsession with presidential polling dates back, of course, to 2008, when Nate Silver’s aggregation method reassured anxious liberals that Obama really was going to win. The seeds of another obsession—critiquing the validity of said polls—also sprouted during that election, with debates about polls conducted via cell phone vs. landline (the lack of the former, it was suggested, accounted for Obama not being even further ahead). But it wasn’t until 2012, when a man named Dean Chambers introduced the term “unskewed” to the electoral lexicon via a (now defunct) website devoted to proving that the polls were wrong about Mitt Romney, that the pastime of poll debunking really took off. The 2012 polls were not wrong—but the 2016 polls were (or, rather, as any data scientist would tell you, they said that Hillary was more likely to win than Trump, and we know how that turned out), and as a result nobody trusts Joe Biden’s 8% lead circa July 2020. As polling analysis has become more sophisticated, so have the ways people talk themselves into believing those polls are wrong. Here’s how to do it properly, and how not to.

Don’t unskew the polls. No, seriously, don’t. Polls are imprecise measurements. They have margins of error for a reason. They aren’t “right” or “wrong”: They’re snapshots of a moment, little bite-size pieces of information.

Do look at polls in more context. I haven’t persuaded you, have I? Fine. If you must start meddling, you can make polls more useful by viewing them in conjunction with other polls—that is, adding the context of other little bite-size pieces of information—and then if you’re really ambitious, do some math to create fancy advanced polling averages, or models that take those averages and turn them into probabilities. But none of that involves looking at a single poll and deciding it’s wrong and then hunting for the proof in the guts of the thing.

Don’t go into the crosstabs. One of the most common ways that amateur poll sleuthers go awry is by delving into the crosstabs of a single poll (where information about responses by race, gender, age, etc. is housed) in order to declare that it sampled the wrong amounts of different demographics. Hardcore Bernie Sanders supporters during the 2020 Democratic primary, for example, argued that polls were failing to capture his support because the crosstabs showed not enough young people were being interviewed.

This sounds smart, but it is, fundamentally, not how polls work. Polling companies don’t simply ask a random sampling of people their opinion, then write it up and call it a day. When it comes to elections, first they ask somewhere between 300 and 5000 people their opinion, and then they weight those opinions based on categories so that the final averages in the poll reflect the demographics of the people being polled. Get too many old fogeys picking up the phone? They have less weight in the final average.

Sometimes, polls will have few enough respondents in a given crosstab that they don’t even list the results and instead throw up an n/a (not applicable). An unskewer uninitiated in the ways of polling might think this meant that nobody in that category was interviewed, but that’s not the case. Most pollsters don’t share results when a very small number of people in a given crosstab are reached because the margin of error climbs so high, but that doesn’t mean that overall they aren’t weighted correctly in the poll.

Crosstabs are dangerous things. It’s best to steer clear.

Articles You May Like