Why Polls Still Matter
The Purpose of Polls
Polls aren’t meant to predict a winner; though that would be nice, it’s not their primary function. Political analysts, pundits, and even candidates themselves often seem to forget this critical point. As we move within less than a week of the 2020 election, it cannot be understated. That doesn’t mean that pollsters, pundits, campaigns, or voters shouldn’t try to predict the outcome of a race – because they should and it’s fun - but it is with this baseline understanding of polls that one must begin. Aggregated polls help determine percentages of victory but there’s only one race every four years. Trump doesn’t need to beat Biden ten times, he needs to beat him once. He doesn’t need a hundred paths to victory, he needs one. They’re both professional politicians - as much as Trump likes to tell everyone he isn’t - and it all comes down to percentages.
Polls are snapshots in time, and they are far from the full picture. Polls can help us prepare and get a sense of where things stand at this current moment, but polls are useless without understanding their purpose and broader contexts. When the Patriots met a banged-up Eagles team with a back-up quarterback in the 2018 Super Bowl they were heavily favored. They’re both professional football teams – obviously pretty damn good ones to make it to the final game of the season – and you only need to win one game. More people should heed Bert Bell’s advice: “on any given Sunday, any team in the NFL can beat any other team.” That’s no less true in politics. That isn’t to say you shouldn’t have favorites, or even that you shouldn’t bet on the outcome if you’re confident and have reasons to be. It’s just to say you shouldn’t be surprised by the outcome. Polls, and national elections, are like odds in football games – there’s a lot more to it than who should win on paper. There was good reason to bet on the Patriots in 2018 and there’s good reason to bet on Joe Biden in 2020 but they only have to win one.
A lot has been made about the 2016 polls and for good reason. President Trump just recently tweeted at FiveThirtyEight’s Nate Silver saying: “you got  so wrong.” But that isn’t really true, mostly because Trump is ignoring percentages and the purpose of polls and forecasts. There are many outliers in polling, relying on only one year like 2016 to make predictions about 2020 is a bad idea. Remember, polling has been pretty consistently accurate since the early 1970s. Some years have been better or worse but presidential polls have been right way more than they’ve been wrong. One should look at a lot of different factors when determining what’s happening in a race. Polls are just a part of that. But what the 2016 polls can remind us is how to look at polls to properly set expectations.
Polling Averages and Mean Absolute Error
The best polls are aggregate (taking averages of individual polls  ) and two leading national and state-wide polls that do so are Real Clear Politics (RCP) averages and FiveThirtyEight. When looking at the quality of polls, researching different methodologies used is key. There are a ton of theories about which are best that often change, which is why aggregate polls do better over time than individual polls do. It is true that at certain years some individual polls are more accurate than others. Many of the Republican polls in 2016 proved to be more accurate in hindsight but I’d caution against overcorrection or using a single election as a sample size. Average polls are only as good as the polls that are in them but- understanding that they are imperfect instruments – there is overwhelming statistical support for aggregated polls.
To determine how accurate polls are, the mean absolute error is calculated, which is simply the error between what the average poll found and the final result. In 2016, the mean absolute error in battleground states was 2.7. In 2012, it was almost identical, actually it was slightly less accurate. On the whole, that’s a good performance and well within a normal margin of error. There were a few big misses in 2016 and the miss in Wisconsin turned out to be the most egregious and costly for Clinton. When looking at polls, the size of the miss is much more important than the direction. It’s up to us – average voters, politicians, policy advisors, pollsters- to set expectations based on what we are seeing.
State polling data is significantly more useful than national polling numbers if you are trying to predict a winner. There is no doubt some usefulness of seeing and understanding a national poll and national trends, but if you are trying to look at polls to see who has the best chance of winning the race, state polls should be your only focus since presidential elections are won and lost in the key swing states within the electoral college and not on national polling.
In 2016, you heard about the big three: Michigan, Pennsylvania, and Wisconsin. Had 80,000 voters across those three states flipped, Clinton would be President. But out of those states, only one was a significant miss, Wisconsin. First, in Michigan, the final RCP average poll had Clinton leading by a little over 3%. On election night, she lost the state by 0.3%, which is about a 3.3% mean absolute error or slightly above the average. In Pennsylvania, we saw almost the exact same story, a mean absolute error of 2.8%. Those are pretty good polls; they just got the winner wrong. The problem wasn’t the poll, it was the pundit who looked at a 3-point advantage and expected Clinton would win the state and was shocked when she didn’t. Wisconsin was a different story. The final RCP average poll had Clinton up 6.5% and she lost by 0.7%, a mean absolute error of over 7%. There are a lot of reasons why pollsters think they missed Wisconsin and why late undecided voters broke for Trump (the fact that they often break for the challenger is one main one). But whatever the reasons, they did. No one was talking about three other big misses in 2016 - Iowa, Minnesota, and Ohio - because they weren’t determinative. The mean absolute error for Ohio and Iowa were over 6.5%, which is a horrible average poll but since they showed Trump ahead and he won the state it didn’t make the news. And it was also true on the flip side in Minnesota. The last polls in Minnesota had Clinton up by +10% or, in some cases, so much that a lot of pollsters didn’t even poll the state. Yet, on election night, Clinton carried MN by only 1.5%. So, six states, all in the Midwest with similar demographics missed by more than the 2.7% average, in some cases by a lot more. But that shouldn’t have been a huge surprise to those who follow polls closely since we know that polling errors are correlated within similar regions of the country.
The moral of the story that I carry from 2016 is that unless there are consistent, aggregate polls that show one candidate up by 6-7% or more, that state shouldn’t be written off. Of all the polls that mattered four had polling errors north of 6% in 2016, in 2012 it was only two and neither of them were determinative for Obama. In 2008, the polls were excellent. The polls are good over time. As pollsters are increasingly running into the mode effect, having difficulty reaching voters, needing to significantly weight their polls, and continuing to rely on live phone calls, we should be giving extra leeway, not less, in our expectations of polling accuracy.
How to Add Polling Information into Political Trends
Pollsters have theories about what went wrong in those misses in 2016. Many have concluded that there was a large block of white, non-college educated voters in the Midwest that the polls missed. In 2018 and 2020 they have said that they corrected that flaw by weighting polls for education. But there were also some big misses in 2018 in Iowa, Florida, Ohio, and Indiana in particular that should give us some pause. Given that polls can miss for a variety of reasons that we sometimes can’t know until after the election, it isn’t unreasonable to think that could happen again this year. In order to do that though, you need to use a non-conventional methodology because all the conventional methodologies of aggregated polling have Biden leading heavily. One would need to use national mood, enthusiasm, anecdotal evidence, and gut instincts to guess how late undecided voters will break and predict turnout, which polls simply cannot do. The common theme of that evidence, of course, is that it’s qualitative. That does not mean that it is useless, it just means you need to take it with a grain of salt and recognize that it isn’t quantitative data and, in my view, shouldn’t be weighted nearly as highly.
Polls can only tell you so much, then comes experience. Gut instincts -the ability to read and synthesize national trends, mood, voter tendencies, and other non-data specific factors - are as important in politics as they are in business or everyday life. Life isn’t math; and almost all jobs are not strictly quantitative. Neither is politics. Like anything, it takes experience to be good at it. The more you read, the more you study trends and history, the more you listen to a variety of opinions, the more objective and unbiased you can be, the better judgement you can make. But toeing the line is difficult. It takes a combination of data and gut. It’s part science and part art. Everyone needs to make their own judgement on what combination they think is best, and they’ll often need to reset, but I venture it’s 85% science and 15% art. It’s not enough to look at a poll that has a candidate with a 1% lead and expect her to win. But if you see an aggregated, well-respected poll that has Biden up by 12% and you think Trump is going to win that state you’re simply being biased. There is no statistically significant data over time that supports that claim and you must be relying way too heavily on qualitative information. Analyzing polls and elections generally takes nuance and that’s what makes it frustrating, difficult, and fun.
Example Application: North Carolina
Polling data says Trump is vulnerable in North Carolina, maybe even down by a point or two. Data also says North Carolina has voted Republican in every presidential race since 1980, save one, in 2008 for Barack Obama. Data says North Carolina has a close Senate race, too, and that can play a factor. Data says that North Carolina is more correlated to voting patterns in the Midwest than in Florida or Georgia. And there are countless more North Carolina demographic trends and shifts that are useful to know. But with a race this close, that won’t always help you figure out who is going to win. Then there’s the anecdotal evidence: I’ve been to an urban center and one of the most densely populated areas in North Carolina in recent weeks and I talked with some voters; some loved Trump, some didn’t like Trump’s style but no one was an outspoken Biden supporter. I saw 100 Trump signs and not a single Biden sign. That’s a little odd to me. I got a vibe of the state. Obviously, I would never let yard signs, a vibe, or any anecdotal evidence for that matter, be foundational of how I make my decision but I can’t ignore it; it doesn’t feel like Joe Biden is going to win North Carolina. So, you look at the national trend, early voting information, changing demographic information, the historical perspective and yes, a few polls, and make your best assessment. My quantitative combined with a small amount of qualitative analysis says Trump wins North Carolina by a point or two, but I would never want to bet on it.
Nate Silver also warned about a number of factors that lead to a misunderstanding of American politics, which are complex. He talks about “the real shortcomings in how American politics are covered, including pervasive groupthink among media elites, an unhealthy obsession with the insider’s view of politics, a lack of analytical rigor, a failure to appreciate uncertainty, a sluggishness to self-correct when new evidence contradicts pre-existing beliefs, and a narrow viewpoint that lacks perspective from the longer arc of American history.” While he is tackling a much larger issue that will take a lifetime of work to analyze, understanding that a poll isn’t predictive can go a long way in making life a lot easier for anyone who cares about elections.
Flaws in Polling
There are inherent flaws in polling that can’t be avoided. They will never be 100% accurate. First, you need try to look at small sample size, in many cases 700 to 1,200 voters, and draw a larger conclusion based on what you think will happen to that universe on election day. That means polls must try to predict voter turnout, which is impossible. These polls won’t catch things like voter enthusiasm, the hidden voter that couldn’t be reached or wouldn’t take the poll, and other quasi-qualitative factors that are hard to capture with data. You need to assume everyone tells the pollsters the truth and that, after they speak with them, they will actually turn out to vote. Then you need to take that limited information and make it sensible across the board. Mapping a small universe to a larger one is difficult. When pollsters inevitably can’t reach all types of voters, they run into sampling issues and thus need to weight their polls to adjust for underrepresented populations, another educated guess.
In 2016, RCP averages nailed Arizona, Colorado, Florida, Georgia, Maine, New Hampshire, and Virginia and they were reasonably close in Michigan, North Carolina, New Mexico, Nevada and Pennsylvania. That isn’t meant to give pollsters a pass for their failures in Wisconsin, Iowa, Minnesota, or Ohio – it’s just to point out that right before the election, Trump was just a normal polling error behind Clinton, and most experts gave him no chance; they looked a 2% Clinton advantage in Pennsylvania and thought it was a lock. Understanding elections holistically is really tough, but that’s the art, not science. Sometimes, it can make dumb people look really smart and smart people look really dumb and that’s called luck and life. Don’t be afraid to make predictions, but take the long-view when you do it and curb as much of your bias as possible. Like aggregated polls, over time it will serve you better.
A final point worth reemphasizing: as we look at 2020 RCP averages, one shouldn’t forget what can happen within the universe of normal polling error. Biden leads the top battleground states by an average of only 3.9% and only one is above 5.5% (Michigan at 8.1%). No one is seriously arguing who is in a stronger position at this point and if Trump were being honest with himself he would change places with Biden in a heartbeat. However, that doesn’t mean Trump can’t win. If these correlated state polling errors happen in 2020 like they did in 2016, Trump could win a second term. If they don’t, like they haven’t in many other presidential years, Biden could win in a landslide. The margins are small. FiveThirtyEight has consistently given Biden a 75%, 80%, and, recently, an 85% chance of winning this election but as Real Clear Politics’ Sean Trende reminds us: “remember, even at an aggregated level, polls are off, and when FiveThirtyEight tells you there’s a twenty-per-cent chance of something happening, you don’t get to round that down to zero.”
----------------------------------------------------  There’s more to it than just taking averages but that’s the simple explanation. RCP and FiveThirtyEight do this a little differently. FiveThirtyEight takes aggregate polls to create forecasts and percentages of likely outcomes of upcoming elections. RCP was the first major website to do aggregate polling for political elections but they don’t focus as much on forecasts.