Before our recent presidential elections, many polls showed Joe Biden winning (for those not in the US, we recently had an election, although I suspect you know that). And, he did, in fact, win. But, pollsters are taking a lot of heat for overstating the magnitude of that win and for expecting Biden to be more competitive in some states, like Florida, where the result wasn’t super close. Will polls ever work? Are we doomed to this happening forever in the future? I’m not sure that we are, but the situation does point to a need to rethink the concept of polling vs the concept of forecasting.
Polling is largely a data science task. It involves gathering a sample of data and extrapolating predictions from that sample. However, this approach really only works when your sample is truly a random representation of the population (as an aside, one interview question I ask data science job applicants is about sampling bias – you’d be surprised at how many “data scientists” don’t know anything about the importance of random sampling).
But let’s think about “random samples.” Do you ever get contacted by pollsters? Do you respond? For a variety of reasons, I don’t. And, I suspect that the majority of people contacted don’t. So, are the samples of voting intention really “random?” Perhaps they’re a random sample of people willing to answer a pollster’s call but certainly not a random sample of the overall voting population. And, in today’s world of spam, spoofing, phishing, and cloaked intentions, people willing to answer a pollsters call are likely NOT representative of all voters.
However, if predicting election outcomes is important (and, for many reasons, it is), there may be better approaches. Perhaps we should be taking clues from defense and security analysts. They can’t reliably ask representatives of another country, “Are you planning to invade us next month?” but they can look for clues that would indicate such a possibility.
If an enemy country is massing troops along your border, it’s very worth considering that they might be planning an invasion. If the executives of a company seem to be selling their stock in that company, it’s worth considering whether that company is about to deliver bad news. If you start to feel chest pains whenever you take a walk, it’s probably worth considering whether or not you’re headed toward having a heart attack.
In 2016, many forecasters expected Trump to lose. However, I remember that there were reports of large numbers of Trump yard signs in various places. A few people (I was not among them), let’s call them forecasters, drew the right conclusions.
Looking for non-polling indicators of some future event is not precise but, at this point, is polling? In the end, perhaps we shouldn’t be talking about political polling but, instead, political forecasting. That forecasting uses a melange of tools that includes traditional polling combined with inductive and related data science techniques.