Data Science / Predictive Analytics / Artificial Intelligence Is Largely About Extrapolation
At its heart, predictive analytics is about making educated guesses regarding the future based on things you know about the past. Thus, the field assumes that the future will work like the past did. But, what if the future doesn’t look anything like the past? You might try to adjust your formulae using analogs but some things are so different today that the results will be imperfect at best. For example, is our current recession analogous to the 2008 recession? Well, they’re both recessions but in 2008 people weren’t largely homebound. So, is it really analogous?
In a funny way, today’s situation is like that of the not hotdog app on the show SILICON VALLEY. There, the app worked when it saw something it recognized, a hot dog, but it didn’t know what to do with anything else.
Today we’re in a unique situation and predictive analytics isn’t great with unique. For now, smart companies will use their analytics as baselines but, at the same time, will adjust policies and structures to react quickly should the underlying assumptions prove incorrect.
Not All Data Science Applications Are Affected
It’s important to note that this issue doesn’t affect all applications. The simplest examples are things not built on economic or healthcare data. Image recognition, for example, still works. (let me know if you’ve had trouble unlocking your iPhone with Face ID and I’ll modify this section accordingly)
We Need to Expand Our Definition of Dirty Data
One more point is important here. We teach that if your data is bad or dirty, your results will be wrong (It’s still a GIGO world, right?). Given what we’re seeing today, perhaps we need to expand our concept of dirty data. Perhaps, rather than just incorrect, we should include a component of not representative of the future.