How Data Driven do you really have to be?

Nicole Weaver
Sep 4, 2020
4 min read

Updated: Oct 12, 2020

Data "driven" or data "informed"? What place does the human have in making decisions, now that we have data?

“You HAVE to be data driven” How often have you heard that in the past couple of years?

Ever since Michael Lewis’ 2003 book ‘Moneyball: The Art of Winning an Unfair Game’ and the 2011 movie adaptation starring Brad Pitt, even the least versed of us can vividly understand how data can offer a massive business advantage over pure instinct or traditional methods of making decisions.

If you haven’t read/seen this story, it’s a must for anyone interested in data and its ability to shed light on how the real world works.

In the story (based on true events), Billy Beane, manager of the underfunded Oakland A’s, starts using data patterns to build a team that would go on to win more games and ultimately reach the playoffs, despite having a much reduced budget for player talent. In the movie Billy clashes with his scouts over which players to draft and which to trade. The scouts trusted their experience and instinct, claiming they “know” what talent “looks” like and Beane, with his trusty sidekick, Harvard-educated statistician Paul DePodesta, chose to look to historical data and statistical models for these decisions.

Although undoubtedly played up in the movie for dramatic effect, this conflict is likely happening in conference rooms across the world, as organizations grapple with the challenge of changing their culture to regularly use data when making decisions. To the person who formerly made decisions based on their gut, or professional experience, the term “data driven” can imply that their skill and expertise has been replaced with a dashboard or a machine learning prediction.

So how data driven SHOULD we be?

I would argue that we need to be data “informed” but we should stop short of blindly following the direction the data points us in. Why?

Well firstly, most data is messy and dirty and it’s not a given that the super shiny chart put out by the data team is 100% accurate or complete. So if it doesn’t seem quite right there should be a forum to question and investigate the result or recommendation that the data is producing. The human brain is an excellent model builder and should be a valuable part of the input gathered. Questioning and examining data and model predictions should be a healthy part of the ongoing dialog.

Secondly, data is necessarily historical. There simply is no data about the future, only modeled predictions based on data that reflects the past. So if the future behaves pretty much like the past, the models are in a good position to perform well. However, if there are major shifts in the environment or landscape (such as a new competitor, regulation, economic position), predictions based on past behavior and results might not be so good. Now there are certainly models that build in scenarios based on changes to future conditions, but these scenarios are based on assumptions and so still somewhat speculative.

Thirdly, the models are not pure. By that I mean that all algorithms and models contain some element of human guidance (and therefore bias), based on the assumptions of the model builder. Assumptions can be inserted into models in many ways, explicit and accidental. For example, the choice of what data to provide to a ML model, is an assumption of what is relevant and useful in predicting a future behavior. The choice of what to do with missing data likewise contains an assumption about the relevance of different elements of the data.

So, following a practice of being data DRIVEN (by which I mean the data primarily determines the decision) can lead to inaccuracies, narrower consideration and poorer decisions.

On the positive side, data INFORMED decisions, where data is routinely sought and examined before making a decision, has been shown to lead to consistently improved results.

One reason for this is it removes the subjectivity that is in all of us. If trying to decide between two logos, it might be tempting to select the one most appealing to the VP of Marketing or the CEO, but that assumes that those people’s taste accurately reflects the target audience. A data informed approach would be to seek data (from focus groups or A/B testing) to see which one performs better with the target audience.

Another reason is that while humans are generally good at processing small amounts of data and making mental models (which we refer to as our “instinct” or “gut”), when the data set is large, humans lose the ability to see all of it. Faced with tens of thousands of records which contain user behaviors, we are simply not equipped to digest it all.

Finally, and this is really exciting to me, timely analytics are able to detect and demonstrate changing trends very much earlier than a human brain would notice, allowing organizations to adapt and respond much earlier.

So, I think the sweet spot can be summed up in this quote from Anand Rao in this article from PwC

“Artificial intelligence can help people make faster, better, and cheaper decisions. For that to happen, first and foremost, you need an openness of mind to collaborate with the machine, as opposed to treating the technology as either a servant or an overlord.”

More opinions of data driven vs data informed can be found here:

https://www.interana.com/blog/data-intuition-go-hand-hand

https://segment.com/resources/data-strategy/data-driven-vs-data-informed/

https://online.hbs.edu/blog/post/data-driven-decision-making

https://www.tradegecko.com/blog/small-business-growth/data-driven-vs-data-informed

https://www.edq.com/globalassets/tip-sheet/3-roadblocks-to-building-data-driven-business.pdf

#machinelearning #datadecisions

How Data Driven do you really have to be?

Recent Posts

Comments