Business Intelligence Analyst or Data Scientist? What’s the Difference?

I am a huge Thomas Davenport fan. His book “Competing on Analytics: The New Science of Winning” was the first book to make organizations aware of the business potential of analytics, even prior to the craziness brought on by Big Data. I happened upon a recent article of his titled “Looking Outward with Big Data: A Q&A with Tom Davenport” and one item from that article really jumped out at me:

“Initially, I didn’t see much of a distinction [between business analytics and big data], and I thought that I could kind of rest on my laurels and not write a book about big data—because the fact is that the analytical tools and approaches used are not all that different for big data. But when I started talking to companies and data scientists, I realized that there really were some fairly substantial differences—some that have yet to be fully articulated and some that are already in evidence.”

Understanding the Differences

There are significant differences between a Business Intelligence (BI) analyst and a Data Scientist, but many folks are still confused. I recently received the following email from a follower (Felix) of my blog series that highlights some of the challenges that organizations are wrestling with on this difference in definitions.

Dear Mr. Schmarzo,

I recently came across your January 9th blog post entitled “Business Analytics: Moving From Descriptive To Predictive Analytics.”

Our IT department disagrees on the capabilities of OLAP cubes. To me, a cube does not appear useful for parameterized models or most types of scenario analysis. (I am trained in statistics and other forms of financial modeling.) I showed Figure 1 of your blog to my colleagues, but was told that I do not understand OLAP technology.

Felix’s dilemma is typical of what I see in organizations that have spent considerable time and money building out their Business Intelligence capabilities. To me, the situation is similar to the construction worker discovering the “saw.” Doesn’t mean that the hammer is no longer important, but the saw and the hammer perform entirely different but complementary tasks.

Here was my response to Felix:

Hey Felix, push back on your IT department. There is nothing predictive in OLAP cubes. Cubes are great for slicing and dicing historical data looking for areas of under- and over-performance in the past week/month/quarter, but they don’t answer any of the questions about the future such as: What will be sales for Product X next month? How many customers do we expect to respond XYZ promotion? What is the likelihood that wind turbine A101 will fail within the next 30 days?

Answering questions about the future requires developing predictive models and getting results that are qualified by probabilities and confidence levels. The key difference is that a BI Analyst uses OLAP cubes and other BI tools to report on what happened in the past, while a data scientist uses predictive and prescriptive tools to forecast what might happen in the future.

There is a significant difference between what a traditional BI Analyst does and what a Data Scientist does. And one does NOT replace the other; they are complementary. Figure 1 does a nice job of summarizing the differences and how these two critical roles play off of each other.

Figure 1: Differences Between BI Analyst and Data Scientist
Figure 1: Differences Between BI Analyst and Data Scientist

Data Science is different than the traditional Business Analytics in some key areas. For example, data science…

  • uses predictive and prescriptive analytics to predict what might happen using probabilities and confidence levels, not just report tools to report on what did happen.
    • Note: when we’re dealing with historical data, there is a strong desire and need for the data to be 100% accurate. If you have your financial results wrong for the past quarter, folks are likely to go to jail. However predicting performance for the next quarter is usually measured in probabilities and confidence levels (e.g., “There is a 95% confidence that our revenues will come in next quarter between $200M to $212M).
  • is used for dealing with and mitigating the uncertainty in the data. It uses several analytic and visualization techniques to understand where uncertainty may lay in the data, and then uses data transformation techniques to massage the data into a workable form – not perfect, but again not necessary when dealing with probabilities and not absolutes.
  • is able to create as-needed data transformations (versus the traditional ETL process) to put the data into a format so that it can be combined with other data sources in search in insights about customers, products and operations.

To quote our Jeffrey Abbott of EMC Global Services Marketing,

“The disconnect is that with BI, people take the historical data and extend the trend lines and factor in cyclical factors. It’s slow, manual, and needs to be rebuilt each month/quarter/year. But with data science, we have the ability to automatically build the predictive apps that actively look for certain combinations of data and trigger a prediction of the future. It’s real time, re-usable, continuous, and automated.”


Summary

A recent blog “Data Science: The More Data, the Better”, talks about how Federal Reserve Chair Janet Yellen uses a dashboard of job data that doesn’t just rely upon a single measure (unemployment rate) to make economic and labor policy decisions. Instead, she uses a dozen different measures to provide a more holistic, more accurate, and hopefully more actionable view of the United States economic situation. She’s a data scientist at heart that realizes that a single measure of anything complex—whether it’s the U.S. economy or even things like customer satisfaction and predictive maintenance—is oversimplifying something to the point of not being useful or actionable.

I have written several blogs trying to highlight the differences between a traditional business analyst and a data scientist, some of which I have listed below. Enjoy!

Bill Schmarzo

About the Author: Bill Schmarzo

Bill Schmarzo is the Customer Advocate for Data Management Innovation at Dell Technologies. He is currently part of Dell Technology’s core data management leadership team, where he is responsible for spearheading customer co-creation engagement to identify and prioritize the customers' key data management, data science, and data monetization requirements. Bill is the former Chief Innovation Officer at Hitachi Vantara where he was responsible for driving Hitachi Vantara’s Data Science and “co-creation” efforts. Bill also has served as CTO at Dell EMC where he formulated the company’s Big Data Practice strategy, identified target markets, developed solution frameworks, and led Analytics client engagements. As the VP of Analytics at Yahoo, Bill delivered the analytics tools and applications that optimized customers’ online marketing spend. Bill is the author of four books and is currently an Adjunct Professor at Menlo College, an Honorary Professor at the University of Ireland – Galway, and an Executive Fellow at the University of San Francisco, School of Management. Bill holds a Master of Business Administration from University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science and Business Administration from Coe College.