Business Intelligence Analyst or Data Scientist? What’s the Difference?

I am a huge Thomas Davenport fan. His book “Competing on Analytics: The New Science of Winning” was the first book to make organizations aware of the business potential of analytics, even prior to the craziness brought on by Big Data. I happened upon a recent article of his titled “Looking Outward with Big Data: A Q&A with Tom Davenport” and one item from that article really jumped out at me:

“Initially, I didn’t see much of a distinction [between business analytics and big data], and I thought that I could kind of rest on my laurels and not write a book about big data—because the fact is that the analytical tools and approaches used are not all that different for big data. But when I started talking to companies and data scientists, I realized that there really were some fairly substantial differences—some that have yet to be fully articulated and some that are already in evidence.”

Understanding the Differences

There are significant differences between a Business Intelligence (BI) analyst and a Data Scientist, but many folks are still confused. I recently received the following email from a follower (Felix) of my blog series that highlights some of the challenges that organizations are wrestling with on this difference in definitions.

Dear Mr. Schmarzo,

I recently came across your January 9th blog post entitled “Business Analytics: Moving From Descriptive To Predictive Analytics.”

Our IT department disagrees on the capabilities of OLAP cubes. To me, a cube does not appear useful for parameterized models or most types of scenario analysis. (I am trained in statistics and other forms of financial modeling.) I showed Figure 1 of your blog to my colleagues, but was told that I do not understand OLAP technology.

Felix’s dilemma is typical of what I see in organizations that have spent considerable time and money building out their Business Intelligence capabilities. To me, the situation is similar to the construction worker discovering the “saw.” Doesn’t mean that the hammer is no longer important, but the saw and the hammer perform entirely different but complementary tasks.

Here was my response to Felix:

Hey Felix, push back on your IT department. There is nothing predictive in OLAP cubes. Cubes are great for slicing and dicing historical data looking for areas of under- and over-performance in the past week/month/quarter, but they don’t answer any of the questions about the future such as: What will be sales for Product X next month? How many customers do we expect to respond XYZ promotion? What is the likelihood that wind turbine A101 will fail within the next 30 days?

Answering questions about the future requires developing predictive models and getting results that are qualified by probabilities and confidence levels. The key difference is that a BI Analyst uses OLAP cubes and other BI tools to report on what happened in the past, while a data scientist uses predictive and prescriptive tools to forecast what might happen in the future.

There is a significant difference between what a traditional BI Analyst does and what a Data Scientist does. And one does NOT replace the other; they are complementary. Figure 1 does a nice job of summarizing the differences and how these two critical roles play off of each other.

Figure 1: Differences Between BI Analyst and Data Scientist
Figure 1: Differences Between BI Analyst and Data Scientist

Data Science is different than the traditional Business Analytics in some key areas. For example, data science…

  • uses predictive and prescriptive analytics to predict what might happen using probabilities and confidence levels, not just report tools to report on what did happen.
    • Note: when we’re dealing with historical data, there is a strong desire and need for the data to be 100% accurate. If you have your financial results wrong for the past quarter, folks are likely to go to jail. However predicting performance for the next quarter is usually measured in probabilities and confidence levels (e.g., “There is a 95% confidence that our revenues will come in next quarter between $200M to $212M).
  • is used for dealing with and mitigating the uncertainty in the data. It uses several analytic and visualization techniques to understand where uncertainty may lay in the data, and then uses data transformation techniques to massage the data into a workable form – not perfect, but again not necessary when dealing with probabilities and not absolutes.
  • is able to create as-needed data transformations (versus the traditional ETL process) to put the data into a format so that it can be combined with other data sources in search in insights about customers, products and operations.

To quote our Jeffrey Abbott of EMC Global Services Marketing,

“The disconnect is that with BI, people take the historical data and extend the trend lines and factor in cyclical factors. It’s slow, manual, and needs to be rebuilt each month/quarter/year. But with data science, we have the ability to automatically build the predictive apps that actively look for certain combinations of data and trigger a prediction of the future. It’s real time, re-usable, continuous, and automated.”


Summary

A recent blog “Data Science: The More Data, the Better”, talks about how Federal Reserve Chair Janet Yellen uses a dashboard of job data that doesn’t just rely upon a single measure (unemployment rate) to make economic and labor policy decisions. Instead, she uses a dozen different measures to provide a more holistic, more accurate, and hopefully more actionable view of the United States economic situation. She’s a data scientist at heart that realizes that a single measure of anything complex—whether it’s the U.S. economy or even things like customer satisfaction and predictive maintenance—is oversimplifying something to the point of not being useful or actionable.

I have written several blogs trying to highlight the differences between a traditional business analyst and a data scientist, some of which I have listed below. Enjoy!

About the Author: Bill Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business” and “Big Data MBA: Driving Business Strategies with Data Science”, is responsible for setting strategy and defining the Big Data service offerings for Dell EMC’s Big Data Practice. As a CTO within Dell EMC’s 2,000+ person consulting organization, he works with organizations to identify where and how to start their big data journeys. He’s written white papers, is an avid blogger and is a frequent speaker on the use of Big Data and data science to power an organization’s key business initiatives. He is a University of San Francisco School of Management (SOM) Executive Fellow where he teaches the “Big Data MBA” course. Bill also just completed a research paper on “Determining The Economic Value of Data”. Onalytica recently ranked Bill as #4 Big Data Influencer worldwide. Bill has over three decades of experience in data warehousing, BI and analytics. Bill authored the Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements. Bill serves on the City of San Jose’s Technology Innovation Board, and on the faculties of The Data Warehouse Institute and Strata. Previously, Bill was vice president of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Website analytics products, including the delivery of “actionable insights” through a holistic user experience. Before that, Bill oversaw the Analytic Applications business unit at Business Objects, including the development, marketing and sales of their industry-defining analytic applications. Bill holds a Masters Business Administration from University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science and Business Administration from Coe College.