There is a lot of buzz about Hadoop, NoSQL, NewSQL and columnar MPP databases. But where is the actual value for businesses? Businesses need to have actionable information derived from their data that they collect on a regular basis. We know how to collect data and store them in databases of various kinds. We have seen the evolutions of SQL databases over the last five decades and databases have gotten sophisticated in terms of processing structured data.
With the recent explosion of social media and with it the proliferation of unstructured data, new technologies have emerged, such as MapReduce. So now we have the data but the real question is where does the business value actually get delivered? The answer is simple. The value lies in the analytics and the end-user reporting layer as illustrated in the Business Intelligence value pyramid below. While the steps of prepping the data in the database management system are important one must not forget that true value of a data system is in delivering actionable insights.
Rewind << for a second:
The 1990 era was all about capturing business relevant data, storing it using business constructs into a database. Typical use cases involved performing OLTP (Online Transaction processing) workloads on that data. We saw the evolution of Data Warehouses as enterprises started to seek out more analytical insights from the data stored in the database which gave rise to OLAP (Online analytics processing) workloads. A good example to cite is the following case study from the University of Virginia. Their Medical College started with a simple database to bring in clinical data for researchers but over a course of time they realized the value and richness of data which led to the development of a sophisticated data warehouse.
Once in the data warehouse, the data was cleansed, filtered and augmented with Business rules using some traditional ETL (Extract, Transform, Load) or Data Integration tools, thus removing any redundancies from the data as well as normalizing it. You would still have to run a Business Intelligence capability against this data to develop dashboards or reports to actually be able to derive some business insight from this data. Enterprises could also decide to further perform detailed trend analysis, forecasting using advanced data mining tools.
Fast forward >> to today:
As EDW’s (Enterprise Data Warehouse) started getting bigger in size, IT soon realized that managing a monolithic data warehouse was cumbersome. Hence the birth of departmental and function-specific data marts. But that was not enough since they did not address the core issues of scalability, performance, agility and the ability to handle large volume transactions. Over the years some viable alternates, like Database sharding, have been used but even that has limited success in terms of scalability. Also it is noteworthy to mention that some of these core issues spawned from the actual limitations of the underlying DBMS’s (Database Management Systems) like MySQL not being able to scale.
Hence investigating alternate DBMS technologies to address these issues has been a focal point of IT managers. So we continue to see the emergence of new DBMS technologies like NoSQL and NewSQL.
Similarly we have seen the emergence of MapReduce (Hadoop) in the area of handling unstructured data. The core use case for MapReduce remains in its ability to store massive amounts of data, pre-processing it and performing exploratory analytics.
The reality for enterprises is that there are now multiple types of databases in the form of EDW, data marts, columnar MPP (Massively Parallel Processing) stores as well as MapReduce clusters. This ecosystem is being commonly referred to by some industry analysts as Data Lakes.
So if you step back and look at the broader BI space you will notice that there is a lot of effort being spent on getting the plumbing right so that the data (structured as well as unstructured) is massaged and primed. While businesses continue to figure out the optimal data management solution they should not do it without investing in analytics and reporting capabilities needed to extract actionable insights. Some examples of successful business insights implementations include (but are not limited to):
- Recommendation engines: increase average order size by recommending complementary products based on predictive analysis for cross-selling (commonly seen on Dell, Ebay and other online retail websites),
- Customer loyalty programs: many prominent insurance companies have implemented these solutions to gather useful customer trends, and
- Large-scale clickstream analytics: many ecommerce websites use clickstream analytics to correlate customer demographic information with their buying behavior.
The takeaway here is that enterprises should remain focused on the value their data can provide in terms of enabling them to make intelligent business decisions. In other words, businesses have to keep in mind the big picture.
So how do you measure the impact of a Business Intelligence implementation for your organization? Please leave us comments in the box below!