Bridging the Data Divide with Dell Technologies and Vertica

Learn how object based data lakes are giving data teams the power of choice in data analytic workloads.

As I look back on the world of data analytics, I see such a huge divide. There are organizations making use of their data and those who are not. Even 10 years ago it was not uncommon for organizations to not take full advantage of their own unstructured data. Sure, they were doing reporting or building dashboards, but only looking back historically at what happened in the past. A recent IDC report states business leaders are working to enable 90% of new enterprise apps to be cloud native.¹ However, many of these projects are over budget or miss their timelines. How do we bridge this data divide? By giving Data Teams the power of choice in their data architecture.

Everywhere we turn data analytics is top of mind. Rare diseases now are quickly identified with the help of analytics. An evening out to dinner is replete with analytics as restaurants use handy apps to predict how to staff for the day based of historical data, current events, and the weather. However, the challenge for data teams is only compounding. Sure, budgets are larger than before, but that comes with greater scrutiny and higher expectations. Gone are the days when data teams could provide results six months down the road. The democratization of data has helped the business both understand the power of data, and the risk involved in not utilizing data correctly. Results are needed in weeks, not months.

So, how do enable data teams to leverage the data deluge? The answer is simple: by separating compute and storage.

Utilizing Resources in Data Analytics Architectures

IDC predicts that by 2023, file and object capacity will grow by 300%, requiring organization to invest in rapidly scalable and adaptable technologies to handle the deluge of data. We have the power to choose when it comes to many aspects of our lives. Why then, when it comes to data analytics must we choose to add compute and storage at the same time?

Suppose you are working with a project that is heavy on historical data,  with a huge capacity footprint, but doesn’t require a lot of compute resources. Typically, data teams are forced to deploy the compute resources to cover the data needs. Continue this process two or three times more, and your team has a lot of underutilized compute. Many customers I talk to are at 40-50% compute utilization.

The Power of Choice in Data Architecture

We recently announced our collaboration with unified analytics warehousing leader Vertica. Our joint data team customers now have the power of choice in their data architecture. By switching from Vertica Enterprise to Vertica EON mode using Dell EMC’s ECS object storage system as the underlying data lake, your analytics workloads can dictate how much compute or storage is needed.

This unique joint solution enables your data teams to:

  • Extend infrastructure resources individually– Storage can be scaled up without the addition of expensive compute, which can be scaled up or down with variable or intermittent workloads.
  • Isolate the workload– Business analysts and data scientists can work independently of a single source of truth, without competing for resources.
  • Simplify database options– Customers can experience improved node recovery, better workload balancing, and faster compute provisioning.
  • Hibernate compute nodes– Customers can start and stop analysis more efficiently by hibernating compute nodes when they are not needed.

Now, the next time you need to crunch an enormous dataset, data teams don’t have to wonder if they can add that data into their workflow. Scale eliminates the worry about bringing on new datasets because of costs or storage availability.

Bridge the Data Divide Today

By combining ECS enterprise object storage with Vertica EON Mode, data teams can now extend infrastructure resources independently. Ready to give data teams more choice when it comes to their analytics architecture? Checkout the Dell Technologies session at Vertica’s Unify Conference on July 21st with Mark Guerra, of Analytics at Jaguar Racing Speed: Driven by Vertica | Powered by Dell Technologies. Registration is free and sessions will be available on-demand.

1 IDC FutureScape: Worldwide IT Industry 2020 Predictions, October 2019, Doc# US45599219

About the Author: Thomas Henson

Thomas Henson an Unstructured Data Solutions Systems Engineer with a passion for Streaming Analytics, Internet of Things, and Machine Learning at Dell Technologies. He brings experience in Machine Learning Anomaly Detection, Open Source Data Analytics Frameworks, and Simulation Analysis. Thomas is also heavily involved in the Data Analytics community.