Shattering Bottlenecks with GPU Acceleration in Cloudera Data Platform

NVIDIA GPU-accelerated Cloudera Data Platform is now available on NVIDIA Certified Systems from Dell Technologies — to drive faster data workflows.

In today’s enterprises, data flows like water cascading down a mountain river in the spring. Massive amounts of data continually stream into edge, core and cloud systems, creating both challenges and opportunities for IT and business leaders.

On the challenges side, IT administrators need to capture, curate and store data in a mix of structured, semi-structured and unstructured formats and make it all readily available to modern applications, like those for data analytics and machine learning. On the opportunity side, data scientists and business leaders can now innovate with data to gain insights, optimize processes and help the business move faster than the competition.

And this is where NVIDIA GPU-accelerated Cloudera Data Platform comes into play. This groundbreaking analytics solution, now available on NVIDIA Certified Systems from Dell Technologies, integrates NVIDA’S RAPIDS  Accelerator for Apache Spark 3.0 to accelerate data pipelines and push the performance boundaries of data and machine learning workflows. With this proven combination of leading-edge technologies, organizations have what they need to accelerate the development and delivery of high-quality data and analytics applications that power AI across the enterprise — without changing code or reworking projects.

This is great news for data scientists and others who wrestle with bottlenecks created by massive amounts of data and slow compute. These bottlenecks directly impact the cost and speed at which companies can train and deploy models across the organization. But now, that’s old news. Today, with an NVIDIA GPU-accelerated Cloudera Data Platform, data scientists can execute end-to-end data science and analytics pipelines on NVIDIA Certified Systems to improve machine learning model accuracy by iterating on models faster and deploying them more frequently.

Wide ranging use cases

Cloudera Data Platform supports a wide range of use cases that span from improving operational efficiency to driving business transformation. On the operational side, CDP use cases include data warehouse augmentation by offloading ETL (extract, transform, load) workloads, log aggregation and analytics, dual storage and active archive, and archive-intensive and tiered Hadoop storage.

On the business transformation side, CDP supports diverse use cases for marketing, finance, healthcare, pharmaceutical and manufacturing applications. We’re talking about applications that help organizations anticipate customer needs, detect fraud and reduce risk, improve patient care and reduce healthcare costs, ensure regulatory compliance and validation, and achieve continuous process improvement.

And now, with NVIDIA GPU-accelerated Cloudera Data Platform on NVIDIA Certified Systems from Dell Technologies, organizations can accelerate these analytics use cases while reducing data science infrastructure costs.

What’s under the hood

With NVIDIA Certified Systems from Dell Technologies, organizations deploying NVIDIA GPU-accelerated Cloudera Data Platform can take advantage of the latest and greatest hardware to accelerate the development and delivery of high-quality data and analytics applications.

Validated for running accelerated workloads with optimum performance, manageability, scalability, and security, these systems include the Dell EMC PowerEdge R750xa. It’s a two-socket server with the latest 3rd generation Intel® Xeon® Scalable processors with the capacity for up to 4x double-width or 6x single-width PCIe NVIDIA GPUs. NVIDIA NVLink bridges allow pairs of A100 PCIe GPUs to share memory while multi-instance GPUs allow for up to seven independent instances per A100, making it easier to designate and share accelerated resources. For a look inside this incredibly flexible server, check out the Dell EMC PowerEdge R750xa video above.

And with Cloudera End of Support (EoS) dates approaching for many legacy products, this is a great time to migrate to CDP on NVIDIA Certified Systems — and futureproof your data center for AI.

Graphic detailing futureproofing data centers for AIAs you accelerate into this move, Dell Technologies has the full range of Ready Solutions for Data Analytics available including Dell EMC PowerEdge R650 server admin/head nodes and Dell EMC PowerScale Isilon H600 storage, along with your Dell EMC PowerEdge R750 accelerated worker nodes.

To test drive NVIDIA GPU-accelerated Cloudera Data Platform, visit one of our worldwide Customer Solution Centers. And to explore the wide range of Dell EMC Ready Solutions for Data Analytics, visit here.

About the Author: Janet Morss

Passionate about data analytics, including machine learning and high performance computing, Janet Morss works in product marketing. Her favorite: Sharing the amazing impact our customers have on people's lives using technology. Prior to joining Dell EMC, Janet worked in HPE Server Strategy, Planning and Operations with a focus on SMB and Enterprise solutions. With multiple degrees and a love of learning, Janet is a start-up style marketer from Colorado who loves to snowboard.