Making AI, HPC and GPUs Easier for Data Scientists

With new GPU-ready containers and server infrastructure, organizations are simplifying the deployment of artificial intelligence applications and the clusters they run on.

Regardless of the domains they work in, researchers, data scientists and developers want to focus on the work they do, not the systems and tools they use to run their workloads. While that’s a simple proposition, the reality on the ground is often far removed from this ideal world.

Those who have built artificial intelligence and other high-performance computing applications from the ground up — starting with bare-metal servers, computing frameworks, software libraries and more — understand that there is a lot of heavy lifting required up front. This work always comes first, before you can focus on the work you really want to do, like training your model, running your inference workloads and seeing the results.

Together with its partners, NVIDIA is working to rewrite this story. It’s making the deployment of GPU-accelerated systems really easy with its NVIDIA GPU Cloud, or NGC.

The NGC container registry

Via the NGC container registry, NVIDIA provides a catalog of GPU-accelerated containers that deliver easy-to-deploy GPU-accelerated software for AI, machine learning and HPC. These containers, which are available to download at no charge, alleviate many of the headaches that come with setting up software. They help you get up and running quickly with tested, optimized and updated frameworks and applications. The containers are designed to take full advantage of NVIDIA GPUs, on-premises or in the cloud, and to work across a wide variety of NVIDIA GPU platforms.

The NGC container registry is simple to use. When you visit the site, you are prompted to answer a simple question: “What are you interested in working on?” To move forward, you simply select one of six options from a catalog: High Performance Computing, Deep Learning, Machine Learning, Inference, Visualization, or Infrastructure. Once a selection is made, you can begin working with, say, the Caffee2 deep-learning framework or the LAMMPS software application for molecular dynamics simulations. After that, you can move forward with the confidence that comes with knowing your software is correctly configured.

The NGC-Ready hardware

Software, of course, is only part of the problem when it comes to deploying large-scale HPC and AI applications. You also have to find the right hardware infrastructure to run your workloads. NVIDIA makes this process easier with its NGC-Ready program. Through this program, hardware vendors validate that the NGC containers run correctly on their servers and workstations.

Dell EMC is an active participant in this program. Our NGC-Ready infrastructure, including the Dell EMC PowerEdge C4140 server, has been tested and validated to run containers from the NVIDIA GPU Cloud. This back end work allows organizations to deploy GPU-accelerated Dell EMC systems with the confidence that comes with knowing they are ready to run NGC containers.

And that’s just one part of the work that Dell EMC does with NVIDIA. My team works closely with the engineers at NVIDIA to optimize systems, perform benchmark testing and take other steps to help ensure that you and others can get the full value of GPU acceleration when deploying NGC containers on Dell EMC hardware.

For example, some users worry that containerizing an application will hurt performance. Through our lab and benchmark testing, we’ve been able to show that little, if any, performance is lost when you containerize your software, as opposed to starting with software that you deploy on bare-metal servers. We put a special focus on proving the potential of using containers in large-scale simulations and deep learning applications. In our lab, we find containers to be portable and efficient as we scale out deep learning workloads.

Even better, we’re demonstrating how the container approach makes life easier for data scientists and other users who are venturing into the brave new world of AI and machine learning. With the right frameworks incorporated into an NGC container and deployed on our NGC-Ready hardware, you can take advantage of GPUs for deep learning by adding just a few lines of code — literally. You add the lines of code and you gain the benefits of GPU acceleration, with no need to port your application code to a new platform.

Ready Solutions for AI

In another ongoing initiative, our engineers at Dell EMC work closely with their counterparts at NVIDIA to bring systems to market that are optimized for GPU-enabled deep learning applications. That’s the case with our Dell EMC Ready Solutions for AI, which are optimized for deep learning with NVIDIA GPUs.

These Ready Solutions provide a GPU‑optimized stack that can shave valuable time off of deep learning projects. If that’s your goal, Dell EMC engineers can help you configure, test and tune your GPU‑enabled hardware and software, so you can get up and running quickly with a top-tier deep learning platform based on a framework that can use both CPUs and GPUs. These solutions even include services to help your data scientists discover insights from data in less time.

Key takeaways

If you take a step back and look at the big picture, you’ll see that there are a lot of resources available to help your organization deploy AI and HPC-driven applications that take advantage of the power of GPU acceleration, and to do so with the confidence that comes with validated and optimized hardware and software solutions.

I think you’ll also see that it’s becoming much easier to get into this game than it was just a handful of years ago. Today, end users no longer need to become experts in the underlying technology to capitalize on GPU-accelerated systems and the power of AI and other HPC-fueled applications. Instead, they can keep their eyes of the real prize — the work they do.

To learn more

For a closer look at the technologies that make it all go, explore Dell EMC solutions for high-performance computing and artificial intelligence.

About the Author: John Lockman

Programmer, developer and evangelist for containerization and orchestration with Kubernetes, John Lockman works in the Dell EMC HPC and AI Innovation Lab. He specializes in nature-inspired programming, deep learning, and artificial intelligence. John brings a passion for building tools to make advanced computing accessible to a larger audience.