You’ve probably heard a lot about high-performance computing (HPC) lately. Companies in all types of industries — from medical to finance to manufacturing — are using HPC to solve their most complex science and engineering problems.
But there’s not a lot of publicity around the behind-the-scenes networking and configuration of HPC environments, which can be huge challenges for any organization. Whether they’re deploying new servers or replacing switches, IT administrators usually have to suffer through time-consuming manual processes to get HPC clusters up and running smoothly.
I saw these challenges up close recently when one of our customers — a large global chemical research company — started to implement new internal HPC compute clusters. The firm knew its compute resources would require an interconnect with the right amount of redundancy, manageability and performance. The company also needed the ability to easily replace the environment’s networking switches without the need for pre- or post-configuration work.
Speeding research insights
This customer was the perfect candidate for network automation, which can greatly simplify server and switch deployments. We worked with them to deploy an HPC cluster connected with Dell Networking S-Series switches, which feature the Dell Networking Open Automation framework to help automate the deployment and provisioning of Dell switches. Specifically, the firm chose to use the Open Automation’s Bare Metal Provisioning (BMP) functionality to configure its Dell Networking 1GbE and 10GbE switches for self-installation. By combining DHCP Option 82 (RFC 3046) with the ability to run pre and post-configuration scripts, the customer was able to assign startup configuration files determined by the state of the uplink switch ports. The customer also used the Open Automation event framework to trigger custom actions when particular conditions occurred. For instance, when a link goes down, a switch can send an alarm or email notification.
Now, relying on scripting and auto-provisioning provided by Dell Open Automation, the chemical research firm can automatically add switching routes, change paths or create security protocols as needed. And by using scripts that run at certain times of the day, the firm can enable automatic configuration backups. With these new capabilities, the company can more quickly enable an HPC infrastructure that powers new insights for its researchers.
Saving time, cutting costs and eliminating errors
In addition to all of the switch configuration being automated, the same location-based deployment is used for supercomputer support servers. The servers get IP address, name and other information based on the 1GbE or 10GbE switch port they are plugged into. Also, by using the same location-based configuration deployment for additional HPC cluster deployments, the company can facilitate setup much faster. This plug-and-play approach also simplifies server provisioning, taking only 10 percent of the time it usually takes for initial configuration.
The research organization is also saving money because its network engineers have significantly reduced the time they dedicate to managing and configuring Dell switches, thus freeing them to perform higher-value activities. The company has also eliminated configuration errors, because the initial configurations are generated just once and never changed. Automation takes care of all the dynamic aspects of the process.
At the end of the day, the chemical research firm is realizing all the benefits of Open Automation: fast, simple deployment and configuration, lower costs and fewer configuration errors. And most importantly, the company is using its new HPC solution to power groundbreaking new scientific research.