Turnkey Open Source Cluster Integrated with 70 Apps.pdf
“As a very large university, it’s important to work with a partner that understands the needs and dynamics of institutions like ours. Because of our long relationship, we have come to trust that Nor-Tech provides high level services in terms of quality, speed and efficiency.”
-Sam Behseta, CSUF Professor of Mathematics and Director of CCAM
Their Challenge
California State University, Fullerton (CSUF) Center for Computational and Applied Mathematics (CCAM) in the College of Natural Sciences and Mathematics needed a powerful cluster capable of solving a wide range of computational problems in science and math. To that end, they applied for and received a $583,900 grant from the U.S. Army to fund the project. This new machine would join two other high-performance computing clusters at CSUF built by Nor-Tech in 2012 and 2015.
CCAM HPC Director Emerio Martinez is the cluster architect on the CSUF end–involved in grant proposals to optimize performance and ensure compatibility with existing technology.
While no high performance technology project is without issues, this one was especially challenging. Nor-Tech Senior HPC Account Executive Tom Morton explained, “Challenges included the pandemic—supply chain issues like scarcity and rising costs–and being able to meet CSUF goals within their budget and deliver in a timely manner. Despite the rising costs, we were able to hold down expenditures for CSUF.”
Nor-Tech has a long history with CSUF—highlighted with delivery of the Kepler Cluster in 2015. Since then, CCAM has worked with Nor-Tech almost continuously for additional nodes, service, support, etc.
Professor of Mathematics and Director of CCAM Sam Behseta, said, “Because of our long relationship, we have come to trust that Nor-Tech provides high level services in terms of the quality, speed and efficiency of their service—that is important for large scale projects.
Before coming to CSUF, Emerio worked in the aerospace industry for several years sourcing technology from some of the largest vendors in the HPC space. “I know what a difference accessibility makes and that was one of the reasons I enjoy working with Nor-Tech,” he said.
Our Solution
Once CSUF decided to go with Nor-Tech for the new cluster, Emerio worked closely with Tom and Nor-Tech’s engineering staff to review components and maximize the technology and capacity to make sure CSUF was spending its money where it would have the most impact.
Nor-Tech developed the cluster from the ground up—configured it to Emerio’s specifications and integrated 70 software applications. The end result was a turnkey cluster with an open source job scheduler and cluster management featuring:
• Intel Cascade Lake-R Compute Cores and GPUs
• 200TB NVMe Storage
• HDR Infiniband Interconnect
• QCHEM, Orange FS, Tensorflow, Nagios, Novoplasty
• Seismic Region Bolt-down Kit
“We wanted to go with open source because we had a grant with a one-time budget and we had to use that upfront,” Emerio explained. “So there would be no money to pay additional licensing fees down the road.” After discussing their situation with Tom and Nor-Tech Vice President of Engineering Dom Daninger, Emerio opted for Open HPC because of the depth of support resources.
“Tom thinks like an engineer,” Emerio said. “And Dom understands that technology is immense and moves fast. He provided me with a lot of informative articles and we were able to brainstorm.”
Emerio continued, “We needed a lot of applications installed partly because some applications are prerequisites for other applications and partly because there are six high usage researchers that each needed their own applications. Installing all of these applications prior to deployment would eliminate a lot of downtime later on.”
“We built the cluster, customized it and invited Emerio to remotely dial in to run jobs and tests to get benchmarks,” Tom said. “We wanted to make sure they were getting the kind of performance they expected.’
As with all clusters, Nor-Tech’s engineers labeled all of the cables and ports before packing, shipping and conducting the onsite install, which included racking and stacking the nodes and reconnecting all cables. Nor-Tech also created a customized Quick Start Guide. By the time Nor-Tech engineers left the site, researchers were running jobs.
Their Success
The new high performance cluster is named “Turing,” after English mathematician Alan Turing. “It is performing well,” Emerio said. “But it’s like a Ferrari or any other high performance machine, there is always a tuning process.”
“They had no upfront costs and will have no recurring costs for cluster management and job scheduling,” Tom said. “Having open source cluster management and job scheduling utilities will save CSUF $5,000-$10,000 per year. Emerio’s overall knowledge of clusters and software packages was instrumental to being able to get the cluster built and shipped on time. Sam and Emerio are both outstanding people and have been a pleasure to work with.”
“I would definitely recommend working with Nor-Tech,” Emerio said. “They are a great HPC vendor to partner with. I have been working with Tom for about four years. He is at the top end of high performance computing expertise and is well connected in the distribution channel. This makes it easier to coordinate with vendors and schedule meetings. When Tom speaks with a vendor on our behalf, he usually comes back with the response we were hoping for. I was very impressed as I worked with him throughout the whole process.”
Sam agreed. “I worked with Tom very closely. As a large university with 40,000 students and a staff of 10,000, it’s important to work with a partner that understands the needs and dynamics of institutions like ours. It is also important to work with partners that are responsive. Tom’s leadership has been very vital. At the end of the day it’s people like Tom that make these things happen. We would of course recommend working with Nor-Tech.”
Sam concluded, “This cluster will benefit 1,000s of students and faculty in our institution. It’s a major contribution. We depend on grant money; that is extremely important to us and why we are very careful in choosing our partners.”
To learn the benefits of of an Open Source Cluster
Contact Nor-Tech
Email: info@nor-tech.com
Call 952-808-1000; toll free: 877-808-1010