Cray Supercomputing as a Service Becomes a Reality

For a mature company that kickstarted supercomputing as we know it, Cray has done a rather impressive job of reinventing itself over the years.

From its original vector machines, to HPC clusters with proprietary interconnects and custom software stacks, to graph analytics appliances engineered in-house, and now to machine learning, the company tends not to let trends in computing slip by without a new machine.

However, all of this engineering and tuning comes at a cost—something that, arguably, has kept Cray at bay when it comes to reaching the new markets that sprung up in the “big data” days of Hadoop, when storage and compute performance at scale could be captured on the cheap. Cray systems are a lot of a things, but inexpensive is not one of them—something that no one denies, but those that are dedicated Cray shops overlook when it comes to mission-critical performance.

With both the reinvention and cost elements in mind, it is surprising that until today, renting a Cray super has been out of reach, especially when such live, remote testing of a workload on a commodity versus Cray cluster could bring new converts. In short, the key to continued reinvention is providing the ability for previously out-of-league users to commit and add to the base that informs next-generation machines.

The first company to provide “Cray as a Service” is Boston-based cloud and data center co-lo provider, Markley, which intends to find rich business with the local (Cambridge, MA) genomics and biotech set. While Markley’s CTO, Patrick Gilmore, tells The Next Platform, that they are tracking how Cray’s XC line of more traditional supercomputers might fit the needs of a wider set of users, they are the first to offer the Urika-GX graph analytics appliance for rent on a reservation basis.

Markley was an early player in the datacenter space and has firm footing in biotech-rich Boston. It is housed in a nearly million-square foot facility that serves as the telco routing and switching hub for all of New England. For those local users with fiber in-house, Markley charges a port fee to get around the major problem of bioinformatics clouds–expensive data movement, although the company does web-based business with other life sciences companies elsewhere in the U.S.. Gilmore says that while the larger cloud providers, including Amazon, Google, and others, have indeed built tooling to draw in these same target users in bioinformatics and pharma, the platform is not tailored for performance on key applications and data movement creates an expensive proposition.

The Urika line first emerged when the “big data” hype had the market creating new architectures, hardware and software alike, to grapple with the sudden influx of new information. At that time, Cray’s roots in supercomputing gave it a sturdy springboard for large-scale customers that had complex, HPC-like problems with analytics. Further, the company was able to leverage the custom Aries interconnect story to work beyond just high performance computing by developing hooks for Hadoop, Spark, and other data analytics frameworks.

Cray shared benchmark between its on-prem Urika cluster and standard EC2 nodes. (Data from 2016)

While we do not have pricing for the Urika service, which users have to schedule in advance via Markley, the ROI question that applies to more general purpose clouds is different. As Ted Slater, global head of life sciences and healthcare at Cray tells us, they worked with a next generation sequencing research center in Cambridge with a focus on variant analysis. “The organization used the Urika-GX and was able to hit up to 5X speedup on parts of their overall workflow over the standard cluster they were using.” This means faster time to market in the competitive drug discovery space and for researchers, less time in between results, which means they can stay on the same track without a week or more in between as they hold off for their project data.

“Next generation sequencing is the wild west right now. There are about as many workloads as there are teams working in this space,” Slater says. “Graphs are one way to tackle this problem. There is the Aries interconnect and the Cray Graph Engine, which works with Aries to computer over really massive graphs.”

While this is both a geographically and workload-targeted Cray as a service offering that is focused one appliance, we would hope to see Cray providing its more general purpose, high performance XC line of machines to other cloud providers in the future. These are expensive acquisitions for on-demand providers since these are not systems that can be carved apart for multitenancy, but with a broad enough base of users in an application segment or region.

It is difficult for new users to understand what might be required in terms of portability and operation without taking a Cray system for a spin–and it would open new doors for Cray to have an increased base of users testing out their hardware. Since not every company can afford a Cray system, this could expand their reach by convincing users of the value through a trial–with the potential of those becoming on-prem customers.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.