Cisco Connects With SGI For Big NUMA Iron
June 28, 2016 Timothy Prickett Morgan
When supercomputer maker SGI tweaked its NUMA server technology to try to pursue sales in the datacenter, the plan was not to go it alone but rather to partner with the makers of workhorse Xeon servers that did not – and would not – make their own big iron but who nonetheless want to sell high-end machines to their customers.
This, company officials have said all along, is the only way that SGI, which is quite a bit smaller than many of the tier one server makers, can reach the total addressable market that the company has forecast for its UV 300 series of machines.
To that end, SGI inked a reseller agreement with Dell last July, which lets the second-largest peddler of servers in the world add UV 300 machines to its portfolio and chase sales of systems that are intended to run SAP HANA on a larger scale than is possible on generic four-socket machines that Dell sells based on Intel Xeon E7 processors. In February this year, SGI made a deal with Hewlett Packard Enterprise, which has its own “Kraken” 16-socket Xeon E7 system for running HANA, to have the world’s largest server maker put its brand on the eight-socket variant of the UV 300 machine and resell it as the Integrity MC990 X, which scales up to eight sockets.
This week, SGI has lined up another big reseller partner – Cisco Systems – which also has shown no desire to make its own high-end NUMA machines but which definitely wants to be able to ride the wave of SAP HANA sales up and also take on other in-memory workloads, like Oracle databases or Spark frameworks, where having a large shared memory space provides substantial performance benefits.
Under the deal with Cisco Systems, which was formalized at the Sapphire customer event that SAP held a few weeks ago, the networking giant and server upstart has a broad agreement to resell the UV 300 product line, including the UV 300H machines tailored specifically for running SAP HANA, as well as the UV 300RL for running Oracle databases in memory and the generic UV 300, which is not tuned for anything in particular.
No matter what style of UV 300, the machines employ the NUMAlink 7 interconnect to lash together Intel’s Xeon E7 processors and do so in an all-to-all topology that provides consistent latency between the nodes in the NUMA cluster. The sub-500 nanosecond, consistent latency made possible by the NUMAlink 7 interconnect is the critical thing for in-memory applications, Bill Dunmire, senior director of product marketing at SGI, tells The Next Platform.
The in-memory and large-scale relational database markets are very big ones to chase, with SAP having over 291,000 customers and Oracle and Intel sharing another 200,000 customers. With somewhere between 5 percent and 10 percent of those customers needing something larger than the eight-socket machines that server makers can build using stock Intel Xeon E7 parts, SGI has a chance to significantly expand its UV systems business, which has largely been focused on the UV 3000 line that stretches NUMAlink interconnects over much larger systems with much larger latencies that vary within the system as happens in a supercomputer by necessity.
While Cisco has built a significant server business with its Unified Computing System blade and rack machines over the past seven years and it could make its own NUMA node controller to extend the Intel architecture beyond four or eight sockets, for the moment it is focused primarily on the SAP HANA opportunity. While data warehousing applications can be scaled across multiple nodes and run decently, when running online transaction processing, HANA really requires a single memory space to play in and that is why a shared memory machine is important. With this agreement, Cisco can sell UV 300H machines that are certified to support up to 20 sockets with up to 20 TB of main memory, and in a pinch customers who need more capacity than this can run on UV 300H machines that can scale to up to 32 sockets with up to 32 TB of shared memory. (The SAP certification process is rigorous and takes time.)
It is interesting to note that HPE has certified its Kraken Superdome X system to support 16 sockets and 16 TB of memory, but that it is OEMing the UV 300H to get an eight-socket machine and is using Intel’s chipsets in its ProLiant DL580 system for a four socket system. This means that customers starting from the ProLiant cannot easily expand from 4 to 8 to 16 sockets on the same system, something you can do with the UV 300H. HPE could just resell the entire UV 300H line, as Cisco and Dell do, but that might cramp sales of the homegrown Superdome X system. HPE has updated the Superdome X, which is a variant of its high-end Superdome Itanium system that was tweaked to support Xeon E7 processors, to use the “Broadwell” Xeon E7 v4 chips announced in June by Intel, just as SGI is able to use these Broadwell Xeon E7s in the UV 300 lineup.
The Superdome X machines scale up from two to sixteen sockets, and as long as customers know they won’t need more capacity than that for their SAP HANA workloads and they start with Superdome X instead of a ProLiant, they could stay in a single server architecture. The Superdome X architecture could scale to 32 sockets and 48 TB of memory using 64 GB memory sticks if HPE wanted to push it, by the way, if the Superdome scalability is any guide. In the long run, as HPE focuses on other efforts like The Machine, it would not be surprising to see HPE more broadly adopt SGI’s UV 300 machines – or perhaps even go so far as to acquire SGI to take on a resurgent Dell and an awakening IBM in the HPC sector. (Lenovo could also be a potential acquirer of SGI, too.)
As for SGI, Dunmire says that the company is working on support for 128 GB memory sticks in the UV 300 line, which would allow for main memory to be pushed up to 128 TB across a single system image if it were not for the 48-bit memory addressing limit on the Xeon processor and its 64 TB capacity ceiling. The UV 300H would probably top out at 64 TB using these fat memory sticks, which is more than three times the certified top-end capacity for SAP HANA that SGI has attained thus far. When this fatter main memory will be available is a bit of a mystery.
The UV 300 line is able to use a bunch of different Xeon E7-4800 and E7-8800 processors from the “Haswell” v3 and “Broadwell” v4 processor generations, but for production workloads, the 18-core E7-8890 v3 and the 24-core Xeon E7-8890 v4 are the only two that are certified by both SGI and SAP.
So who is next to ink a reseller or OEM deal with SGI? Lenovo is an obvious choice, and so are possibly Fujitsu, Hitachi, NEC, and Bull, which all have Xeon server businesses, and so are upstarts like Inspur, Huawei Technology, and Sugon in China as well.
“We will certainly entertain opportunities as they present themselves to us,” says Dunmire. “Today, we have Dell, HPE, and Cisco, and tomorrow we will see. We are moving to a very channel-focused strategy and to leverage the presence and sales forces that these companies bring. We look at each deal individually and we are not opposed to any other agreements.”