OpenPower Collective Opens For System Business
March 20, 2015 Timothy Prickett Morgan
It has been about a year since the OpenPower Foundation was formed to provide a development locus for systems based on the Power8 processor and its varied I/O technologies. It has taken time for IBM to open up the technology behind the Power8 chip and for others, such as Google, to create open source versions of the microcode that runs on the processors. The first non-IBM variant of the Power8 chip is sampling, at least one Linux is fully ported to it, and fifteen different components and systems were on display at the inaugural OpenPower Summit in San Jose this week. Now it will be to system builders to make their cases to customers, and they will likely find that people are receptive to what the OpenPower partners have to say.
And not just to give them leverage with their X86 system vendors, either. There is more to this OpenPower effort than that.
First and foremost, the Open part is more important than the Power part of the effort that IBM started back in August 2013 with the help of search engine giant Google, GPU maker Nvidia, switch chip maker Mellanox Technologies, and motherboard maker Tyan. There are now 113 different organizations in the OpenPower Foundation, and they range from national labs and government agencies that are funding pre-exascale supercomputers to system builders who are tightly linking components like GPUs, FPGAs, and networking cards to the Power8 chip through specialized interconnects that will allow low latency and sometimes high bandwidth connections between those peripherals and the Power8 chip. In many cases, memory on these devices can be shared in a coherent way with the memory subsystems in the Power8 processor complex, and that not only simplifies programming but can significantly boost performance. With IBM opening up the Power8 chip – IBM has not open sourced the Power8 chip, but rather is making it available for licensing for a fee – and Google standing behind the effort (but not quite admitting that it will ever use Power-based machines in its hyperscale infrastructure), the Power chip can indeed become the kernel of a new kind of hybrid system. Or, even more precisely, Power8 and its follow-ons can become the kernel from which a large number of customized systems that are tailored for very specific uses can come to market. (This was one of the topics of conversation that The Next Platform had with Gordon MacKean, who is chairman of the OpenPower Foundation and who also runs server and storage development at Google.)
The fact that the U.S. Department of Energy is spending $325 million to build two massive, pre-exascale systems based on the future Power9 chips married to the future “Volta” Tesla GPU coprocessors from Nvidia doesn’t hurt the OpenPower collective from a public relations standpoint, either. One may argue that the U.S. government likes to have at least two architectures and multiple suppliers for its largest systems, so that a win the IBM-Nvidia-Mellanox team for the Summit and Sierra machines being installed at Oak Ridge National Laboratory and Lawrence Livermore National Lab, respectively, was almost inevitable once Intel and Cray were tapped to build the other big pre-exascale machines. But a win is a win, and this one is important because unlike some other IBM-led supercomputer deals, the CORAL procurement results in a system that will have direct application to both enterprise and hyperscale applications even before the Summit and Sierra systems are built in 2017. This is the first time that we can recall since the advent of Linux-based Beowulf clusters that a technology approach will trickle up to the stratosphere of supercomputing. IBM’s BlueGene massively parallel machine and the failed Power-based Blue Waters system did not trickle down and find commercial adoption. In fact, the Power7 variant of Blue Waters would have cost $1.2 billion for Big Blue to build and that is why the company pulled the plug on the project.
No one has money for such wasted effort anymore. It is that simple.
The CORAL funding gives the OpenPower partners some time to actually build an ecosystem of motherboard makers, peripherals for accelerating compute and memory and storage, system suppliers, and applications tuned to run the whole gamut. That was the key theme of the OpenPower Summit this week, which was co-hosted inside of Nvidia’s annual GPU Technology Conference in San Jose. The first fruits of the labors of the OpenPower partners were on display, including a number of components, full-blown systems, and some application stacks tuned up to run atop hybrid systems based on Power8 processors.
“Up until now, in warehouse-scale datacenters, one size had to fit all – namely servers running X86 processors. But with computer companies and cloud service operators now able to license and modify Power technology, they are no longer limited to using off-the-shelf components controlled by a single company.” – Brad McCredie, president of the OpenPower Foundation
John Lockwood, CEO at Algo-Logic, which makes high frequency trading systems for the financial services market, gave a presentation at the GTC-OpenPower conferences, outlining a heterogeneous system that combined the Power8 chip with Nvidia GPUs and FPGAs running in both an Ethernet network interface made by Mellanox and a PCI adapter card made by Nallatech that has an Altera FPGA on it. The Tick-to-Trade system that Algo-Logic created was already implemented on X86 iron, but with the Power8 systems, Algo-Logic is able to make use of the Coherent Accelerator Processor Interface (CAPI) to radically reduce the latency between the network card and FPGA card to handle market data feeds and execute transactions based on algorithmic trading.
Speaking to The Next Platform after his presentation, Lockwood said that the hybrid idea is simple enough to express: FPGAs are deployed where you need low latency on transactions, GPUs are used where you need high throughput calculations for the parts of the application components that can be parallelized, and CPUs are used for those portions of the code need fast execution on single threads. Any HFT application has all of these needs, and the interesting thing is that Algo-Logic has put them all into a single system. Lockwood thinks that this hybrid idea is going to catch on, even among the hyperscalers that have tended to like homogenous architectures because they can drive their costs down by having only a few kinds of systems in their massive fleets.
“If you tried to propose a heterogeneous system like this to Google or Facebook a decade ago, you would have been laughed out of their datacenters,” Lockwood told The Next Platform. “While open source software is great and Google and Facebook have made billions of dollars off this software, maybe if you need a million machines to run that software there is a better way.”
To be fair, the hyperscalers are, by definition, operating at quite a different scale than most enterprises, and particularly high frequency trading firms, which tend to have a few racks of gear except for the very largest players. But Lockwood’s point is no less valid and hyperscale companies could find that by breaking up their workloads and reimplementing them on a collection of Power8 systems accelerated through CAPI this year and NVLink with Nvidia Tesla GPU coprocessors next year, they could get a lot more work done with a lot less iron because the systems have lower latency, higher thread count, and more memory bandwidth than X86 systems.
Big Blue Meets Bigger Red
The OpenPower Foundation showed off fifteen different pieces of hardware at the event, a sort of coming out party for the ecosystem. The two-socket Google and one-socket Tyan system boards that IBM showed off last year to whet everyone’s appetite were on display again, and so was a board under development by Rackspace Hosting, which has committed to developing its own Power8-based system, deploying it in Open Compute form factors, and putting it into its cloud alongside X86 iron. (We will cover the Rackspace “Barreleye” system in a separate story.)
The CP1 chip that is under development by Suzhou PowerCore was also on display, and according to Brad McCredie, who is vice president Power Systems development at IBM and president of the OpenPower Foundation, the CP1 chip is a modest variant of IBM’s own Power8 designs, with some modifications related to security which he did not elaborate on further. (That the Chinese market is concerned with security will come as no surprise to anyone.) Based on what it takes to design a new chip, it will probably take Suzhou PowerCore about two years to use the chip design software it licensed along with the Power8 chip to create its own custom variant. All the Chinese chipmaker, which has done custom PowerPC chips for the embedded markets in the past, has said so far about its plan is that it is making a variant of the chip that is suitable for the enterprise market in China.
The CP1 chip will be fabricated by GlobalFoundries in the former IBM chip plant in East Fishkill, New York. The first system to use the CP1 processor will come from Inspur, one of the larger and fast-growing system makers in China, which is putting the processor inside of a 4U system that has two processors, 64 memory slots for a total of 2 TB of memory, a dozen PCI-Express 3.0 slots, and four 1 Gb/sec Ethernet ports. The system has eight 2.5-inch drive bays for local storage. Conceptually, the Inspur machine is not so much different from IBM’s own Power S824 system, except this one will have a processor and a system manufacturer that is thought of as being indigenous.
Indigenous is important to the Chinese government, which has funded its own clones of the Sparc and MIPS architectures. Through various agencies, China put a lot of money into developing the 64-bit “Godson” variants of the MIPS architecture, specifically to have a homegrown chip that could be deployed in enterprise and HPC systems. Given all of the system development around the Power architecture thanks to the OpenPower Foundation and, as we now know, the U.S. Department of Energy, it will not be at all surprising to see a Top 500-class supercomputer based on the PowerCore CP1 chip come to market that looks similar to the Summit and Sierra machines. Nvidia has said that it is willing to license its GPU designs, but as far as we know has not had any takers, but the Chinese government would be an obvious one if it really wants to have a hybrid supercomputer that has a Made In China stamp on it from top to bottom.
The state owned governments in China want systems that are “safe and secure,” as McCredie delicately put it. They want fast, efficient machines but they don’t want to use chips and systems controlled by vendors outside of China for the same reason the U.S. government gets nervous about deploying machines from Huawei Technologies. Everybody is worried about backdoors that they don’t control, even if they might not exist. This is one reason why Chinese startups are taking a shining to OpenPower.
Zoom Netcom, another system maker from China, will be launching a line of two-socket machines based on the CP1 chip, with the brand name RedPower Systems on them, which pretty eloquently sums up the attitude of the Chinese vendors as they embrace the opened up Power8 chip and architecture. The first Zoom Netcom machine will come in a two-socket configuration with 64 memory slots, eight drive bays, and a dozen PCI-Express 3.0 slots, and four 1 Gb/sec Ethernet ports – all crammed into a 4U configuration and very similar to the Inspur machine above. There will be two different variants of the RedPower Systems, the C210 and the C220, which will different from each other in their processing and memory capacities; exactly how was not divulged. Zoom Netcom will also be creating its own Linux variant for these systems, according to McCredie.
ChuangHe, another server maker with expertise in the telecommunications space, has created the OP-1X a two-socket system that comes in a 1U form factor. The OP-1X is a single-socket Power8 machine with 32 memory slots. This machine is heavy on the memory, reasonably heavy on the compute, and light on the I/O, and is using a motherboard designed by Taiwanese motherboard and system maker Tyan.
The first commercial Power8 system to come out under the auspices of the OpenPower Foundation will actually come from Tyan. That company’s TN-71-BP012 system, which is nicknamed “Habanero,” is designed for large-scale cloud deployments and significantly will be used by IBM as it rolls out Power-based systems into its SoftLayer cloud. If you don’t think that IBM is changing its tune about the openness of hardware, remember that it is not using its own Power S812L or Power S824L systems in its own public cloud, but rather machines designed and built by a rival. The Habanero system will be available to IBM and any other buyer in the second quarter of this year, and has 32 memory slots for a total of 1 TB of memory, and the “Centaur” memory controller chips used in the system implement 64 MB of L4 cache. The Habanero system has a single Power8 socket and will use the merchant variants of the Power8 processor that IBM is selling, which have different core counts and clock speeds from the ones it deploys in its own Power Systems lineup. The Habanero system has a dozen 2.5-inch drive bays that can handle hot-swap SAS or SATA units and has four PCI-Express 3.0 slots, two of which are enabled to support CAPI. The system has a dual-port ConnectX-3 10 Gb/sec network adapter card, which is not CAPI-enabled. (Mellanox is showing off a ConnectX-4 adapter than supports InfiniBand and Ethernet protocols running at anywhere from 10 Gb/sec to 100 Gb/sec speeds that does support CAPI links to the Power8 chip, however.)
Wistron, another ODM based in Taiwan, was showing off its “Firestone” system, which is a prototype machine aimed at the HPC market in particular and one that IBM says is a stepping stone to the Summit and Sierra systems being deployed as part of the CORAL procurement. The Firestone system puts two Power8 chips and two Nvidia K40 GPUs in a 2U enclosure, and it has room for two 2.5-inch drives and supports up to 1 TB of memory across its eight riser cards. Interestingly, these riser cards do not use IBM’s high density memory, but regular DDR3 memory that is less dense but a lot cheaper. Incidentally, IBM will be reselling this Firestone machine with its own brand on it.
That leaves the Cirrascale RM4950 developer platform that is based on Tyan’s single-socket Power8 motherboard and puts four Tesla K40 coprocessors and a PCI-Express switch, called the SR3514, developed by Cirrascale into a 4U chassis. This switch is used to glue the four GPUs and the Power8 processor together with peer-to-peer access, a bit akin to a slow-mo version of the NVLink technology that IBM and Nvidia will deploy with the Power 8’ (that apostrophe means “prime”) processors next year.
That is a reasonable amount of development and production iron for a nascent market. McCredie said that there were nearly 100 other pieces of hardware currently in development among the 113 members of the OpenPower Foundation for future systems and that next year he would need a much bigger table.
“No one company alone can spark the magnitude or diversity of the kind of innovation that we are going to need for the growing hyperscale datacenters or what I think of as the next generation of warehouse-scale computing,” McCredie said during his keynote address introducing all the new OpenPower hardware. “Up until now, in warehouse-scale datacenters, one size had to fit all – namely servers running X86 processors. But with computer companies and cloud service operators now able to license and modify Power technology, they are no longer limited to using off-the-shelf components controlled by a single company. Today the limits of the X86 architecture are profound, and reliance on the X86 architecture is constraining companies, industries, and countries that need innovation. The OpenPower community is breaking the locks and opening the doors to the future.”
One other thing that McCredie said, hinting at yet another difference that would help foster the OpenPower ecosystem: “We are going to share the profits so no one company gets more than its fair share of the profits.”
That barb is aimed right at Intel and could start to resonate with other server makers who currently push Xeon chips so hard, particularly if there end up being multiple sources of Power processors and a wider set of feeds and speeds for the chips. It is still early days for OpenPower and its ecosystem, but it only takes a few hyperscalers buying fairly steadily to turn this from an idea into a business. With IBM working with Inspur and Zoom Netcom and, as McCredie pointed out to The Next Platform, also working on the Scorpio common rack project with Chinese hyperscalers Tencent, Baidu, and Alibaba, it looks like the OpenPower partners could get their first wave of big business in the Middle Kingdom. IBM will very likely push OpenPower gear very hard, too, and has committed to reselling two of the machines – the Wistron and Tyan systems above – and will likely tap others as appropriate for its enterprise, hyperscale, and HPC customers.