IBM’s 2018 Rollout Plan For Power9 Systems
February 6, 2018 Timothy Prickett Morgan
In a way, the processor market started moving in slow motion through 2017 as server makers and their customers were awaiting a veritable cornucopia of processor options, something the industry has not seen in many a year. We have been predicting that there would be a Cambrian Explosion of compute, first in 2017, but it has taken a bit longer for many of these processors to come to market and it looks like 2018 might be the year.
This might be, in fact, the year when IBM’s Power RISC processors see a long-awaited resurgence, and frankly, if it doesn’t happen this year, it is hard to imagine the conditions under which such beasts as the “Nimbus” and “Cumulus” variants of the Power9 chips could do better, against both the prior chips in the Power family (the competition there is really Power7, Power7+, and Power8) as well as against the X86 and Arm competition and the vast installed base of older Intel Xeon iron.
IBM launched the first of its Power9 machines in early December last year, the “Newell” Power AC922, which was also previously known as the “Witherspoon” system in some earlier roadmaps from Big Blue. This machine has two Power9 processors, with up to 1 TB of memory per socket using industry standard 64 GB DDR4 RDIMM memory sticks. We are beginning to think that Witherspoon is the water-cooled variant with support for six “Volta” Tesla V100 accelerators that is being employed in the “Summit” supercomputer in Oak Ridge National Laboratory, and the Newell machine is the more standard variant that has only four Volta GPUs in it and is like the system being deployed in the ‘Sierra” supercomputer being installed at Lawrence Livermore National Laboratory.
In any event, as we pointed out back in December, the air-cooled version of the Power AC922 started shipping on December 22, and from what we hear from customers looking to buy one, IBM’s order book is jammed packed and the back of the waiting line is now out to June. (This is thin data, so take it with a grain of salt.) The water-cooled version with four GPUs will ship in the second quarter, as will the six GPU variant, which does not have an air-cooled option.
The new bit of data we have learned is that IBM will also be shipping a 128 GB memory stick starting in the second quarter that will boost the memory capacity on the Newell/Witherspoon system to 2 TB per socket, as well as a skinny 8 GB memory stick that, frankly, probably will not be all that useful except in cases where memory bandwidth is important but memory capacity is not; using 8 GB memory sticks across sixteen memory slots only yields 128 GB of capacity, but it delivers the full 170 GB/sec of peak bandwidth that the 24 core “Nimbus” scale out version of the Power9 chip can deliver with standard memory. You can get higher memory bandwidth with the “Centaur” memory buffer and L4 cache chip, but this adds a lot to the cost and apparently will only be used in NUMA systems with four, eight, twelve, or sixteen sockets in a single system image. IBM is shipping Power9 chips with either 16 cores or 20 cores that run at a base frequency of 2.25 GHz in these machines within a 190 watt thermal envelope. The 16 core version can turbo up to 3.12 GHz and stay within 250 watts, while the 20 core has to gear down to 2.8 GHz to hit that thermal ceiling.
The Power AC922 is aimed data analytics, HPC simulation and modeling, and machine learning workloads where GPU acceleration is required.
The rumors going around now are that sometime before the first quarter is done — and we thought it was very likely to be at IBM’s Think 2018 extravaganza in Las Vegas being held from March 19 through 22 but now we are hearing it will be sooner, on February 13 — IBM is going to launch a system code-named “ZZ” that will be the plain vanilla Power9 machine aimed at customers who want systems with one or two sockets and who only want CPU computing. The ZZ system, presumably code-named for the rock band ZZ Top, is aimed at traditional AIX and IBM i customers doing back-end processing and who need a fair amount of local storage to run their applications and databases.
AIX and IBM i are big endian operating systems (describing the order in which bytes are stored and processed), and big endian variants of Red Hat Enterprise Linux and SUSE Linux Enterprise Server can be run on top of PowerVM in logical machines as well. (The Power8 and Power9 chips can support bytes stored in big or little endian formats.) The ZZ systems run IBM’s PowerVM hypervisor and a set of homegrown microcode to manage their hardware components. This is in contrast to the Newell/Witherspoon machines, which sport the OPAL microcode that was developed in conjunction with Google as part of the OpenPower consortium and that is used in conjunction with the OpenKVM implementation of the KVM hypervisor that IBM created explicitly for its
These AIX and IBM i customers are important to HPC and AI shops because they are the backbone of the Power Systems business, and while they do not drive big revenues per system, it does add up across a few hundred thousand global customers. IBM needs to keep these customers happy to extend Power into new areas.
As far as we know, IBM plans to deliver a Power S914, which is a single-socket machine with a 4U enclosure that has lots of room for local disk and flash storage; there will also be a two-socket Power S924 variation on the ZZ system theme. IBM also plans to have a Power S922 machine, which is a two-socket system that crams all of the components into a 2U form factor, but which obviously has a lot less room for storage expansion even if it can support up to 2 TB per socket with those fat 128 GB memory sticks. It is not clear if IBM is supporting variants of the Nimbus Power9 chip in these ZZ systems with SMT4 simultaneous multithreading, which provides a maximum of 24 cores per socket, but we have heard that it will support the SMT8 variants that provide up to 12 fatter cores per socket.
A variant of this latter machine, called the PowerS922L, will be essentially the same two-socket, 2U machine, but will only run the OPAL/OpenKVM software and will only support Linux operating systems in little endian mode. If history is any guide, the base machine, processor cards, core activations, memory, and storage costs on this Power S922L machine will all be lower than on the plain vanilla Power S922, helping IBM make inroads against the Intel Xeon base running Linux without sacrificing the margins on AIX and IBM i machines.
IBM is also reportedly planning another variant, code-named “Boston” and said to be branded the Power S921LC and the Power S922LC, and as the name suggests, it will be another Linux-only machine running the OPAL/OpenKVM stack and the little endian Linuxes from Red Hat and Canonical (Ubuntu Server in the latter case). This Boston server will come with one or two sockets in that 2U form factor and will have its memory cut back a bit, room for two Nvidia PCI-Express GPU accelerators, and lots of storage bays. It is aimed at HPC workloads with the need for local disk and flash storage as well as open source databases like PostgreSQL and MySQL and data stores such as Redis and Memcached. We hear that a variant of this Boston machine will also be used as the main node in future hyperconverged storage clusters based on the Nutanix Enterprise Computing Platform, which was ported from X86 to Power last year.
We caught wind of the “Fleetwood” high-end NUMA servers and their “Mack” system interconnect controllers last October, so you already know about the Power E970 and E980 high end box, which will support from four to sixteen sockets. The word we hear is that this big bad iron will be launched in the third quarter of this year, a bit later than we expected but consistent with past rollouts of Power7 and Power8 high-end machinery. We have subsequently learned that the “Zeppelin” machine, to presumably be sold as the Power E950, will also come out in the third quarter as well. These big iron machines will run the “Cumulus” Power9 variant of the processors, which have SMT8 threading and only twelve cores per socket; they will also use memory that is buffered with those Centaur chips made by IBM. Zeppelin and Fleetwood are big endian systems, and that means they support AIX, IBM i, and the big endian Linuxes from Red Hat and SUSE Linux.