IBM Rounds Out Power9 Systems For HPC, Analytics
May 9, 2018 Timothy Prickett Morgan
Back in the early 1990s, when IBM has having its near-death experience as the mainframe business faltered, Unix systems were making huge inroads into the datacenter, and client/server computing was pulling work off central systems and onto PCs, the company was on the ropes and probably close to bankruptcy. At the time, the Wall Street Journal ran a central A1 column story, where a bunch of CIOs who were unhappy with Big Blue were brutally honest about how they felt.
One of them – and we have never been able to forget this quote – who had moved to other systems quipped: “You can find better, but you can’t pay more.”
Ouch. A lot of things have happened in those 25 years, and one of them is that IBM turned around its Power chip and system business and eventually vanquished rivals Sun Microsystems and Hewlett Packard from the glass house. To be sure, the Unix server business is considerably smaller than it once was, but IBM has embraced Linux and has created a compelling set of processors, with lots of I/O that is not available on Intel Xeon or AMD Epyc processors, that give it a credible chance of stealing some market share from X86 platforms. At least by IBM’s numbers, it is the “Skylake” Xeon SP platform that is the pricey one, and with the “Boston” Linux-only Power LC921 and Power LC922 machines and updated “Newell” Power AC922 machines that sport more GPUs as well as water cooling that IBM is launching this week, Big Blue is rubbing it in a little in the hopes of taking more market share away from Xeons for data analytics and traditional HPC simulation and modeling.
New Twist On The Newell
Let’s start with the tweaks to the Newell system first. The Power AC922 is the commercial version of the server nodes that are used in the “Summit” supercomputer at Oak Ridge National Laboratory and its companion “Sierra” system at Lawrence Livermore National Laboratory (which has a different configuration). The initial AC922 system, which launched in December last year, was a air-cooled machine that crammed two “Nimbus” Power9 processors and four “Volta” Nvidia Tesla V100 GPU accelerators into a single 2U node; this is the node that is used in Sierra. Back in December, IBM promised that it would deliver a variant of the machine that had six Tesla V100s in a water-cooled chassis, which is the node that is used in the Summit machine. With this week’s launch, IBM is making good on that promise.
The air-cooled Power AC922 supports the Volta GPUs with either 16 GB or 32 GB of main memory, and has two processor options: a pair of 16 core Power9s running at 2.7 GHz (3.3 GHz turbo speed) or a pair of 20 core Power9s running at 2.4 GHz (3 GHz turbo speed). The machine has eight memory slots per socket, and supports memories in 8 GB, 16 GB, 32 GB, 64 GB, or 128 GB capacities, and that memory gets more expensive per GB as the sticks get fatter. The machine tops out at 2 TB of main memory with those 128 GB sticks, although most HPC shops won’t shell out for more than 256 GB or 512 GB of main memory so it is moot. (It is hard to say what AI shops will do once they figure out coherence between the CPU and GPU memories.) This air-cooled AC922 has two 2.5-inch media bays, an integrated (non-RAID) SATA controller, a single 100 Gb/sec InfiniBand or Ethernet slot, and four PCI-Express 4.0 slots, with three of them able to support IBM’s CAPI coherence with system memory across those PCI slots. The machine supports Red Hat Enterprise Linux 7.5 or Canonical Ubuntu Server 18.04.
The base air-cooled Power AC922 costs $5,100. The 16 core Power9 processor costs $2,999 and the 20 core Power9 costs $3,999. IBM is charging $11,499 for a Tesla V100 with 16 GB of frame buffer memory and $15,499 for the V100 with 32 GB. For workloads sensitive to memory bandwidth, you want to fill all 16 memory slots on the box, so that means using 16 GB sticks, which cost $619 or $39 per GB, to reach 256 GB or 32 GB sticks, which cost $1,179 or $37 per GB, to reach 512 GB. The 64 GB sticks cost $2,699 or $42 per GB, and the 128 GB sticks cost $9,880 or $77 per GB. With two 3.2 TB NVM-Express flash drives, 256 GB of main memory, and a dual-port 100 Gb/sec EDR InfiniBand adapter, a power AC922 with two 20 core Power9s and four of the 32 GB Voltas costs $133,215 at list price. That extra GPU memory really adds up in the overall cost, but the other components are not cheap, either.
The air-cooled Power AC922 node with support for four or six V100s, which is akin to the node used in the Summit machine. (The version launched back in December had only four V100s, and they had only the 16 GB memory capacity on those GPU accelerators.) This system has two CPU options, but they are different from those in the machine above: an 18 core Power9 chip running at 3.15 GHz (3.45 GHz turbo, costing $3,499) and a 22 core Power9 running at 2.8 GHz (3.1 GHz turbo, costing $4,499). The peripheral slots and options are the same, as are the operating systems supported. The top-end version of this Power AC22 machine, using the 22 core Power9s, with 512 GB of main memory and six of the 32 GB V100s and the same InfiniBand setup will set you back $166,213 at list price.
Both of these machines will be available on May 25.
These may sound like high prices, but those GPU accelerators are a lot more expensive than anyone thought they would be, and this has as much to do with the complex manufacturing for them as it does the demand for these chips among cryptocurrency enthusiasts and gamers. There is more GPU compute demand than there is supply, and that has driven up prices considerably, as we have calculated recently. Memory and flash prices are also higher than anyone expected two years ago. It is a good thing that HP)C and AI shops can justify these high prices, and those memory, flash, and GPU price increases affect all system makers, including those peddling Xeon and Xeon Phi gear as well as AMD Epyc gear.
Big Data Engines
While you can add GPU or FPGA accelerators to the two other machines IBM is launching this week, code-named Boston, their main purpose is to run Hadoop batch and Spark in memory processing or to support NoSQL storage such as MongoDB, Cassandra, or Redis.
There are two Boston machines, the Power LC921 and the Power LC922, and as the names suggest, these are Linux-only compute engines that are based on the Power9 processor, they have two processors in a box, and they come in either a 1U or 2U chassis.
Here is what the Power LC922 motherboard, which is a minimalist board indeed, looks like:
Here is what the two machines look like:
The Power LC922 is distinct from the “ZZ” Power L922 Linux-only system that was launched earlier this year, which has a lot more peripheral expansion room. The Power9 chips used in the Power LC922 come in three varieties:
- 16 core running at 2.9 GHz, costing $2,399
- 20 core running at 2.7 GHz, costing $2,799
- 22 core running at 2.6 GHz, costing $2,999
The Power LC922 is aimed mostly at data analytics workloads where a certain amount of local storage is expected on each node. The machines support up to 2 TB of DDR4 memory across 16 memory slots, and the box also has room for two dozen 2.5-inch or a dozen 3.5-inch drives, supporting up to 120 TB of disk or 45.6 TB of flash storage. The spec says there is only room for a dozen 2.5-inch drives, but it is wrong. Here is a picture showing the small form factor drives that proves it:
There are four 2.5-inch bays that are enabled with the NVM-Express protocol to boost the speed of the flash. The machine has six PCI-Express 4.0 slots of various sizes (one x16 and three x8) and three of the slots are CAPI 2.0 capable.
IBM does not charge for the base Power LC922 node, which is one way it keeps the cost of the Linux-only machines lower and therefore competitive with Xeon machines. Prices for memory, network adapters, disks, and flash are all over the pace, depending on capacity and speed as you might expect. Oddly enough, the memory is a little more expensive on the Power LC922 than on the Power AC922. Particularly for those dense 128 GB sticks, which cost $13,299 each or $104 per GB. The premium is only 1.8 percent on 16 GB sticks, is a little more than twice that on the 32 GB and 64 GB sticks, but is 35 percent on the fat 128 GB sticks. NVM-Express drives in a 2.5-inch form factor are not cheap, either, costing $,1391 for a 960 GB unit, $5,574 for a 1.6 TB unit, and $11,999 for a 3.2 TB unit. If you get a pair of the fastest CPU cards, put 1 TB of memory (filling all of the slots so as to get the maximum memory bandwidth), add four of those 3.2 TB midrange NVM-Express drives, eight 10 TB drives, and a two-port 25 Gb/sec Ethernet adapter, this machine comes in at $88,169. This is a very hefty configuration, and you can get an all-disk, lightly configured node for a lot less, with the 16 core Power9 chips, 512 GB of memory, and a dozen of the 10 TB drives (perhaps good for Hadoop), this machine comes in at $46,097 per node.
You have to really price out configurations for different scenarios. And you also have to realize this is list pricing, which no one pays and particularly not customers displacing Xeon iron and buying dozens to hundreds of nodes.
The Power LC921 is a skinnier 1U machine with two sockets, and it is aimed more at data ingest rather than data analytics workloads, says Dylan Boday, senior offering manager for Power Servers within the Cognitive Systems division at IBM. The Power LC921 offers a 16 core Nimbus chip running at 2.2 GHz with a 140 watt thermal rating or a 20 core chip running at 2.13 GHz that comes in at 160 watts. This machine has four drive bays, which can support either 2.5-inch or 3.5-inch media and which all can support NVM-Express. With the 20 core chip, 512 GB of memory, and four 10 TB drives, the Power LC921 has a list price of $32,737. It costs $48,153 if you swap out those disks for the 3.2 TB NVM-Express flash drives.
Up next, we will walk through IBM’s own competitive analysis for these Power9 platforms against the Skylake Xeons from Intel.