It is certainly true that no technology company can grow if they are not able to do business in China. But it is equally true that no technology company can grow without doing business in the United States or Europe, too. And sometimes we forget, in watching the eagerness of US-based tech companies to do backflips to get access to China that the many big tech companies in The Middle Kingdom want to do business here in the US of A. And they are setting up local companies to do so and leveraging every opportunity to make the case as to why they are as much an indigenous supplier as Dell, Hewlett Packard Enterprise, Supermicro, Lenovo, or IBM.
Inspur is probably the best example of this in the datacenter. The company has vaulted into the upper echelons of the server market, thanks in large part to its large deals with search engine giant Alibaba, online retailer Tencent, and telecom giant China Mobile, among many other companies operating at or near hyperscale. We will get into a much deeper dive into who Inspur is and the many arenas that it plays in within the IT sector at another time, but some statistics are in order here before we talk about what Inspur is doing in conjunction with the Open Compute Project, the open source hardware foundation started by Facebook almost eight years ago.
The Chinese IT company was founded in 1945, and has four different divisions that are listed separately on the stock exchanges in Shenzhen, Hong Kong, Shanghai, and Beijing (NEEQ over the counter in that last case). The conglomerate had 27,200 employees (about 14,100 of them are hardware and software engineers of some sort) as 2017 came to a close and booked $12.6 billion in revenues for products sold across 108 countries. Inspur has a small but fast-growing presence in the United States, with its regional headquarters located in Freemont, California, a hotbed of system integration that also is the home of Supermicro and a number of other server players.
In the hyperscale and cloud sectors, Inspur has a very big piece of the action in China, with Alibaba being its flagship customer and where it holds more than 50 percent of the purchase revenues that Alibaba makes for the Chinese search engine giant. Inspur started shipping “Skylake” Xeon SP systems to Alibaba in May 2017, months ahead of their launch by Intel, as part of a $735 million server upgrade cycle that was booked in the first half of last year. At Baidu, Inspur has attained 60 percent market share of servers and 100 percent share of GPU accelerated servers for machine learning workloads. At retailer Tencent, Inspur has 30 percent share of the server base, and has been instrumental in bringing AMD Epyc server chips and various FPGA accelerators into the Tencent datacenters. Add it all up and Inspur has about half of the plain vanilla server shipments and about 80 percent of the GPU accelerated machine learning shipments among the hyperscalers and cloud builders in China.
John Wu, who is chief technology officer at Inspur, tells The Next Platform that the company’s server business has a compound annual growth rate of 40 percent between 2014 and 2017, inclusive, and that through September of this year Inspur had server bookings of $4 billion and thinks it will reach $6 billion for all of 2018. Inspur’s business in the United States is growing a lot faster, and has tripled over the past year; Inspur did not divulge what the revenue stream is here. Clearly, Inspur wants it to be more, and it thinks that it can compete against all of the players mentioned above plus Cisco Systems and a number of niche players in the HPC market and the ODMs in the hyperscale market to try to do that successfully. (It will be tough – if not impossible – to dislodge Quanta Computer and Foxconn from Google and WiWynn from Facebook, to be sure.)
But the server business is no longer about buying parts in bulk and bending metal, and has not been for quite some time. There are volume economics that come into play, to be sure, and Inspur, as one of the top server manufacturers in the world, can command the best prices from parts suppliers. But as we have pointed out before, a lot of the profit within a server ends up with the CPU, memory, flash, and operating system makers (the latter at least for enterprise customers who do not create their own Linux distributions and who often also adopt Microsoft’s Windows Server). There certainly is very little profit left in server designs, which is why Inspur is enthusiastic about opening up its hardware specifications.
“We have a very good footprint in open source designs, with Open Compute Project, the “Scorpio” Open Data Center Committee, and the Open19 efforts,” says Wu, referring to the Facebook effort as well as the companion one launched seven years ago by Alibaba, Baidu, and Tencent as well as the 19-inch rack derivative of the OCP designs that is a separate effort. “We think that open source hardware will be a foundation for hyperscale and cloud services of the future, and that sharing the designs is important because in the future, the competition will extend to the full stack, not only the software. These companies take open source hardware and change it to provide differentiation in the market, and now based on their competition between themselves, they need more intensive competition with the hardware. We think that the server business comes down to providing services like system design and even manufacturing is a kind of service. Server vendors have value in this market because they can provide those services. We do not expect to get extra money to keep designs confidential. We use our labor to get the revenue as we serve the customers, and this is how we are different from the old server business model.”
Inspur has been involved in some fashion with the OCP effort for the past two years, and started out by provided its own manufacturing capability of the existing OCP machinery. Now, it is moving on to donate designs of its own to the community backed by its own manufacturing capability both in the United States and abroad but also allowing for competitive manufacturers to adopt its OCP designs and sell them.
Every hyperscaler and cloud builder wants to spread out its risk across a broader supply chain while at the same time negotiating with as few suppliers as possible to get the best volume economics. The open designs allow this to happen more fluidly and it is preferable to having two different OEMs with radically different server designs, baseboard management controllers, system management software stacks sharing the revenue stream and bifurcating the datacenter.
The first Inspur design, Compute Node 1 (also known as ON5163M5), is a two-socket machine that is 1OU rack units high that is a third of an Open Rack unit wide that allows for up to 96 nodes to be put into an Open Rack V2.0 cabinet. This system supports two processors (they are “Skylake” Xeons at the moment but there is not reason a motherboard for AMD “Naples” Epyc or IBM “Nimbus” Power9 or Cavium ThunderX2 chips could not be hacked into the node) and has sixteen memory slots across the nodes. This one has a single M.2 flash card interface plus one Open Compute NIC and a single PCI-Express x16 slot that can be equipped with 10 Gb/sec, 25 Gb/sec, or 40 Gb/sec network interface cards additionally or any other PCI-Express 3.0 x16 peripheral card). This particular machine is aimed at search engine processing, machine learning inference, and data analytics, according to Wu.
Compute Node 2 (product number ON5283M5), is aimed at for data processing acceleration, I/O expansion, transaction processing, and image search applications, according to Wu, and it is a two-socket machine that is 2OU units high that has sixteen memory slots, one M.2 flash slot, four 2.5-inch media bays that can support disk drives or NVM-Express flash drives, and three PCI-Express 3.0 x8 slots.
Compute Node 3 (product number ON5273M5) is aimed at network function virtualization workloads at hyperscalers, cloud builders, and telecommunications companies, and as such it is a two-socket, three-wide node that has sixteen DIMMs across those two sockets plus two M.2 flash cards, two drive bays (again supporting disk drives or NVM-Express flash drives) and two PCI-Express 3.0 x16 slots.
For disaggregated storage servers, Inspur is opening up its Storage Node design (product ON5266M5), shown above, which is two rack units high and a full rack unit wide that created a JBOD arrays supporting 34 drive bays that can support a mix of disk, flash, and NVM-Express flash drives. This can be used a storage expansion module for compute nodes or as a storage pool for a rack of servers connected to it using SAS switches embedded in the JBOD.
That leaves the AI Node (product ON5488M5), shown above, which is a two-socket Skylake server that uses a two-tier PCI-Express switch network to lash the PCI-Express GPUs – sixteen of them – to each other and to the CPU complex. This is a single-wide server that is 4OU high.
This is as much GPU expansion as Nvidia is offering with its DGX-2 system but without having to resort to using NVSwitch. It would be very interesting indeed to pit these two machines against each other and see what the performance and bang for the buck is between them. We have no doubt that the DGX-2 would win on performance, but the Inspur AI Node machine might handily beat it on price/performance, performance per watt, and cost per performance per watt.