Shifting Tides for China’s Next Wave of Supercomputers
September 11, 2015 Nicole Hemsoth
While the news of the last few years has clearly indicated that China is at the top of the world when it comes to chart-topping supercomputers, the storyline is evolving—and the picture for high performance computing in Asia is not as rosy as many believe.
To be fair, however, the landscape for high performance computing in China was much more picturesque at this time last year and certainly the one before. The sky appeared to be the limit, with announcements at several of the country’s supercomputer sites–for those dedicated to weather to those crunching larger research problems–about massive upgrades and extension. But then, as is the case in quite a few other areas inside China over the last several months, the tone changed.
Putting aside the larger discussion of the current economic woes in China, there have been two critical blows that have left the country reeling when it comes to the future of their large-scale supercomputing infrastructure. First, and most publicly discussed, is the fact that Intel is restricted from shipping supercomputing processors to China, a major blow for the world’s fastest supercomputer, Tianhe-2, which was set to be upgraded this year. The upgrade cycle for that has been pushed out at least a year, and The Next Platform reported this summer, it will now use an interesting blend of homegrown architectures to power its top super, including an accelerator that is based on digital signal processors (DSPs), which have long since been a research target at the National University of Defense Technology (NUDT) where the Tianhe-2 machine is based.
But it’s not just the very top machine in China that will feel the burn. Funding cutbacks throughout China will ripple through all 19 supercomputer sites, having a big impact on the top ten systems, but the sharpest slice in the top three or four systems, mostly because their upgrades will be the most expensive. Considering that two machines in China were set to get a lift past the 100 petaflop barrier and both were reliant on Intel and government funding, those plans are pushed out a year or more. And another large supercomputer in China, the Sunway BlueLight system, which uses non-Intel (ShenWei) processors will have a tough time getting its petaflop boost as well due to the funding problems.
As one might logically guess, this coupling of funding issues and Intel restrictions might spur an already burgeoning domestic semiconductor and HPC software program that could be best represented by the forthcoming DSP architecture that will be featured on the upcoming Tianhe-2 machine. Indeed, this appears to be plan, with ramped-up investments up and down the stack; from broad machine learning and more specific HPC-focused software initiatives for both research and industry to ongoing hardware investments like the Godson processors that are cropping up in a new field of Chinese-built machines. But as Earl Joseph, program vice president for HPC at IDC, tells The Next Platform, the current norm on the hardware front is hard to beat.
“At this point in time, all of those systems that are built with domestic [Chinese] processors as the primary processor have not been doing well from a performance standpoint—and they are also costly, so the price performance is still not quite competitive.”
Even still, there are options in China that aren’t solely reliant on homegrown architectures or Intel parts that may never make regulation muster. For instance, consider that IBM has already sold IP rights to China to bring OpenPower based systems into the Asian supercomputing fold. And while these chips might be based on IBM tech, as more roll into production on Chinese systems, these will be acceptable in comparison to the whole swath of critical infrastructure in the country that was moved off IBM and other U.S. companies (banks in particular) before the IBM and Lenovo deal went down.
On that note, there are other efforts in China to take a bite out of American tech vendors as the pullback from non-Chinese hardware and software for big infrastructure continues. For instance, the Chinese government provides funding to companies to displace services that American outfits like Google and hardware vendors like Cisco, IBM, and EMC offer. Called the “De-IOE” movement, which stands for IBM, Oracle and EMC, is not to be overlooked and has received a fair bit of attention after Alibaba, Asia’s answer to Amazon, kicked off their IBM and Oracle systems in favor of a homegrown set of approaches in 2013.
As one might imagine, there is something to said for true reliability of time-tested systems from these vendors, but the risk now could mean a reward in the form of a more vibrant Chinese server and enterprise software market in the coming years. And Europe is thinking this way as well as it watches what is happening with ARM throughout the world (although for now, of course, ARM is just a license-driven company versus a hardware manufacturer). And not to make too much of the European side point, consider that when Lenovo set up its worldwide applications development center, it wasn’t in China, it was in Stuttgart, Germany, not to mention the fact that Lenovo is part of the European Technology Platform for HPC supported by the European Commission (a government-sponsored vendor organization to create new HPC technologies).
A Quiet Risk: The Reliability Factor
But all of these efforts aside, the point is, to find their competitive edges, countries are ditching the years of development into IBM mainframes, Oracle’s systems and software, and countless other tech products to blaze these new trails. And that is not without its own risk to quality. It may not be a permanent struggle, but as new platforms based on old ideas spin off, they will not be fail proof (not to suggest any system is free from failures, of course).
And remember the DSP architecture that will be the unexpected accelerator upgrade for what is far and away the fastest supercomputer on the planet? It is an interesting compute possibility–but how will it even perform at scale?
As countries like China and others seek their homegrown approaches to time-worn technologies from the world’s largest hardware and software providers, learning by doing (read as error) is the new development framework. After all, consider the pressures.
If one considers that the vast majority of critical infrastructure in China was running on the one bit of hardware and software that is most time-tested, the mighty mainframe, progress toward homegrown competitive efforts might be lumbering for a time. While it is possible to build equivalent technologies, the years of refinement are not to be discounted, according to Joseph. In the course of conversation, he referenced a presentation at the International Supercomputing Conference from Inspur, the Chinese HPC hardware company that is behind the Tianhe-2 machine, where a national HPC project for the country’s transportation and rail reservation system, was able to handle peak demand for one of the world’s largest traveling populations within China, at 95% accuracy. That might sound great for something where this is a margin of error allowed. But consider an airline or travel booking site that would, 5% of the time, improperly handle the reservation. This would be considered unacceptable, of course, but it comes down to the reliability of the systems.
Still, Chinese companies like Sugon and Inspur are hard at work to deliver the next generation of products that will lock out the need for American tech—even at the risk of having slightly less performance and price attractiveness. However, on the flip side, there are certainly pressures beyond the more recent restrictions on Intel processors, especially for the processors. Just as the United States has made it illegal for its own government branches to buy supercomputers from a non-U.S. company, so too does a similar approach stand in other countries, and where it is not formalized in legal language, there is a definite “favoritism” for in-country or region vendors. For instance, French supercomputer maker Bull does a large swath of its business in France and Europe (although they have secured systems in Japan as well), and Inspur’s business has not translated outside of China.
So what happens with a company like Lenovo then, which is now the representative force for IBM’s pre-existing X86 HPC customers? Lenovo has made it clear that their focus is right where they have the most chance of success—in China, followed second by Europe, and then the U.S. market. Remember, this is for high performance computing systems versus general consumer devices—but they will face an uphill battle for even non-government supercomputers in oil and gas, financial services, and other areas as IBM works to bolster its OpenPower story in the U.S.. We have already watched this U.S. and China tension play out this year in the context of the $44 million NOAA supercomputer deal, which was originally awarded to IBM but following the acquisition, moved into home court advantage with Cray taking over the contract.
IDC’s Joseph says that while he has a great deal of belief in what the Chinese supercomputer market will produce in terms of its own technologies in the coming years, it’s an uphill battle, although having Lenovo as a supplier of X86 parts will lighten the burden. Still, he says, the supercomputing world outside of the U.S. is keenly scouting for alternatives to American-based technology giants and this has European and Asian governments in “all ears” mode when it comes to emerging and established options alike.