A Glimmer of Light Against Dark Silicon
February 24, 2017 Jeffrey Burt
Moore’s Law has been the driving force behind computer evolution for more than five decades, fueling the relentless innovation that led to more transistors being added to increasingly smaller processors that rapidly increased the performance of computing while at the same time driving down the cost.
Fifty-plus years later, as the die continues to shrink, there are signs that Moore’s Law is getting more difficult to keep up with. For example, Intel – the keeper of the Moore’s Law flame – has pushed back the transition from 14-nanometers to 10nm by more than a year as it worked through issues around the smaller manufacturing process, with the first of the 10nm Cannonlake chips due later this year. In addition, as we’ve talked about before, hyperscale organizations like Google continue to diversify their computing platforms, leveraging GPUs from the likes of Nvidia and AMD and its own Tensor Processing (TPU) ASICs along with Xeon server chips from Intel.
With the continued shrinking of the die, other factors also are coming into play, including the power that can be used to power the processors. Many-core processors are playing an increasingly important role in HPC and hyperscale environments as companies and institutions look for ways to more rapidly run their workloads and analyze the data. The continued scaling of CMOS technologies that came from adding more transistors to ASIC designs every two years should lead to the scaling of many-processor systems, according to Milos Krstic, a professor at the University of Potsdam and team leader of Innovations for High Performance (IHP) Microelectronics in Germany. However, as the die shrinks – 20nm to 14nm to 10nm to 8nm and smaller – power consumption to the silicon begins to be a problem because it can’t continue to scale. The result is a situation called “dark silicon.”
“As a consequence, even if the increased number of available transistors is used to implement additional processor cores, all available cores cannot be powered at the same time, in order not to overload the thermal budget of the chip,” Krstic wrote in a recent study. “Dark silicon is potentially very significant issue, practically disabling further simple scaling of homogenous processor architectures.”
In his assessment of asynchronous design methods for dark silicon chips,” Krstic argues that chip and system designers, faced with the growing challenge of dark silicon, will be forced to adopt new architectures and designs in particular asynchronous circuits that can help them address the growing power consumption issue. Rather than scaling through the use of many homogeneous programmable cores, future designs will need to use a lot of dedicated co-processors to run particular tasks in power-optimized fashion that has them using energy only when they’re needed, and sitting idle when not. They’ll have to rely even more heavily on such power reduction technologies as power gating, dynamic frequency and voltage scaling, and adaptive voltage scale, and they’ll have to ensure that performance is available when the application needs it but reduced when it’s not needed. The designs also will need to utilize intelligent power management to make sure the power control mechanisms are used in the most efficient ways.
Asynchronous circuits aren’t new. They’ve been around for many years, with proponents pushing them as energy-efficient alternatives to traditional synchronous designs. While synchronous circuits use a single global control signal, which is active at times even when there is no processing needed in a particular pipeline, asynchronous circuits are only active when workloads are in local pipelines. However, adoption of asynchronous circuits was hindered by complex designs and other issues and they were only embraced by academic researchers and a handful of industry players, Krstic wrote. However, more recently, as the issue of power consumption and efficiency has come to the fore, new design methods have emerged for asynchronous circuits. Among the new methods is the idea of desynchronization, in which a synchronized circuit can be converted into an asynchronous design.
“Using such approaches it has been shown that the significant advantages of the asynchronous circuits in ultra-low power domains can be obtained,” he wrote, adding that in near threshold regimes – because of the lack of global timing and other factors, there can be a 40% improvement in power consumption. “Moreover, the novel aggressive fault tolerant voltage scaling approaches on the asynchronous side, such as recently proposed BLADE show important improvements in comparison with state of the art synchronous power saving architectures such as Bubble-RAZOR.”
Another method for reducing power consumption in asynchronous circuits – proposed by Prof. Alex Yakovlev at the University of Newcastle in England – is “energy modulated computing,” in which asynchronous logic uses the power available to it and adjusts the performance to meet that energy level. It’s a method that’s been used for ultra-low power cases, such as wireless sensor modes.
However, while asynchronous circuit designs are being used effectively in environments where low power consumption is needed, such as the Internet of Things and mobile applications, there are other use cases. “High-performance computing is also affected and dark silicon issues and causes the need to the radical paradigm change” Krstic wrote. “In this context there is a significant chance for the asynchronous logic design methods to address dark silicon problems.”
That includes being used with co-processors, which are more suitable for asynchronous designs than many-core processors. They can be optimized for power control and event-driven, and they don’t need to use a clock source. In addition, such low-power techniques as adaptive voltage scaling and power gating can more easily be incorporated into co-processor designs. Krstic pointed to a study showing that an asynchronous co-processor for Elliptic Curve Cryptography (ECC) consumes a third less power than a similar synchronous chip.
The use of ultra-low power techniques like Near Threshold Voltage Computing also can be used in co-processor designs in dark silicon chips to increase their performance or reduce power consumption.
“Since synchronous design style still has significant merits when it comes to the high performance, due to the maturity of the design tools and simpler control protocol, it is plausible to propose also the use of mixed-mode synchronous-asynchronous logic,” he wrote. “In the context of dark silicon the specific co-processors can be designed in such way that their pipelines can be, with control signal, turned into asynchronous (bundleddata) pipelines. This asynchronous mode of operation could be used in near threshold voltage mode to further reduce the power consumption of the system. In high-performance mode, the regular synchronous pipeline could be activated. The overhead of this technique could be limited to the additional control gates which can be in scaled technologies, with billions of gates available, tolerated in many applications.”
In addition, systems laden with dark silicon will find their way into interconnect environments, where the idea of “networks of chips” has been the focus of academic research but could find its way into commercial environments. The power consumption of switches using asynchronous logic could be reduced by 45% (in active operation phases) to 91% (in idle phases).