There are two trends converging in AI inference and so far, only a small number of companies are enmeshed.
The first trend takes us back to the future with analog compute engines, which promise much lower power and potential cost, but with some impacts to complexity creation and avoidance. More on that shortly.
The second trend is for any company entering into the inference market with an analog approach to hedge their bets with a dual focus on datacenter and edge inference. More specifically, by having the same architecture play just as well in edge devices as it can cram into PCIe as an accelerator.
And while we are on the subject of trends in AI inference, let’s remember that in the datacenter, the CPU is still king. Offload acceleration models have not caught on for this particular part of the workflow but as trained model complexity (and ROI) continues to grow, it will make sense. Enough for the big companies to go and build their own. That is, unless someone makes it economically and technically unfeasible to do so.
One of those few inference startups that falls into the convergence category just secured more cash to finance their foray into the datacenter and tackle the above challenges. Mythic, which we profiled last year, announced a $30 million Series B-1 round, which brings total funding to $86 million.
If you must skip our architecture profile, the short story is the company dusted off decades of existing work on analog devices for a wide range of workloads beginning in 2013 and claims to have perfected some of the ultra-tricky analog to digital (and back again) circuitry along with the addition of optimizations for complex neural network inference operations (CNNs, RNNs, with an eye on transformer networks and other novel approaches to AI coming out of Google and other hyperscale research labs).
One might say Mythic’s datacenter ambitions are tempered by just enough emphasis on the devices at the edge, which is where the real mainstream opportunity is. This dual market focus should calm investors who know finding footing in the datacenter acceleration market is no small feat, especially since that market is comprised mostly of hyperscalers who will either test a new device long enough to decide to build their own or put a startup in the precarious position of years of qualification and development before their chips ever see the inside of such datacenters.
While the margins can be maintained in that emerging edge mainstream, the mindshare potential in the datacenter is the golden grail. This is something several AI chip startups aimed at training and inference alike have sought, but it’s hard to say if any of those hopes are well-placed considering the limited number of potential big customers, their inclination to build versus buy at scale, their extended hardware qualification times, the need for dramatic levels of software integration, their requirements for advanced roadmaps…shall the list go on? The blunt way to say this is that the datacenter might be a far-flung hope, even for some of the best tech and development.
But, as Mike Henry, CEO of Mythic, tells us, there are ways around many of those barriers to the big customers. And he thinks they might be able to make an inference offer the largest companies cannot refuse. Well, to be more exact, they won’t be willing to refuse because it wouldn’t make sense from a development and cost perspective.
“The key is having something that is truly differentiated to where those hyperscalers would not build themselves that is low cost enough that it would not drive them to do so if they had to,” Henry says, and he does have a point. “Those companies have large systems and hardware teams but they are good at things like building large-scale digital integration and systems. I have not seen any complex analog chip come out of these companies aside from maybe some network communication structures.” This is not to say teams are not hard at work on this, but as Henry explains, it took several years to get all of the analog/digital conversions right, not to mention the many other optimizations required for their devices to work at the edge or datacenter.
“If you look at a relatively simple analog device, like an automotive sensor from Texas Instruments, there are eight analog to digital converters on the chip. They’re fast with high sample rates, 8-16 bits of precision, and there might be eight of those on a chip,” he says. “Our problem was putting around 22,000 of these on a chip while keeping roughly the same kind of power budget. The scale of these converters on a chip is far greater than anything that’s ever been made before. And we had to figure out how to make them small and thin enough to line these up against the flash memory without blowing our power budget.”
Training an analog device to speak digitally trained neural networks is no small task. Henry says the kind of complex data flow of most deep learning jobs is not at all suited to analog. “We had to build a lot of digital surround to have programmable and flexible architecture for these networks. And as we thought about what would be required in the next several years we realized it was about a lot of raw brute force matrix math capability in terms of compute, no specific network accelerators of any kind, but rather a built in digital capability to shift data in and out to run CNNs, RNNs, tension and transformer networks, and those new things. It took us over five years but as the topology changes we can keep pace. But the key is that analog to digital (and vice versa).”
“AI is a completely new workload from what the semiconductor industry has known. It focuses on low precision, it is memory intensive, and the computations are simple from a control flow perspective. Analog compute with existing memory can address the bottlenecks that Moore’s Law scaling will never be able to solve.”
The point here is hard to argue with. Google, Facebook, Amazon, and others could certainly go find and leverage expertise to build analog devices, but why would they bother when the complexity and ramp-up will be very long with much reinvention of wheels and the devices themselves, which take advantage of existing and relatively cheap memory technology means the cost/benefit starts to look somewhat lopsided. The only advantage would be controlling architecture completely to enable speed-of-development software progress, but even that is a stretch.
And the flipside to this is that there is a lot of analog expertise on the planet, a lot of easily accessible memory, and making inference chips can become anyone’s game at the edge where things are easier—and those startups can take the same dual-market approach Mythic has. But that’s just the way business and competition work and as we heard during a recent VC panel, inference is still anyone’s game.
“Sure, in grad school, someone can put together some circuit that shows analog compute, but that’s a far cry from mass production; that’s not shipping it into a bunch of devices or loading trained networks in TensorFlow. Loading that with a negligible loss of accuracy and ensuring consistency across millions of chips? That’s the hard part,” Henry adds.
The research and development risks Mythic has stretched over the last several years can be mitigated by an emphasis at the edge where the markets are still largely undefined and possibly explosive for certain niches. And this makes a fresh round of investment in a dual-placed strategy sound less scary than the money that has been pumped into AI accelerators that finally hit the market long after the frameworks and de facto devices changed (as we saw with training accelerators, for instance).
We fully expect to be hearing from an entirely new crop of chip startups that are trying to carve a slice off the edge. And we also expect to hear about established companies that have made a mint in analog devices for decades and finally a chance to enliven their business with the year’s hottest workload. And like so many other things in this space, there will be far more “solutions” on the market than people with problems complex enough to warrant anything too far outside the box. But if problems are persistent enough, and ones rooted in AI will be over time, the real battle for hardware begins.
On that note, the biggest analog AI inference chip story comes from IBM, which is also pursuing 8-bit phase change memory-based devices. More on that coming this week.
What Mythic’s story also shows us that this is a good time to pay attention to what’s going on with the memory makers of the world. There aren’t many, but one of the biggest, Micron, is involved in Mythic’s news, with Micron Ventures adding to the funding round.
The B-1 funding was led by Valor Equity Partners (SpaceX, Tesla) with new investors, Future Ventures (relevant to us because of investments in quantum computer maker, D-Wave, Nervana Systems, which was acquired by Intel, and investments in silicon photonics and spin memory), Atreides, Lam Research, and as mentioned, Micron Ventures. Mythic secured its initial rounds from SoftBank, Threshold Ventures, Lux Capital, Data Collective, AME Cloud Ventures, and Lockheed Martin Ventures.
Stay tuned here this week for a much deeper piece that looks at the architectures, challenges, and opportunities for analog inference (with emphasis on datacenter applications). There’s a lot of momentum here and unlike the ASIC-driven startups that set about to capture AI share early on with training/inference chips. the manufacturability is far easier, the roadmaps are clearer, and the device diversity is less, well diverse. Software will matter more than ever before so the winner will be the company that finds the most efficient, usable way to get TensorFlow (and a host of even more complex future frameworks for transform NNs and beyond) to speak analog–and that’s no simple matter.