With the AI chip startup hype cycle spinning down from its feverish pace in 2018, giving way to 2021 expectations for real-world deployments, it still difficult to see which company will steal what little share is left in the Nvidia/Intel/AMD dominated datacenter.
The list of potential companies to grab some AI training/inference market share for datacenter use cases is not long but one player most often left out of the conversation in the last year has been Groq.
While SambaNova, Cerebras, and Graphcore have all made headway finding public use cases at well-known research centers and organizations, Groq has been less visible. In fact, it wasn’t until recently they made a wave of appearances in chatter, not because of a new installation but because they’ve taken on more funding ($300 million in Series C). The investment round was not led by the superstar VCs other chip startups have, but it’s happening because there’s real customer momentum they can see—even if it’s not public, Groq co-founder and CEO, Jonathan Ross, tells us.
“We have three of the labs and we’ve shipped to two lighthouse customers,” he explains, but the details are under NDA other than that one of the big customers is fintech, another is on the autonomous vehicles (AV) side. What is interesting about the two largest is that they chose Groq for completely different reasons. “In the autonomous case they were able to replace four GPUs with a single one of our chips, the energy savings was crucial. In the datacenter case [fintech] they needed to scale up and while they could get performance on a GPU and scaled performance on CPU they couldn’t get both. We could give them better than GPU performance on a per-socket bases and can scale to hundreds of chips,” Ross adds.
When asked why we aren’t hearing much about Groq in 2021, Ross says they’ve been focused on real-world problems instead of making noise. “That’s no way to run a startup,” he says. Although for those who watch this segment carefully, in the overhyped world of AI chip upstarts, more noise (or at least public use cases) tends to mean more traction.
The real question for Groq is what that list of secret customers are buying and why, especially when there are so many options for custom AI acceleration and general purpose devices that do the job well and efficiently enough.
In terms of market segments, Ross tells us the balance between autonomous and datacenter is about even in terms of the existing customer base but he says it’s difficult to determine the exact split since so much of what’s happening in AV is simulation-driven thus has a datacenter angle since customers don’t want dueling architectures.
The unit of deployment Groq counts by is per rack or per four racks with the ability to connect more to continue scaling. Ross says to date they’ve deployed “hundreds of racks” with between one to four per site/user. “When we give them linearly scaled compute they get excited about using it in different ways. Instead of a massive CPU deployment they can do something more contained in a Groq environment and actually get more done using these ‘pods’ of compute. There’s real work to be done and often it was overnight. We can allow them to complete a run in near real-time, which is a game-changer for how they operate and make decisions going forward,” Ross argues.
“The two areas we shine are in low latency, which lines up well with real-time applications,” Ross says. “We have about 10x lower latency than other architectures because our chip is single core, all transistors are dedicated to a single problem at a time. The other is in scaling up. Our chip is deterministic, we can solve larger scale problems and address tail latency.” Ross adds that they are finding success with the obvious embarrassingly parallel problems mainstream hardware can tackle, but their approach is closer to sequential processing, something he says works especially well with training.
Instead of going it alone on the CAPEX-intensive appliance route, Groq is working any angle possible to secure a spot in the datacenter. While OEM partners have not been publicly announced, when asked specifically about HPE, Dell, and Supermicro during a partnership question Ross told us two of the three are shipping Groq systems now. Usually these have eight of the TSP chips alongside two AMD Epyc CPUs with hefty interconnect between the TSPs and 16 120Gb/s links per chip along with whatever custom tooling OEM partners add.
“Our flexibility tactically lets us meet customers where they want to be. Having said that, and also, channels being what they are, it means we can deploy at a colo as part of a cloud provider’s infrastructure or customers can go hybrid with that. We don’t see the need to commit to one strategy,” Ross explains.
Ross says that they’ll be pushing to be more public with customer wins this coming year and insists that the customers they have are able to deploy models and projects they couldn’t afford to on traditional compute before. “We’ve also double the size of the company since July and will probably do that again this year. We have tremendous customer traction evidenced by this latest investor round. We’ve been focused on product and less on marketing, we’re busy talking with customers and building the business instead of making noise.”
Back to the original question: can the company come back into the public eye with the same excitement the market felt about their ex-Google TPU engineering team led by Ross? In our view at TNP, it will take some big public wins to rival chip startup competitors have secured to prove the value of their minimalist scalable architecture which, from all we’ve seen, is solid, both in terms of hardware performance and the software underbelly.
With multiple form factors and ways of delivering Groq hardware into customer hands, it is still hard to say why they haven’t worked to gather public steam in the last two years. Their entrance into the market made big waves, especially because of the pedigree of the founding team and initial entry defined by a simple architecture and compiler/software-first approach. Although scoring public deals with national labs and Fortune 500s might seem like a waste of time for a company that wants to just focus on engineering, noise matters. And if they don’t start making some soon via public deals, they’ll continue slipping off the list of viable (market reach-wise) AI hardware makers.