VAST Data: What Controls The Data Is More Important Than What Stores It

Published

At times, VAST Data has been pigeonholed as a data storage company that competes with the likes of Pure Storage (now called Everpure), IBM, Dell, Hewlett Packard Enterprise, and others. To be pair, there is a strong storage component to its products. But as VAST Data executives, including co-founder Jeff Denworth, have argued during most of the company’s ten years, VAST Data is not a storage company – it’s a data company. And one that had a vision – through its “thinking machine” strategy – that dovetails nicely with what is happening with AI and agents today.

Given that, Denworth and co-founder and chief executive officer Renen Hallak, and others have continued to evolve VAST Data along those lines, positioning it as an AI infrastructure platform company with data at its core and boasting about it offering the industry’s first AI operating system that pulls together data, storage, and compute. That is on display this week at the vendor’s first major user conference – VAST Forward, in Salt Lake City, Utah – with VAST executives unveiling a number of new platform features aimed at building trust in the data that AI systems use, accelerating what the platform can do, and expanding partnerships with hyperscalers, server companies, and other tech vendors.

“VAST is becoming this ubiquitous brand where data meets AI,” Denworth told journalists during a briefing before the conference opened.

That’s being proven in the numbers, he said. The company, since rolling out its first product seven years ago, now has eclipsed more than $4 billion in software bookings – and finished its latest fiscal year with three times the growth in bookings from the previous twelve months – has $1 billion in the bank, and has annual recurring revenue of more than $500 million.

At VAST Forward, executives said they want to address what they view as they key hurdle to broader AI adoption: trust.

“Trust in what the models have been trained upon, trust is what the models are allowed to do with different data or different tools, and then ultimately, trust in agents that use these models to talk to each other and talk down to tools,” Denworth said. “So we started to work through all of the things that are required from a security perspective to build trust – or zero trust – into a platform. What we realized is that these things actually weren’t divergent from our objectives to building a thinking machine, in that if you’re going to build a system that can recursively compute and improve upon itself, one of the things it uses to improve upon itself is data, and that data also needs to be trusted.”

The company isn’t only talking about data found in databases or file systems, but also data in communications and any activity that can be logged and audited, all of which is important because “data becomes the thing that ultimately dominates computing in today’s world,” he said.

To build the necessary trust, VAST Data unveiled two additions to its AI OS platform, its PolicyEngine and TuningEngine. Combined, the two services govern the data – including not just access to the data but that dictate how agents communicate with each other and other tools and the memories they retain – based on policies set by organizations using them. They also ensure that what gets put into the model list is approved and allowed according to the policies.

Specifically, the PolicyEngine is involved with every event happening with a system, acting as both a framework for decision-making and a tool for determining the types of data and the ways the data is presented to agents. Interpretations of the data will be made and, at times, data will be transformed to ensure its safe for an endpoint to see and use.

“The objective here is to basically put something in between agents and other agents, agents and MCP [Model Context Protocol] tools, agents and the memory and the RAG [Retrieval-Augmented Generation] reserves that the agents draw upon in order to complete their activities and essentially ensure that every single operation that happens within a VAST system, within the AI Operating System, is trusted,” Denworth said, adding that every action is tracked and put into audit logs.

PolicyEngine works with VAST’s Policy Enforcement Point (PEP), which mediates everything that happens in the system according to policies. Organizations will set policies for blocking or allowing certain vents, and PEP may censor some activities according to the policies.

“The second thing we realized is that if you’re going to have trust in a model, then you’re going to have to trust what that model was also trained on and fine-tuned with,” Denworth said.

This is where TuningEngine comes in. A year ago, VAST unveiled AgentEngine, an agent deployment and orchestration system – what the company calls the “application management layer” of its AI OS platform – the provides the tools, runtime environment, and observability for deploying and manage AI agents at scale. TuningEngine is a new step for those agents.

<>

TuningEngine collect the data from the agents, runs it through an ETL process that creates artifact tables that are fed into a set of tuners,

It will collect all the data that comes off of these through an ETL process that creates artifact tables. Those artifact tables are then fed into a set of tuners – Low-Rank Adaptation (LoRA), supervised fine-tuning tuners, and reinforcement learning tuners, to name a few. Coming out of that are find-tuned models that can be deployed again via the AgentEngine as a new version of the model.

Both PolicyEngine and TuningEngine will roll out this year.

VAST also is expanding its partnership with Nvidia to more fully accelerate its data stack. It was an active collaboration even before the conference, with the two companies working on more than two dozen joint engineering projects.

“One of the things that is very clear is that accelerated computing hasn’t made it way all the way to the data level,” Denworth said. “That is something that Nvidia has been working on and it hasn’t necessarily played out in the market yet, and we’re taking an aggressive approach to basically building accelerators into the VAST platform.”

The extended work with Nvidia will allow VAST to bring its AI Operating System, combining data services with the compute layer to reduce operational complexity and streamline AI operations. It also will allow VAST’s platform to run on Nvidia-based servers from the likes of Supermicro and Cisco.

VAST also introduced the CNode-X, a new Nvidia-certified system that will run VAST’s high-performance storage services for clusters powered by Nvidia GPUs.

“What this allows us to do is to take our software and then power it with a bunch of very specific Nvidia libraries that are being built into the system,” Denworth said.

Those libraries include cuVS for accelerated vector search and retrieval, NIM microservices to run RAG pipelines natively in the cluster, and Sirius for accelerated SQL queries.

VAST also introduced its redesigned Polaris control plane for provisioning, operating, and orchestrating VAST clusters in major public cloud platforms and that will be “extended to others with ‘fleet-level scale’ operational challenges,” Denworth said.

Jonsi Stefansson, VAST’s general manager of cloud, said the offering started off as a simple cloud control plane to manage VAST in the cloud via the hyperscaler market and offering lifecycle management and day-two operations, like expanding and replacing and non-disruptive upgrades.

It’s now “that unified control plane that delivers centralized intelligence with distributed execution across all VAST environments, from on-premises clusters to public clouds to sovereign neoclouds, and not having to deploy the full stack everywhere, but by deploying a lightweight Kubernetes operator and the Polaris agent locally on each side,” Stefansson said. “The architecture enables a single global view for policy management, governance, upgrade, orchestration, without centralizing the data path.”