The best technology companies have always taken something that was complex and done a whole lot of engineering or in many cases re-engineering of it to make it usable and consumable – with the right pricing – so it can go mainstream.
Pure Storage was not the pioneer in enterprise flash storage, but it is the unicorn that has grown its product line, customer base, and revenues steadily since its founding in 2009 in the belly of the Great Recession. And like other innovative storage upstarts – Nutanix comes to mind – Pure Storage has had issues in making money as it grows its business, but Wall Street seems to have patience that in long run the flash storage appliance maker will be able to grow sales faster than costs and flip to profitability and stay there.
It was not at all obvious at the time when Pure Storage was founded that it would emerge as one of the leaders in all-flash arrays. Four years earlier, Fusion-io created flash cards that took a long time to define the flash storage market, and eventually the cost of flash came down enough that both Apple and Facebook used Fusion-io devices to underpin database acceleration; from 2011 through 2013 to the tune of $150 million to $200 million a year, and these two accounts comprised more than half of the company’s revenues. Violin Memory, also founded in 2005, created storage appliances that, like those from Pure Storage many years later, were based on homegrown flash modules, not stock SATA or SAS solid state drives, and an operating system that masked many of the reliability and durability issues of flash from the systems that accessed them. A slew of all-flash array makers have launched since then, and some of them are still here, chasing the incumbent storage giants that serve the enterprise datacenter, many of whom have long histories as disk array makers and have added flash to make hybrid devices to try to defend against the all-flash onslaught.
We have said it before, and we will say it again: In the fullness of time, except for the largest hyperscalers, cloud builders, and HPC centers with absolutely monstrous, exascale-class storage problems, most enterprises will at some point be able to move to all-flash storage for their applications and they will abandon disk-based arrays entirely. The mantra for a long time was disk is the new tape and flash is the new disk, but we think if you look carefully – and let’s take the instances and storage services at Amazon Web Services as an example – tape is the new tape, since the AWS Glacier is based on tape libraries with very good caching on the front end, and flash is the new disk and, unless you need to store trillions of cat videos or immense simulation and modeling inputs and outputs, you probably don’t need disk arrays.
Good data is hard to come by that allows comparisons, but is not a coincidence that according to data from IDC for the first quarter of 2019, the all-flash array market was about the same size as the hybrid flash array market – $2.47 billion versus $2.81 billion – but there was still another $8.09 billion in disk-based storage. And, we strongly suspect, a lot more capacity was sold on disk than on hybrid and certainly on all flash. Some of that is short-stroking disk arrays for performance (putting data only on the outside third of tracks where the rotational velocity is highest on the platter and therefore the average access time is slower on files).
At the current rate of revenue decline for disk storage, it will take until 2027 for disk array sales to fall to the level of all-flash arrays and hybrids. This decline could happen more abruptly – or not at all. Extrapolation works like that. Still, it is funny to think that, in the longest of runs, when the hyperscalers and the cloud builders are the last customers to be buying disk drives for massive exascale storage farms, enterprises will have outsourced disk arrays to them, and consumers, too. And the real comparison over time is between on-premises flash versus cloud storage.
This is one of the big bets that Pure Storage made more than a decade ago, and inasmuch as it has built a business that generated $1.64 billion in its fiscal 2020 ended in January, with around 7,500 customers, and is on track to kiss $2 billion in sales in fiscal 2021 (which is for the most part resident in calendar 2020), then Pure Storage has been successful precisely within its wildest dreams.
Pure Storage is right in there, competing with the other storage array makers once you take the internal disk arrays that server makers bundle inside their server skins as well as the ODM storage servers that the hyperscalers and cloud builders out of the mix – call it about 5 percent share. If it were possible to focus just in one large enterprise accounts, we think that Pure Storage’s share could be somewhere in the range of 2X to 3X this level, which is remarkable for any storage startup and is akin to the rise of EMC back in the 1990s, when it cut its teeth making RAID clusters of chip disks with smart controllers with giant cache memories that emulated IBM 3880 and 3990 mainframe disk arrays and then rose the Unix server wave up with a POSIX file system on its Symmetrix arrays, completely changing enterprise storage.
However, EMC was smaller when it went public in 1986, and it was profitable, so it could be smaller and still raise a substantial amount of money relative to its size. That year, Dick Egan and Roger Marino cashed in some of their shares on Wall Street and raised $30 million for their company, which posted $66.6 million in sales that year (twice the level of the year before) and had $18.6 million in net income. There was no venture capital beyond the bootstraps of Egan and Marino. When EMC hired Moshe Yanai, who has designed the Symmetrix, XIV, and Infinidat storage systems, the company was able to ride up the rocket as IT underwent the Unix revolution and then the dot-com boom. In 1990, when the Symmetrix array launched after three years of development, EMC had 0.2 percent share of the mainframe disk market, and five years later, EMC had 41 percent share and IBM had 35 percent share, and it had to buy the Storagetek disk array business to cover the embarrassment. In 1994, when EMC was roughly the size of Pure Storage, it generated $1.37 billion in sales but $251 million in net income – and it was growing five times faster.
That was a different time, and in the 21st century, companies grow faster than that long ramp between 1979 when EMC was founded as a company to sell furniture and 1990, when the company created the product that would define it. And storage upstarts struggle to make profits as they drive the revenue growth that their venture capital investors demand and, if they are lucky, that their Wall Street investors expect when and if they get to go public before being acquired by a storage incumbent who recognizes the threat they pose and has the cash to acquire them before they walk down to lower Manhattan with some empty wheelbarrows.
In the six fiscal years that we have been tracking Pure Storage, the company has an aggregate of $5.37 billion in sales and net losses of $1.19 billion, and in general, as you can see from the chart and table above, it is closing the gap between what it costs to support its revenues and what it needs to generate to cover those costs.
Some of those losses in the early years were covered in part by the $530.9 million in venture funding the company raised in eight rounds between 2009 and 2014, and the $425 million the company raised in its initial public offering in October 2015. Pure Storage had a market value of $3.1 billion when it went public, and it has a value of $3.9 billion today after a pretty bad week for all stocks thanks to the coronavirus; the stock had been trading higher in the summer of 2018, when the company had a market capitalization of $7.2 billion, and presumably those VCs and other Wall Street investors post IPO cashed out and made some tidy profits from their investments. As the fiscal 2020 year came to a close in January, Pure Storage had $697 million in deferred revenue in the bank and $1.3 billion in cash and investments, so it is in pretty good shape for investing in the future and chasing more customers and more deals – even if it incurs losses as it has in the past. But somewhere around $500 million per quarter, it is at break even and even though its revenue guidance for fiscal 2021 is only for 16 percent growth year on year, that would put the company at $1.93 billion and closer to breakeven. If the company keeps costs level, that is. Pure Storage may decide to keep spending on the rise in pace to fuel growth; it may have no choice, in fact, particularly if the economy starts to stumble.
The key to the success of Pure Storage to date is that it innovated heavily atop flash storage and created an appliance experience, akin to what we get from our iPhone and its services for those of us who use Apple products, to make complex storage easier to deploy without sacrificing underlying sophistication. The company’s innovative Evergreen upgrade program builds in controller upgrades every three years so companies know they can boost the performance of their flash arrays without having to dump their flash storage modules. The company has also expanded its product lines over the years to address new markets, and continued to do so this week with the launch of the FlashArray//X R3 block storage product.
Pure Storage started out back in 2011 with the FlashArray//M block storage, and the FlashBlade follow-on announced in 2016 was about creating less expensive flash and more scalable arrays that could tackle the big object storage jobs enterprises were beginning to wrestle with thanks to the mountains of unstructured data that they started piling up in the hopes of converting it into money. The FlashBlade initially could scale to two controllers and 30 storage blades for a maximum of 30 blades and 3.2 PB of capacity usable employing its fattest storage blades, which weigh in at 52 TB. Today, Pure Storage can scale a single object store across ten enclosures for a total of 150 blades, or five times that amount. As important, the flash object storage has 150 GB/sec of bandwidth across those ten chassis and can deliver an aggregate of 24 million IOPS running the NFS file system.
The beefier FlashArray//X series block storage debuted in 2017, and they included new DirectFlash Modules that employed NVM-Express internally and that dropped the access time of data on benchmark tests run by Pure Storage from 1 millisecond for a FlashArray//M to around 500 microseconds with the FlashArray//X. (The average enterprise-class disk array using SAS spinning rust drives was around 30 milliseconds, by comparison.) With the new and improved FlashArray//X systems announced a year ago, which included a revamped NVM-Express over Fabrics (NVMe-oF) interconnect for the flash called DirectFlash Fabric, that access time on the benchmarks used by Pure Storage was cut in half again to 250 microseconds. The net effect of the last nine years of innovation is that a database query running on block disk storage from 2011 might take 5 minutes to complete, but on the FlashArray//X R2 systems announced last year, that would drop down to 2.5 seconds – an improvement that comes just because of the shift from disk to flash block storage. The goal, as the company set it out, was to get that average file access down to 100 microseconds and to get that database query down to 1 second. Here is the evolution:
In the middle of all of that, at the end of 2018, Pure Storage took its Purity storage software stack and carved it up to run on flash-based instances on the Amazon Web Services public cloud to create a virtual FlashArray of sorts called Cloud Block Store.
The FlashArray family is a system with a pair of redundant controllers based on Xeon processors from Intel that uses a mix of flash SSDs off the shelf or custom flash modules manufactured by Pure Storage. The top end FlashArray//X90 R3 will top out at over 3 PB, which a lot of capacity for a single instance of block storage in the enterprise. In fact. Matt Kixmoeller, vice president of strategy at the company, tells The Next Platform that the typical customer who wants to do rack-scale block flash external from the servers but accessible by all the servers in that rack tend to buy arrays in the range of 500 TB to 1 PB of capacity. A blast radius larger than a few racks makes enterprises uncomfortable, and consequently customers tend to go with the FlashArray//X50 and FlashArray//X70 models, not the top end box. Very roughly speaking, the arrays range from around $100,000 for a loaded up FlashArray//X10 to $1 million or more for a FlashArray//X90. You can upgrade from entry level to the big bad box, and upgrade as the box changed.
With the new FlashArray//X R3 models announced this week, the machines are only being with the DirectFlash Modules standard, and while off-the-shelf flash SAS SSDs are still supported in the R3 machines, it is really about letting customers upgrade their controllers now and their f;ash storage later. Pure Storage is also delivering a new 1 TB DirectFlash module, based on 3D TLC NAND, as well.
“DirectFlash has been an awesome architecture for Pure, and we want to bring it across the product line,” says Kixmoeller. “We have been measuring performance and reliability and power efficiency of our DirectFlash Modules from the early days and we also believed that it would be more reliable as well. We have seen over the years of shipping DFMs, that DFMs have about half the failure rate of SSDs.”
The new FlashArray//X R3 systems are also getting a controller boost, with a move to the new “Cascade Lake” Xeon refresh processors that Intel announced earlier this week. The more capacious variants of the FlashArrays get heftier processors, so it is not just that there is one chip chosen but rather each model gets its own processors. But in general, the //X R3 controller has about 25 percent more raw compute oomph than the //X R2 controller from last year. For those who are coming from the FlashArray//M R2 models from 2016 and 2017, the controller performance increase is more like 50 percent, says Kixmoeller. And whole 100 microsecond access time is still a goal, the new //X R3 machines have got that down to 150 microseconds, which is still a 40 percent reduction in latency. By the way, customers that want to add persistent memory into their FlashArrays can do so with Intel Optane 3D XPoint SSD modules, which slide into the enclosures. You don’t have to do anything special to access these – it’s just another kind of storage. In some cases, thanks to the lower latency of Optane, it can be cheaper, says Kixmoeller, to boost the performance of I/O sensitive applications like databases by plugging in some Optane SSDs rather than upgrading the compute on the controller. The Optane upgrade is also less expensive, which is a plus.
One last thing: Pure Storage seems uninterested in adding compute functions to its arrays to run them natively on the devices in a hyperconverged fashion, just as it has not been interested every time we have asked.