Server virtualization juggernaut VMware was reluctant to take on the storage area establishment most aptly represented by its parent company, EMC, with virtual storage area network storage software aimed at commodity hardware. But starting last year, the company had scaled out and scaled up its Virtual SAN product and allied it with its ESXi 6.0 hypervisor, creating a potent combination that could take on the hyperconverged storage upstarts.
But now, VMware wants more, particularly with its core ESXi and vSphere server virtualization technologies reaching their peak. With VSAN 6.2, which launches today along with a refreshed and repackaged vRealize cloud controller suite, the company will be extending the capabilities – and therefore the use cases – for its VSAN software as the hyperconverged storage layer for virtualized workloads and private clouds alike. And rebuffing the advances of Nutanix and others who are trying to storm the 500,000-strong VMware installed base with their own hyperconverged storage and server virtualization infrastructure.
Some key features that are in real physical arrays, such as data reduction techniques such as data compression and data de-duplication, have been missing with VSAN and are now being added to the mix. Moreover, erasure coding approaches to distributing and protecting data, made popular by the hyperscalers who have created their own object storage as well as the several handfuls of storage companies that followed suit, are also being added to VSAN. The net result is that VSAN software makes much more efficient use of storage, which is particularly important for all-flash configurations of storage arrays that are becoming more popular for latency sensitive workloads.
The interesting bit, Gaetan Castelein, senior director, product management and marketing of storage and availability at VMware, tells The Next Platform, is that the data reduction technologies are not being provided on hybrid VSAN configurations that use a mix of disk drives and flash devices for their I/O and storage capacity.
“There are a number of reasons for this,” Castelein explains, adding that VMware does not mean these will never be supported, but just that they are not supported now. “These features are most valuable on flash where the cost of the media is much higher, and they also require more I/O operations from the underlying storage, and we think it is technically a better fit with flash than for disk. And finally, the market is continuing to move to all-flash as prices keep coming down faster than for disks, and all-flash becomes very, very attractive.”
Given the relative cost and performance of all-flash arrays and now hyperconverged storage that is based on servers but can also deliver all-flash configurations, enterprises can make an argument for buying the Cadillac version of the storage rather than settle for the Chevy. And to stretch this metaphor even further, we think that VMware’s storage experts had better be working on a cheap and virtualized object storage of some kind to complement VSAN for those shops that want inexpensive storage for largely unstructured data. “This is an interesting idea,” Castelein said with a laugh when we suggested it to him.
Obviously, EMC has a lot of experience with de-duplication and data compression techniques across the many storage lines it has created or acquired over the years, but VMware operates separately and Castelein says that these routines were developed in-house by VMware’s engineers. The data reduction techniques are called the “space efficiency” feature in VSAN 6.2, and it will be enabled at the cluster level within a VMware infrastructure stack. The de-duplication works on fixed block lengths of 4 KB and is performed as information is moved from the caching tier of the VSAN stack, which is implemented in a mix of main memory and flash-based PCI-Express cards in servers, to the capacity tier in the VSAN, which will be based on flash drives. The compression algorithms kick in after data has been de-duplicated.
The amount of data reduction customers can expect depends on the workload, but averages around 7:1 or so. For virtual desktop infrastructure, where PC images are running the same code, the reduction factor will be very high, while for databases, which are dense and not full of replicated data, the reduction will be much less pronounced. Neither the de-duplication nor compression routines make use of specific functions in any recent Intel Xeon processors to get acceleration, so VSAN 6.2 will work on older systems. The overage for the data reduction technologies is on the order of 5 percent or so of the CPU, says Castelein, which is not very much at all, particularly on a multicore processor that might have a dozen or more cores.
On top of these data reduction techniques, VMware is also giving VSAN erasure coding that makes use of RAID 5 or RAID 6 data striping across multiple ESXi hosts that underpin the VSAN clustered storage. The RAID 5 erasure coding technique takes a minimum of four ESXi hosts and spreads four slices of parity data and three slices of actual data across those four nodes. To replicate datasets once, as VSAN was doing by default prior to the VSAN 6.2 release, it would take 40 GB to secure 20 GB of data. But with the RAID 5 erasure coding method, it takes only 27 GB to secure that 20 GB of data, or a 33 percent overhead. With RAID 6 erasure coding, it takes at least six ESXi hosts and there are two sets of parity data and two different parts of the dataset can be nuked and the virtual SAN will keep running. In this setup, there is a 50 percent overhead on the storage capacity to implement the data protection.
If you add up all of the effects of these features, then an all-flash VSAN cluster now offers around a factor of 10X more data density at the 6.2 release level compared to the prior release, and that can drive the cost of an all-flash VSAN array – including software and hardware – down to as low as $1 per usable GB, according to Castelein.
So how does this all stack up? Here are the competitive comparisons that VMware is making, and not surprisingly, you won’t see EMC’s own hybrid or flash arrays in the comparisons:
On the left, VMware is lining up a VSAN array with four nodes against a Nutanix NX-3160 hybrid disk-flash configuration, including hardware, software, and three years of support costs for the stack. The VSAN software is running on white box machines from Supermicro. The idea here is that you can get an all-flash setup for about 48 percent less than a hybrid Nutanix machine.
On the right hand side of the chart above, VMware is comparing am unnamed all-flash array maker (we think it is Pure Storage, but Castelein is not saying), and the case is even stronger. We are trying to get the details behind this comparison and are particularly intrigued by the stipulation that the hardware costs for VSAN 6.2 only include SSDs because the compute is mostly allocated to running virtual machines. (In the comparison on the left, both the Nutanix and VSAN setups imply a mix of compute and storage workloads.) It would be far netter to add compute to the all-flash appliance than to take compute out of the VSAN picture, we think. But this does suggest that the storage portion is far cheaper on the VSAN front.
Neither of these comparisons take into account the performance of the storage, which is obviously a key factor in any comparison.
What VMware is focused on is that it has 500,000 customers for its server virtualization wares, an installed base that it has spent more than a decade and a half building, while Nutanix, the market leader by some measures in hyperconverged storage, has 2,100 customers as of earlier this year when it announced it was going public.
VMware now has more than 3,000 customers who have deployed VSAN through the end of 2015, and is adding 500 new customers per quarter. Interestingly, the business has a $100 million annualized booking rate, which is up around 200 percent compared to a year ago, and it has deployed VSAN on more than 20,000 processor sockets (VMware calls them CPUs) in the fourth quarter alone. That is more than 10,000 servers, and that probably means VMware has an installed base of VSAN server nodes that is several times larger. This is but a dent in the number of servers that VMware has its ESXi software running on, linking to other storage. (With 50 million virtual machines under management, this suggests several million machines at least, more if the VM densities are not too high on the boxes.)
In other words, hyperconverged storage is the company’s market to lose, and for a while there when it was ignoring Nutanix and other players and offering a limited scale virtual storage array, it looked like VMware might just do that. But now, VMware is adding the features that will give it a chance to convert its vSphere base to virtual SANs, and that could go a long way toward making up for the decline in ESXi and vSphere revenues that had to come sooner or later.
With the launch of VSAN 6.2, VMware is offering three different releases of its virtual storage. The Standard Edition, which costs $2,495 per socket, has the base functionality. With this release, this includes support for pure IPv6 networking for those customers who are moving away from the earlier IPv4, such as government organizations and service providers with large numbers of devices on their networks. The core VSAN software also includes software-based checksum to detect and resolve disk errors across a VSAN cluster, which is done through disk scrubbing and moving in copies of replicated data. The core software also now supports write through on read memory caches, which are local to virtual machines and which have a big impact on VM performance. The software also has sparse swap algorithms to reclaim space left over after memory swaps.
VSAN Advanced Edition includes the all-flash setup as well as the new compression, de-duplication, and RAID 5/6 erasure coding across nodes, and it costs $3,995 per socket.
The full-on VSAN Enterprise Edition costs $5,495 and adds in quality of service features, such as eliminating noisy neighbor issues, pegging performance for VMs to service level agreements regardless of the order they are provisioned, and providing deep visibility in the IOPS consumed per VM in the cluster for performance management. The Enterprise Edition will also allow for a single VSAN cluster to stretch across multiple physical sites in an active-active manner, provided the network latency is low enough and the bandwidth is high enough, of course.
VSAN 6.2 will be available in early March, and will be trying to get its slice of the $1.5 billion in hyperconverged system sales that IDC expects this year.