Why IT Shops Are Going For All-Flash Datacenters
June 1, 2015 Timothy Prickett Morgan
If space, power consumption, and raw I/O performance were not an issue, most datacenters would continue to use disk drives for their tier one storage until the end of time. They would just go with what they know and just keep adding more spinning rust to their storage racks. But datacenters are running out of space and power, and as they ramp up their compute, they often cannot ramp up their storage to keep pace with disk.
And that is why Hewlett-Packard is starting to think that the all-flash datacenter is going to become more of a common occurrence starting this year. The first reason that HP believes this to be true, Craig Nunes, vice president of marketing for the HP Storage division, explains to The Next Platform, is that the cost of capacity on flash arrays keeps coming down as the performance and effective capacity of the arrays keep going up.
This time last year, says Nunes, HP was upgrading its 3PAR StoreServ line with the 7450 all-flash array, which provided the same price per effective capacity as a disk array using 15K RPM drives (around $2.00 per GB) and not using de-duplication and data compression features which generally were not done in-line as data was coming and going through the storage and even if they were had a dramatic and adverse effect on performance. With the StoreServ 20000 announced this week at the company’s Discover 2015 customer and partner event in Las Vegas, Nunes says that HP can get the price of an all-flash array down to the same cost per usable gigabyte as a disk array using much less expensive 10K RPM disk drives, or around $1.50 per GB, and when you take into account snapshot data sharing, you can drive the cost down even further.
Here’s how the math breaks down according to HP using the 3PAR all-flash arrays to get the cost down to disk arrays using 10K RPM drives:
The HP arrays start out with a 3.84 TB SSD drive based on consumer-grade multi-level cell flash. The adaptive sparing wear leveling and overprovisioning features of the 3PAR controller stretch another 20 percent or so of capacity out of the raw flash. The in-line thin de-duplication features of the Thin Express ASIC Gen5 chip that goes onto the controller give another 4:1 data reduction. That gets the cost per effective capacity on the StoreServ 20000 down to around $1.50 per GB. Disk arrays sell for somewhere between $1.00 and $1.50 per GB, says Nunes.
To push the comparison even further, Nunes says that companies moving to 3PAR all-flash arrays will be able to be much more aggressive about using active snapshot copies of production data – something that they do not normally do. Citing data from the analysts at Wikibon, Nunes says that IT administrators will take idle snapshots of production data for protection purposes, often making eight to ten copies of such data. This is as opposed to making active snapshots of data, which point back to the original dataset and which impose I/O and compute overhead on the storage controller. With flash, this overhead is minimal, so companies can switch from making physical copies to making logical ones and still use them for production, development, and testing. On average, Nunes says customers should expect to see a 6:1 further compression by shifting to active snapshots, which will mean they need less storage. The implication is that a dataset is less expensive to store through snapshots and the effective price per gigabyte goes down by a factor of six to 25 cents.
All things being equal, even cost parity without taking into account the snapshot effect might not be enough to make customers jump from disk to flash. But the flash arrays offer an 80 percent to 90 percent reduction in datacenter footprint – so, it takes around seven racks of disk arrays to provide the capacity of one rack of flash – and there are similar differences when it comes to power consumption and datacenter cooling related to disk and flash storage, too. If you want to think of it this way, you get the superior I/O of flash arrays for free and you pay for all of these other benefits.
Here is a case in point. One large multinational bank with a datacenter in the United Kingdom, which has to remain anonymous as you might expect, was used to building new datacenters to house new systems as the bank expanded. And then the board of directors of the bank turned off the concrete trucks and said it would not spend $20 million on a new datacenter just to add storage capacity to underpin the server virtualization environment hosting some of its applications. So the IT staff stepped back and rethought the problem and did the math on moving to all-flash arrays. Then they actually did it. The upshot is that the bank got 60 percent of its datacenter space back, boosted the performance of the virtualization environment by a factor of 2X, and didn’t have to spend $20 million on that new datacenter. Nunes was not at liberty to say how much the company paid for its all-flash arrays, but as long as it was under $20 million, it was a win.
“It is more than just flash arrays going mainstream right now,” says Nunes. “We think based on what we are observing from some of our customer deployments, is that all-flash datacenter, which I think a lot of folks think is far off into the future, is actually now. And maybe this is affecting the high end of the storage industry, where we are certainly seeing a decline.”
(Obviously, because flash uses a lot less power and cooling than disk on a per-unit of capacity basis, the move to flash does not present a problem in terms of the thermal density of the racks. Cramming more servers into a smaller space, on the contrary, creates power and cooling issues – at least until we switch to a new processing technology some years hence that does not rely on general purpose CPUs as we know them.)
The 3PAR StoreServ arrays are powered by HP’s own Thin Express Gen5 ASIC, as we mentioned above, which can lash up to right controllers together to create a virtual storage cluster that can scale to 1,024 flash-based SSDs with up to 3.6 TB of on-node cache memory. Such a machine is called the StoreServ 20850, and it has up to 3.9 PB of raw capacity using the 3.84 TB SSDs. (HP sells lower-capacity SSDs for the StoreServ 20000 series that come in 480 GB, 920 GB, and 1.92 TB sizes.) There is a “converged flash” variant that can have up to 1,920 disk drives hanging off the controllers next to the flash, which tops out at 6 PB of total raw capacity.
With data reduction techniques, the all-flash StoreServ 20850 crams 280 TB in a 2U enclosure, 5.5 PB in a rack, and 15 PB in an eight-node system; with RAID data protection overhead, the effective capacity gets knocked down to 12 PB. Such a beast can deliver 3.2 million I/O operations per second with a latency that ranges from 200 microseconds to 800 microseconds, depending on the file size tested. With elastic resource pooling and 16 Gb/sec Fibre Channel links between the arrays, four such StoreServ beasts can be federated and workloads and data to be balanced across those arrays. With this storage cluster, HP can deliver an array with 60 PB of total capacity (without RAID overhead) with a maximum of 10 million IOPS and 300 GB/sec of aggregate storage bandwidth.
Entry street pricing for a StoreServ 20000 starts at under $100,000 for two controllers, four SSDs, and a single drive enclosure. Based on the pricing example above, a fully loaded all-flash StoreServ 20850 with an effective capacity of 15 PB would have a list price of around $22.5 million without a discount, and that does not include the benefits of being able to use active snapshotting. (Nunes says that HP “will give you a good discount” on that 15 PB setup, if you want it. So, in the example above, the unnamed bank could get a fully loaded StoreServ 20850 for about the same price as the datacenter that it did not have to buy.
At this point, the datacenter swap outs from disk to flash have been fairly modest in size, says Nunes, usually in the range of a few petabytes of effective capacity. But if this trend continues – as HP and its rivals in the all-flash array market hope and as disk drive makers Western Digital and Seagate Technology probably don’t.
“It is one thing to deploy flash for virtual desktops or database acceleration, but when you start thinking about it at a datacenter level, where you do want to consolidate workloads and there is a payback in total cost of ownership or worker productivity, then they will look at a 400 TB high-end array and make sure they have high availability features,” says Nunes. “A lot of the first-gen all-flash platforms don’t have them, and they are working on them and they will get there eventually.”
The thing about moving to an all-flash datacenter is that companies don’t have to do it all at once. They can roll flash in and roll disk out for each set of workloads over a period of time. This is precisely what HP is expecting customers to do, and frankly, needs customers to do. HP’s all-flash array business grew by a factor of 10X last year and now has an annual run rate of $342 million, and all-flash and hybrid disk-flash systems are, depending on who you ask, accounting for somewhere between 30 percent and 40 percent of the $20 billion market for storage arrays in 2014.