It used to be far easier to talk about storage performance, cost, and options. But in the last few years, and certainly going forward, things are set to get far more complicated for both technology creators and end users.
As with every set of transitions in one part of the stack, the complexity will be mitigated via natural selection among vendors who build onto core technologies and previous best practices but there is a long road ahead for storage makers and consumers at scale. Long gone are the simple days of disk, tape, and fibre channel.
The current ecosystem still has plenty of disk to go around (and more tape than anyone cares to admit) but with SSDs, NVMe (over fabrics or not), and rapidly-evolving storage class memory devices in the mix, there is not yet a set of best practices to help storage designers and end users alike get the right performance/cost balance for their workloads, which need to contend with multiple ways data can be shuttled around based on priority or policy.
Getting to the heart of this complexity and rapid evolution is central to the storage section of the day at The Next I/O Platform on September 24 in San Jose. The day’s conversations will cover quite a bit of ground on this front from both technology creator and end user perspectives with LinkedIn, Netflix, Fred Hutchinson Cancer Center, and others weighing in on how they’re thinking about the new layers introduced to the modern storage stack.
Among the perspectives that day, we’ll hear from noted storage architect, Curtis Anderson (Panasas), who will pick apart the lack of best practices for an increasingly diverse storage stack and hone in on a few ways users might consider this lack of coherence.
“In those good old days of disk and tape, there was relative consistency over time. Tape had its complexities and disk had a known performance profile and it was up to the storage industry to define a few rules of thumb on how to architect for desired performance levels in hardware and software. At that point we were simply limited by the speed of the hard drive and it was a matter of lashing things together to get aggregates with higher performance levels,” Anderson says. “That is no longer the case. There are now three and a half (or four, depending on how you define it) layers of performance: HDDs, SATA SSDs, NVMe, and storage class memory (SCM). As a storage designer, what strikes me is that creates a lot of individual complexity.”
Anderson says that even though high performance awaits, so do high costs. Without advanced policy engines and best practices that provide a seamless way for users to maximize investments in the higher-cost elements (NVMe, SCM), the promise of all those layers of performance can fall short of reality.
SCM is a good example of this lack of best practices, Anderson explains. “SCM has been around for several years now but we are all still figuring out how to best use them. Ultimately, the cross product for all the individual layers and how data gets moved between them is something the whole industry is still trying to understand. For instance, we just absorbed the idea of capacity-optimized flash, managing SATA SSDs, and the relationship between those and the hard drives. Then along comes NVMe and then SCM and storage designers and users alike are left struggling to make use of all of the layers most cost-effectively.”
In short, because of a lack of best practices with these layers there is a lot left on the table performance wise. For example, with something like SCM, there might be low latency, high bandwidth, and some byte addressability but all of that is landlocked on a single node. If that goes down, all is lost. The only way to disaggregate it is over the network, which adds latency. This has driven users toward replication, which introduces latency. “The best practice now, or what has become the default, is treating SCM like a fast flash device where you use sectors just like on normal storage devices with a file system on top. That cuts into the advantage since you’re not getting byte addressability or low latency support. The industry as a whole just does not know how best to use SCM,” Anderson argues.
Of course, treating SCM like a super-fast SSD might be somewhat wasteful but it is the easiest thing to do. But there are other questions. For instance, how do you map that into the application so it knows when to use the byte-addressable SCM? And within all of this is the question of cost per byte. This goes beyond the mere policies or least recently used placers of yore.
“With these layers where does a piece of data go when accessed? You’re balancing the temperature of each piece of data against the cost of moving it up and down the stack with those layers and there are a lot of combinations. There isn’t anyone who is putting a lot of theory into exploring that that I know of. With these performance layers you don’t have to move data all the way to the top or bottom, it can come from HDD to NVMe, or for the hotter data, even SCM,” Anderson says. “As the industry as a whole tries to figure this out, something new will rise to the top, but it will take a while.”
We’ll be talking about the different ways of thinking about this problem from a performance, application, and cost perspective. Of the latter, we’ll just say that the differences in these tiers ranges from $30 per terabyte to over $1000 for the cutting-edge NVMeOF tech.
“It Is straightforward enough to think about performance but with the big other element of cost and how, as a storage architect, one can minimize cost and maximize performance, it gets quite complicated.
We’ll hear much more about this from Anderson and many others at The Next I/O Platform on September 24. Our thanks to Panasas for being a sponsor of the technical event.