Media companies and supercomputing centers have similar compute, storage, and bandwidth issues, but their workloads and users vary enough that they require different kinds of systems.
Like other parts of the datacenter, convergence is coming to the storage that underpins media content and distribution. In this case, the convergence is a reflection of the new reality of big media companies, who need systems that are not only suitable for content creation but also for content distribution. And rather than having to manage a bunch of disparate boxes with human beings babysitting the movement of data across production arrays and archival storage on tape or in the cloud, companies want integration and automation.
This is why DataDirect Networks, which got its start designing storage for media companies fifteen years ago and was immediately pulled into the HPC market by scoring its first deal with NASA, has come up with an uber-appliance called MEDIAScaler tuned up specifically for media and entertainment companies.
Over half of its 1,200-strong worldwide customer base is in the media and entertainment sector, and its products are in use at about half of the 1,200 or so large media companies. These media companies have workloads that run the gamut from animation and rendering, content origination and delivery, archiving, post production, transcoding, and broadcast.
Laura Shepherd, director of HPC markets at the company, tells The Next Platform that media and content distribution drives about 15 percent of DDN’s revenues, but it also changes depending on what technology transitions are under way on the various devices we use to consume media. The HPC business can be very choppy, with a lot of big system and storage deals increasingly tied to Intel Xeon E5 product cycles, and as it turns out, the media industry has its own waves. (This is one reason why the financials of all of the major supercomputer makers like SGI and Cray as well as players like DDN have to be evaluated against their own product cycles, not on a quarterly and maybe not even on an annual basis.) DDN has not released financial figures for its latest fiscal year, but it is on the order of several hundreds of millions of dollars.
“The amount of data that media companies have to deal with fluctuates based on how recently there has been a transition of the media workflow to a new format.” Shepherd explains. “In most markets, where you are dealing with high data rate instruments – such as in life sciences or surveillance or oil and gas – there are waves in devices that do drive big changes in data rates, but all the while you are having minor changes in data rates through the proliferation of new devices and the enhancements in definition of new devices and the number of devices involved in operations. So you have this nice gradual gain of data with these milder peaks. But with media, the whole industry moves to UltraHD, and the whole industry is going to move to 8K. And the more advanced the infrastructure, the more likely that they are going to have to work in uncompressed formats.”
These transitions are more like punctuated equilibrium, and they tend to be about three to five years apart. The media industry is in the middle of a transition to 4K resolution devices right now, as it turns out, and this is an opportune time for DDN to craft a storage appliance that has the oomph to tackle the higher-definition jobs and, importantly, bring all of the need storage components together into a seamless whole.
The central component of the MEDIAScaler appliance is a parallel file system running on high performance disk array nodes, and DDN is starting out using IBM’s GPFS at the moment.(But Intel’s distribution of the Lustre file system could just as easily be used, according to Shepherd.) The GPFS file system can deliver about 4 GB/sec of media streaming capability per client. MEDIAScaler starts out as small as 80 TB and scales up to tens of petabytes of capacity, depending on which SFA disk arrays are chosen from the DDN lineup.
The active archive nodes are comprised of both object storage and tape storage, in this case the WOS object storage that is also used as a cloud gateway to object storage that is compatible with Amazon S3 or OpenStack Swift protocols. The GPFS file system also has been integrated with tape libraries for long term storage, and the secret sauce in the MEDIAScaler is that data stored across these devices is presented in a single view that is transparent to both applications and administrators. Competitive products in the media sector can span different kinds of disk, tape, and cloud storage, but Shepherd says with each of them, there is a hop here or a need to stop and move data from one storage device to another or to a third party service out on the cloud.
Precisely what disk arrays customers will pick to build their MEDIAScaler systems depends on the number of streams that they need, says Shepherd. For those wanting to push three to seven uncompressed HD streams, DDN would start them out with the SFA770-class arrays – provided their capacity is not doubling every year. Bigger media companies are trying to push tens of streams so they can parallelize content creation jobs and get more work through their shops in a faster amount of time (imagine that), and if a company wants to push 35 uncompressed UltraHD (4K) streams – which Shepherd says no one else in the industry can do – then DDN would suggest the higher-end SFA12K arrays as they foundation for a MEDIAScaler setup. The MEDIAScaler can scale out to hundreds of gigabytes per second of media stream performance and up to hundreds of petabytes of capacity, and has features to automatically tier data between disk and flash in the SFA arrays and across tape and object storage, with all data presented through a single namespace.
Here’s how DDN sizes up the storage competition in the media market. It says that its basic MEDIAScaler building block can deliver those 35 UltraHD streams, but an Isilon setup from EMC can deliver around one stream and the Quantum and NetApp alternatives can deliver around eight streams. (These are uncompressed.) DDN can push 4 GB/sec for the streams, compared to less than 1 GB/sec for the alternatives mentioned above.
Workload Overlaps Drives Convergence
The balance of the storage and the compute at media companies depends on the balance of content creation and distribution at those companies. Distribution companies, which are transcoding media for different formats and archiving them over time, obviously have a much higher focus on storage than compute, while those who are doing more content creation will lean toward a more traditional split between compute and storage. One of the factors that is driving the creation of MEDIAScaler is that there is an increasing amount of overlap in the workloads.
“Distribution used to be completely separate all the time,” Shepherd explains. “Now you have content distributors going into content creation, and creators going into distribution. You can think of it as distributors, creators, and hybrids, but the truth of the matter is that almost everybody has some hybrid going on right now.”
If customers are focused mainly on high-performance content creation, then they would be looking at SFA12K arrays, she says. And if they focus just on media distribution, then Shepherd would expect that they would be talking to DDN about WOS. Customers doing both will use a mix of both, and hence MEDIAScaler integrates both. Moreover, because the WOS object storage is a cloud solution that is latency aware, it is not only just valuable for collaboration for content creation among artists, but is well suited for content distribution because it can figure out which archive is closest to you on the Internet and serve that one up to you. It all comes down to particular cases and specific capacity, performance, and price points. MEDIAScaler is available now. Pricing information on the devices was not divulged.