An Inside Look at One Major Media Outlet’s Cloud Transition
May 3, 2017 Nicole Hemsoth
When it comes to large media in the U.S. with a broad reach into television and digital, the Scripps Networks Interactive brand might not come to mind first, but many of the channels and sources are household names, including HGTV, Food Network, and The Travel Channel, among others.
Delivering television and web-based content and services is a data and computationally intensive task, which just over five years ago was handled by on-premises machines in the company’s two local datacenters. In order to keep up with peaks in demand during popular events or programs, Scripps Interactive had to overprovision with those servers and their mostly enterprise/proprietary stacks sitting idle in down seasons.
The cost of this overprovisioning and lack of innate scalability were only part of the reason Scripps Networks Interactive made the on-prem to cloud transition. The real driver was agility—being able to develop and deploy new services quickly as well as have the reliability and resiliency to stick to their service delivery targets. As Mark Kelly, the company’ director of cloud and infrastructure architecture tells The Next Platform, between 60-70% percent of their infrastructure is on Amazon Web Service—usage that amounts to around 3000 instances per day and two petabytes of S3 object storage.
Kelly says that the first hurdle was getting financial decision makers to see the cost/benefits since so much early on was “back of napkin” and difficult to pin down. “It is always a struggle to present that—but cost was only one benefit, we were looking for something to give us an edge. It was really about competition, which were startups that were more agile We were stuck in a traditional enterprise datacenter model where it took weeks or longer to make changes,” he explains, noting that once they got through the initial hurdles, the benefits were clear, leading to an uptick in the number of applications handled in the Amazon cloud.
Instead of relying on local datacenters, the 30-40 million daily visits across their numerous video-heavy web properties can scale with demand and quickly usher in new services that are part of Amazon. For instance, Kelly says video teams at Scripps Networks Interactive are experimenting with MxNet for machine learning to better analyze viewer data and more uniquely, are using the Amazon Rekognition service to train for video elements that need to be blurred. He gave the example of a brand name showing in the background (i.e. a can with the Coke name clearly visible) and says that in the past, that kind of editing was a manual human task. By building a database of brands and other video-no no’s they can be far faster at blurring, something that was an inexact process before.
Around 70% of the consumer-facing side of Scripps Networks Interactive’s business is on AWS, but some of the latency-sensitive workloads with custom or specific software stacks are still happening on site. Among such workloads is video transcoding—an application that is both data and compute intensive. Kelly says they are working on moving some of that as a batch workload to AWS but the problem is more one of a lack of support on the cloud vendor side for transcoding applications. “Transcoding is a unique animal—the proper codex for transformations and such don’t exist on major clouds yet, but a lot those batch jobs can take advantage of scaling better than our limited on-prem transcoding. That does get cost-prohibitive because of data movement at a certain point, but bursting into the cloud for this works well when it’s necessary.”
For these and other workloads, Kelly says his teams are using various instance types on AWS as they fit the application, including the use of GPUs for3D rendering and other video services. Although many of the same hardware options are available via other cloud offerings, he says they do follow what the other major cloud providers are doing and even have some workloads running on Azure (only because they are tied to Windows Server) but ultimately, AWS is the most mature with the richest options, Kelly says.
Although the media company has found the cloud transition to be a successful undertaking, there are still difficult elements, especially as data volumes grow and the need to be competitive for both viewers and advertisers’ sake increases. “Given the volume of infrastructure and supporting services and operations, data management is a critical issue for us,” Kelly explains.
“We were doing traditional on prem support of all of our infrastructure and hunting manually for needles in the proverbial haystack. Aggregating that into a centralized form with its own process has made that easier—we can resolve problems easier.” For much of the IT operations side, Kelly says they deploy Sumo Logic—the log analysis and monitoring suite of tools from a company that seven years after its founding has raised $160 million and has over 1300 customers. “The data as a whole from our logs, operations and application services among other services translates to around 500 GB per day—that is a lot to process and dig through to see our operations challenges clearly. Applying filters, searches, and queries is making our operational lives easier and we can centralize around these main repositories,” Kelly says.
Aside from the IT operational issues that Scripps Networks Interactive addressed with Sumo Logic, the larger analytics side is comprised of several different open source and custom-developed tools. This is a big difference from the largely proprietary stack that was in place for similar needs when Kelly joined six years ago—and it is a stack that is evolving quickly to incorporate machine learning and other tools. With between 100-200 developers on staff and with a mostly Java-based focus, Kelly says that AWS has been a good fit and that they pay close attention to the new tools and partnerships they add.
As with other companies that have shifted from on-prem to cloud for many of their operations, system admin and their jobs have been on the line. Kelly says they did not have a major lay-off, but rather had their sys admins learn more about automation and generalized development, meaning their skills were more valuable versus laid to waste.