IBM, Aspera Transfer Network Lessons to Wider Markets
August 17, 2015 Nicole Hemsoth
Before they were acquired by IBM in early 2014, high speed data transfer company, Aspera, had a story that was best told through the performance improvements reported by their end users. This list, which included some of the top content providers on the planet, was largely media and entertainment focused, but as they moved under the Big Blue umbrella and expanded their capability set, they have set their sights on an expanding set of opportunities in supercomputing research networks, hospital networks, and more global enterprise settings.
While it might be worthwhile to do a more in-depth technical overview on the next-generation protocols that are set to show 100 GB/s transfers from disk to disk with full encryption, all of which happens in a FIPS-compliant way to sate future enterprise users, the important note here is that Aspera, which simply tacked “an IBM company” onto its name versus being absorbed, sees the future of large-scale file transfer. This is both a hardware and software story, as one of the company’s lead software architects, Charles Shiflett, explains to The Next Platform—and at a time when bolstering the reliability, robustness, and of course, speed, of massive files with full encryption, is more important than ever, especially following IBM’s investments in another acquisition, the SoftLayer cloud business.
Aspera’s foray into the fast-growing, expanding field of high-speed file transfer was fed by a frustration with TCP-based file transfer protocols that were difficult to accelerate beyond the point required by ever-growing data sizes and speed needs. To counter this, they rolled out their FASP software, which significantly boosted what was possible using FTP and HTTP approaches with all the required security and encryption for data in flight. In essence, Aspera’s software upends TCP’s way of handling rate control and reliability by removing a big bottleneck, the TCP acknowledgment of packet receipt from the receiving end of the network. There is still acknowledgement, but the communication if there is a packet loss has been altered so that the network does not automatically slow down because the message has not come back fast enough. This is one of several enhancements that Aspera has made to existing file transfer systems, but with FASP and the R&D resources, partnerships, and customer base of IBM, Aspera has a chance to see bigger returns on its technology investments.
This is not to say that IBM and Aspera have been close partners in the past. In fact, a great deal of the work Aspera did was with Intel. Prior to the acquisition, Aspera worked with the chipmaker on some noteworthy transfer performance improvements, including one demonstration at the beginning of the partnership (just over two years ago) where they used four ten gigabit cards without encryption and 20 SSDs connected to RAID controllers to prove out read speeds at 40 Gb/sec , then sent that over the network to write to an identically configured node. Last year at SC14, they upped the ante and showed how the same concept, this time with two 40 gigabit network adapters and NVM-Express (instead of the RAID cards) could be proven in and across the LAN at a 2,000 mile distance with 61 Gb/sec in effective throughput in the LAN, and further, an effective throughput of 70 Gb/sec from memory to memory.
The Intel involvement was with the Data Plane Development Kit (DPDK), which is a low-level toolkit that chucks the kernel for network operations. The project, which is designed to provide a better interface for Intel’s own hardware, is not just being furthered by Intel. Others, including Mellanox, are part of the open source DPDK development. In essence, for the NVM-Express drive work Aspera did, it is possible to set up the drives in a certain way with the file system so that rather than reading directly from the drives, then having to go through the kernel (which involves memory copies and other overhead) it is possible to copy straight from the disk to a defined memory region. There are other aspects to the DPDK, including the ability to take advantage of memory and processor alignment and synchronization primitives, all of which Aspera tapped into to demonstrate its performance improvements.
The Intel and Aspera partnership has been valuable, but according to Shiflett, the benefits of being part of IBM will add further capabilities. Although it is still a new acquisition and there is still progress to be made (Shiflett says they are still operating very much as a distinct company), the interesting benefit is that their new IBM ties allow for even closer cooperation with Intel. This means the opportunity to get even earlier access to pre-release hardware, but it also opens a host of new tools to integrate with and develop on. This includes working with GPFS (Aspera has already done extensive work with Lustre), as well as other tools, including the IBM security package, GSKit. Aspera already has developed FIPS compliant offering on their own, but with the IBM work, they will work on expanding their FIPS-compliant transfer protocol. “In addition to this, we are working with IBM and their compression software for higher speed compression that is compliant with older protocols and lets us do new things like adaptive compression, which lets us look at how many compute cycles are available before sending a block over the wire.”
IBM and Aspera are also working with large companies like Netflix, including the use of their existing software, to deliver big speedups from Netflix distribution points to elsewhere on their CDN almost instantly. More specifically, Netflix uses Aspera’s software to handle 30 TB of new content each month from over 300 global partners and has been able to boost their transfer rates by 10x over other approaches they took in the past, according to Aspera. In this case, as in other use cases they are experimenting with, the goal is to take advantage of changes in storage hardware. “Right now, we are limited to direct attached storage, as is the case with the NVM-Express based disks, but this can still apply to spinning disk. What is ahead is when that model changes—when it becomes a question of how many nodes you have and can those serve the client fast enough and with all the same transformations (i.e., encryption), send it over the network and have that stream rebuilt in a reliable way at a remote site.”
This applies to areas outside of large-scale content delivery shops like Netflix, Shiflett describes, pointing to what they hope to demo for the supercomputing set in November at SC15. Aspera’s traditional market has been with media and content providers, but they are shifting their focus to the wider world of high performance computing, including large scientific and research networks. “Our goal is to move into general purpose HPC. If you look at what Caltech is doing in academia, they are showing the same 100 Gb/s transfer but they’re limited to going from memory to memory, but we are going disk to disk, which is a tougher problem, especially when you add encryption to that, which we do.” They are working with the USDA, the Brazilian research networks, and other hospitals and institutions to move their next-generation approach to high-speed, secure transfer. In these cases for us, it is not the historical market of content provider networks, but research data and collaborative efforts.