Any aspiring server processor architecture seems to have a new rite of passage, and that is to be deployed in the Packet bare metal cloud. The original 48-core “Thunder” X1 processors from Cavium (since acquired by Marvell) were added to instances at Packet way back in November 2016, and racks of servers using the 24-core variants of AMD’s “Naples” Epyc 7000s were rolled into its five datacenters in Parsippany (outside of New York City in New Jersey), San Jose, Dallas/Fort Worth, Amsterdam, and Tokyo in February 2018. Now, it is Ampere’s turn with its “Skylark” eMAG 8180 Arm processor.
Ampere dropped out of stealth mode back in February 2018 after tapping Renee James, who was in line to run Intel at one point, to take the helm of the chip upstart, which bought the server chip business from Arm server chip innovator Applied Micro with money supplied by The Carlyle Group, a private equity giant based in New York. The eMAG 8180 chip is based on the work that Applied Micro was doing on its third generation of X-Gene Arm server processors, which started sampling in March 2017 but which Ampere said needed some more work to make them hum. The eMAG 8180 has 32 cores, with what appears to be a base clock speed of 3 GHz and a turbo mode that has been revealed to rev up to 3.3 GHz. Like AMD’s Epycs, IBM’s Power9, and Marvell’s “Vulcan” ThunderX2 processors, the Skylark chip from Ampere has eight DDR4 memory channels per socket, which gives it 33 percent more memory bandwidth than the current “Skylake” Xeon SPs and the impending plain vanilla “Cascade Lake” Xeon SP follow-ons that are widely expected to be launched next week. (The Cascade Lake-AP variants will put two chips into a single socket to double up the memory controllers as well as cramming a lot more cores into a socket, but we are not counting those just yet as competition until we see more about them.) The eMAG 8180 has 42 lanes of PCI-Express 3.0 connectivity coming off the peripheral bus (a little less than the 48 lanes that a single Xeon SP socket offers), four direct SATA 3 ports for storage, and all packed into a 125 watt thermal envelope.
As we discussed last September when the Skylark chip first became available, the top bin eMAG 8180 chip with all 32 cores had its price dropped to $850 (down from $950), and the middle bin part with 16 cores had its price set to $550. These pricing moves and the sales pitch are being steered by Matthew Taylor, who managed the relationships that Intel had with Cisco Systems during its early years of creating and ramping its UCS blade server line and with Amazon Web Services between 2011 and 2015; Taylor also headed up marketing efforts of Qualcomm’s ill-fated “Amberwing” Centriq 2400 Arm server chip efforts. Now Taylor is senior vice president of worldwide sales and business development at Ampere, and he wants to chip away at Intel’s hegemony in the cloud and hyperscale datacenters
According to Taylor, the eMAG 8180 started shipping to server OEM Lenovo and server ODM Mitac at the end of 2018, and it is in proofs of concepts at a number of large companies right now based on these system designs. Adding bare metal instances on the Packet cloud will help spur the PoC tire kicking, since companies will not have to invest in whole servers but just time slices on them to give their code a whirl.
“We have been in the Arm developer community for a long time, which has been great for us, and this is just the next chapter in that,” Jacob Smith, cofounder with his brother Zachary Smith of Packet, tells The Next Platform. “We are pretty excited and have a lot of pent up demand to fulfill. We have sold a lot of the inventory that we brought in based on Ampere, so we are suffering a little from our success there.”
At the moment, only its Sunnyvale, California datacenter (the San Jose area) is available. FYI: Packet names its datacenters after the closest airport to them, so that one is called SJC1, and the one in Parsippany, New Jersey is called EWR1 after Newark Airport.
Packet may be on the leading edge of bare metal serving, but it probably has on the order of 10,000 servers across its datacenter; Smith says that the company buys in lots of hundreds and has rolled blocks of that size each into its five core datacenters. One particularly keen user of the Ampere machines is called Hatch, which runs online Android games in emulation mode on its Intel Xeon and AMD Epyc iron but can now run Android slices natively on the eMAG servers. Another early adopter is using the eMAG machines for developing IoT applications that will eventually be deployed at the edge on Arm-based iron, according to Smith.
According to Taylor, the early adopters of eMAG are focused on core infrastructure, such as Web hosting and content hosting as well as on new-fangled databases and datastores such as Cassandra and MongoDB. All of this software runs on a Linux stack that has been ported and – importantly – tuned and tested to run on the Armv8 architecture, and it just works at this poi
Smith says that Packet is not trying to live out there on the bleeding edge, but sees itself more as a channel for any particular technology that customers actually want to deploy. This is how and why it has tapped the Skylark chip from Ampere and not the Vulcan ThunderX2 processor from Marvell for its upgrade to Arm infrastructure on its bare metal cloud. This is what customers want to take a look at. (It is important to not read too much into that. ThunderX2 servers have been reasonably widely available for quite some time now.) But without a doubt, there are some synergies going on between Packet and Ampere, Smith says, n that they are both relatively small companies with respect to the size of the ecosystems they play in – Marvell is one of the largest chip makers in the world, by contrast – and both can focus in tightly on a set of customers and learn from them fast.
There absolutely was a bakeoff for the second generation of Arm instances at Packet, by the way. “We actually did pick Ampere,” Smith explains. “We were actually closely involved with Qualcomm and its Centriq 2400, and we all know how that went. To tell you the truth, there are not a lot of different things to buy, but we chose the Ampere chip because it was a good fit for the broadest set of workloads. When it comes to ThunderX2, we think it is a great focus, but Marvell’s focus is mainly on the HPC sector and they have been a little less engaged in a relationship with them. But we are interested in the ThunderX roadmap as well. But for this rollout, we definitely chose Ampere.”
Packet is deploying a single-socket Ampere eMAG 8180 server (base speed of around 3 GHz is our guess) that has 128 GB of memory, 480 GB of SSD local storage, and a dual-port 10 Gb/sec ConnectX network interface card from Mellanox Technologies that has the ports bonded to deliver 20 Gb/sec of aggregate bandwidth. This sells for $1 per hour. That is the same price that Packet is charging for a single-socket AMD Epyc 7401p server, which has 24 cores running at 2.2 GHz, 64 GB of main memory, 960 GB of SSD, and the same 20 Gb/sec connectivity. The old dual-socket ThunderX machines which have 96 cores running at 2 GHz, plus 128 GB of memory, 250 GB of SSD, and 20 Gb/sec of connectivity cost 50 cents per hour at this point. The closest setup that Packet offers that is similar to the eMAG 8180 and Epyc 7401 setups is two-socket system with a pair of 28-core Xeon SP-5120 Gold processors running at 2.2 GHz with 384 GB of memory, a pair of 240 GB SDDs, 3.8 TB of NVM-Express flash, and 40 Gb/sec of networking. This is obviously a much more hefty machine, which is why it costs $2.25 per hour. It would be interesting to see what heftier two-socket Epyc and eMAG servers might look like and cost, and what skinnier one-socket Xeon SPs might cost so a direct comparison on the Packet bare metal cloud could be made.
Maybe we will see that with the next generation of eMAG processors, code-named “Quicksilver,” which will sample later this year and move into production later this year based on the 7 nanometer chip etching techniques from Taiwan Semiconductor Manufacturing Corp. The current eMAG 8180 is etched using TSMC’s 16 nanometer processes. By the time Quicksilver is shipping, Intel will have “Cooper Lake” Xeon SPs in the field and AMD will have had “Rome” Epycs out there for about six months or so.