After years of false starts and delays with various products, we are finally at a point where Intel will truly start to test the breadth of its heterogenous computing strategy, thanks to the release of new Gaudi2 machine learning chips from Intel and the upcoming launch of its much-anticipated “Ponte Vecchio” GPU that will power Argonne National Laboratory’s “Aurora” exascale supercomputer.
But if Intel hopes to compete with the growth engine that is Nvidia’s accelerated computing business and prove that these new products won’t fade away like previous accelerator efforts, it will need to win over an untold number of developers and show that it has a better programming model than CUDA, Nvidia’s proprietary parallel programming environment that has allowed the GPU giant to claim a massive chunk of the accelerated computing market for itself.
Enter oneAPI, Intel’s cross-platform parallel programming model that aims to simplify development across a broad range of compute engines: CPUs, GPUs, FPGAs, and other kinds of processors, even those from competitors. The effort, which Intel built as an open standard, comes with a variety of toolkits, serving applications that are important to readers of The Next Platform, such as like high performance computing, deep learning and AI analytics as well as other areas, like rendering and the Internet of Things.
There are many things Intel needs to get right with OneAPI if the chipmaker wants its multi-pronged compute strategy to work the same kind of magic Nvidia has conjured with CUDA, and the first thing it needs is an easy way for developers to port code from CUDA.
Intel upped its CUDA migration efforts this month by open sourcing the technologies powering the Intel DPC++ Compatibility Tool, which is used for moving code from CUDA to OneAPI’s Data Parallel C++ language. But rather than herding developers into OneAPI, the new open source tool, called SYCLomatic, focuses on simply helping move that code to SYCL, the royalty-free, cross-architecture programming abstraction layer that underpins Intel’s parallel-friendly C++ implementation.
Knowing full well that Nvidia built its empire with a walled-garden approach, Intel is positioning the SYCLomatic as a community-minded effort that will build support for SYCL faster and free developers “from a single-vendor proprietary ecosystem.” The company says it will accomplish this by giving them an “easier, shorter pathway to enabling hardware choice.”
What Intel means by this that SYCLomatic, which is hosted on GitHub with an Apache 2.0 license, can automatically port up to 95 percent of CUDA code before developers have to step in and make manual adjustments so that applications run at the optimal level on their architecture of choice.
One organization that plans to take advantage of SYCLomatic is none other than Argonne National Laboratory, which expects to use the open source tool to cover gaps in the Intel DPC++ Compatibility Tool for preparing the in-development CRK-HACC cosmological simulation code for the Aurora supercomputer, which is set to go online later this year.
“To prepare for Aurora, the Intel DPC++ Compatibility Tool allowed us to quickly migrate over 20 kernels to SYCL,” Esteban Rangel, a computer scientist at Argonne, explains. “Since the current version of the code migration tool does not support migration to functors, we wrote a simple clang tool to refactor the resulting SYCL source code to meet our needs. With the open-source SYCLomatic project, we plan to integrate our previous work for a more robust solution and contribute to making functors part of the available migration options.”
Intel may have a long road ahead in winning developer mindshare from Nvidia, but at least it’s showing there is another way to build a software ecosystem for our brave, new compute world. Or more precisely, it is doing exactly what AMD has been doing with the HIP tool for its ROCm environment to port CUDA to its own Instinct GPU accelerators.