When Jim Keller talks about compute engines, you listen. And when Keller name drops a programming language and AI runtime environment, as he did in a recent interview with us, you do a little research and you also keep an eye out for developments.
The name Keller dropped was Chris Lattner, who is one of the co-founders of a company called Modular AI, which has just released a software development kit for a new programming language called Mojo for Linux platforms. Lattner is probably one of the most important people in compilers since Dennis Ritchie created the C programming language in the early 1970s at AT&T Bell Labs for the original Unix. Ken Thompson created its predecessor, B, also while at Bell Labs, and interestingly now works for Google, which we might even call Ma Search, and created the Go language for system-level programming (like Kubernetes) that is like many, many languages, a derivative of C.
Every new domain has its programming languages that evolve with that domain, and every platform has its preferences, as does every programmer. And thus, the cornucopia of programming languages have evolved in that Cambrian explosion manner. This magnificent programming language genealogy tree from Wikipedia, which was created by Maximilian Dörrbecker and which ends in 2008 so it is missing a bunch of stuff, gives a pretty good overview of the first couple of decades of the dominant programming languages beyond plug boards and assembler and how they relate to each other – or don’t. Love this:
For many of us of a certain age, the dawn of the PC era meant that BASIC was transformative in our lives because it gave us the power of programming at a price that we could afford. But no one would call it a great language for creating complex, modern applications. Even in 1985. Many of us learned FORTRAN (now less shouty as Fortran) and COBOL and got a taste of other esoteric languages – ours were Pascal and Smalltalk. We talked about why Fortran persists and its skills gap problem recently.
Java was created because C and C++ were too damned hard – and an unnecessarily so in a macho kind of way – for programmers in the enterprise who just want to craft business logic and not have to worry about system architecture. They wanted the ease of COBOL, RPG, and similar high level languages and they want to have the complexity of the underlying hardware – things you really need to know about with C and C++ – underneath a virtual machine that interprets the language and that allows for a certain degree of compilation for performance but which ensures the maximum platform portability. While some have joked that Java has the portability of C++ and the readability of assembly, it was an obvious improvement, which is why Java took off in the enterprise in application areas where performance, in an absolute sense, was not as much of a concern as programmer productivity and quick turn around on code. We had so many clock cycles and threads laying around, and you know the server makers were happy to sell machines to run this interpreted program rather than a strictly compiled one that ran like a bat out of hell on cheap iron. Still, when it comes to performance, a lot of modules of code are still written in C or C++ within a stack of Java.
Making such statements, we are bracing for – and absolutely expecting – contrarian opinions. There is nothing like talking about programming languages to bring out the prides and prejudices. Which is fun and fine. But make no mistake, and don’t just listen to all of the AI whitewashing that Modular AI is doing, which helped them to raise $100 million in venture funding two weeks ago. Lattner and his Modular AI co-founder, Tim Davis, don’t just want to fix AI programming. They want to fix programming. And if there are two people who can do that – in a way that Dennis Ritchie with C and James Gosling with Java and Ramus Lerdorf did with Personal Home Page – Lattner and Davis are those two.
The Lovechild of C++ And Python
Here in the third decade of the 21st century, if you asked programmers what language a newbie should learn, we think most of them would say Python, which is a language created by Guido von Rossum more than two decades ago to make it easier for programmers to do neat things.
Pythonic languages (and there are many) are not in love with their own complexity, just like languages such as BASIC, RPG, and COBOL are not. Other languages are definitely in love with their own complexity – and they are macho about it. But in this era where software makes up a by-far dominant portion of the overall IT budget, we don’t have time for this.
If you put a gun to the heads of most programmers and asked the same question – what programming language should you learn? – they would probably say C or C++ because they are universally applicable and ultimately more useful in terms of both portability and performance for both the code and their careers. In the corporate enterprise, the programmers would no doubt say Java, which is C++ playing on idealized hardware in a sense.
Lattner and Davis think this is a false dichotomy, and with Mojo they are creating a superset of Python that can do all of the same high-level stuff and can even absorb and run literal Python code in its runtime while at the same time giving Mojo the same low-level performance programming that C and C++ offer and that has given them such longevity in the market. (Maybe we can think of it as the lovechild of C++ and Python.) The Mojo runtime can do CPython code, but it can do its own stuff, too. And this is the key.
The pedigree of Mojo’s creators is impeccable. Lattner got a bachelor’s degree in computer science at the University of Portland and was a developer on the Dynix/ptx Unix variant for the big X86 NUMA boxes from Sequent Computer Systems for a while. Lattner got his master’s and PhD degrees from the University of Illinois at Urbana-Champaign after the Dot-Com Boom, and created the Low Level Virtual Machine project with his advisor, Vikram Adve. LLVM is, of course, integral to most compilers today, and it is a set of compiler tools that uses a language-independent intermediate representation that is constructed from any high-level programming language on its front end that can be targeted and compiled down to any instruction set for any kind of device on its back end. The important thing about LLVM, according to Lattner, is that it is modular and extendible through APIs – the LLVM toolchain is like the non-monolithic code that programmers are trying to create, which means they can tweak it to do all sorts of things.
While at Apple, Lattner drove the development of Clang and Swift, two C-alikes, and then did a brief stint at Tesla as vice president in charge of its Autopilot software, and then moved to Google to be the senior director of the TensorFlow AI framework that Google had open sourced a few years prior.
It was at Google that Lattner met Davis, who has a bachelor’s degree in chemical engineering from the University of Melbourne and a law degree from Monash University in Australia as well as studying computer science at Stanford University. Davis founded his own machine learning startup, called CrowdSend, in 2011 and did the engineering for another startup called Fluc in 2013, where he spent three years before becoming a product manager in the Google advertising business in 2016. Davis joined the Google Brain team to work on TensorFlow in 2018 and was one of the creators of the TensorFlow Lite variant, and eventually became the group product leader for machine learning at the company, leading the development of machine learning APIs, compilers, and runtimes.
In September 2019, both Lattner and Davis were the tag team distinguished engineer and product manager for TensorFlow when the Multi Level Intermediate Representation, or MLIR, compiler infrastructure for heterogenous compute engines was contributed to the non-profit LLVM Foundation for which Lattner is the benevolent dictator for life. The two left Google in January 2022 to start Modular AI.
Modularity Is The Key
The name Modular AI is important, and not the AI part except for the fact that this is a great place to start a new programming language like Mojo. There are so many different ways to get code onto so many different kinds of compute engines that it is enough to make your stomach churn, and there is a modicum of lock-in, intended (as with Nvidia’s CUDA stack) or not (as with AMD’s ROCm and Intel’s OneAPI).
“The tools used to deploy AI models today are strikingly similar to compilers and tools in the 1990s and 2000s,” wrote Lattner and Davis when Modular AI was unveiled back in April 2022. “We see severe fragmentation across these systems, with a wide variety of hardware, each having bespoke tools. The world’s biggest tech companies have built multiple in-house toolchains specific to different hardware products over the years, and these are often incompatible and share little code. How many flaky converters and translators does one industry really need?”
One should do. Ahem.
Lattner and Davis unveiled the Mojo language in May of this year,
What got us to writing about Mojo was the fact that the first software development kit, obviously for Linux, was released yesterday, with SDKs on the way for Windows and MacOS/iOS platforms coming down the pike.
Mojo itself was unveiled in May of this year, and over 120,000 developers gave come and kicked the tires on this “new” language and over 19,000 of them are chatting away about it on Discord and GitHub. It is illustrative to hear why Lattner and Davis created Mojo directly from them, so you should read that manifesto.
One of the big changes between Python and Mojo is that Mojo is multithreaded and can run across multiple cores, and because of this, Modular AI can show a 35,000X speedup calculating Mandelbrot sets compared to the CPython runtime in Python 3. That is a much bigger performance increase than using the PyPy variant of Python that has a just-in-time compiler and it even beats the tar out of moving the code to C++, which is neat:
To our eyes, this just shows how interpreted languages should never be left uncompiled, given the tremendous need for compute and the flattening of the cost curves thanks to the creakiness of Moore’s Law reductions in the cost of transistors on compute engines.
To show that the Mojo pudding has some proof, Modular AI created its own Modular Inference Engine, which can take trained AI models from TensorFlow and PyTorch machine learning frameworks, the JAX numerical function transformer, the XGBoost turbocharger for gradient descent calculation acceleration and accelerate the heck out of them and then output transformed and accelerated inference models for compute engines based on Intel and AMD X86, various Arm, and various RISC-V CPUs as well as for GPUs from Nvidia, AMD, and Intel.
Here is the speedup of the Modular Inference Engine compared to TensorFlow and PyToch on various AWS instances using Intel Xeon SP, AMD Epyc, and Graviton2 instances running on Amazon Web Services:
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.