Concept: San Francisco’s company OpenAI has released Triton, an open source, Python-like programming language that lets researchers write efficient GPU code for AI workloads. OpenAI says that Triton makes it feasible to attain top hardware performance with minimum effort, creating code that is on par with what an expert might accomplish in as few as 25 lines.
Nature of Disruption: Triton, according to OpenAI, enables the construction of specialized kernels that are significantly quicker than those in general-purpose libraries. Its compiler simplifies code and optimizes and parallelizes it automatically, transforming it to code that can be executed on current Nvidia GPUs. At this time, CPUs and AMD GPUs, as well as platforms other than Linux, are not supported. A challenge posed by the proposed paradigm is that of work scheduling which questions how the work done by each program instance should be segregated for efficient execution on modern GPUs. To address this, the Triton compiler makes extensive use of block-level data-flow analysis, a technique for statically scheduling iteration blocks based on the target program’s control- and data-flow structure. The end result is a system that functions well as the compiler can do a wide range of interesting optimizations on its own.
Outlook: The Triton language was first proposed in a paper presented at the International Workshop on Machine Learning and Programming Languages in 2019. The project’s GitHub repository contains the first stable version of Triton as well as tutorials.