Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LANGUAGE REQUEST]: Triton #5530

Open
siboehm opened this issue Sep 28, 2023 · 1 comment · May be fixed by #5531
Open

[LANGUAGE REQUEST]: Triton #5530

siboehm opened this issue Sep 28, 2023 · 1 comment · May be fixed by #5531
Labels
new-language request Request for something

Comments

@siboehm
Copy link

siboehm commented Sep 28, 2023

Language name

Triton

Language version

No response

Language homepage

https://triton-lang.org/main/index.html

Compiler homepage

https://github.com/openai/triton

Compiler version

v2.0

Motivation

Triton is an MLIR-based JIT compiler that compiles Python programs for accelerators. It's already pretty popular for writing CUDA kernels, and more hardware vendors are implementing backends for it. It's interesting as a language because it moves the abstraction level, from operating on scalars (CUDA) to operating on blocks of values.

(I hope classifying this as a new language is ok. It's not really just another library for Python, since CE will need very different disassembly steps + device code display.)

@siboehm siboehm added new-language request Request for something labels Sep 28, 2023
@siboehm
Copy link
Author

siboehm commented Sep 28, 2023

I started working on this yesterday, here are some notes for posterity:
Adding Triton as a JIT is not a good fit for CE, as we want to be able to target different Nvidia Microarches. (Interestingly, the CE machines have GPUs, T40s IIRC)

Triton recently added a AOT feature, which is this script.

  1. Currently does not allow you to specify the microarch version (doesn't pass cc to compile())
  2. It takes a Python file that defines the kernel, and emits a .h and .c file. The compiled kernel itself is encoded as hex into a C char array in the .c file.
  3. The .h and .c files can be compiled using a standard C compiler.
  4. Unfortunately, the resulting binary is not a cubin, so using nvdisasm / cuobjdump does not work for getting at the SASS & PTX. We'd have to first extract the cubin from the char array.
  5. The JIT does emit proper cubin files, plus IR files for some of the intermediate dialects.

It seems hacky for CE to extract the cubin file from the char array, dump it to disk and then run nvdisasm on it. I haven't come up with a better idea yet though.

@siboehm siboehm linked a pull request Sep 29, 2023 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-language request Request for something
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant