BarraCUDA Open-source CUDA compiler targeting AMD GPUs

GitHub – Zaneham/BarraCUDA: Open-source CUDA compiler targeting AMD GPUs (and more in the future!). Compiles .cu to GFX11 machine code. Skip to content You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert Zaneham / BarraCUDA Public Notifications You must be signed in to change notification settings Fork 4 Star 65 Open-source CUDA compiler targeting AMD GPUs (and more in the future!). Compiles .cu to GFX11 machine code. License Apache-2.0 license 65 stars 4 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings Zaneham/BarraCUDA master Branches Tags Go to file Code Open more actions menu Folders and files Name Name Last commit message Last commit date Latest commit History 12 Commits 12 Commits src src tests tests .gitignore .gitignore LICENSE LICENSE Makefile Makefile README.md README.md ROADMAP.txt ROADMAP.txt View all files Repository files navigation BarraCUDA An open-source CUDA compiler that targets AMD GPUs, with more architectures planned. Written in 15,000 lines of C99. Zero LLVM dependency. Compiles .cu files straight to GFX11 machine code and spits out ELF .hsaco binaries that AMD GPUs can actually run. This is what happens when you look at NVIDIA’s walled garden and think “how hard can it be?” The answer is: quite hard, actually, but I did it anyway. What It Does Takes CUDA C source code, the same .cu files you’d feed to nvcc , and compiles them to AMD RDNA 3 (gfx1100) binaries. No LLVM. No HIP translation layer. No “convert your CUDA to something else first.” Just a lexer, a parser, an IR, and roughly 1,700 lines of hand-written instruction selection that would make a compiler textbook weep. ┌──────────────────────────────────────────────────────────────┐ │ BarraCUDA Pipeline │ ├──────────────────────────────────────────────────────────────┤ │ Source (.cu) │ │ ↓ │ │ Preprocessor → #include, #define, macros, conditionals │ │ ↓ │ │ Lexer → Tokens │ │ ↓ │ │ Parser (Recursive Descent) → AST │ │ ↓ │ │ Semantic Analysis → Type checking, scope resolution │ │ ↓ │ │ BIR (BarraCUDA IR) → SSA form, typed instructions │ │ ↓ │ │ mem2reg → Promotes allocas to SSA registers │ │ ↓ │ │ Instruction Selection → AMDGPU machine instructions │ │ ↓ │ │ Register Allocation → VGPR/SGPR assignment │ │ ↓ │ │ Binary Encoding → GFX11 instruction words │ │ ↓ │ │ ELF Emission → .hsaco ready for the GPU │ │ ↓ │ │ Your kernel runs on ya silicon │ └──────────────────────────────────────────────────────────────┘ Every single encoding has been validated against llvm-objdump with zero decode failures. I didn’t use LLVM to compile, but I did use it to check my homework. Building # It’s C99. It builds with gcc. There are no dependencies. make # That’s it. No cmake. No autoconf. No 47-step build process. # If this doesn’t work, your gcc is bro

Source: Hacker News | Original Link