Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ __pycache__/
*.pyo
*.so
*.a
*.obj
*.lib
*.exp
*.dll
*.pyd

# Wheel files
*.whl
Expand Down
4 changes: 3 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,9 @@ set(PIP_INSTALL_ENVS) # Pip install environmental vars
if(DEFINED PIP_INSTALL_TARGET)
list(APPEND PIP_INSTALL_ENVS "PIP_INSTALL_TARGET=${PIP_INSTALL_TARGET}")
endif()
list(APPEND PIP_INSTALL_ENVS "TRITON_BUILD_WITH_CLANG_LLD=true")
if(NOT WIN32)
list(APPEND PIP_INSTALL_ENVS "TRITON_BUILD_WITH_CLANG_LLD=true")
endif()

add_custom_target(triton-build ALL
COMMAND ${PIP_INSTALL_ENVS}
Expand Down
112 changes: 112 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,115 @@ AIR_TRANSFORM_TILING_SCRIPT=transform_aie2.mlir python matmul_bf16_m64_n64_k64.p
```

**Note:** The `transform_aie2.mlir` transform dialect IR is specifically designed for the AIE2 architecture. For AIE2P architecture, use `transform_aie2p.mlir` instead.

## Windows Support

Native Windows builds are supported using MSVC — no WSL or Linux required. The full
compilation pipeline (Triton → MLIR → xclbin → XRT dispatch) runs natively on Windows.

### Windows Requirements

- **Windows 10/11** (x64)
- **Visual Studio 2022** with "Desktop development with C++" workload
- **Python 3.12+**
- **CMake 3.20+** and **Ninja** (via pip or standalone)
- **AMD NPU driver** (installs `xrt_coreutil.dll` runtime)

### Windows Quick Start

```powershell
git clone https://github.com/amd/Triton-XDNA.git
cd Triton-XDNA
git submodule update --init

python -m venv venv
.\venv\Scripts\activate
pip install --upgrade pip setuptools wheel
```

Prepare XRT development files (headers, import library, xclbinutil). Download
`xrt_windows_sdk.zip` from [Xilinx/XRT releases](https://github.com/Xilinx/XRT/releases)
and extract the `xrt/` directory to `C:\Program Files\AMD\xrt`:

```powershell
# The xrt/ folder inside the zip should end up at:
# C:\Program Files\AMD\xrt\include\xrt\xrt_bo.h
# C:\Program Files\AMD\xrt\lib\xrt_coreutil.lib
```

Run the automated build:

```powershell
.\utils\build_windows.ps1
```

This installs pre-built wheels (triton-windows, mlir-aie, llvm-aie), builds mlir-air
from source, and installs the Triton-XDNA backend. Takes approximately 30–60 minutes.

### Windows Manual Build

```powershell
pip install cmake ninja lit numpy PyYAML nanobind scipy
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install triton-windows
pip install mlir-aie -f https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-no-rtti
pip install llvm-aie -f https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly
```

mlir-air must be built from source (no Windows wheels yet):

```powershell
git clone https://github.com/Xilinx/mlir-air.git
cd mlir-air
git checkout <commit-from-utils/mlir-air-hash.txt>
git submodule update --init --recursive

cmake -G Ninja -DCMAKE_BUILD_TYPE=Release `
-DCMAKE_C_COMPILER=cl -DCMAKE_CXX_COMPILER=cl `
-DMLIR_DIR=<mlir-distro>/lib/cmake/mlir `
-DLLVM_DIR=<mlir-distro>/lib/cmake/llvm `
-DAIE_DIR=<mlir-aie-python-pkg>/lib/cmake/aie `
-DLLVM_ENABLE_RTTI=OFF -DBUILD_SHARED_LIBS=OFF `
-DAIR_RUNTIME_TARGETS="" -DAIR_ENABLE_GPU=OFF `
-B build -S .
ninja -C build -j $env:NUMBER_OF_PROCESSORS
ninja -C build install
```

Install Triton-XDNA:

```powershell
$env:TRITON_PLUGIN_DIRS = "$PWD\third_party\triton_shared;$PWD\amd_triton_npu"
pip install -e . --no-build-isolation -v
```

### Additional Windows Tools

**xclbinutil** and **aiebu-asm** — Included in the XRT Windows SDK zip. Ensure they
are on PATH or in `<mlir_aie_install>/bin/`.

**DIA SDK** — If the mlir-air cmake build can't find DIA SDK:
```powershell
subst Z: "C:\Program Files\Microsoft Visual Studio\2022\Community\DIA SDK"
```

### Run examples (Windows)

```powershell
cd examples\vec-add
$env:AIR_TRANSFORM_TILING_SCRIPT = "transform_aie2p.mlir"
python vec-add.py
```

### Windows Environment Variables

| Variable | Purpose |
|----------|---------|
| `AIR_TRANSFORM_TILING_SCRIPT` | Path to MLIR transform dialect IR |
| `XILINX_XRT` | (Optional) Override XRT SDK location if not in `C:\Program Files\AMD\xrt` |

### Windows Known Limitations

- mlir-air must be built from source (no Windows wheels published)
- xclbinutil and aiebu-asm must be on PATH (from XRT Windows SDK)
- NPU driver must be installed
8 changes: 7 additions & 1 deletion amd_triton_npu/backend/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,18 @@
import shutil
import subprocess
import functools
import sys
from pathlib import Path

IS_WINDOWS = sys.platform == "win32"


def _get_amd_triton_npu_opt_path() -> str:
binary_name = "triton-shared-opt.exe" if IS_WINDOWS else "triton-shared-opt"
path = (
Path(__file__).resolve().parent.parent.parent
/ "triton_shared"
/ "triton-shared-opt"
/ binary_name
)
if not os.path.isdir(path.parent):
raise RuntimeError(f"Could not find triton-shared binaries at {path}")
Expand All @@ -31,6 +35,8 @@ def _get_llvm_bin_path(bin_name: str) -> str:
path = os.getenv("LLVM_BINARY_DIR", "")
if path == "":
raise Exception("LLVM_BINARY_DIR is not set.")
if IS_WINDOWS and not bin_name.endswith(".exe"):
bin_name += ".exe"
return os.path.join(path, bin_name)


Expand Down
Loading
Loading