You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/Dialect-creation.md
+204-5Lines changed: 204 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
Creating an MLIR dialect out-of-tree means describing your operations in TableGen (.td) files and then implementing the connection in C++. Here’s how and why each part matters—presented as a logical workflow, not as a step-by-step tutorial for beginners, but as a technical, narrative explanation.
1
+
Creating an MLIR dialect out-of-tree means describing your operations in TableGen (.td) files and then implementing the connection in C++. Here’s how and why each part matters, presented as a logical workflow, not as a step-by-step tutorial for beginners, but as a technical, narrative explanation.
2
2
3
3
---
4
4
@@ -19,6 +19,7 @@ def Tuto_Dialect : Dialect {
19
19
}
20
20
```
21
21
22
+
22
23
On the same principle, you then describe the dialect’s operations in a dedicated TableGen file, typically `include/TutoDialect/TutoDialectOps.td`. This file gathers all operations: each operation is declared with its input/output types and names, and its MLIR assembly format. For instance, an addition of floats:
This file is the blueprint for generating the full C++ class for the operation via TableGen, ensuring parsing, syntax, and printing are consistent and correct.
48
49
49
-
Once these definitions are written, TableGen (invoked by CMake during the build) generates all the backend C++ (headers and intermediate sources). Then, you just need to implement the minimal glue in C++: the main dialect file, for example `lib/TutoDialect/TutoDialect.cpp`, is responsible for registering all operations of the dialect within MLIR. This is done with a simple `initialize()` method that adds your operations to the dialect’s table. Nothing magic—this is the key that makes your operations usable in tools like `mlir-opt` or your own binary (`tuto-opt`).
50
+
## Articulating TableGen and C++: The Skeleton of an Out-of-Tree MLIR Dialect
51
+
52
+
Designing an MLIR dialect outside the LLVM source tree is fundamentally about **separating declaration from implementation**. This architectural split declarative TableGen and connecting C++ is what allows MLIR to scale and remain maintainable, even as dialects grow.
53
+
54
+
### 1. TableGen: Declarative Core of the Dialect
55
+
56
+
Everything starts with TableGen `.td` files.
57
+
58
+
-**`TutoDialect.td`**: This file defines the dialect, its MLIR name, summary, and C++ namespace. It is the canonical description from which TableGen generates all symbols and basic metadata.
59
+
60
+
-**`TutoDialectOps.td`**: Here, you describe all operations of your dialect. Each op is defined with its operands/results, assembly syntax, documentation, and interfaces. TableGen will turn this into a complete C++ class, including parsing, printing, and basic verification logic.
61
+
62
+
63
+
> **Key Point:**
64
+
> TableGen `.td` files are the _sole source of truth_ for the syntax, signatures, and metadata of your dialect and ops.
65
+
> All boilerplate and repetitive code (parsing, printing, verification stubs, etc.) is generated from here.
66
+
67
+
---
68
+
69
+
### 2. The C++ Headers: Connecting Generated Code to the Project
70
+
71
+
The glue between TableGen and the MLIR C++ API consists of several headers:
72
+
73
+
#### `TutoOps.h`
74
+
75
+
```cpp
76
+
#ifndef TUTO_TUTOOPS_H
77
+
#define TUTO_TUTOOPS_H
78
+
79
+
#include "mlir/IR/BuiltinTypes.h"
80
+
#include "mlir/IR/Dialect.h"
81
+
#include "mlir/IR/OpDefinition.h"
82
+
#include "mlir/Interfaces/InferTypeOpInterface.h"
83
+
#include "mlir/Interfaces/SideEffectInterfaces.h"
84
+
85
+
#define GET_OP_CLASSES
86
+
#include "TutoDialect/TutoOps.h.inc"
87
+
88
+
#endif// TUTO_TUTOOPS_H
89
+
```
90
+
91
+
- This header gathers all operation class declarations that TableGen generates into `TutoOps.h.inc`, so you can use them from C++.
92
+
93
+
- It also includes all relevant MLIR headers (types, base classes, interfaces).
94
+
95
+
96
+
#### `TutoDialect.h`
97
+
98
+
```cpp
99
+
#ifndef TUTO_TUTODIALECT_H
100
+
#define TUTO_TUTODIALECT_H
101
+
102
+
#include "mlir/IR/Dialect.h"
103
+
#include "TutoDialect/TutoOpsDialect.h.inc"
104
+
105
+
#endif// TUTO_TUTODIALECT_H
106
+
```
107
+
108
+
- This header links MLIR to your dialect class, which is also generated by TableGen as `TutoOpsDialect.h.inc`.
109
+
110
+
- It makes your dialect visible and instantiable within MLIR.
111
+
112
+
113
+
---
114
+
115
+
### 3. The .cpp Files: Registration and Implementation
116
+
117
+
#### `TutoOps.cpp`
118
+
119
+
```cpp
120
+
#include"TutoDialect/TutoOps.h"
121
+
#include"TutoDialect/TutoDialect.h"
122
+
#include"mlir/IR/OpImplementation.h"
123
+
124
+
#defineGET_OP_CLASSES
125
+
#include"TutoDialect/TutoOps.cpp.inc"
126
+
```
127
+
128
+
- This file pulls in all TableGen-generated implementations for your ops.
129
+
130
+
- Here you would also add any custom verification/builders/etc. for your operations if needed.
131
+
132
+
133
+
#### `TutoDialect.cpp`
134
+
135
+
```cpp
136
+
#include"TutoDialect/TutoDialect.h"
137
+
#include"TutoDialect/TutoOps.h"
138
+
#include"mlir/IR/DialectImplementation.h"
139
+
140
+
usingnamespacemlir;
141
+
usingnamespacemlir::tuto;
142
+
143
+
void TutoDialect::initialize() {
144
+
addOperations<
145
+
#define GET_OP_LIST
146
+
#include "TutoDialect/TutoOps.cpp.inc"
147
+
>();
148
+
}
149
+
```
150
+
151
+
- This is where the dialect and its operations are **registered** with MLIR.
152
+
The `addOperations<>` macro, with the included op list, "injects" all the generated operation classes into your dialect.
153
+
154
+
- This registration is what makes your ops discoverable and usable in tools like `mlir-opt` and your own binaries.
155
+
156
+
157
+
---
158
+
159
+
### 4. The Build Process: Automation via CMake and TableGen
160
+
161
+
- When you build, **TableGen** runs and emits all the necessary `.inc` headers from your `.td` files (`TutoOps.h.inc`, `TutoOps.cpp.inc`, `TutoOpsDialect.h.inc`, etc.).
162
+
163
+
- Your C++ files include these headers, and the compiler stitches everything together.
164
+
165
+
- You **never manually edit** the generated `.inc` files, they're regenerated automatically any time your TableGen definitions change.
166
+
167
+
168
+
---
169
+
170
+
### 5. Using the Dialect
171
+
172
+
After this pipeline is in place, you can write, parse, and print your custom operations in `.mlir` files.
173
+
Your driver binary (`tuto-opt`) is now able to:
174
+
175
+
- Parse and validate your dialect/ops,
176
+
177
+
- Print them in MLIR syntax,
178
+
179
+
- Serve as a testbed for further extensions: types, canonicalizations, lowerings, etc.
180
+
181
+
182
+
---
183
+
### Key Takeaways
184
+
185
+
- **TableGen** is for declarative structure: syntax, types, interfaces, and signatures.
186
+
187
+
- **C++** is for connecting, registering, and (optionally) extending with custom logic.
188
+
189
+
- The generated `.inc` files are the automatic "bridge" between declarative TableGen and the runtime/compilable C++ world.
190
+
191
+
- The build system keeps everything in sync, allowing you to focus on high-level definitions and advanced extensions.
192
+
193
+
Once these definitions are written, TableGen (invoked by CMake during the build) generates all the backend C++ (headers and intermediate sources). Then, you just need to implement the minimal glue in C++: the main dialect file, for example `lib/TutoDialect/TutoDialect.cpp`, is responsible for registering all operations of the dialect within MLIR. This is done with a simple `initialize()` method that adds your operations to the dialect’s table. Nothing magic, this is the key that makes your operations usable in tools like `mlir-opt` or your own binary (`tuto-opt`).
In an out-of-tree MLIR project, the main driver source (as shown) serves as the interface between your dialect and the MLIR ecosystem. Its purpose is not to hard-code logic but to **register the set of dialects you want your tool to support** including your own and to delegate all actual IR handling, verification, parsing, pass execution, and pretty-printing to MLIR’s robust infrastructure.
239
+
240
+
Here, the inclusion of all core dialect headers, alongside your own, signals to MLIR what kinds of operations and types should be recognized and parsed. The dialect registry object is a central component: by inserting your dialect (`mlir::tuto::TutoDialect`) and any others (arith, math, tensor, affine, linalg, memref, LLVM, func), you make their ops available as first-class citizens in your IR. This registry becomes the catalogue that MLIR uses at runtime for all dialect resolution and IR manipulation.
241
+
242
+
The key function is `MlirOptMain`, which is a generic driver for IR files and passes, provided directly by MLIR. It expects to be handed a dialect registry and takes care of everything else: loading IR, handling passes, running analyses, producing diagnostics, and emitting transformed IR. It abstracts away boilerplate so that your binary focuses solely on **declaring support for dialects**, not reimplementing existing tooling.
243
+
244
+
There is no stepwise logic or custom orchestration here; the code is deliberately minimal, reflecting the **compositional, declarative design** MLIR encourages. Your dialect integrates seamlessly with all standard passes and dialects simply by being registered. The out-of-tree nature is reflected in the lack of special-casing: your dialect is just another extension point, managed at runtime via the registry, never hardwired into MLIR itself.
245
+
246
+
This is the architectural pattern that enables scalability, extensibility, and modularity in the MLIR ecosystem.
247
+
248
+
---
249
+
50
250
51
-
Note: the C++ operations file (`TutoDialectOps.cpp`) is usually empty at first, unless you want to add custom verification, builders, or canonicalization logic. The parsing, syntax, and type signatures are already handled by TableGen.
52
251
53
252
With these files in place, you simply build the project. The dialect can then be used in a `.mlir` file like:
54
253
@@ -65,13 +264,13 @@ And you can test it using your binary:
65
264
./bin/Tuto-opt test/TutoTest.mlir
66
265
```
67
266
68
-
Result: your dialect and operations are fully integrated into MLIR and ready to be extended—add types, patterns, lowerings, whatever you need.
267
+
Result: your dialect and operations are fully integrated into MLIR and ready to be extended, add types, patterns, lowerings, whatever you need.
69
268
70
269
---
71
270
72
271
**Logical summary:**
73
272
74
-
-`.td`: all declarative stuff—syntax, signatures, metadata.
273
+
-`.td`: all declarative stuff, syntax, signatures, metadata.
0 commit comments