Skip to content

Commit af086bc

Browse files
committed
explanations
1 parent 3c1cf73 commit af086bc

File tree

2 files changed

+210
-10
lines changed

2 files changed

+210
-10
lines changed

.obsidian/workspace.json

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,8 @@
7878
}
7979
],
8080
"direction": "horizontal",
81-
"width": 300
81+
"width": 200,
82+
"collapsed": true
8283
},
8384
"right": {
8485
"id": "2843680f571045af",
@@ -168,17 +169,17 @@
168169
"command-palette:Open command palette": false
169170
}
170171
},
171-
"active": "c3a0c2a858aa43f0",
172+
"active": "cdc20bcbc91f2c90",
172173
"lastOpenFiles": [
173174
"docs/en/Environment-Setup.md",
175+
"docs/en/index.md",
176+
"docs/en/Introduction.md",
177+
"docs/fr/index.md",
174178
"docs/en/Dialect-creation.md",
175179
"docs/en/index.md~",
176180
"_config.yml~",
177-
"docs/en/Introduction.md",
178-
"docs/en/index.md",
179181
"docs/index.md",
180182
"README.md",
181-
"docs/fr/index.md",
182183
"docs/en/passes.md",
183184
"docs/en/tests.md",
184185
"en.md"

docs/en/Dialect-creation.md

Lines changed: 204 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Creating an MLIR dialect out-of-tree means describing your operations in TableGen (.td) files and then implementing the connection in C++. Here’s how and why each part matterspresented as a logical workflow, not as a step-by-step tutorial for beginners, but as a technical, narrative explanation.
1+
Creating an MLIR dialect out-of-tree means describing your operations in TableGen (.td) files and then implementing the connection in C++. Here’s how and why each part matters, presented as a logical workflow, not as a step-by-step tutorial for beginners, but as a technical, narrative explanation.
22

33
---
44

@@ -19,6 +19,7 @@ def Tuto_Dialect : Dialect {
1919
}
2020
```
2121

22+
2223
On the same principle, you then describe the dialect’s operations in a dedicated TableGen file, typically `include/TutoDialect/TutoDialectOps.td`. This file gathers all operations: each operation is declared with its input/output types and names, and its MLIR assembly format. For instance, an addition of floats:
2324

2425
```tablegen
@@ -46,9 +47,207 @@ def AddOp : Tuto_Op<"add", [Pure]> {
4647

4748
This file is the blueprint for generating the full C++ class for the operation via TableGen, ensuring parsing, syntax, and printing are consistent and correct.
4849

49-
Once these definitions are written, TableGen (invoked by CMake during the build) generates all the backend C++ (headers and intermediate sources). Then, you just need to implement the minimal glue in C++: the main dialect file, for example `lib/TutoDialect/TutoDialect.cpp`, is responsible for registering all operations of the dialect within MLIR. This is done with a simple `initialize()` method that adds your operations to the dialect’s table. Nothing magic—this is the key that makes your operations usable in tools like `mlir-opt` or your own binary (`tuto-opt`).
50+
## Articulating TableGen and C++: The Skeleton of an Out-of-Tree MLIR Dialect
51+
52+
Designing an MLIR dialect outside the LLVM source tree is fundamentally about **separating declaration from implementation**. This architectural split declarative TableGen and connecting C++ is what allows MLIR to scale and remain maintainable, even as dialects grow.
53+
54+
### 1. TableGen: Declarative Core of the Dialect
55+
56+
Everything starts with TableGen `.td` files.
57+
58+
- **`TutoDialect.td`**: This file defines the dialect, its MLIR name, summary, and C++ namespace. It is the canonical description from which TableGen generates all symbols and basic metadata.
59+
60+
- **`TutoDialectOps.td`**: Here, you describe all operations of your dialect. Each op is defined with its operands/results, assembly syntax, documentation, and interfaces. TableGen will turn this into a complete C++ class, including parsing, printing, and basic verification logic.
61+
62+
63+
> **Key Point:**
64+
> TableGen `.td` files are the _sole source of truth_ for the syntax, signatures, and metadata of your dialect and ops.
65+
> All boilerplate and repetitive code (parsing, printing, verification stubs, etc.) is generated from here.
66+
67+
---
68+
69+
### 2. The C++ Headers: Connecting Generated Code to the Project
70+
71+
The glue between TableGen and the MLIR C++ API consists of several headers:
72+
73+
#### `TutoOps.h`
74+
75+
```cpp
76+
#ifndef TUTO_TUTOOPS_H
77+
#define TUTO_TUTOOPS_H
78+
79+
#include "mlir/IR/BuiltinTypes.h"
80+
#include "mlir/IR/Dialect.h"
81+
#include "mlir/IR/OpDefinition.h"
82+
#include "mlir/Interfaces/InferTypeOpInterface.h"
83+
#include "mlir/Interfaces/SideEffectInterfaces.h"
84+
85+
#define GET_OP_CLASSES
86+
#include "TutoDialect/TutoOps.h.inc"
87+
88+
#endif // TUTO_TUTOOPS_H
89+
```
90+
91+
- This header gathers all operation class declarations that TableGen generates into `TutoOps.h.inc`, so you can use them from C++.
92+
93+
- It also includes all relevant MLIR headers (types, base classes, interfaces).
94+
95+
96+
#### `TutoDialect.h`
97+
98+
```cpp
99+
#ifndef TUTO_TUTODIALECT_H
100+
#define TUTO_TUTODIALECT_H
101+
102+
#include "mlir/IR/Dialect.h"
103+
#include "TutoDialect/TutoOpsDialect.h.inc"
104+
105+
#endif // TUTO_TUTODIALECT_H
106+
```
107+
108+
- This header links MLIR to your dialect class, which is also generated by TableGen as `TutoOpsDialect.h.inc`.
109+
110+
- It makes your dialect visible and instantiable within MLIR.
111+
112+
113+
---
114+
115+
### 3. The .cpp Files: Registration and Implementation
116+
117+
#### `TutoOps.cpp`
118+
119+
```cpp
120+
#include "TutoDialect/TutoOps.h"
121+
#include "TutoDialect/TutoDialect.h"
122+
#include "mlir/IR/OpImplementation.h"
123+
124+
#define GET_OP_CLASSES
125+
#include "TutoDialect/TutoOps.cpp.inc"
126+
```
127+
128+
- This file pulls in all TableGen-generated implementations for your ops.
129+
130+
- Here you would also add any custom verification/builders/etc. for your operations if needed.
131+
132+
133+
#### `TutoDialect.cpp`
134+
135+
```cpp
136+
#include "TutoDialect/TutoDialect.h"
137+
#include "TutoDialect/TutoOps.h"
138+
#include "mlir/IR/DialectImplementation.h"
139+
140+
using namespace mlir;
141+
using namespace mlir::tuto;
142+
143+
void TutoDialect::initialize() {
144+
addOperations<
145+
#define GET_OP_LIST
146+
#include "TutoDialect/TutoOps.cpp.inc"
147+
>();
148+
}
149+
```
150+
151+
- This is where the dialect and its operations are **registered** with MLIR.
152+
The `addOperations<>` macro, with the included op list, "injects" all the generated operation classes into your dialect.
153+
154+
- This registration is what makes your ops discoverable and usable in tools like `mlir-opt` and your own binaries.
155+
156+
157+
---
158+
159+
### 4. The Build Process: Automation via CMake and TableGen
160+
161+
- When you build, **TableGen** runs and emits all the necessary `.inc` headers from your `.td` files (`TutoOps.h.inc`, `TutoOps.cpp.inc`, `TutoOpsDialect.h.inc`, etc.).
162+
163+
- Your C++ files include these headers, and the compiler stitches everything together.
164+
165+
- You **never manually edit** the generated `.inc` files, they're regenerated automatically any time your TableGen definitions change.
166+
167+
168+
---
169+
170+
### 5. Using the Dialect
171+
172+
After this pipeline is in place, you can write, parse, and print your custom operations in `.mlir` files.
173+
Your driver binary (`tuto-opt`) is now able to:
174+
175+
- Parse and validate your dialect/ops,
176+
177+
- Print them in MLIR syntax,
178+
179+
- Serve as a testbed for further extensions: types, canonicalizations, lowerings, etc.
180+
181+
182+
---
183+
### Key Takeaways
184+
185+
- **TableGen** is for declarative structure: syntax, types, interfaces, and signatures.
186+
187+
- **C++** is for connecting, registering, and (optionally) extending with custom logic.
188+
189+
- The generated `.inc` files are the automatic "bridge" between declarative TableGen and the runtime/compilable C++ world.
190+
191+
- The build system keeps everything in sync, allowing you to focus on high-level definitions and advanced extensions.
192+
193+
Once these definitions are written, TableGen (invoked by CMake during the build) generates all the backend C++ (headers and intermediate sources). Then, you just need to implement the minimal glue in C++: the main dialect file, for example `lib/TutoDialect/TutoDialect.cpp`, is responsible for registering all operations of the dialect within MLIR. This is done with a simple `initialize()` method that adds your operations to the dialect’s table. Nothing magic, this is the key that makes your operations usable in tools like `mlir-opt` or your own binary (`tuto-opt`).
194+
195+
The main looks like :
196+
197+
```cpp
198+
#include "mlir/IR/Dialect.h"
199+
#include "mlir/IR/MLIRContext.h"
200+
#include "mlir/InitAllDialects.h"
201+
#include "mlir/InitAllPasses.h"
202+
#include "mlir/Pass/Pass.h"
203+
#include "mlir/Pass/PassManager.h"
204+
#include "mlir/Support/FileUtilities.h"
205+
#include "mlir/Tools/mlir-opt/MlirOptMain.h"
206+
#include "llvm/Support/CommandLine.h"
207+
#include "llvm/Support/InitLLVM.h"
208+
#include "llvm/Support/SourceMgr.h"
209+
#include "llvm/Support/ToolOutputFile.h"
210+
#include <mlir/Dialect/Linalg/IR/Linalg.h>
211+
#include "mlir/Dialect/Math/IR/Math.h"
212+
#include "mlir/Dialect/MemRef/IR/MemRef.h"
213+
#include "mlir/Dialect/SCF/IR/SCF.h"
214+
#include "mlir/Dialect/Tensor/IR/Tensor.h"
215+
#include "mlir/Dialect/Affine/IR/AffineOps.h"
216+
#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
217+
#include "TutoDialect/TutoDialect.h"
218+
#include "TutoDialect/TutoOpsDialect.cpp.inc"
219+
#include "mlir/Dialect/Func/IR/FuncOps.h"
220+
221+
int main(int argc, char **argv) {
222+
223+
mlir::DialectRegistry registry;
224+
registry.insert<mlir::tuto::TutoDialect>();
225+
registry.insert<mlir::arith::ArithDialect>();
226+
registry.insert<mlir::math::MathDialect>();
227+
registry.insert<mlir::tensor::TensorDialect>();
228+
registry.insert<mlir::affine::AffineDialect>();
229+
registry.insert<mlir::linalg::LinalgDialect>();
230+
registry.insert<mlir::memref::MemRefDialect>();
231+
registry.insert<mlir::LLVM::LLVMDialect>();
232+
registry.insert<mlir::func::FuncDialect>();
233+
return mlir::asMainReturnCode(
234+
mlir::MlirOptMain(argc, argv, "Tuto optimizer driver\n", registry));
235+
}
236+
```
237+
238+
In an out-of-tree MLIR project, the main driver source (as shown) serves as the interface between your dialect and the MLIR ecosystem. Its purpose is not to hard-code logic but to **register the set of dialects you want your tool to support** including your own and to delegate all actual IR handling, verification, parsing, pass execution, and pretty-printing to MLIR’s robust infrastructure.
239+
240+
Here, the inclusion of all core dialect headers, alongside your own, signals to MLIR what kinds of operations and types should be recognized and parsed. The dialect registry object is a central component: by inserting your dialect (`mlir::tuto::TutoDialect`) and any others (arith, math, tensor, affine, linalg, memref, LLVM, func), you make their ops available as first-class citizens in your IR. This registry becomes the catalogue that MLIR uses at runtime for all dialect resolution and IR manipulation.
241+
242+
The key function is `MlirOptMain`, which is a generic driver for IR files and passes, provided directly by MLIR. It expects to be handed a dialect registry and takes care of everything else: loading IR, handling passes, running analyses, producing diagnostics, and emitting transformed IR. It abstracts away boilerplate so that your binary focuses solely on **declaring support for dialects**, not reimplementing existing tooling.
243+
244+
There is no stepwise logic or custom orchestration here; the code is deliberately minimal, reflecting the **compositional, declarative design** MLIR encourages. Your dialect integrates seamlessly with all standard passes and dialects simply by being registered. The out-of-tree nature is reflected in the lack of special-casing: your dialect is just another extension point, managed at runtime via the registry, never hardwired into MLIR itself.
245+
246+
This is the architectural pattern that enables scalability, extensibility, and modularity in the MLIR ecosystem.
247+
248+
---
249+
50250

51-
Note: the C++ operations file (`TutoDialectOps.cpp`) is usually empty at first, unless you want to add custom verification, builders, or canonicalization logic. The parsing, syntax, and type signatures are already handled by TableGen.
52251

53252
With these files in place, you simply build the project. The dialect can then be used in a `.mlir` file like:
54253

@@ -65,13 +264,13 @@ And you can test it using your binary:
65264
./bin/Tuto-opt test/TutoTest.mlir
66265
```
67266

68-
Result: your dialect and operations are fully integrated into MLIR and ready to be extendedadd types, patterns, lowerings, whatever you need.
267+
Result: your dialect and operations are fully integrated into MLIR and ready to be extended, add types, patterns, lowerings, whatever you need.
69268

70269
---
71270

72271
**Logical summary:**
73272

74-
- `.td`: all declarative stuffsyntax, signatures, metadata.
273+
- `.td`: all declarative stuff, syntax, signatures, metadata.
75274

76275
- TableGen: generates classes/headers.
77276

0 commit comments

Comments
 (0)