Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ jobs:
# Temporary CI gate: skip cases that still error/flap on the remote NPU.
# Update this list as we fix the underlying issues.
DEFAULT_SKIP_CASES: >-
mix_kernel,vadd_validshape,vadd_validshape_dynamic,print,storefp
mix_kernel,vadd_validshape,vadd_validshape_dynamic,print,storefp,Gemvmx
steps:
- name: Resolve validation parameters
shell: bash
Expand Down
89 changes: 89 additions & 0 deletions docs/PTO_IR_manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -1400,6 +1400,95 @@ pto.tgemv.bias ins(%a, %b, %bias : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.

---

##### `pto.tgemv.mx` - Mixed-Precision Matrix-Vector Multiply

**Summary:** Mixed-precision GEMV with explicit A/B scaling tiles.

**Semantics:**

```
dst = gemv(a, b) // quantization/mixed-precision behavior is target-defined
```

**Arguments:**

| Name | Type | Description |
|------|------|-------------|
| `a` | `pto.tile_buf` | Matrix tile (`loc=left`) |
| `a_scale` | `pto.tile_buf` | Scale tile associated with `a` |
| `b` | `pto.tile_buf` | Vector tile (`loc=right`) |
| `b_scale` | `pto.tile_buf` | Scale tile associated with `b` |
| `dst` | `pto.tile_buf` | Destination accumulator tile (`loc=acc`) |

**Results:** None. Writes into `dst` via DPS pattern.

**Constraints & Verification:**

- `a/b/dst` reuse the same GEMV shape/location checks as `pto.tgemv`.
- `a_scale` and `b_scale` must be valid tile buffers.

**Hardware Mapping:**

- Executes on the **Matrix pipeline** (`PIPE_M`)

**Basic Example:**

```mlir
pto.tgemv.mx ins(%a, %a_scale, %b, %b_scale : !pto.tile_buf<...>, !pto.tile_buf<...>,
!pto.tile_buf<...>, !pto.tile_buf<...>)
outs(%c : !pto.tile_buf<...>)
```

---

##### `pto.tgemv.mx.acc` - Mixed-Precision GEMV with Accumulation

**Summary:** Mixed-precision GEMV accumulation form using scale tiles.

**Semantics:**

```
dst = c_in + gemv(a, b)
```

**Arguments:** `c_in, a, a_scale, b, b_scale, dst`

**Hardware Mapping:** Matrix pipeline (`PIPE_M`)

**Basic Example:**

```mlir
pto.tgemv.mx.acc ins(%c_in, %a, %a_scale, %b, %b_scale : !pto.tile_buf<...>, !pto.tile_buf<...>,
!pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>)
outs(%c_out : !pto.tile_buf<...>)
```

---

##### `pto.tgemv.mx.bias` - Mixed-Precision GEMV with Bias

**Summary:** Mixed-precision GEMV bias form using scale tiles.

**Semantics:**

```
dst = gemv(a, b) + bias
```

**Arguments:** `a, a_scale, b, b_scale, bias, dst`

**Hardware Mapping:** Matrix pipeline (`PIPE_M`)

**Basic Example:**

```mlir
pto.tgemv.mx.bias ins(%a, %a_scale, %b, %b_scale, %bias : !pto.tile_buf<...>, !pto.tile_buf<...>,
!pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>)
outs(%c : !pto.tile_buf<...>)
```

---

### 4.5 Vector Arithmetic Operations

All vector arithmetic operations execute on the **Vector pipeline** (`PIPE_V`) and use `ins`/`outs` with tile buffers in the **VEC (UB)** memory space.
Expand Down
101 changes: 101 additions & 0 deletions include/PTO/IR/PTOOps.td
Original file line number Diff line number Diff line change
Expand Up @@ -918,6 +918,107 @@ def TGemvBiasOp : PTO_TOp<"tgemv.bias", [
}];
}

def TGemvMxOp : PTO_TOp<"tgemv.mx", [
PTO_DpsInitOpInterface,
OpPipeInterface,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>
]> {
let summary = "Mixed-precision GEMV with scale tiles (tile world, ins/outs).";

let arguments = (ins
PTODpsType:$a,
PTODpsType:$a_scale,
PTODpsType:$b,
PTODpsType:$b_scale,
PTODpsType:$dst
);

let results = (outs Optional<AnyRankedTensor>:$result);
let hasVerifier = 1;

let assemblyFormat = [{
`ins` `(` $a `,` $a_scale `,` $b `,` $b_scale
`:` type($a) `,` type($a_scale) `,` type($b) `,` type($b_scale) `)`
`outs` `(` $dst `:` qualified(type($dst) ) `)`
attr-dict
(`->` qualified(type($result))^)?
}];

let extraClassDeclaration = [{
static StringRef getIntrinsicName() { return "TGEMV_MX"; }
::mlir::pto::PIPE getPipe() { return ::mlir::pto::PIPE::PIPE_M; }
::mlir::MutableOperandRange getDpsInitsMutable() { return getDstMutable(); }
}];
}

def TGemvMxAccOp : PTO_TOp<"tgemv.mx.acc", [
PTO_DpsInitOpInterface,
OpPipeInterface,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>
]> {
let summary = "Mixed-precision GEMV accumulate with scale tiles (tile world, ins/outs).";

let arguments = (ins
PTODpsType:$c_in,
PTODpsType:$a,
PTODpsType:$a_scale,
PTODpsType:$b,
PTODpsType:$b_scale,
PTODpsType:$dst
);

let results = (outs Optional<AnyRankedTensor>:$result);
let hasVerifier = 1;

let assemblyFormat = [{
`ins` `(` $c_in `,` $a `,` $a_scale `,` $b `,` $b_scale
`:` type($c_in) `,` type($a) `,` type($a_scale) `,` type($b) `,` type($b_scale) `)`
`outs` `(` $dst `:` qualified(type($dst) ) `)`
attr-dict
(`->` qualified(type($result))^)?
}];

let extraClassDeclaration = [{
static StringRef getIntrinsicName() { return "TGEMV_MX"; }
::mlir::pto::PIPE getPipe() { return ::mlir::pto::PIPE::PIPE_M; }
::mlir::MutableOperandRange getDpsInitsMutable() { return getDstMutable(); }
}];
}

def TGemvMxBiasOp : PTO_TOp<"tgemv.mx.bias", [
PTO_DpsInitOpInterface,
OpPipeInterface,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>
]> {
let summary = "Mixed-precision GEMV with bias and scale tiles (tile world, ins/outs).";

let arguments = (ins
PTODpsType:$a,
PTODpsType:$a_scale,
PTODpsType:$b,
PTODpsType:$b_scale,
PTODpsType:$bias,
PTODpsType:$dst
);

let results = (outs Optional<AnyRankedTensor>:$result);
let hasVerifier = 1;

let assemblyFormat = [{
`ins` `(` $a `,` $a_scale `,` $b `,` $b_scale `,` $bias
`:` type($a) `,` type($a_scale) `,` type($b) `,` type($b_scale) `,` qualified(type($bias)) `)`
`outs` `(` $dst `:` qualified(type($dst) ) `)`
attr-dict
(`->` qualified(type($result))^)?
}];

let extraClassDeclaration = [{
static StringRef getIntrinsicName() { return "TGEMV_MX"; }
::mlir::pto::PIPE getPipe() { return ::mlir::pto::PIPE::PIPE_M; }
::mlir::MutableOperandRange getDpsInitsMutable() { return getDstMutable(); }
}];
}

def TMovOp : PTO_TOp<"tmov", [
PTO_DpsInitOpInterface,
OpPipeInterface,
Expand Down
Loading
Loading