This repository contains the official implementation and experimental artifacts for:
"Hybrid Quantum-MambaVision: A Quantum-enhanced State Space Model for Calibrated Mixed-type Wafer Defect Detection"
Accepted Paper β EFMxDM Workshop at PAKDD 2026, Hong Kong
We propose a hybrid quantum-classical architecture that augments MambaVision-T with a Quantum Context Adapter (QCA) and LoRA fine-tuning for multi-label semiconductor wafer defect classification. The QCA injects a lightweight PennyLane-based variational quantum circuit between the backbone stages, providing a quantum-enhanced channel re-calibration signal whose influence is governed by a learnable gate scalar. Combined with Focal Loss and a cosine-annealed training schedule, the model achieves 97.84% subset accuracy and 0.9947 Micro-F1 on the MixedType Wafer Defect dataset β outperforming classical MambaVision, ResNet-50 and ViT baselines while adding only ~5.8K quantum-adapter parameters.
| Component | Description |
|---|---|
| Backbone | MambaVision-Tiny (nvidia/MambaVision-T-1K) β hybrid SSM + attention |
| Quantum Context Adapter | 4-qubit PennyLane circuit (AngleEmbedding β StronglyEntanglingLayers), inserted after Level 2 via forward hook |
| LoRA | Rank-64, Ξ±=128, applied to in_proj, out_proj, x_proj, dt_proj, fc1, fc2 |
| Classification Head | LayerNorm β Linear (640 β 8) |
| Loss | Focal Loss (Ξ³=2) |
Click to view full architecture
| ID | Layer | Type | Output | Status |
|---|---|---|---|---|
| 0 | base_model.model.model.patch_embed.proj | Identity | [1, 3, 384, 384] | π Frozen |
| 1 | base_model.model.model.patch_embed.conv_down.0 | Conv2d | [1, 32, 192, 192] | π Frozen |
| 2 | base_model.model.model.patch_embed.conv_down.1 | BatchNorm2d | [1, 32, 192, 192] | π Frozen |
| 3 | base_model.model.model.patch_embed.conv_down.2 | ReLU | [1, 32, 192, 192] | π Frozen |
| 4 | base_model.model.model.patch_embed.conv_down.3 | Conv2d | [1, 80, 96, 96] | π Frozen |
| 5 | base_model.model.model.patch_embed.conv_down.4 | BatchNorm2d | [1, 80, 96, 96] | π Frozen |
| 6 | base_model.model.model.patch_embed.conv_down.5 | ReLU | [1, 80, 96, 96] | π Frozen |
| 7 | base_model.model.model.levels.0.blocks.0.conv1 | Conv2d | [1, 80, 96, 96] | π Frozen |
| 8 | base_model.model.model.levels.0.blocks.0.norm1 | BatchNorm2d | [1, 80, 96, 96] | π Frozen |
| 9 | base_model.model.model.levels.0.blocks.0.act1 | GELU | [1, 80, 96, 96] | π Frozen |
| 10 | base_model.model.model.levels.0.blocks.0.conv2 | Conv2d | [1, 80, 96, 96] | π Frozen |
| 11 | base_model.model.model.levels.0.blocks.0.norm2 | BatchNorm2d | [1, 80, 96, 96] | π Frozen |
| 12 | base_model.model.model.levels.0.blocks.0.drop_path | Identity | [1, 80, 96, 96] | π Frozen |
| 13 | base_model.model.model.levels.0.blocks.0 | ConvBlock | [1, 80, 96, 96] | π Frozen |
| 14 | base_model.model.model.levels.0.downsample.reduction.0 | Conv2d | [1, 160, 48, 48] | π Frozen |
| 15 | base_model.model.model.levels.0.downsample.reduction | Sequential | [1, 160, 48, 48] | π Frozen |
| 16 | base_model.model.model.levels.0.downsample | Downsample | [1, 160, 48, 48] | π Frozen |
| 17 | base_model.model.model.levels.0 | MambaVisionLayer | [1, 160, 48, 48] (+1 aux) | π Frozen |
| 18 | base_model.model.model.levels.1.blocks.0.conv1 | Conv2d | [1, 160, 48, 48] | π Frozen |
| 19 | base_model.model.model.levels.1.blocks.0.norm1 | BatchNorm2d | [1, 160, 48, 48] | π Frozen |
| 20 | base_model.model.model.levels.1.blocks.0.act1 | GELU | [1, 160, 48, 48] | π Frozen |
| 21 | base_model.model.model.levels.1.blocks.0.conv2 | Conv2d | [1, 160, 48, 48] | π Frozen |
| 22 | base_model.model.model.levels.1.blocks.0.norm2 | BatchNorm2d | [1, 160, 48, 48] | π Frozen |
| 23 | base_model.model.model.levels.1.blocks.0.drop_path | DropPath | [1, 160, 48, 48] | π Frozen |
| 24 | base_model.model.model.levels.1.blocks.0 | ConvBlock | [1, 160, 48, 48] | π Frozen |
| 25 | base_model.model.model.levels.1.blocks.1.conv1 | Conv2d | [1, 160, 48, 48] | π Frozen |
| 26 | base_model.model.model.levels.1.blocks.1.norm1 | BatchNorm2d | [1, 160, 48, 48] | π Frozen |
| 27 | base_model.model.model.levels.1.blocks.1.act1 | GELU | [1, 160, 48, 48] | π Frozen |
| 28 | base_model.model.model.levels.1.blocks.1.conv2 | Conv2d | [1, 160, 48, 48] | π Frozen |
| 29 | base_model.model.model.levels.1.blocks.1.norm2 | BatchNorm2d | [1, 160, 48, 48] | π Frozen |
| 30 | base_model.model.model.levels.1.blocks.1.drop_path | DropPath | [1, 160, 48, 48] | π Frozen |
| 31 | base_model.model.model.levels.1.blocks.1 | ConvBlock | [1, 160, 48, 48] | π Frozen |
| 32 | base_model.model.model.levels.1.blocks.2.conv1 | Conv2d | [1, 160, 48, 48] | π Frozen |
| 33 | base_model.model.model.levels.1.blocks.2.norm1 | BatchNorm2d | [1, 160, 48, 48] | π Frozen |
| 34 | base_model.model.model.levels.1.blocks.2.act1 | GELU | [1, 160, 48, 48] | π Frozen |
| 35 | base_model.model.model.levels.1.blocks.2.conv2 | Conv2d | [1, 160, 48, 48] | π Frozen |
| 36 | base_model.model.model.levels.1.blocks.2.norm2 | BatchNorm2d | [1, 160, 48, 48] | π Frozen |
| 37 | base_model.model.model.levels.1.blocks.2.drop_path | DropPath | [1, 160, 48, 48] | π Frozen |
| 38 | base_model.model.model.levels.1.blocks.2 | ConvBlock | [1, 160, 48, 48] | π Frozen |
| 39 | base_model.model.model.levels.1.downsample.reduction.0 | Conv2d | [1, 320, 24, 24] | π Frozen |
| 40 | base_model.model.model.levels.1.downsample.reduction | Sequential | [1, 320, 24, 24] | π Frozen |
| 41 | base_model.model.model.levels.1.downsample | Downsample | [1, 320, 24, 24] | π Frozen |
| 42 | base_model.model.model.levels.1 | MambaVisionLayer | [1, 320, 24, 24] (+1 aux) | π Frozen |
| 43 | base_model.model.model.levels.2.blocks.0.norm1 | LayerNorm | [4, 196, 320] | π Frozen |
| 44 | base_model.model.model.levels.2.blocks.0.mixer.in_proj.base_layer | Linear | [4, 196, 320] | π Frozen |
| 45 | base_model.model.model.levels.2.blocks.0.mixer.in_proj.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 46 | base_model.model.model.levels.2.blocks.0.mixer.in_proj.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 47 | base_model.model.model.levels.2.blocks.0.mixer.in_proj.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 48 | base_model.model.model.levels.2.blocks.0.mixer.in_proj | Linear | [4, 196, 320] | π Frozen |
| 49 | base_model.model.model.levels.2.blocks.0.mixer.x_proj.base_layer | Linear | [784, 36] | π Frozen |
| 50 | base_model.model.model.levels.2.blocks.0.mixer.x_proj.lora_dropout.default | Identity | [784, 160] | π Frozen |
| 51 | base_model.model.model.levels.2.blocks.0.mixer.x_proj.lora_A.default | Linear | [784, 64] | π’ Trainable |
| 52 | base_model.model.model.levels.2.blocks.0.mixer.x_proj.lora_B.default | Linear | [784, 36] | π’ Trainable |
| 53 | base_model.model.model.levels.2.blocks.0.mixer.x_proj | Linear | [784, 36] | π Frozen |
| 54 | base_model.model.model.levels.2.blocks.0.mixer.dt_proj.base_layer | Linear | [784, 160] | π Frozen |
| 55 | base_model.model.model.levels.2.blocks.0.mixer.dt_proj.lora_dropout.default | Identity | [784, 20] | π Frozen |
| 56 | base_model.model.model.levels.2.blocks.0.mixer.dt_proj.lora_A.default | Linear | [784, 64] | π’ Trainable |
| 57 | base_model.model.model.levels.2.blocks.0.mixer.dt_proj.lora_B.default | Linear | [784, 160] | π’ Trainable |
| 58 | base_model.model.model.levels.2.blocks.0.mixer.dt_proj | Linear | [784, 160] | π Frozen |
| 59 | base_model.model.model.levels.2.blocks.0.mixer.out_proj.base_layer | Linear | [4, 196, 320] | π Frozen |
| 60 | base_model.model.model.levels.2.blocks.0.mixer.out_proj.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 61 | base_model.model.model.levels.2.blocks.0.mixer.out_proj.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 62 | base_model.model.model.levels.2.blocks.0.mixer.out_proj.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 63 | base_model.model.model.levels.2.blocks.0.mixer.out_proj | Linear | [4, 196, 320] | π Frozen |
| 64 | base_model.model.model.levels.2.blocks.0.mixer | MambaVisionMixer | [4, 196, 320] | π Frozen |
| 65 | base_model.model.model.levels.2.blocks.0.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 66 | base_model.model.model.levels.2.blocks.0.norm2 | LayerNorm | [4, 196, 320] | π Frozen |
| 67 | base_model.model.model.levels.2.blocks.0.mlp.fc1.base_layer | Linear | [4, 196, 1280] | π Frozen |
| 68 | base_model.model.model.levels.2.blocks.0.mlp.fc1.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 69 | base_model.model.model.levels.2.blocks.0.mlp.fc1.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 70 | base_model.model.model.levels.2.blocks.0.mlp.fc1.lora_B.default | Linear | [4, 196, 1280] | π’ Trainable |
| 71 | base_model.model.model.levels.2.blocks.0.mlp.fc1 | Linear | [4, 196, 1280] | π Frozen |
| 72 | base_model.model.model.levels.2.blocks.0.mlp.act | GELU | [4, 196, 1280] | π Frozen |
| 73 | base_model.model.model.levels.2.blocks.0.mlp.drop1 | Dropout | [4, 196, 1280] | π Frozen |
| 74 | base_model.model.model.levels.2.blocks.0.mlp.norm | Identity | [4, 196, 1280] | π Frozen |
| 75 | base_model.model.model.levels.2.blocks.0.mlp.fc2.base_layer | Linear | [4, 196, 320] | π Frozen |
| 76 | base_model.model.model.levels.2.blocks.0.mlp.fc2.lora_dropout.default | Identity | [4, 196, 1280] | π Frozen |
| 77 | base_model.model.model.levels.2.blocks.0.mlp.fc2.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 78 | base_model.model.model.levels.2.blocks.0.mlp.fc2.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 79 | base_model.model.model.levels.2.blocks.0.mlp.fc2 | Linear | [4, 196, 320] | π Frozen |
| 80 | base_model.model.model.levels.2.blocks.0.mlp.drop2 | Dropout | [4, 196, 320] | π Frozen |
| 81 | base_model.model.model.levels.2.blocks.0.mlp | Mlp | [4, 196, 320] | π Frozen |
| 82 | base_model.model.model.levels.2.blocks.0.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 83 | base_model.model.model.levels.2.blocks.0 | Block | [4, 196, 320] | π Frozen |
| 84 | base_model.model.model.levels.2.blocks.1.norm1 | LayerNorm | [4, 196, 320] | π Frozen |
| 85 | base_model.model.model.levels.2.blocks.1.mixer.in_proj.base_layer | Linear | [4, 196, 320] | π Frozen |
| 86 | base_model.model.model.levels.2.blocks.1.mixer.in_proj.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 87 | base_model.model.model.levels.2.blocks.1.mixer.in_proj.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 88 | base_model.model.model.levels.2.blocks.1.mixer.in_proj.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 89 | base_model.model.model.levels.2.blocks.1.mixer.in_proj | Linear | [4, 196, 320] | π Frozen |
| 90 | base_model.model.model.levels.2.blocks.1.mixer.x_proj.base_layer | Linear | [784, 36] | π Frozen |
| 91 | base_model.model.model.levels.2.blocks.1.mixer.x_proj.lora_dropout.default | Identity | [784, 160] | π Frozen |
| 92 | base_model.model.model.levels.2.blocks.1.mixer.x_proj.lora_A.default | Linear | [784, 64] | π’ Trainable |
| 93 | base_model.model.model.levels.2.blocks.1.mixer.x_proj.lora_B.default | Linear | [784, 36] | π’ Trainable |
| 94 | base_model.model.model.levels.2.blocks.1.mixer.x_proj | Linear | [784, 36] | π Frozen |
| 95 | base_model.model.model.levels.2.blocks.1.mixer.dt_proj.base_layer | Linear | [784, 160] | π Frozen |
| 96 | base_model.model.model.levels.2.blocks.1.mixer.dt_proj.lora_dropout.default | Identity | [784, 20] | π Frozen |
| 97 | base_model.model.model.levels.2.blocks.1.mixer.dt_proj.lora_A.default | Linear | [784, 64] | π’ Trainable |
| 98 | base_model.model.model.levels.2.blocks.1.mixer.dt_proj.lora_B.default | Linear | [784, 160] | π’ Trainable |
| 99 | base_model.model.model.levels.2.blocks.1.mixer.dt_proj | Linear | [784, 160] | π Frozen |
| 100 | base_model.model.model.levels.2.blocks.1.mixer.out_proj.base_layer | Linear | [4, 196, 320] | π Frozen |
| 101 | base_model.model.model.levels.2.blocks.1.mixer.out_proj.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 102 | base_model.model.model.levels.2.blocks.1.mixer.out_proj.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 103 | base_model.model.model.levels.2.blocks.1.mixer.out_proj.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 104 | base_model.model.model.levels.2.blocks.1.mixer.out_proj | Linear | [4, 196, 320] | π Frozen |
| 105 | base_model.model.model.levels.2.blocks.1.mixer | MambaVisionMixer | [4, 196, 320] | π Frozen |
| 106 | base_model.model.model.levels.2.blocks.1.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 107 | base_model.model.model.levels.2.blocks.1.norm2 | LayerNorm | [4, 196, 320] | π Frozen |
| 108 | base_model.model.model.levels.2.blocks.1.mlp.fc1.base_layer | Linear | [4, 196, 1280] | π Frozen |
| 109 | base_model.model.model.levels.2.blocks.1.mlp.fc1.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 110 | base_model.model.model.levels.2.blocks.1.mlp.fc1.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 111 | base_model.model.model.levels.2.blocks.1.mlp.fc1.lora_B.default | Linear | [4, 196, 1280] | π’ Trainable |
| 112 | base_model.model.model.levels.2.blocks.1.mlp.fc1 | Linear | [4, 196, 1280] | π Frozen |
| 113 | base_model.model.model.levels.2.blocks.1.mlp.act | GELU | [4, 196, 1280] | π Frozen |
| 114 | base_model.model.model.levels.2.blocks.1.mlp.drop1 | Dropout | [4, 196, 1280] | π Frozen |
| 115 | base_model.model.model.levels.2.blocks.1.mlp.norm | Identity | [4, 196, 1280] | π Frozen |
| 116 | base_model.model.model.levels.2.blocks.1.mlp.fc2.base_layer | Linear | [4, 196, 320] | π Frozen |
| 117 | base_model.model.model.levels.2.blocks.1.mlp.fc2.lora_dropout.default | Identity | [4, 196, 1280] | π Frozen |
| 118 | base_model.model.model.levels.2.blocks.1.mlp.fc2.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 119 | base_model.model.model.levels.2.blocks.1.mlp.fc2.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 120 | base_model.model.model.levels.2.blocks.1.mlp.fc2 | Linear | [4, 196, 320] | π Frozen |
| 121 | base_model.model.model.levels.2.blocks.1.mlp.drop2 | Dropout | [4, 196, 320] | π Frozen |
| 122 | base_model.model.model.levels.2.blocks.1.mlp | Mlp | [4, 196, 320] | π Frozen |
| 123 | base_model.model.model.levels.2.blocks.1.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 124 | base_model.model.model.levels.2.blocks.1 | Block | [4, 196, 320] | π Frozen |
| 125 | base_model.model.model.levels.2.blocks.2.norm1 | LayerNorm | [4, 196, 320] | π Frozen |
| 126 | base_model.model.model.levels.2.blocks.2.mixer.in_proj.base_layer | Linear | [4, 196, 320] | π Frozen |
| 127 | base_model.model.model.levels.2.blocks.2.mixer.in_proj.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 128 | base_model.model.model.levels.2.blocks.2.mixer.in_proj.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 129 | base_model.model.model.levels.2.blocks.2.mixer.in_proj.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 130 | base_model.model.model.levels.2.blocks.2.mixer.in_proj | Linear | [4, 196, 320] | π Frozen |
| 131 | base_model.model.model.levels.2.blocks.2.mixer.x_proj.base_layer | Linear | [784, 36] | π Frozen |
| 132 | base_model.model.model.levels.2.blocks.2.mixer.x_proj.lora_dropout.default | Identity | [784, 160] | π Frozen |
| 133 | base_model.model.model.levels.2.blocks.2.mixer.x_proj.lora_A.default | Linear | [784, 64] | π’ Trainable |
| 134 | base_model.model.model.levels.2.blocks.2.mixer.x_proj.lora_B.default | Linear | [784, 36] | π’ Trainable |
| 135 | base_model.model.model.levels.2.blocks.2.mixer.x_proj | Linear | [784, 36] | π Frozen |
| 136 | base_model.model.model.levels.2.blocks.2.mixer.dt_proj.base_layer | Linear | [784, 160] | π Frozen |
| 137 | base_model.model.model.levels.2.blocks.2.mixer.dt_proj.lora_dropout.default | Identity | [784, 20] | π Frozen |
| 138 | base_model.model.model.levels.2.blocks.2.mixer.dt_proj.lora_A.default | Linear | [784, 64] | π’ Trainable |
| 139 | base_model.model.model.levels.2.blocks.2.mixer.dt_proj.lora_B.default | Linear | [784, 160] | π’ Trainable |
| 140 | base_model.model.model.levels.2.blocks.2.mixer.dt_proj | Linear | [784, 160] | π Frozen |
| 141 | base_model.model.model.levels.2.blocks.2.mixer.out_proj.base_layer | Linear | [4, 196, 320] | π Frozen |
| 142 | base_model.model.model.levels.2.blocks.2.mixer.out_proj.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 143 | base_model.model.model.levels.2.blocks.2.mixer.out_proj.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 144 | base_model.model.model.levels.2.blocks.2.mixer.out_proj.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 145 | base_model.model.model.levels.2.blocks.2.mixer.out_proj | Linear | [4, 196, 320] | π Frozen |
| 146 | base_model.model.model.levels.2.blocks.2.mixer | MambaVisionMixer | [4, 196, 320] | π Frozen |
| 147 | base_model.model.model.levels.2.blocks.2.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 148 | base_model.model.model.levels.2.blocks.2.norm2 | LayerNorm | [4, 196, 320] | π Frozen |
| 149 | base_model.model.model.levels.2.blocks.2.mlp.fc1.base_layer | Linear | [4, 196, 1280] | π Frozen |
| 150 | base_model.model.model.levels.2.blocks.2.mlp.fc1.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 151 | base_model.model.model.levels.2.blocks.2.mlp.fc1.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 152 | base_model.model.model.levels.2.blocks.2.mlp.fc1.lora_B.default | Linear | [4, 196, 1280] | π’ Trainable |
| 153 | base_model.model.model.levels.2.blocks.2.mlp.fc1 | Linear | [4, 196, 1280] | π Frozen |
| 154 | base_model.model.model.levels.2.blocks.2.mlp.act | GELU | [4, 196, 1280] | π Frozen |
| 155 | base_model.model.model.levels.2.blocks.2.mlp.drop1 | Dropout | [4, 196, 1280] | π Frozen |
| 156 | base_model.model.model.levels.2.blocks.2.mlp.norm | Identity | [4, 196, 1280] | π Frozen |
| 157 | base_model.model.model.levels.2.blocks.2.mlp.fc2.base_layer | Linear | [4, 196, 320] | π Frozen |
| 158 | base_model.model.model.levels.2.blocks.2.mlp.fc2.lora_dropout.default | Identity | [4, 196, 1280] | π Frozen |
| 159 | base_model.model.model.levels.2.blocks.2.mlp.fc2.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 160 | base_model.model.model.levels.2.blocks.2.mlp.fc2.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 161 | base_model.model.model.levels.2.blocks.2.mlp.fc2 | Linear | [4, 196, 320] | π Frozen |
| 162 | base_model.model.model.levels.2.blocks.2.mlp.drop2 | Dropout | [4, 196, 320] | π Frozen |
| 163 | base_model.model.model.levels.2.blocks.2.mlp | Mlp | [4, 196, 320] | π Frozen |
| 164 | base_model.model.model.levels.2.blocks.2.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 165 | base_model.model.model.levels.2.blocks.2 | Block | [4, 196, 320] | π Frozen |
| 166 | base_model.model.model.levels.2.blocks.3.norm1 | LayerNorm | [4, 196, 320] | π Frozen |
| 167 | base_model.model.model.levels.2.blocks.3.mixer.in_proj.base_layer | Linear | [4, 196, 320] | π Frozen |
| 168 | base_model.model.model.levels.2.blocks.3.mixer.in_proj.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 169 | base_model.model.model.levels.2.blocks.3.mixer.in_proj.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 170 | base_model.model.model.levels.2.blocks.3.mixer.in_proj.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 171 | base_model.model.model.levels.2.blocks.3.mixer.in_proj | Linear | [4, 196, 320] | π Frozen |
| 172 | base_model.model.model.levels.2.blocks.3.mixer.x_proj.base_layer | Linear | [784, 36] | π Frozen |
| 173 | base_model.model.model.levels.2.blocks.3.mixer.x_proj.lora_dropout.default | Identity | [784, 160] | π Frozen |
| 174 | base_model.model.model.levels.2.blocks.3.mixer.x_proj.lora_A.default | Linear | [784, 64] | π’ Trainable |
| 175 | base_model.model.model.levels.2.blocks.3.mixer.x_proj.lora_B.default | Linear | [784, 36] | π’ Trainable |
| 176 | base_model.model.model.levels.2.blocks.3.mixer.x_proj | Linear | [784, 36] | π Frozen |
| 177 | base_model.model.model.levels.2.blocks.3.mixer.dt_proj.base_layer | Linear | [784, 160] | π Frozen |
| 178 | base_model.model.model.levels.2.blocks.3.mixer.dt_proj.lora_dropout.default | Identity | [784, 20] | π Frozen |
| 179 | base_model.model.model.levels.2.blocks.3.mixer.dt_proj.lora_A.default | Linear | [784, 64] | π’ Trainable |
| 180 | base_model.model.model.levels.2.blocks.3.mixer.dt_proj.lora_B.default | Linear | [784, 160] | π’ Trainable |
| 181 | base_model.model.model.levels.2.blocks.3.mixer.dt_proj | Linear | [784, 160] | π Frozen |
| 182 | base_model.model.model.levels.2.blocks.3.mixer.out_proj.base_layer | Linear | [4, 196, 320] | π Frozen |
| 183 | base_model.model.model.levels.2.blocks.3.mixer.out_proj.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 184 | base_model.model.model.levels.2.blocks.3.mixer.out_proj.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 185 | base_model.model.model.levels.2.blocks.3.mixer.out_proj.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 186 | base_model.model.model.levels.2.blocks.3.mixer.out_proj | Linear | [4, 196, 320] | π Frozen |
| 187 | base_model.model.model.levels.2.blocks.3.mixer | MambaVisionMixer | [4, 196, 320] | π Frozen |
| 188 | base_model.model.model.levels.2.blocks.3.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 189 | base_model.model.model.levels.2.blocks.3.norm2 | LayerNorm | [4, 196, 320] | π Frozen |
| 190 | base_model.model.model.levels.2.blocks.3.mlp.fc1.base_layer | Linear | [4, 196, 1280] | π Frozen |
| 191 | base_model.model.model.levels.2.blocks.3.mlp.fc1.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 192 | base_model.model.model.levels.2.blocks.3.mlp.fc1.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 193 | base_model.model.model.levels.2.blocks.3.mlp.fc1.lora_B.default | Linear | [4, 196, 1280] | π’ Trainable |
| 194 | base_model.model.model.levels.2.blocks.3.mlp.fc1 | Linear | [4, 196, 1280] | π Frozen |
| 195 | base_model.model.model.levels.2.blocks.3.mlp.act | GELU | [4, 196, 1280] | π Frozen |
| 196 | base_model.model.model.levels.2.blocks.3.mlp.drop1 | Dropout | [4, 196, 1280] | π Frozen |
| 197 | base_model.model.model.levels.2.blocks.3.mlp.norm | Identity | [4, 196, 1280] | π Frozen |
| 198 | base_model.model.model.levels.2.blocks.3.mlp.fc2.base_layer | Linear | [4, 196, 320] | π Frozen |
| 199 | base_model.model.model.levels.2.blocks.3.mlp.fc2.lora_dropout.default | Identity | [4, 196, 1280] | π Frozen |
| 200 | base_model.model.model.levels.2.blocks.3.mlp.fc2.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 201 | base_model.model.model.levels.2.blocks.3.mlp.fc2.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 202 | base_model.model.model.levels.2.blocks.3.mlp.fc2 | Linear | [4, 196, 320] | π Frozen |
| 203 | base_model.model.model.levels.2.blocks.3.mlp.drop2 | Dropout | [4, 196, 320] | π Frozen |
| 204 | base_model.model.model.levels.2.blocks.3.mlp | Mlp | [4, 196, 320] | π Frozen |
| 205 | base_model.model.model.levels.2.blocks.3.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 206 | base_model.model.model.levels.2.blocks.3 | Block | [4, 196, 320] | π Frozen |
| 207 | base_model.model.model.levels.2.blocks.4.norm1 | LayerNorm | [4, 196, 320] | π Frozen |
| 208 | base_model.model.model.levels.2.blocks.4.mixer.qkv | Linear | [4, 196, 960] | π Frozen |
| 209 | base_model.model.model.levels.2.blocks.4.mixer.q_norm | Identity | [4, 8, 196, 40] | π Frozen |
| 210 | base_model.model.model.levels.2.blocks.4.mixer.k_norm | Identity | [4, 8, 196, 40] | π Frozen |
| 211 | base_model.model.model.levels.2.blocks.4.mixer.proj | Linear | [4, 196, 320] | π Frozen |
| 212 | base_model.model.model.levels.2.blocks.4.mixer.proj_drop | Dropout | [4, 196, 320] | π Frozen |
| 213 | base_model.model.model.levels.2.blocks.4.mixer | Attention | [4, 196, 320] | π Frozen |
| 214 | base_model.model.model.levels.2.blocks.4.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 215 | base_model.model.model.levels.2.blocks.4.norm2 | LayerNorm | [4, 196, 320] | π Frozen |
| 216 | base_model.model.model.levels.2.blocks.4.mlp.fc1.base_layer | Linear | [4, 196, 1280] | π Frozen |
| 217 | base_model.model.model.levels.2.blocks.4.mlp.fc1.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 218 | base_model.model.model.levels.2.blocks.4.mlp.fc1.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 219 | base_model.model.model.levels.2.blocks.4.mlp.fc1.lora_B.default | Linear | [4, 196, 1280] | π’ Trainable |
| 220 | base_model.model.model.levels.2.blocks.4.mlp.fc1 | Linear | [4, 196, 1280] | π Frozen |
| 221 | base_model.model.model.levels.2.blocks.4.mlp.act | GELU | [4, 196, 1280] | π Frozen |
| 222 | base_model.model.model.levels.2.blocks.4.mlp.drop1 | Dropout | [4, 196, 1280] | π Frozen |
| 223 | base_model.model.model.levels.2.blocks.4.mlp.norm | Identity | [4, 196, 1280] | π Frozen |
| 224 | base_model.model.model.levels.2.blocks.4.mlp.fc2.base_layer | Linear | [4, 196, 320] | π Frozen |
| 225 | base_model.model.model.levels.2.blocks.4.mlp.fc2.lora_dropout.default | Identity | [4, 196, 1280] | π Frozen |
| 226 | base_model.model.model.levels.2.blocks.4.mlp.fc2.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 227 | base_model.model.model.levels.2.blocks.4.mlp.fc2.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 228 | base_model.model.model.levels.2.blocks.4.mlp.fc2 | Linear | [4, 196, 320] | π Frozen |
| 229 | base_model.model.model.levels.2.blocks.4.mlp.drop2 | Dropout | [4, 196, 320] | π Frozen |
| 230 | base_model.model.model.levels.2.blocks.4.mlp | Mlp | [4, 196, 320] | π Frozen |
| 231 | base_model.model.model.levels.2.blocks.4.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 232 | base_model.model.model.levels.2.blocks.4 | Block | [4, 196, 320] | π Frozen |
| 233 | base_model.model.model.levels.2.blocks.5.norm1 | LayerNorm | [4, 196, 320] | π Frozen |
| 234 | base_model.model.model.levels.2.blocks.5.mixer.qkv | Linear | [4, 196, 960] | π Frozen |
| 235 | base_model.model.model.levels.2.blocks.5.mixer.q_norm | Identity | [4, 8, 196, 40] | π Frozen |
| 236 | base_model.model.model.levels.2.blocks.5.mixer.k_norm | Identity | [4, 8, 196, 40] | π Frozen |
| 237 | base_model.model.model.levels.2.blocks.5.mixer.proj | Linear | [4, 196, 320] | π Frozen |
| 238 | base_model.model.model.levels.2.blocks.5.mixer.proj_drop | Dropout | [4, 196, 320] | π Frozen |
| 239 | base_model.model.model.levels.2.blocks.5.mixer | Attention | [4, 196, 320] | π Frozen |
| 240 | base_model.model.model.levels.2.blocks.5.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 241 | base_model.model.model.levels.2.blocks.5.norm2 | LayerNorm | [4, 196, 320] | π Frozen |
| 242 | base_model.model.model.levels.2.blocks.5.mlp.fc1.base_layer | Linear | [4, 196, 1280] | π Frozen |
| 243 | base_model.model.model.levels.2.blocks.5.mlp.fc1.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 244 | base_model.model.model.levels.2.blocks.5.mlp.fc1.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 245 | base_model.model.model.levels.2.blocks.5.mlp.fc1.lora_B.default | Linear | [4, 196, 1280] | π’ Trainable |
| 246 | base_model.model.model.levels.2.blocks.5.mlp.fc1 | Linear | [4, 196, 1280] | π Frozen |
| 247 | base_model.model.model.levels.2.blocks.5.mlp.act | GELU | [4, 196, 1280] | π Frozen |
| 248 | base_model.model.model.levels.2.blocks.5.mlp.drop1 | Dropout | [4, 196, 1280] | π Frozen |
| 249 | base_model.model.model.levels.2.blocks.5.mlp.norm | Identity | [4, 196, 1280] | π Frozen |
| 250 | base_model.model.model.levels.2.blocks.5.mlp.fc2.base_layer | Linear | [4, 196, 320] | π Frozen |
| 251 | base_model.model.model.levels.2.blocks.5.mlp.fc2.lora_dropout.default | Identity | [4, 196, 1280] | π Frozen |
| 252 | base_model.model.model.levels.2.blocks.5.mlp.fc2.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 253 | base_model.model.model.levels.2.blocks.5.mlp.fc2.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 254 | base_model.model.model.levels.2.blocks.5.mlp.fc2 | Linear | [4, 196, 320] | π Frozen |
| 255 | base_model.model.model.levels.2.blocks.5.mlp.drop2 | Dropout | [4, 196, 320] | π Frozen |
| 256 | base_model.model.model.levels.2.blocks.5.mlp | Mlp | [4, 196, 320] | π Frozen |
| 257 | base_model.model.model.levels.2.blocks.5.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 258 | base_model.model.model.levels.2.blocks.5 | Block | [4, 196, 320] | π Frozen |
| 259 | base_model.model.model.levels.2.blocks.6.norm1 | LayerNorm | [4, 196, 320] | π Frozen |
| 260 | base_model.model.model.levels.2.blocks.6.mixer.qkv | Linear | [4, 196, 960] | π Frozen |
| 261 | base_model.model.model.levels.2.blocks.6.mixer.q_norm | Identity | [4, 8, 196, 40] | π Frozen |
| 262 | base_model.model.model.levels.2.blocks.6.mixer.k_norm | Identity | [4, 8, 196, 40] | π Frozen |
| 263 | base_model.model.model.levels.2.blocks.6.mixer.proj | Linear | [4, 196, 320] | π Frozen |
| 264 | base_model.model.model.levels.2.blocks.6.mixer.proj_drop | Dropout | [4, 196, 320] | π Frozen |
| 265 | base_model.model.model.levels.2.blocks.6.mixer | Attention | [4, 196, 320] | π Frozen |
| 266 | base_model.model.model.levels.2.blocks.6.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 267 | base_model.model.model.levels.2.blocks.6.norm2 | LayerNorm | [4, 196, 320] | π Frozen |
| 268 | base_model.model.model.levels.2.blocks.6.mlp.fc1.base_layer | Linear | [4, 196, 1280] | π Frozen |
| 269 | base_model.model.model.levels.2.blocks.6.mlp.fc1.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 270 | base_model.model.model.levels.2.blocks.6.mlp.fc1.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 271 | base_model.model.model.levels.2.blocks.6.mlp.fc1.lora_B.default | Linear | [4, 196, 1280] | π’ Trainable |
| 272 | base_model.model.model.levels.2.blocks.6.mlp.fc1 | Linear | [4, 196, 1280] | π Frozen |
| 273 | base_model.model.model.levels.2.blocks.6.mlp.act | GELU | [4, 196, 1280] | π Frozen |
| 274 | base_model.model.model.levels.2.blocks.6.mlp.drop1 | Dropout | [4, 196, 1280] | π Frozen |
| 275 | base_model.model.model.levels.2.blocks.6.mlp.norm | Identity | [4, 196, 1280] | π Frozen |
| 276 | base_model.model.model.levels.2.blocks.6.mlp.fc2.base_layer | Linear | [4, 196, 320] | π Frozen |
| 277 | base_model.model.model.levels.2.blocks.6.mlp.fc2.lora_dropout.default | Identity | [4, 196, 1280] | π Frozen |
| 278 | base_model.model.model.levels.2.blocks.6.mlp.fc2.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 279 | base_model.model.model.levels.2.blocks.6.mlp.fc2.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 280 | base_model.model.model.levels.2.blocks.6.mlp.fc2 | Linear | [4, 196, 320] | π Frozen |
| 281 | base_model.model.model.levels.2.blocks.6.mlp.drop2 | Dropout | [4, 196, 320] | π Frozen |
| 282 | base_model.model.model.levels.2.blocks.6.mlp | Mlp | [4, 196, 320] | π Frozen |
| 283 | base_model.model.model.levels.2.blocks.6.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 284 | base_model.model.model.levels.2.blocks.6 | Block | [4, 196, 320] | π Frozen |
| 285 | base_model.model.model.levels.2.blocks.7.norm1 | LayerNorm | [4, 196, 320] | π Frozen |
| 286 | base_model.model.model.levels.2.blocks.7.mixer.qkv | Linear | [4, 196, 960] | π Frozen |
| 287 | base_model.model.model.levels.2.blocks.7.mixer.q_norm | Identity | [4, 8, 196, 40] | π Frozen |
| 288 | base_model.model.model.levels.2.blocks.7.mixer.k_norm | Identity | [4, 8, 196, 40] | π Frozen |
| 289 | base_model.model.model.levels.2.blocks.7.mixer.proj | Linear | [4, 196, 320] | π Frozen |
| 290 | base_model.model.model.levels.2.blocks.7.mixer.proj_drop | Dropout | [4, 196, 320] | π Frozen |
| 291 | base_model.model.model.levels.2.blocks.7.mixer | Attention | [4, 196, 320] | π Frozen |
| 292 | base_model.model.model.levels.2.blocks.7.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 293 | base_model.model.model.levels.2.blocks.7.norm2 | LayerNorm | [4, 196, 320] | π Frozen |
| 294 | base_model.model.model.levels.2.blocks.7.mlp.fc1.base_layer | Linear | [4, 196, 1280] | π Frozen |
| 295 | base_model.model.model.levels.2.blocks.7.mlp.fc1.lora_dropout.default | Identity | [4, 196, 320] | π Frozen |
| 296 | base_model.model.model.levels.2.blocks.7.mlp.fc1.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 297 | base_model.model.model.levels.2.blocks.7.mlp.fc1.lora_B.default | Linear | [4, 196, 1280] | π’ Trainable |
| 298 | base_model.model.model.levels.2.blocks.7.mlp.fc1 | Linear | [4, 196, 1280] | π Frozen |
| 299 | base_model.model.model.levels.2.blocks.7.mlp.act | GELU | [4, 196, 1280] | π Frozen |
| 300 | base_model.model.model.levels.2.blocks.7.mlp.drop1 | Dropout | [4, 196, 1280] | π Frozen |
| 301 | base_model.model.model.levels.2.blocks.7.mlp.norm | Identity | [4, 196, 1280] | π Frozen |
| 302 | base_model.model.model.levels.2.blocks.7.mlp.fc2.base_layer | Linear | [4, 196, 320] | π Frozen |
| 303 | base_model.model.model.levels.2.blocks.7.mlp.fc2.lora_dropout.default | Identity | [4, 196, 1280] | π Frozen |
| 304 | base_model.model.model.levels.2.blocks.7.mlp.fc2.lora_A.default | Linear | [4, 196, 64] | π’ Trainable |
| 305 | base_model.model.model.levels.2.blocks.7.mlp.fc2.lora_B.default | Linear | [4, 196, 320] | π’ Trainable |
| 306 | base_model.model.model.levels.2.blocks.7.mlp.fc2 | Linear | [4, 196, 320] | π Frozen |
| 307 | base_model.model.model.levels.2.blocks.7.mlp.drop2 | Dropout | [4, 196, 320] | π Frozen |
| 308 | base_model.model.model.levels.2.blocks.7.mlp | Mlp | [4, 196, 320] | π Frozen |
| 309 | base_model.model.model.levels.2.blocks.7.drop_path | DropPath | [4, 196, 320] | π Frozen |
| 310 | base_model.model.model.levels.2.blocks.7 | Block | [4, 196, 320] | π Frozen |
| 311 | base_model.model.model.levels.2.downsample.reduction.0 | Conv2d | [1, 640, 12, 12] | π Frozen |
| 312 | base_model.model.model.levels.2.downsample.reduction | Sequential | [1, 640, 12, 12] | π Frozen |
| 313 | base_model.model.model.levels.2.downsample | Downsample | [1, 640, 12, 12] | π Frozen |
| 314 | base_model.model.quantum_adapter.reducer | Linear | [1, 4] | π’ Trainable |
| 315 | base_model.model.quantum_adapter.q_layer | TorchLayer | [1, 4] | π’ Trainable |
| 316 | base_model.model.quantum_adapter.expander | Linear | [1, 640] | π’ Trainable |
| 317 | base_model.model.quantum_adapter | QuantumContextAdapter | [1, 640, 12, 12] | π’ Trainable |
| 318 | base_model.model.model.levels.2 | MambaVisionLayer | [1, 640, 12, 12] (+1 aux) | π Frozen |
| 319 | base_model.model.model.levels.3.blocks.0.norm1 | LayerNorm | [4, 49, 640] | π Frozen |
| 320 | base_model.model.model.levels.3.blocks.0.mixer.in_proj.base_layer | Linear | [4, 49, 640] | π Frozen |
| 321 | base_model.model.model.levels.3.blocks.0.mixer.in_proj.lora_dropout.default | Identity | [4, 49, 640] | π Frozen |
| 322 | base_model.model.model.levels.3.blocks.0.mixer.in_proj.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 323 | base_model.model.model.levels.3.blocks.0.mixer.in_proj.lora_B.default | Linear | [4, 49, 640] | π’ Trainable |
| 324 | base_model.model.model.levels.3.blocks.0.mixer.in_proj | Linear | [4, 49, 640] | π Frozen |
| 325 | base_model.model.model.levels.3.blocks.0.mixer.x_proj.base_layer | Linear | [196, 56] | π Frozen |
| 326 | base_model.model.model.levels.3.blocks.0.mixer.x_proj.lora_dropout.default | Identity | [196, 320] | π Frozen |
| 327 | base_model.model.model.levels.3.blocks.0.mixer.x_proj.lora_A.default | Linear | [196, 64] | π’ Trainable |
| 328 | base_model.model.model.levels.3.blocks.0.mixer.x_proj.lora_B.default | Linear | [196, 56] | π’ Trainable |
| 329 | base_model.model.model.levels.3.blocks.0.mixer.x_proj | Linear | [196, 56] | π Frozen |
| 330 | base_model.model.model.levels.3.blocks.0.mixer.dt_proj.base_layer | Linear | [196, 320] | π Frozen |
| 331 | base_model.model.model.levels.3.blocks.0.mixer.dt_proj.lora_dropout.default | Identity | [196, 40] | π Frozen |
| 332 | base_model.model.model.levels.3.blocks.0.mixer.dt_proj.lora_A.default | Linear | [196, 64] | π’ Trainable |
| 333 | base_model.model.model.levels.3.blocks.0.mixer.dt_proj.lora_B.default | Linear | [196, 320] | π’ Trainable |
| 334 | base_model.model.model.levels.3.blocks.0.mixer.dt_proj | Linear | [196, 320] | π Frozen |
| 335 | base_model.model.model.levels.3.blocks.0.mixer.out_proj.base_layer | Linear | [4, 49, 640] | π Frozen |
| 336 | base_model.model.model.levels.3.blocks.0.mixer.out_proj.lora_dropout.default | Identity | [4, 49, 640] | π Frozen |
| 337 | base_model.model.model.levels.3.blocks.0.mixer.out_proj.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 338 | base_model.model.model.levels.3.blocks.0.mixer.out_proj.lora_B.default | Linear | [4, 49, 640] | π’ Trainable |
| 339 | base_model.model.model.levels.3.blocks.0.mixer.out_proj | Linear | [4, 49, 640] | π Frozen |
| 340 | base_model.model.model.levels.3.blocks.0.mixer | MambaVisionMixer | [4, 49, 640] | π Frozen |
| 341 | base_model.model.model.levels.3.blocks.0.drop_path | DropPath | [4, 49, 640] | π Frozen |
| 342 | base_model.model.model.levels.3.blocks.0.norm2 | LayerNorm | [4, 49, 640] | π Frozen |
| 343 | base_model.model.model.levels.3.blocks.0.mlp.fc1.base_layer | Linear | [4, 49, 2560] | π Frozen |
| 344 | base_model.model.model.levels.3.blocks.0.mlp.fc1.lora_dropout.default | Identity | [4, 49, 640] | π Frozen |
| 345 | base_model.model.model.levels.3.blocks.0.mlp.fc1.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 346 | base_model.model.model.levels.3.blocks.0.mlp.fc1.lora_B.default | Linear | [4, 49, 2560] | π’ Trainable |
| 347 | base_model.model.model.levels.3.blocks.0.mlp.fc1 | Linear | [4, 49, 2560] | π Frozen |
| 348 | base_model.model.model.levels.3.blocks.0.mlp.act | GELU | [4, 49, 2560] | π Frozen |
| 349 | base_model.model.model.levels.3.blocks.0.mlp.drop1 | Dropout | [4, 49, 2560] | π Frozen |
| 350 | base_model.model.model.levels.3.blocks.0.mlp.norm | Identity | [4, 49, 2560] | π Frozen |
| 351 | base_model.model.model.levels.3.blocks.0.mlp.fc2.base_layer | Linear | [4, 49, 640] | π Frozen |
| 352 | base_model.model.model.levels.3.blocks.0.mlp.fc2.lora_dropout.default | Identity | [4, 49, 2560] | π Frozen |
| 353 | base_model.model.model.levels.3.blocks.0.mlp.fc2.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 354 | base_model.model.model.levels.3.blocks.0.mlp.fc2.lora_B.default | Linear | [4, 49, 640] | π’ Trainable |
| 355 | base_model.model.model.levels.3.blocks.0.mlp.fc2 | Linear | [4, 49, 640] | π Frozen |
| 356 | base_model.model.model.levels.3.blocks.0.mlp.drop2 | Dropout | [4, 49, 640] | π Frozen |
| 357 | base_model.model.model.levels.3.blocks.0.mlp | Mlp | [4, 49, 640] | π Frozen |
| 358 | base_model.model.model.levels.3.blocks.0.drop_path | DropPath | [4, 49, 640] | π Frozen |
| 359 | base_model.model.model.levels.3.blocks.0 | Block | [4, 49, 640] | π Frozen |
| 360 | base_model.model.model.levels.3.blocks.1.norm1 | LayerNorm | [4, 49, 640] | π Frozen |
| 361 | base_model.model.model.levels.3.blocks.1.mixer.in_proj.base_layer | Linear | [4, 49, 640] | π Frozen |
| 362 | base_model.model.model.levels.3.blocks.1.mixer.in_proj.lora_dropout.default | Identity | [4, 49, 640] | π Frozen |
| 363 | base_model.model.model.levels.3.blocks.1.mixer.in_proj.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 364 | base_model.model.model.levels.3.blocks.1.mixer.in_proj.lora_B.default | Linear | [4, 49, 640] | π’ Trainable |
| 365 | base_model.model.model.levels.3.blocks.1.mixer.in_proj | Linear | [4, 49, 640] | π Frozen |
| 366 | base_model.model.model.levels.3.blocks.1.mixer.x_proj.base_layer | Linear | [196, 56] | π Frozen |
| 367 | base_model.model.model.levels.3.blocks.1.mixer.x_proj.lora_dropout.default | Identity | [196, 320] | π Frozen |
| 368 | base_model.model.model.levels.3.blocks.1.mixer.x_proj.lora_A.default | Linear | [196, 64] | π’ Trainable |
| 369 | base_model.model.model.levels.3.blocks.1.mixer.x_proj.lora_B.default | Linear | [196, 56] | π’ Trainable |
| 370 | base_model.model.model.levels.3.blocks.1.mixer.x_proj | Linear | [196, 56] | π Frozen |
| 371 | base_model.model.model.levels.3.blocks.1.mixer.dt_proj.base_layer | Linear | [196, 320] | π Frozen |
| 372 | base_model.model.model.levels.3.blocks.1.mixer.dt_proj.lora_dropout.default | Identity | [196, 40] | π Frozen |
| 373 | base_model.model.model.levels.3.blocks.1.mixer.dt_proj.lora_A.default | Linear | [196, 64] | π’ Trainable |
| 374 | base_model.model.model.levels.3.blocks.1.mixer.dt_proj.lora_B.default | Linear | [196, 320] | π’ Trainable |
| 375 | base_model.model.model.levels.3.blocks.1.mixer.dt_proj | Linear | [196, 320] | π Frozen |
| 376 | base_model.model.model.levels.3.blocks.1.mixer.out_proj.base_layer | Linear | [4, 49, 640] | π Frozen |
| 377 | base_model.model.model.levels.3.blocks.1.mixer.out_proj.lora_dropout.default | Identity | [4, 49, 640] | π Frozen |
| 378 | base_model.model.model.levels.3.blocks.1.mixer.out_proj.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 379 | base_model.model.model.levels.3.blocks.1.mixer.out_proj.lora_B.default | Linear | [4, 49, 640] | π’ Trainable |
| 380 | base_model.model.model.levels.3.blocks.1.mixer.out_proj | Linear | [4, 49, 640] | π Frozen |
| 381 | base_model.model.model.levels.3.blocks.1.mixer | MambaVisionMixer | [4, 49, 640] | π Frozen |
| 382 | base_model.model.model.levels.3.blocks.1.drop_path | DropPath | [4, 49, 640] | π Frozen |
| 383 | base_model.model.model.levels.3.blocks.1.norm2 | LayerNorm | [4, 49, 640] | π Frozen |
| 384 | base_model.model.model.levels.3.blocks.1.mlp.fc1.base_layer | Linear | [4, 49, 2560] | π Frozen |
| 385 | base_model.model.model.levels.3.blocks.1.mlp.fc1.lora_dropout.default | Identity | [4, 49, 640] | π Frozen |
| 386 | base_model.model.model.levels.3.blocks.1.mlp.fc1.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 387 | base_model.model.model.levels.3.blocks.1.mlp.fc1.lora_B.default | Linear | [4, 49, 2560] | π’ Trainable |
| 388 | base_model.model.model.levels.3.blocks.1.mlp.fc1 | Linear | [4, 49, 2560] | π Frozen |
| 389 | base_model.model.model.levels.3.blocks.1.mlp.act | GELU | [4, 49, 2560] | π Frozen |
| 390 | base_model.model.model.levels.3.blocks.1.mlp.drop1 | Dropout | [4, 49, 2560] | π Frozen |
| 391 | base_model.model.model.levels.3.blocks.1.mlp.norm | Identity | [4, 49, 2560] | π Frozen |
| 392 | base_model.model.model.levels.3.blocks.1.mlp.fc2.base_layer | Linear | [4, 49, 640] | π Frozen |
| 393 | base_model.model.model.levels.3.blocks.1.mlp.fc2.lora_dropout.default | Identity | [4, 49, 2560] | π Frozen |
| 394 | base_model.model.model.levels.3.blocks.1.mlp.fc2.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 395 | base_model.model.model.levels.3.blocks.1.mlp.fc2.lora_B.default | Linear | [4, 49, 640] | π’ Trainable |
| 396 | base_model.model.model.levels.3.blocks.1.mlp.fc2 | Linear | [4, 49, 640] | π Frozen |
| 397 | base_model.model.model.levels.3.blocks.1.mlp.drop2 | Dropout | [4, 49, 640] | π Frozen |
| 398 | base_model.model.model.levels.3.blocks.1.mlp | Mlp | [4, 49, 640] | π Frozen |
| 399 | base_model.model.model.levels.3.blocks.1.drop_path | DropPath | [4, 49, 640] | π Frozen |
| 400 | base_model.model.model.levels.3.blocks.1 | Block | [4, 49, 640] | π Frozen |
| 401 | base_model.model.model.levels.3.blocks.2.norm1 | LayerNorm | [4, 49, 640] | π Frozen |
| 402 | base_model.model.model.levels.3.blocks.2.mixer.qkv | Linear | [4, 49, 1920] | π Frozen |
| 403 | base_model.model.model.levels.3.blocks.2.mixer.q_norm | Identity | [4, 16, 49, 40] | π Frozen |
| 404 | base_model.model.model.levels.3.blocks.2.mixer.k_norm | Identity | [4, 16, 49, 40] | π Frozen |
| 405 | base_model.model.model.levels.3.blocks.2.mixer.proj | Linear | [4, 49, 640] | π Frozen |
| 406 | base_model.model.model.levels.3.blocks.2.mixer.proj_drop | Dropout | [4, 49, 640] | π Frozen |
| 407 | base_model.model.model.levels.3.blocks.2.mixer | Attention | [4, 49, 640] | π Frozen |
| 408 | base_model.model.model.levels.3.blocks.2.drop_path | DropPath | [4, 49, 640] | π Frozen |
| 409 | base_model.model.model.levels.3.blocks.2.norm2 | LayerNorm | [4, 49, 640] | π Frozen |
| 410 | base_model.model.model.levels.3.blocks.2.mlp.fc1.base_layer | Linear | [4, 49, 2560] | π Frozen |
| 411 | base_model.model.model.levels.3.blocks.2.mlp.fc1.lora_dropout.default | Identity | [4, 49, 640] | π Frozen |
| 412 | base_model.model.model.levels.3.blocks.2.mlp.fc1.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 413 | base_model.model.model.levels.3.blocks.2.mlp.fc1.lora_B.default | Linear | [4, 49, 2560] | π’ Trainable |
| 414 | base_model.model.model.levels.3.blocks.2.mlp.fc1 | Linear | [4, 49, 2560] | π Frozen |
| 415 | base_model.model.model.levels.3.blocks.2.mlp.act | GELU | [4, 49, 2560] | π Frozen |
| 416 | base_model.model.model.levels.3.blocks.2.mlp.drop1 | Dropout | [4, 49, 2560] | π Frozen |
| 417 | base_model.model.model.levels.3.blocks.2.mlp.norm | Identity | [4, 49, 2560] | π Frozen |
| 418 | base_model.model.model.levels.3.blocks.2.mlp.fc2.base_layer | Linear | [4, 49, 640] | π Frozen |
| 419 | base_model.model.model.levels.3.blocks.2.mlp.fc2.lora_dropout.default | Identity | [4, 49, 2560] | π Frozen |
| 420 | base_model.model.model.levels.3.blocks.2.mlp.fc2.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 421 | base_model.model.model.levels.3.blocks.2.mlp.fc2.lora_B.default | Linear | [4, 49, 640] | π’ Trainable |
| 422 | base_model.model.model.levels.3.blocks.2.mlp.fc2 | Linear | [4, 49, 640] | π Frozen |
| 423 | base_model.model.model.levels.3.blocks.2.mlp.drop2 | Dropout | [4, 49, 640] | π Frozen |
| 424 | base_model.model.model.levels.3.blocks.2.mlp | Mlp | [4, 49, 640] | π Frozen |
| 425 | base_model.model.model.levels.3.blocks.2.drop_path | DropPath | [4, 49, 640] | π Frozen |
| 426 | base_model.model.model.levels.3.blocks.2 | Block | [4, 49, 640] | π Frozen |
| 427 | base_model.model.model.levels.3.blocks.3.norm1 | LayerNorm | [4, 49, 640] | π Frozen |
| 428 | base_model.model.model.levels.3.blocks.3.mixer.qkv | Linear | [4, 49, 1920] | π Frozen |
| 429 | base_model.model.model.levels.3.blocks.3.mixer.q_norm | Identity | [4, 16, 49, 40] | π Frozen |
| 430 | base_model.model.model.levels.3.blocks.3.mixer.k_norm | Identity | [4, 16, 49, 40] | π Frozen |
| 431 | base_model.model.model.levels.3.blocks.3.mixer.proj | Linear | [4, 49, 640] | π Frozen |
| 432 | base_model.model.model.levels.3.blocks.3.mixer.proj_drop | Dropout | [4, 49, 640] | π Frozen |
| 433 | base_model.model.model.levels.3.blocks.3.mixer | Attention | [4, 49, 640] | π Frozen |
| 434 | base_model.model.model.levels.3.blocks.3.drop_path | DropPath | [4, 49, 640] | π Frozen |
| 435 | base_model.model.model.levels.3.blocks.3.norm2 | LayerNorm | [4, 49, 640] | π Frozen |
| 436 | base_model.model.model.levels.3.blocks.3.mlp.fc1.base_layer | Linear | [4, 49, 2560] | π Frozen |
| 437 | base_model.model.model.levels.3.blocks.3.mlp.fc1.lora_dropout.default | Identity | [4, 49, 640] | π Frozen |
| 438 | base_model.model.model.levels.3.blocks.3.mlp.fc1.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 439 | base_model.model.model.levels.3.blocks.3.mlp.fc1.lora_B.default | Linear | [4, 49, 2560] | π’ Trainable |
| 440 | base_model.model.model.levels.3.blocks.3.mlp.fc1 | Linear | [4, 49, 2560] | π Frozen |
| 441 | base_model.model.model.levels.3.blocks.3.mlp.act | GELU | [4, 49, 2560] | π Frozen |
| 442 | base_model.model.model.levels.3.blocks.3.mlp.drop1 | Dropout | [4, 49, 2560] | π Frozen |
| 443 | base_model.model.model.levels.3.blocks.3.mlp.norm | Identity | [4, 49, 2560] | π Frozen |
| 444 | base_model.model.model.levels.3.blocks.3.mlp.fc2.base_layer | Linear | [4, 49, 640] | π Frozen |
| 445 | base_model.model.model.levels.3.blocks.3.mlp.fc2.lora_dropout.default | Identity | [4, 49, 2560] | π Frozen |
| 446 | base_model.model.model.levels.3.blocks.3.mlp.fc2.lora_A.default | Linear | [4, 49, 64] | π’ Trainable |
| 447 | base_model.model.model.levels.3.blocks.3.mlp.fc2.lora_B.default | Linear | [4, 49, 640] | π’ Trainable |
| 448 | base_model.model.model.levels.3.blocks.3.mlp.fc2 | Linear | [4, 49, 640] | π Frozen |
| 449 | base_model.model.model.levels.3.blocks.3.mlp.drop2 | Dropout | [4, 49, 640] | π Frozen |
| 450 | base_model.model.model.levels.3.blocks.3.mlp | Mlp | [4, 49, 640] | π Frozen |
| 451 | base_model.model.model.levels.3.blocks.3.drop_path | DropPath | [4, 49, 640] | π Frozen |
| 452 | base_model.model.model.levels.3.blocks.3 | Block | [4, 49, 640] | π Frozen |
| 453 | base_model.model.model.levels.3 | MambaVisionLayer | [1, 640, 12, 12] (+1 aux) | π Frozen |
| 454 | base_model.model.model.norm | BatchNorm2d | [1, 640, 12, 12] | π Frozen |
| 455 | base_model.model.model.avgpool | AdaptiveAvgPool2d | [1, 640, 1, 1] | π Frozen |
| 456 | base_model.model.model.head | Linear | [1, 1000] | π Frozen |
wafer-mamba/
βββ notebooks/
β βββ training/ # Kaggle training notebooks
β β βββ wafer-mamba-hybrid.ipynb # Hybrid Quantum-MambaVision (proposed)
β β βββ wafer-mamba-classical.ipynb # Classical MambaVision + LoRA
β β βββ wafer-resnet.ipynb # ResNet-50 + LoRA baseline
β β βββ wafer-vit.ipynb # ViT-Small baseline
β βββ analysis/
β βββ analysis_for_paper.ipynb # Figures & tables for the paper
β βββ detailed_analysis.ipynb # Extended analysis
β βββ requirements.txt # Dependencies for analysis
βββ models/ # Trained model checkpoints (.pth)
β βββ hybrid-mamba/
β βββ classical-mamba/
β βββ resnet/
β βββ vit/
βββ results/ # Per-epoch predictions & logs (.json)
β βββ hybrid-mamba/
β βββ classical-mamba/
β βββ resnet/
β βββ vit/
βββ figures/ # Publication-ready figures (PDF)
β βββ overal_architecture_diagram.pdf
β βββ quantum_adapter_diagram.pdf
β βββ lora_diagram.pdf
β βββ training_dynamics_*.pdf
β βββ defect_distribution.pdf
β βββ ... (18 figures total)
βββ data/
βββ Description.pdf # Dataset description
MixedType Wafer Defect Datasets β available on Kaggle.
- Format:
.npz(wafer map images + multi-hot labels) - 8 defect classes, multi-label
- Split: 70% train / 10% validation / 20% test (stratified)
- Input resolution: 384 Γ 384 (grayscale β 3-channel)
All four models were trained on Kaggle notebooks with a T4 GPU. Each notebook's first cell installs the required dependencies:
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1
pip install causal-conv1d==1.4.0 --no-build-isolation
pip install mamba-ssm==2.2.4 --no-build-isolation
pip install mambavision timm pillow
pip install "transformers>=4.43" "accelerate>=0.33" "datasets>=2.19" "evaluate>=0.4" "peft>=0.12.0" "scikit-learn>=1.3" "bitsandbytes>=0.43.0"
pip install pennylane pennylane-lightning[gpu]To reproduce training:
- Upload the MixedType Wafer Defect Dataset as a Kaggle dataset
- Open the desired notebook from
notebooks/training/in a Kaggle notebook environment with GPU T4 Γ 2 accelerator - Run all cells β each notebook is self-contained
| Parameter | Value |
|---|---|
| Epochs | 15 |
| Batch size | 64 Γ 2 (gradient accumulation) |
| Optimizer | AdamW (fused) |
| Learning rate | 6e-4 (Mamba, ViT) / 5e-5 (ResNet) |
| Scheduler | Cosine with 10% warmup |
| Early stopping | Patience = 3 (on Macro-F1) |
| Precision | FP16 |
| Seed | 42 |
For running the analysis notebooks locally:
# Create virtual environment (Python 3.11)
python -m venv .venv
# Activate
# Windows:
.venv\Scripts\activate
# Linux/macOS:
source .venv/bin/activate
# Install analysis dependencies
pip install -r notebooks/analysis/requirements.txtThe analysis notebooks load pre-computed results from results/ and generate all figures in figures/.