igorls · igorls · Mar 8, 2026 · Copilot · Mar 8, 2026 · Copilot
diff --git a/.jules/bolt.md b/.jules/bolt.md
@@ -0,0 +1,3 @@
+## 2024-03-08 - Fast Path Packet Classification
+**Learning:** In Zig, standard `switch` statements on integers compile to jump tables. For network packet classification where one type (data plane packets) vastly outnumbers others, the jump table overhead and potential branch mispredictions can be a bottleneck. Furthermore, small utility functions on the hot path may incur call overhead across module boundaries if not explicitly inlined.
+**Action:** Extract the dominant case (`msg_type == 4` for `.wg_transport`) into an explicit `if` branch before the `switch` statement to improve branch prediction and avoid jump table overhead for the most common packets. Also mark the function with the `inline` keyword.
-**Learning:** In Zig, standard `switch` statements on integers compile to jump tables. For network packet classification where one type (data plane packets) vastly outnumbers others, the jump table overhead and potential branch mispredictions can be a bottleneck. Furthermore, small utility functions on the hot path may incur call overhead across module boundaries if not explicitly inlined.
-**Action:** Extract the dominant case (`msg_type == 4` for `.wg_transport`) into an explicit `if` branch before the `switch` statement to improve branch prediction and avoid jump table overhead for the most common packets. Also mark the function with the `inline` keyword.
+**Learning:** In Zig, `switch` statements on integers are lowered by LLVM using heuristics (jump tables, comparison chains, or binary search) depending on the number and density of cases. For network packet classification where one type (data plane packets) vastly outnumbers others, the structure of the branch logic and potential branch mispredictions can be a bottleneck. Furthermore, small utility functions on the hot path may incur call overhead across module boundaries if not explicitly inlined.
+**Action:** Extract the dominant case (`msg_type == 4` for `.wg_transport`) into an explicit `if` branch before the `switch` statement to improve branch prediction and minimize overhead for the most common packets. Also mark the function with the `inline` keyword.
-**Learning:** In Zig, standard `switch` statements on integers compile to jump tables. For network packet classification where one type (data plane packets) vastly outnumbers others, the jump table overhead and potential branch mispredictions can be a bottleneck. Furthermore, small utility functions on the hot path may incur call overhead across module boundaries if not explicitly inlined.
-**Action:** Extract the dominant case (`msg_type == 4` for `.wg_transport`) into an explicit `if` branch before the `switch` statement to improve branch prediction and avoid jump table overhead for the most common packets. Also mark the function with the `inline` keyword.
+**Learning:** In Zig, `switch` statements on integers are lowered by LLVM using heuristics (jump tables, comparison chains, or binary search) depending on the number and density of cases. For network packet classification where one type (data plane packets) vastly outnumbers others, the structure of the branch logic and potential branch mispredictions can be a bottleneck. Furthermore, small utility functions on the hot path may incur call overhead across module boundaries if not explicitly inlined.
+**Action:** Extract the dominant case (`msg_type == 4` for `.wg_transport`) into an explicit `if` branch before the `switch` statement to improve branch prediction and minimize overhead for the most common packets. Also mark the function with the `inline` keyword.
diff --git a/src/wireguard/device.zig b/src/wireguard/device.zig
@@ -24,16 +24,22 @@ pub const PacketType = enum {
     stun, // STUN binding response
     unknown,
 
-    pub fn classify(data: []const u8) PacketType {
+    /// Optimization: Inlining small packet classification function and extracting
+    /// the dominant data-plane path (.wg_transport) outside the switch.
+    /// This avoids jump table overhead and improves branch prediction for 99%+ of packets.
-    /// This avoids jump table overhead and improves branch prediction for 99%+ of packets.
+    /// This is a speculative micro-optimization intended to streamline the common fast path.
-    /// This avoids jump table overhead and improves branch prediction for 99%+ of packets.
+    /// This is a speculative micro-optimization intended to streamline the common fast path.
+    pub inline fn classify(data: []const u8) PacketType {
-    pub inline fn classify(data: []const u8) PacketType {
+    pub fn classify(data: []const u8) PacketType {
-    pub inline fn classify(data: []const u8) PacketType {
+    pub fn classify(data: []const u8) PacketType {
         if (data.len < 4) return .unknown;
 
         // WireGuard messages: first byte is type, next 3 are zeros
         const msg_type = std.mem.readInt(u32, data[0..4], .little);
+
+        // Fast path: Type 4 (Transport Data) is overwhelmingly the most common
+        if (msg_type == 4) return .wg_transport;
+
         return switch (msg_type) {
             1 => .wg_handshake_init,
             2 => .wg_handshake_resp,
             3 => .wg_cookie,
-            4 => .wg_transport,
             else => blk: {
                 // STUN: check for magic cookie at bytes 4-7
                 if (data.len >= 8) {