bug: ev.Group in .gather mode panics when children exceed io_uring SQ depth

**Summary**: When an `ev.Group` in `.gather` mode has more children than the io_uring submission queue depth (default 256), submitting the group panics with `reached unreachable code` in `task.zig:263`.

**Root cause**: `loop.addInternal` for a `.group` op walks the children list and calls `addInternal` on each child (loop.zig:514-520). Each child's `addInternal` calls `getSqe()`. When the SQ is full, `getSqe` flushes via `poll(state, .zero)` and retries once (io_uring.zig:836-840). If the kernel hasn't completed any in-flight SQEs yet, the retry also returns `SubmissionQueueFull`. This error propagates up and leaves the group in an inconsistent state (`.running` with partial children added), causing an assertion failure when the waiter tries to yield.

**Reproduction**:

```zig
const ev = zio.ev;

fn repro() !void {
    // Create a file with 300 pages
    const file = try std.fs.createFileAbsolute("/tmp/zio_group_bug", .{});
    defer std.fs.deleteFileAbsolute("/tmp/zio_group_bug") catch {};
    var page: [4096]u8 = undefined;
    @memset(&page, 0xAA);
    for (0..300) |_| try file.writeAll(&page);
    file.close();

    const f = try zio.fs.open("/tmp/zio_group_bug", .{});
    defer f.close();

    var reads: [300]ev.FileRead = undefined;
    var iovs: [300][1]zio.os.iovec = undefined;
    var bufs: [300][4096]u8 = undefined;

    var group: ev.Group = .init(.gather);
    for (0..300) |i| {
        reads[i] = ev.FileRead.init(f.fd, .fromSlice(&bufs[i], &iovs[i]), i * 4096);
        group.add(&reads[i].c);
    }
    try zio.waitForIo(&group.c); // panics
}

test "group gather 300 reads" {
    const rt = try zio.Runtime.init(std.testing.allocator, .{});
    defer rt.deinit();
    var h = try rt.spawn(repro, .{});
    try h.join();
}
```

**Actual behavior**:

```
[zio] (err): Failed to get io_uring SQE for file_read
[default] (err): Event loop error during yield: error.SubmissionQueueFull
thread panic: reached unreachable code
  task.zig:263 -- assert(self.state.load(.acquire) == .ready)
```

**Expected behavior**: Either all 300 reads complete successfully (by waiting for SQ slots to free up), or the group returns a clean error without panicking.

**Suggested fixes** (in order of preference):

1. **`getSqe` should wait for a free slot** instead of failing after one retry. When the SQ is full and the flush doesn't free any slots, `getSqe` could wait for at least one CQE to complete (freeing an SQ slot), then retry. This makes large groups work transparently.

2. **The group add loop should handle `getSqe` failure gracefully**: cancel all already-added children, set the group error, and propagate `SubmissionQueueFull` to the caller instead of panicking.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: ev.Group in .gather mode panics when children exceed io_uring SQ depth #354

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

bug: ev.Group in .gather mode panics when children exceed io_uring SQ depth #354

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions