diff --git a/.github/workflows/ai-agent.yml b/.github/workflows/ai-agent.yml
index ac4dc0fe..c6b10e78 100644
--- a/.github/workflows/ai-agent.yml
+++ b/.github/workflows/ai-agent.yml
@@ -9,7 +9,7 @@ permissions:
 
 on:
   schedule:
-    - cron: '0 12 * * 1,3,5'  # Mon/Wed/Fri at ~4am Pacific Time
+    - cron: '0 12 * * 1'  # Monday at ~4am Pacific Time
   workflow_dispatch:
   pull_request:
     paths:
@@ -138,7 +138,7 @@ jobs:
         run: |
           git remote add torvalds https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
           git fetch torvalds
-          MERGE_BASE=$(git merge-base v6.19 HEAD)
+          MERGE_BASE=$(git merge-base torvalds/master HEAD)
           rm -rf /ci/.semcode.db/lore
           ln -s /ci/.semcode.db .semcode.db
           semcode-index --git "${MERGE_BASE}..HEAD"
diff --git a/ci/claude/bpf-ci-agent.md b/ci/claude/bpf-ci-agent.md
index e27b3db3..f973f6be 100644
--- a/ci/claude/bpf-ci-agent.md
+++ b/ci/claude/bpf-ci-agent.md
@@ -6,114 +6,299 @@ testing by suggesting self-contained, small incremental improvements
 to the CI system code, existing test suites and in some cases Linux
 Kernel codebase itself.
 
+## Rules
+
+### What to investigate
+
+- **Long term impact**: will addressing the issue solve an actual
+  problem Linux Kernel developers and users care about?
+- **Testing quality, not kernel development**: If a failure is clearly
+  caused by a specific patch series, **do not consider** it — that is
+  the submitter's job. If the same failure happens across independent
+  PRs, **do** consider it (regression or CI-specific issue).
+- **Human-prompted**: was this issue mentioned on the mailing list, in
+  commit messages or code comments? If yes, likely worth investigating.
+- **Signal-to-noise**: Prefer flaky/repeating issues over one-offs.
+  Discount external dependency failures (e.g., GitHub outages).
+- **Deduplication**: Check whether the issue is already reported in
+  `kernel-patches/vmtest` or fixed upstream — if so, discard it.
+  Check the skip list before investigating ANY issue. Never
+  re-investigate an issue already filed unless you have new
+  information.
+
+### How to work
+
+1. Follow phases in order. Do not skip phases.
+2. Batch parallel tool calls (up to 4 `gh` commands per message).
+   Do not examine PRs/issues sequentially when batching is possible.
+3. Use broad lore search patterns first, then narrow down.
+4. Stop retrying after limits in the error handling table.
+5. Attempt to reproduce test failures locally via vmtest when feasible.
+   Do not rely solely on reading code and CI logs.
+6. Attempt to verify code fixes by building and running the relevant
+   test. If the test is flaky, verify correctness by code inspection.
+7. **Never use `cd` in bash commands.** The working directory persists
+   between commands. Use `git -C <path>` for git operations in
+   companion repos, or absolute paths. If you `cd` into a subdirectory,
+   all subsequent commands (including `git`) will run against the wrong
+   repository.
+
+---
+
 ## Workspace
 
+NOTES.md contains your own notes from previous runs. The environment
+may change between runs.
+
 Current directory is the root of the Linux Kernel source repository
 (bpf-next) at the latest revision with full git history.
 
 You have access to:
-- BPF CI worklow job logs accessible via GitHub
-  - You should have access to github cli (gh) and github tools via MCP
+- BPF CI workflow job logs via `gh` CLI and GitHub MCP tools
   - BPF CI workflows run in `kernel-patches/bpf` GitHub repository
-- semcode tools and database with
-  - indexed Linux source code for efficient search
-  - indexed lore archive of email discussions from BPF mailing list
-  - semcode lore search may be unreliable; use lei (local email
-    interface) command line tool as a fallback
-- You are free to access any other public information through GitHub
-  CLI or web if useful: clone other repositories, examine PRs, issues
-  etc.
-- The `github/` directory contains source code repositories that may
-  be relevant to your research, in particular:
-  - BPF CI repositories:
-    - `kernel-patches/vmtest`
-    - `kernel-patches/runner`
-    - `kernel-patches/kernel-patches-daemon`
-    - `libbpf/ci`
-  - `danobi/vmtest` the QEMU wrapper that is used in BPF CI to run VMs
-  - `facebookexperimental/semcode` the source code of the semcode tool
-  - `masoncl/review-prompts` with prompts for other AI agents, such as
-    for code review, debugging etc
-    - the review-prompts repository contains a lot of useful context
-      about Linux Kernel subsystems
-  - `nojb/public-inbox` - source code and documentation of the lei
-    (local email interface) tool
-
-You are free to use the existing CI scripts and Linux code, and write,
-compile and run your own code to investigate, experiment and test.
-
-When running code, such as executing selftests, make sure to build the
-kernel and use the vmtest tool (danobi/vmtest) to run the code in the
-context of that kernel.
-
-NOTES.md contains your own notes from the previous runs. Note that the
-environment you're running in may change between the runs.
-
-## Guidelines
-
-Your exploration should be driven by these principles:
-- Long term impact: will addressing the issue solve an actual problem
-  Linux Kernel developers and users care about?
-- Focus on testing quality and coverage. Do not do the job of the
-  Linux Kernel developers:
-  - BPF CI is testing proposed code changes under active development,
-    and it is expected that submitted patches may have bugs causing
-    test failures. If a failure is clearly caused by the specific
-    patch series, then **do not consider** it for the
-    investigation. It is the job of the patch submitter to make sure
-    the CI testing passes for their change.
-  - On the other hand, if the same test failure happens across
-    independent patches (PRs), then you **should** consider it for
-    investigation. Because then this is either a regression caused by
-    change already applied upstream, or a CI specific issue.
-- Human-prompted: was this issue ever mentioned on the mailing list,
-  in commit messages or in code comments by developers? If yes, it's
-  likely worth investigating.
-- Better signal-to-noise ratio:
-  - Is this issue flaky? Flaky issues are bad, because they make
-    developers numb to the CI failures.
-  - Is this issue caused by an external dependency? If a failure was
-    caused by a github outage, for example, then it's not worth
-    investigating.
-  - Discount one-off errors or failures that never repeat. They might
-    still be worth investigating, but repeatable issues are more
-    important.
-  - Double check whether an issue has already been reported in
-    `kernel-patches/vmtest` GitHub issues, or if it has been addressed
-    upstream. If so, discard it.
+- semcode tools with indexed Linux source code and lore archive
+  (semcode may be unreliable; see Error Handling table for fallbacks)
+- Any public information via GitHub CLI or web
+- The `github/` directory contains relevant repositories:
+  - `kernel-patches/vmtest`, `kernel-patches/runner`,
+    `kernel-patches/kernel-patches-daemon`, `libbpf/ci` — BPF CI code
+  - `danobi/vmtest` — QEMU wrapper used in BPF CI to run VMs
+  - `facebookexperimental/semcode` — semcode source code
+  - `masoncl/review-prompts` — prompts with useful context about
+    Linux Kernel subsystems
+  - `nojb/public-inbox` — lei (local email interface) tool
+
+### Building and running tests
+
+`github/libbpf/ci/` contains the CI scripts. Key files:
+- `build-linux/build.sh` — kernel build (config assembly + make)
+- `build-selftests/build_selftests.sh` — selftest build
+- `run-vmtest/run.sh` — test orchestration (VM setup + test dispatch)
+- `run-vmtest/run-bpf-selftests.sh` — BPF test runner (inside VM)
+- `run-vmtest/prepare-bpf-selftests.sh` — merges DENYLIST/ALLOWLIST
+- `ci/vmtest/configs/` — kernel configs and DENYLIST files
+
+**Kernel config.** CI assembles .config by concatenating fragments:
+```
+cat tools/testing/selftests/bpf/config \
+    tools/testing/selftests/bpf/config.vm \
+    tools/testing/selftests/bpf/config.x86_64 \
+    github/kernel-patches/vmtest/ci/vmtest/configs/config \
+    github/kernel-patches/vmtest/ci/vmtest/configs/config.x86_64 \
+    > .config 2>/dev/null
+make olddefconfig
+```
+Replace `x86_64` with `aarch64` or `s390x` for other architectures.
+The CI config adds KASAN, livepatch, and sample module options.
+
+**Build kernel and selftests:**
+```
+make -j$(nproc)
+make headers
+make -C tools/testing/selftests/bpf -j$(nproc)
+```
+
+**Run tests via vmtest** (boots a QEMU VM with the built kernel):
+```
+vmtest -k arch/x86/boot/bzImage -- \
+  ./tools/testing/selftests/bpf/test_progs -t <test_name>
+```
+If `vmtest` is not installed, build from `github/danobi/vmtest`
+(`cargo build --release`). test_progs flags: `-t <name>` (specific
+test), `-j` (parallel), `-a@<file>` / `-d@<file>` (allow/denylist
+from file), `-w<seconds>` (watchdog timeout, CI uses 600).
+
+**DENYLIST/ALLOWLIST.** One test per line, `test/subtest` for subtests,
+`#` for comments. Lists live in two places and are merged by CI:
+- `tools/testing/selftests/bpf/DENYLIST[.arch]` (in-tree)
+- `github/kernel-patches/vmtest/ci/vmtest/configs/DENYLIST[.arch]`
+
+---
 
 ## Protocol
 
-1. Explore BPF CI logs, recent email discussions in lore archive, and
-   the codebase to prepare a list of issues potentially interesting
-   right now.
-   - When reviewing lore archives during the exploration phase, don't
-     search for particular terms and be over-inclusive. Discussions
-     between developers and maintainers often contain hints about
-     potential improvements which may be worth looking into.
-2. Review the compiled list and pick a single issue to focus on.
-3. Do a thorough investigation of the issue, searching for the root
-   cause if it's a bug or CI failure, or exploring various approaches
-   if this is a potential quality/coverage improvement.
-4. Generate output covering this specific issue.
+Print the completion banner at the end of each phase.
+
+### Phase 0: Load Context and Build Skip List
+
+**0.1** Read `NOTES.md` (if it exists) for known issues and status.
+
+**0.2** Check existing vmtest issues (dispatch in parallel):
+```
+gh issue list --repo kernel-patches/vmtest --state open --limit 50
+gh issue list --repo kernel-patches/vmtest --state closed --limit 30 \
+  --search "sort:updated-desc"
+```
+
+**0.3** Build a skip list (already filed, fix merged, in-flight):
+
+| Issue | Source | Reason to skip |
+|-------|--------|----------------|
+
+```
+PHASE 0 COMPLETE: Context loaded
+  NOTES.md: <loaded N items | not found>
+  Open vmtest issues: <count>
+  Skip list entries: <count>
+```
+
+---
+
+### Phase 1: Gather Candidates
+
+**1.1 CI logs.** List recent failed runs, then fetch logs for 5–8
+failed runs covering independent PRs:
+```
+gh run list --repo kernel-patches/bpf --workflow vmtest \
+  --status failure --limit 20 --json databaseId,displayTitle,conclusion,createdAt
+gh run view <run-id> --repo kernel-patches/bpf --log-failed 2>&1 | head -200
+```
+Look for test names failing across multiple independent PRs, infra
+failures vs test failures, and patterns in failure messages.
+
+**1.2 Lore archive.** Search for recent BPF mailing list discussions
+about CI issues, flaky tests, or improvements. Be over-inclusive.
+Max 3 search attempts per query (see Error Handling).
+
+**1.3 CI configuration.** Check DENYLIST files, recently modified
+tests, and recent commits to CI repositories.
+
+**1.4 Compile candidate list.** Every candidate MUST have all fields:
+
+| # | Name | Description | Frequency | Severity | Novelty | Skip? |
+|---|------|-------------|-----------|----------|---------|-------|
 
-## Output
+Frequency: every run / most / occasional / rare. Severity: blocks CI /
+misleading signal / cosmetic. Novelty: new / known-unfixed / regression.
+Check every candidate against the Phase 0 skip list.
 
-Put the results of your exploration in the `output` directory.
+**Do NOT** list issues caused by a specific patch series, issues from a
+single PR only, or skip-list issues without marking them.
 
-It must contain a `summary.md` document with the description of the
-issue and your suggestion. Format the `summary.md` as a GitHub issue /
-bug report intended for humans.
+```
+PHASE 1 COMPLETE: Candidates gathered
+  CI runs examined: <count>
+  Lore searches: <count successful> / <count attempted>
+  Candidates found: <count total>
+  Candidates after skip-list filter: <count>
+```
 
-If you came up with code changes, create .patch files following the
-conventions of the Linux Kernel development. Use `git log` in `linux`
-directory to see examples of proper patches.
+---
 
-Use the following tag in the patches you write:
+### Phase 2: Select Issue
+
+Score each non-skipped candidate on (in priority order):
+1. **Novelty** (highest) — not previously investigated or reported
+2. **Frequency** — appears across more independent PRs
+3. **Impact** — blocks CI or misleading signal over cosmetic
+4. **Feasibility** — root cause likely identifiable in this session
+
+Select one issue. State which, why, and the investigation approach.
+
+```
+PHASE 2 COMPLETE: Issue selected
+  Selected: #<N> <name>
+  Reason: <1-2 sentences>
+```
+
+---
+
+### Phase 3: Investigate
+
+**3.1 Reproduce and characterize.** Gather failure logs, identify the
+exact failing test/component and failure mode. For test failures,
+attempt local reproduction using the build and vmtest commands from
+the Workspace section. Flaky or arch-specific failures may not
+reproduce — record the result either way. If you skip reproduction,
+state why (e.g., "infra issue, not a test failure" or "requires
+s390x hardware").
+
+**3.2 Root cause analysis.** Read test and kernel code. Use semcode
+for functions/callers/call chains. Check git history. Search lore.
+
+Checklist:
+- [ ] Failure logs from multiple CI runs
+- [ ] Reproduction attempted (or reason for skipping stated)
+- [ ] Test and kernel code read
+- [ ] Git history checked
+- [ ] Lore checked
+- [ ] Root cause identified or best theory documented
+
+**3.3 Develop fix (if warranted).** Write and test the fix if
+possible. For flaky tests, verify the fix is logically correct by
+code inspection. For CI config changes, verify by examining the
+configuration logic.
+
+**3.4 Decide whether to report.** **Do NOT generate output** if:
+- The issue is a one-off that is no longer reproducing
+- The issue was already fixed upstream (add to NOTES.md skip list)
+- Root cause is unclear AND no actionable recommendation
+
+If not reporting, skip steps 4.1–4.2 but still update NOTES.md.
+
+```
+PHASE 3 COMPLETE: Investigation finished
+  Reproduction: <reproduced | not reproduced | skipped: reason>
+  Root cause: <identified | theory | unknown>
+  Fix: <patch ready | recommendation | needs upstream work | not reporting>
+```
+
+---
+
+### Phase 4: Generate Output
+
+**4.1** Create `output/summary.md` as a GitHub issue:
+
+```markdown
+# <short descriptive title for the issue>
+
+## Summary
+<1-3 sentences>
+
+## Failure Details
+- **Test / Component:** <name>
+- **Frequency:** <how often, across how many PRs>
+- **Failure mode:** <crash / wrong result / timeout / flaky>
+- **Affected architectures:** <x86_64 / s390x / aarch64 / all>
+- **CI runs observed:** <links to 2-3 runs>
+
+## Root Cause Analysis
+<explanation with file:line references and relevant commits>
+
+## Proposed Fix
+<description, reference patch file if included>
+
+## Impact
+<consequence if unfixed>
+
+## References
+- <links to lore threads, commits, issues>
+```
+
+**4.2** Create `.patch` files if applicable, following Linux Kernel
+conventions (`git log` for examples). Use the tag:
 
     Generated-by: BPF CI Bot ($LLM_MODEL_NAME) <bot+bpf-ci@kernel.org>
 
-Finally, update NOTES.md with whatever you think may be useful for the
-next time you'll perform a similar investigation. Remember to keep
-NOTES.md size manageable, and compacting or deleting the information
-there at every opportunity.
+**4.3** Update `NOTES.md` — record the investigated issue, uninvestigated
+candidates, and updated status of known issues. Keep it compact.
+
+```
+PHASE 4 COMPLETE: Output generated
+  Files in output/: <list>
+  NOTES.md: <updated | created>
+```
+
+---
+
+## Error Handling
+
+| Tool | Error | Action |
+|------|-------|--------|
+| semcode lore | Error or empty | Retry once → `lei` CLI → `git log --grep`. Max 3 total attempts per query. |
+| semcode code | Error | Verify cwd with `pwd` (must be Linux repo root). Fall back to grep/find. |
+| `gh run view` | Rate limit or error | Wait 10s, retry once. If still failing, skip that run. |
+| `gh issue list` | Error | Retry once. If failing, proceed with empty skip list. |
+| `lei` | Unavailable | Fall back to `git log --grep`. |
+| `git` | Unexpected output | Run `pwd` to verify cwd is the Linux repo root. If wrong, run `cd $GITHUB_WORKSPACE` to return to the workspace root. |
+| Build / vmtest | Failure | Record error, do not retry more than once. |