[codex] Redesign Minimap as a lean navigation graph#1
Draft
himattm wants to merge 4 commits into
Draft
Conversation
`android layout` emits a flat array of nodes with hyphenated keys (content-desc, resource-id) and a stringified center "[x,y]". `resolve_selector_point` was looking for the legacy UIAutomator shape (camelCase contentDescription/testTag, bounds object) and never matched against real CLI output. Live smoke confirmed: `minimap tap --selector content_desc=Settings` always failed with "Selector not found" against the running emulator. Extend the resolver to accept both shapes so existing fake-adb test fixtures keep working and real CLI output is now supported: - Selector key lookup tries hyphenated first, then camelCase. - center_of falls back to parsing the "[x,y]" string when no bounds object is present. Two new tests: tap_selector_resolves_real_cli_shape exercises the end-to-end selector path against a real-CLI-shaped layout fixture; parse_center_string_parses_bracketed_pair covers the parser edges. Test count: 49 -> 51 passing, 0 failed, 1 ignored.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR resets Minimap around the narrower product goal we aligned on: an Android-only navigation memory layer for agents. It replaces the heavier proposal/journal-oriented model with a lean graph of semantic places and deterministic UI edges that agents can grow, replay, and validate over time.
Key changes:
.minimap/graphmodel with semantic places, place variants, deterministic edge files, explicit viewport compatibility for coordinate taps, and one active Android app profile.init,doctor,whereami,layout,tap,scroll,back, andgo.android layoutshape.Review Focus
Please review this as a breaking v1 redesign rather than an incremental compatibility patch.
High-value review areas:
crates/minimap-cli/src/main.rsandcrates/minimap-cli/tests/cli_contract.rs.crates/minimap-core/src/lib.rs.crates/minimap-graph/src/lib.rs.crates/minimap-repo/src/lib.rs.crates/minimap-schemas/src/lib.rs.plugins/minimap-claude-code/skills/minimap-app-navigation/SKILL.md.docs/MINIMAP_BENCHMARK_NOTES.mdanddocs/MINIMAP_CHANGE_BENCHMARK_PROTOCOL.md.Known local-only files intentionally excluded from the PR:
.claude/settings/checkpoints.Validation
Ran locally:
Also ran a controlled change smoke against the installed Minimap CLI using fake
android layoutand fakeadbcommands:Smoke coverage:
known_changed, one place variant, no edge churn.needs_label, then new place plus new edge.gosucceeded, destination variant added.config_error, repair recorded a replacement edge.config_error, no new edge recorded.Notes
The real Compose sample changed-app benchmark is not included yet. The current benchmark evidence covers known-path replay on Jetsnack plus deterministic change-case smoke. A follow-up should run the same protocol against a modified Compose sample build before treating the performance numbers as product claims.
Update (2026-06-11): hardening + device targeting
Two commits landed since the original push, closing the CI failure and the validated hardening backlog:
Harden matching tolerance, graph writes, and CLI safetyedge_idpanic on long selectors;no_compatible_pathvsno_known_pathreachability; atomic graph writes; pending-transition TTL + dangling-edge guard; cross-device cache bleed (serial-less -> no cache); arg validation before mutation; duplicate-id detection invalidate_graph; overlayandroid:id/button1false-positive removed;doctorexit code routed throughexit_code_for_status; 0600/0700 cache-file perms; default-deny redaction with tightened email/numeric heuristics.normalize_labeluses deunicode transliteration with a never-empty fallback; duplicate slugs surfacelabel_mismatchunless--allow-duplicate-label.KNOWN_CHANGED_THRESHOLDtuned to 0.80, band-center of the measured clean gap [0.689, 0.902] on Jetsnack. Sibling detail screens (e.g. two product details) intentionally merge into one item-agnostic place.manual_option_zip, rustfmt drift).Thread device serial through adb and android CLI calls--serialflag withANDROID_SERIALenv fallback; everyadbcall now carries-s <serial>,android layoutcarries--device=<serial>, andandroid screensubcommands getANDROID_SERIALon the child process. With a configured serial, cache scoping no longer depends onadb get-serialnosucceeding.doctornow detects the multiple-devices-without-serial condition and reports an actionable hint instead of a raw adb failure; with a serial it reports the targeted device.Validation:
cargo fmt --check,cargo clippy --all-targets -- -D warnings,cargo test-> 100 passed, 0 failed. Earlier live e2e on Jetsnack validated the full loop (init -> label -> grow -> re-identify known ->goreplay -> viewport-mismatch refusal).Deferred follow-ups (intentionally out of scope for v1):
android layoutCLI emits their buttons astextwithout the resource-ids the detector scans).whereamican report a stale place if the screen changed via raw adb in between.skipped_edgesdiagnostics are noisy.