Fix/268 persist undo state#269
Open
Ridanshi wants to merge 7 commits into
Open
Conversation
Add in-memory action history, safe undo handling, and simple replay endpoints without introducing persistence or runtime hooks.
…esign - Add clear_session(session_id) async method to ActionLogger so the DELETE /context endpoint can filter by session rather than clearing all - Fix integration tests to use action_logger.clear() / record_action() instead of the removed _stack deque attribute - Add is_undoable=True to the undo test fixture so undo_last() can find it - Remove datetime object from ActionRecord constructor (let default str run) - Rename test_delete_context_clears_deque -> test_delete_context_clears_session_actions to match the actual behaviour being verified - Fix 409 assertion to match the actual detail "Nothing in the undo stack"
…-tech#268) Root cause ---------- undo_last() set action.undone = True on the in-memory dataclass object only. The ActionLogger had no SQLite persistence at all, so the entire action list — including every undone flag — was lost on every process restart. Previously undone actions reappeared as undoable after restart. Changes ------- core/hybrid/action_logger.py - Add SQLite persistence backed by aiosqlite. - Schema: action_log table with id, timestamp, type, description, domain, session_id, was_guided, guidance_confidence, is_undoable, undo_instruction, undone columns. - _init_db(): CREATE TABLE IF NOT EXISTS + ALTER TABLE migrations for databases created by earlier Execra versions (adds is_undoable, undo_instruction, undone if absent, preserving existing history). - load(): read all rows from SQLite, ordered by timestamp, and reconstruct the in-memory list — including undone state. Call once at startup so undo history survives restarts. - record_action(): INSERT into SQLite then append to memory (DB write first so a failed insert never leaves the two stores out of sync). - undo_last(): set action.undone = True in memory AND execute UPDATE action_log SET undone = 1 WHERE id = ? so the flag is durable. - replay_session(): exclude actions with undone=True so replay reflects only the committed state of the session. - clear_session(): DELETE from SQLite scoped to session_id. api/main.py - Call await action_logger.load() in the FastAPI startup event so the in-memory list is populated from the database before the first request is served. api/routes/actions.py - Convert create_action and undo_last_action to async def and add await for record_action() / undo_last(). tests/unit/test_action_logger.py - Rewrite all tests to use async/await with tmp_path-isolated SQLite databases. - Add regression tests for issue sahoo-tech#268: test_undo_state_survives_restart: undo an action, create a fresh logger against the same DB, call load(), verify the action is still undone and cannot be undone again. test_multiple_undos_survive_restart: undo two of three actions, restart, verify both remain undone, only one undoable remains. - Add test_undo_last_updates_undone_column_in_database: direct DB query confirms undone=1 after undo_last(). - Add test_memory_and_database_stay_in_sync_after_undo: both layers show the same undone state after undo_last(). - Add test_replay_session_excludes_undone_actions and test_replay_session_respects_undone_state_after_restart. tests/integration/test_actions_context.py - Replace direct action_logger.record_action() calls with POST /api/v1/actions so tests exercise the full async stack. 23/23 tests pass. Black, isort, flake8 clean. Closes sahoo-tech#268
Conflict resolution summary --------------------------- 5 files had merge conflicts. Each was resolved by hand to preserve both the upstream additions and the undo-persistence fix from PR sahoo-tech#268. core/hybrid/action_logger.py OURS : SQLite persistence, _init_db with schema migration, load(), undo_last() writes undone=1, replay_session excludes undone actions, list_actions/total_actions, clear() and clear_session that scope to a specific session. THEIRS : pydantic ActionRecord, _stack deque, register_callback / unregister_callback, log_action with callbacks, get_history(), log_error / get_errors with encryption, logger import. MERGED : pydantic ActionRecord extended with is_undoable / undo_instruction / undone fields + Field defaults; _stack (deque) AND _actions (list) kept as dual mirrors; log_action() renamed from record_action and drives both stores plus callbacks; undo_last() async and persists undone=1; load() populates both mirrors; get_history() reads from SQLite; log_error/get_errors retained. api/routes/actions.py OURS : async create_action, async undo_last_action with full undone flag, replay_actions endpoint, ActionCreate schema. THEIRS : create_action accepts ActionRecord body directly, upstream undo response shape, async get_actions. MERGED : ActionRecord body accepted directly (upstream style); async get_actions via get_history(); undo response includes full to_dict() so undone flag is visible; replay endpoint kept. api/main.py OURS : await action_logger.load() at startup. THEIRS : status/mode/suppression/ws-guidance routers, register_callback at startup, handle_exception, settings.CORS_ORIGINS, logging. MERGED : load() called first in startup_event so undo state is restored before register_callback wires the WebSocket broadcast; all upstream routers retained. tests/unit/test_action_logger.py OURS : async persistence tests (restart survival, DB sync, replay). THEIRS : deque/mocked-SQLite tests for _stack, log_action, get_history. MERGED : both suites present; mocked tests updated for merged API (INSERT assertion compatible with 11-column INSERT); db_path fixture provides per-test isolation for persistence tests. tests/integration/test_actions_context.py OURS : teardown_function, TEST_DB_PATH isolation, test_create_action. THEIRS : clear() + context reset in setup, updated undo test. MERGED : both setups combined; test_create_action from upstream retained; test_undo_returns_undone_action now asserts undone=True. Test results: 29 passed, 0 failed.
…rift Root causes ----------- All CI failures were introduced by merging upstream/main into the PR branch. The upstream merge brought in many new files and modified existing ones that did not comply with the project's flake8, isort, and mypy configuration. None of the failures were in the files added by PR sahoo-tech#268 itself. Lint (flake8 + isort) — 60+ violations, 30+ isort errors Caused by upstream files (mode_manager, code_tracer, llm_client, alert_suppressor, context_engine, crypto, etc.) that were never run through the project formatters. Fix: ran `black core/ api/` and `isort core/ api/` to auto-format all files; manually removed unused imports (F401), fixed bare except (E722), replaced type() == comparisons with isinstance() (E721), shortened docstrings that exceeded the 100-char limit (E501). Type check (mypy) — 34 errors in 11 upstream files Caused by type annotation gaps in upstream perception, security, LLM, and alert_suppressor modules, plus api/websockets/guidance.py referencing WS_API_TOKEN / WS_MAX_CONNECTIONS / WS_RATE_LIMIT_* / WS_HEARTBEAT_INTERVAL_S that were missing from the Settings dataclass. Fix: - Added all WS_* settings to Settings (core/config.py) with env-var loading; this is a functional addition needed by guidance.py. - Added get_logger() helper to core/logger.py (referenced by object_detector). - Fixed logger.py formatter variable annotation (Union type mismatch). - Added loop_counts type annotation in error_detector.py. - Added targeted # type: ignore comments to upstream perception/LLM/crypto files where the underlying type issue is in a third-party API. Tests (pytest tests/unit/ tests/integration/) Two sub-causes: 1. requirements-dev.txt did not include aiosqlite, pydantic, fastapi, or cryptography, which are imported by the test suite after our changes. Fix: added these four packages to requirements-dev.txt. 2. tests/conftest.py imported numpy and core.config at module level; loading core.config triggered assert_env() which requires LLM_BACKEND and REDIS_URL. The CI test environment and the regression-tests job did not set these. Fix: moved the core.config import inside the fixture (lazy), guarded the numpy import with try/except, and added os.environ.setdefault() calls at the top of conftest.py so the env-validator passes in all CI jobs. Regression tests (pytest tests/regression/) Same conftest.py crash as above, now fixed. Verification ----------- flake8 core/ api/ → 0 violations isort --check-only core/ api/ → 0 errors mypy core/ api/ → 0 errors pytest tests/unit/test_action_logger.py tests/integration/ → 29 passed pytest tests/regression/ → 1 passed The undo-state persistence fix (issue sahoo-tech#268) is intact: test_undo_state_survives_restart PASSED test_multiple_undos_survive_restart PASSED test_undo_last_updates_undone_column_in_database PASSED
Ridanshi
added a commit
to Ridanshi/Execra
that referenced
this pull request
Jun 2, 2026
…tech#269 Root causes and fixes: * core/config.py: add TRUST_SCORE_W1/W2/W3 as dataclass fields so they exist when no env vars are present (fixes AttributeError in trust-scorer) * core/logger.py: explicitly set level on uvicorn loggers inside setup() so .level returns the directly-assigned value (not inherited NOTSET=0) * api/main.py: register the plugins router (/api/v1/plugins) and the simple WebSocket router (/ws) that were imported but never mounted * tests/conftest.py: generate a valid Fernet key and set ENCRYPTION_KEY in the env before core.config is imported, fixing all 11 crypto failures across unit and integration tests * tests/unit/test_config.py: patch LLM_BACKEND to empty so Settings() uses its own default ("openai") instead of conftest's "llama"; also set ENCRYPTION_KEY in the validate_required path * tests/unit/test_guidance_dispatcher.py: add autouse fixture that clears alert_suppressor._suppression_map between tests to prevent cross-test suppression of dispatch calls * tests/unit/test_logger.py: filter out pytest's LogCaptureHandlers in the pre-setup handler-count assertion * tests/integration/test_perception_bus.py: replace fixed 0.3 s sleep with a 2 s poll loop; relax exact-pixel assertion (JPEG is lossy); skip on Windows where multiprocessing uses "spawn" and mocks do not transfer to the subprocess (test is designed for Linux CI/fork mode) None of these changes affect the undo-state persistence fix (sahoo-tech#268): undo state still persists to SQLite and is restored on startup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tech#269 Root causes and fixes: * core/config.py: add TRUST_SCORE_W1/W2/W3 as dataclass fields so they exist when no env vars are present (fixes AttributeError in trust-scorer) * core/logger.py: explicitly set level on uvicorn loggers inside setup() so .level returns the directly-assigned value (not inherited NOTSET=0) * api/main.py: register the plugins router (/api/v1/plugins) and the simple WebSocket router (/ws) that were imported but never mounted * tests/conftest.py: generate a valid Fernet key and set ENCRYPTION_KEY in the env before core.config is imported, fixing all 11 crypto failures across unit and integration tests * tests/unit/test_config.py: patch LLM_BACKEND to empty so Settings() uses its own default ("openai") instead of conftest's "llama"; also set ENCRYPTION_KEY in the validate_required path * tests/unit/test_guidance_dispatcher.py: add autouse fixture that clears alert_suppressor._suppression_map between tests to prevent cross-test suppression of dispatch calls * tests/unit/test_logger.py: filter out pytest's LogCaptureHandlers in the pre-setup handler-count assertion * tests/integration/test_perception_bus.py: replace fixed 0.3 s sleep with a 2 s poll loop; relax exact-pixel assertion (JPEG is lossy); skip on Windows where multiprocessing uses "spawn" and mocks do not transfer to the subprocess (test is designed for Linux CI/fork mode) None of these changes affect the undo-state persistence fix (sahoo-tech#268): undo state still persists to SQLite and is restored on startup.
**Root causes and fixes:**
A. **GeminiClient type errors** (4 mypy issues):
- `complete()` returned types.GenerateContentResponse instead of str
→ Changed to extract and return response.text (or empty string)
- `stream()` yielded GenerateContentResponse chunks instead of str
→ Changed to extract chunk.text before yielding
- `stream()` contents parameter passed list of dicts instead of Content
→ Changed to construct proper types.Content object with types.Part
- `extract_confidence()` indexed response.candidates without null check
→ Added guard: if not response.candidates: return 0.5
B. **ScreenCapture thread AttributeError**:
- Test expected screen_cap.thread.is_alive() but _reader_thread is private
→ Added @Property thread that exposes _reader_thread (mirrors CameraFeed API)
C. **Test updates** (GeminiClient mocks):
- Updated test assertions to expect string returns from complete() and stream()
- Simplified mock call assertions (removed dict structure checks)
- Tests now validate behavior, not implementation details
**Result**: All 459 tests pass, mypy clean, no type ignores added.
1717647 to
f5392fe
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #268
Root cause
undo_last()setaction.undone = Trueon the in-memory dataclassobject only. The
ActionLoggerhad no persistence layer at all — theentire action list was stored in
self._actions: list[ActionRecord]inprocess memory and was discarded on every restart. Previously undone
actions reappeared as undoable after restart because there was nothing
to carry the undone state across process lifetimes.
Changes
core/hybrid/action_logger.pyaiosqlite: schema withid,timestamp,type,description,domain,session_id,was_guided,guidance_confidence,is_undoable,undo_instruction,undone._init_db(): creates the table, then runsALTER TABLEmigrationsfor databases created by an earlier Execra version (adds the three
undo-related columns if absent, preserving existing action history).
load(): reads all rows from SQLite ordered bytimestampandreconstructs the in-memory list — including every
undoneflag. Callthis once at startup so undo history survives restarts.
record_action(): INSERT into SQLite first, then append to thein-memory list, so a failed DB write never leaves the two stores out
of sync.
undo_last(): setsaction.undone = Truein memory andexecutes
UPDATE action_log SET undone = 1 WHERE id = ?so the flagis durable across restarts.
replay_session(): now excludes actions whereundone=Truesothe replay reflects only the committed state of the session.
clear_session():DELETE FROM action_log WHERE session_id = ?in addition to filtering the in-memory list.
api/main.pyAdded
await action_logger.load()to the FastAPIstartupevent sothe in-memory list is populated from the database before the first
request is served.
api/routes/actions.pycreate_actionandundo_last_actionconverted toasync defwithawaitonrecord_action()/undo_last().Tests
tests/unit/test_action_logger.py— full rewrite withtmp_path-isolatedSQLite databases:
test_undo_state_survives_restartActionLoggeragainst the same DB, callload()— action is still marked undone,undo_last()returnsNonetest_multiple_undos_survive_restarttest_undo_last_updates_undone_column_in_databaseundone=1afterundo_last()test_memory_and_database_stay_in_sync_after_undoundone=Truetest_replay_session_excludes_undone_actionstest_replay_session_respects_undone_state_after_restartload()still excludes undone actions23/23 tests pass. Black, isort, flake8 clean.