AiML Continuation Lab is currently a static React prototype for an AI/ML research-agent IDE. The working app lives in prototype/AiML Prototype.html and renders a multi-panel workspace for experiment runs, traces, policy overlays, graph views, and chat.
- No package manager or build manifest is present at the repo root.
- The prototype is browser-run from a single HTML entrypoint.
- React 18.3.1, ReactDOM 18.3.1, and Babel Standalone 7.29.0 are loaded from CDNs in the HTML shell.
- Application state is hydrated from static browser globals defined in
prototype/data.js,prototype/backend_slice_loader.js, andprototype/first_trace_payload.js; first-run LLM, retrieval, tabular, and vision workspaces resolve generated backend slices before fixture fallbacks. - Repository verification is currently a deterministic Node smoke baseline captured in
.supervisor/project.jsonand executed without npm scripts or a build toolchain. - The current machine verify baseline includes nineteen deterministic Node smokes:
prototype/first_trace_render_smoke.js,prototype/live_resume_smoke.js,prototype/codex_resume_bridge_smoke.js,prototype/codex_resume_stream_smoke.js,prototype/codex_resume_persistence_smoke.js,prototype/queued_followup_execution_smoke.js, resume history, queued-followup UI, local bridge preflight UI/data smokes, runtime contract and frontend slice checks, orchestration and retriever smokes, graph and memory schema checks, structured corpus validation, the Codex runtime adapter smoke, andknowledge/codex_first_loop_smoke.js. - Product direction is
Codex-first: the agent loop is intended to run through Codex CLI or SDK integration first, with Claude compatibility designed in from the start and implemented later. - The orchestration control plane is documented in
ARCHITECTURE.md. - Multi-turn continuation now includes one deterministic executable Codex-first follow-up path on top of the canonical checkpoint/resume contract, while staying runtime-agnostic at the orchestration boundary. The browser prototype hydrates the canonical resumed-turn follow-up slice and can apply local bridge resume and queued follow-up transport results through the same frontend slice/current-run selection path. Bridge-produced resume outputs are also persisted as repo-owned runtime state under
knowledge/generated/runtime_state/codex_resume/. - When served through
prototype/codex_resume_bridge.js, the multi-turn run panel shows a compact local-runtime preflight surface before live resume and queued follow-up controls. It reports the status probe result, bridge id, scoped resume/stream/queued-follow-up/history capabilities, supported queued action kinds, and runtime-state history counts without executing actions. - Multi-turn run panels also include a read-only checkpoint lineage branch inspector that groups checkpoint creation, resume, branch, and queued-follow-up runtime-history refs by checkpoint/action provenance without adding branch mutation controls.
- Multi-turn turn/run/checkpoint legality is now centralized in
knowledge/lib/multi_turn_state_machine.js, and adapter/loop/frontend continuation projections consume that deterministic state machine instead of ad hoc transition logic. - When multiple backend slices contribute competing
currentruns for one domain,prototype/data.jsapplies an explicit deterministic policy: continuation-bearing resumed slices outrank queued-follow-up slices, which outrank other runtime-backed backend currents, which outrank legacycurrenttags and final first-run fallback. Tabular and vision now use backend-backed first-run current slices while the older tabular static rows remain additive history.
prototype/AiML Prototype.html: browser entrypoint and rootApp.prototype/ide-components.jsx: shared icons and chart helpers used by the IDE shell.prototype/ide-sidebar.jsx: left explorer and related navigation UI.prototype/ide-run.jsx: main run tab and experiment detail panels.prototype/ide-overlays.jsx: overlays such as graph, policy, sources, and command palette.prototype/backend_slice_loader.js: loads generated-first backend slice artifacts into browser globals, with runtime examples as fallback fixtures for LLM, retrieval, tabular, vision, and resumed follow-up contexts.prototype/codex_resume_bridge.js: dependency-free local Node bridge that serves the prototype and exposes scoped checkpoint resume, streamed resume, and persisted queued follow-up endpoints.prototype/codex_resume_transport.js: optional browser-side transport registration that enables local resume, streamed resume, and queued follow-up calls only when the local bridge status endpoint is available.prototype/data.js: compatibility projections from the canonical slice plus legacy prototype datasets.prototype/first_trace_payload.js: first-trace recommendation payload.prototype/ide.css: main IDE styling.prototype/styles.css: older prototype styling for alternate views.knowledge/README.md: corpus, graph, memory, and checkpoint layout for the future orchestration substrate.knowledge/lib/runtime_state_store.js: dependency-free writer for durable Codex resume runtime state manifests, refreshed checkpoint snapshots, applied frontend slices, runtime transcript snapshots, queued follow-up state, and checkpoint lineage refs.knowledge/lib/queued_followup_execution.js: validates persisted queued continuation state and dispatches supportedexperiment.queue_runactions through the Codex-first follow-up loop.knowledge/schemas/: canonical record schemas for playbooks, metric rules, stage rules, interventions, trajectories, and policy rules.knowledge/sources/sources_seed.csv: first curated source inventory for corpus acquisition.
Open prototype/AiML Prototype.html in a browser. Because dependencies are loaded from public CDNs, internet access is required unless those assets are vendored later. Backend slices now resolve generated artifacts under knowledge/generated/ before falling back to knowledge/runtime_examples/, so prefer serving the repo from the root with a simple static server such as python3 -m http.server instead of relying on file:// mode.
To run the prototype with the local Codex resume bridge, use:
node prototype/codex_resume_bridge.jsThen open http://127.0.0.1:4177/prototype/AiML%20Prototype.html. This bridge is a local development/runtime boundary for checkpoint-backed resume, not a standalone inference API.
Each successful POST /aiml/codex/resume writes durable repo artifacts under knowledge/generated/runtime_state/codex_resume/. The manifest links the refreshed checkpoint, applied frontend slice, runtime transcript, queued follow-up action state, lineage refs, and the canonical generated slice path loaded by prototype/backend_slice_loader.js after a browser reload or bridge restart. POST /aiml/codex/queued-followup can then select a persisted pending experiment.queue_run action by checkpointId and actionId, execute it through the same Codex-first path, and refresh the persisted artifacts.
Run the verified repository checks:
node prototype/first_trace_render_smoke.js
node prototype/live_resume_smoke.js
node prototype/codex_resume_bridge_smoke.js
node prototype/codex_resume_stream_smoke.js
node prototype/codex_resume_persistence_smoke.js
node prototype/queued_followup_execution_smoke.js
node prototype/codex_resume_history_smoke.js
node prototype/queued_followup_ui_smoke.js
node prototype/codex_resume_preflight_smoke.js
node knowledge/runtime_contract_smoke.js
node knowledge/frontend_slice_smoke.js
node knowledge/orchestration_core_smoke.js
node knowledge/retriever_query_contract_smoke.js
node knowledge/filesystem_corpus_retriever_smoke.js
node knowledge/graph_access_smoke.js
node knowledge/experiment_memory_smoke.js
node knowledge/structured_corpus_smoke.js
node knowledge/codex_runtime_adapter_smoke.js
node knowledge/codex_first_loop_smoke.jsThe same 20-command machine baseline is registered in .supervisor/project.json as one chained sequential testCommand. It intentionally excludes lint, typecheck, build, coverage, and browser automation because those toolchain stages do not exist in the current repository.
Several prototype and Codex loop smokes intentionally write canonical files under knowledge/generated/ so generated-first hydration can be verified. Run those smokes sequentially; if two shared-writer smokes overlap, the smoke lock fails fast with a generated-artifact lock message instead of allowing ambiguous partial JSON reads. Smokes that do not need canonical loader paths should pass an isolated generatedDir through the Codex loop.
To run the optional live Codex CLI integration check on top of the deterministic loop coverage, use:
AIML_ENABLE_LIVE_CODEX_SMOKE=1 node knowledge/codex_first_loop_smoke.jsThe loop smoke evaluates the higher-level codex_first_loop path locally, including canonical envelope normalization, command auditing with fail-closed mismatch behavior, provenance propagation, and append-only memory writes across repeated runs.
- The canonical multi-turn continuation contract is defined in
knowledge/schemas/runtime_checkpoint.schema.jsonand the paired fixtures underknowledge/runtime_examples/. - Repo-owned checkpoints are the source of truth for resume semantics. Runtime-managed handles such as OpenAI
previous_response_idchains or Claude session resume remain adapter-local implementation details. - The canonical resumed turn uses
turn.resume, re-checks corpus and policy state, appends memory, and can queue bounded follow-up actions named bypending_follow_up_action_ids. - Frontend continuation state is additive through optional
runtime_session,turn_timeline, andcontinuationobjects so existing single-turn slice consumers stay valid. prototype/backend_slice_loader.jsandprototype/data.jsnow hydrate first-run slices plus canonical resumed LLM, retrieval, and additive tabular follow-up slices through the existing backend fixture path. The tabular continuation slice is generated/fallback inspectable but is not a bridge/UI execution target yet.- The frontend-visible multi-turn flow remains generated-slice/fallback friendly, and the resumed LLM card can now call
window.AIML_CODEX_RESUME_TRANSPORT.resumeCheckpointFollowup(request)orwindow.AIML_CODEX_RESUME_TRANSPORT.executeQueuedFollowup(request)when the local bridge is available. Runtime session handles stay adapter-local in the request/response boundary; repo checkpoint ids and continuation records remain canonical in the applied frontend slice. - Local bridge streaming is intentionally narrow:
POST /aiml/codex/resume/streamaccepts the same scopedcodex.resume_checkpoint_followupJSON body asPOST /aiml/codex/resumeand returns newline-delimited JSON. Orderedruntime_eventrecords are emitted first and projected intoturn_timeline.messageswhile the resume is running; the finalfrontend_slicerecord carries the same canonical response shape used by the non-streaming endpoint, andprototype/data.jsstill applies that final slice throughapplyBackendSlice(...). - Bridge resume results are durable across browser reloads and bridge restarts because the bridge persists repo-owned runtime state manifests plus checkpoint, frontend slice, runtime transcript, queued action, and lineage artifacts while leaving generated-first slice hydration on the existing loader paths.
- No
package.json, lockfile, or formal dev environment. - No automated lint, typecheck, build, coverage, or browser test pipeline.
- No documented product requirements beyond the prototype assets and research notes.
- The knowledge corpus now includes normalized seed slices for
llm_finetuning,retrieval_reranking,tabular_classification, andvision_object_detection, but ingestion is still intentionally curated and incomplete. - The local Codex resume bridge is intentionally narrow and deterministic; broader streamed runtime updates and broader queued action-family execution are still future work.
The first usable corpus slice targeted llm_finetuning, aligned with the current prototype's strongest example flow:
knowledge/playbooks/llm_finetuning_first_run_review.yamlknowledge/metric_rules/llm_finetuning.yamlknowledge/stage_rules/first_run_review.yamlknowledge/intervention_library/llm_overfitting_after_step_inflection.yamlknowledge/policy/llm_baseline_guardrails.yamlknowledge/trajectories/human_curated/llm_first_run_overfit_trace.json
The structured corpus now also covers high-signal first-run review scenarios for retrieval, tabular classification, and object detection. The newest vision_object_detection records add:
- low small-object AP/recall despite acceptable aggregate bbox mAP
- validation mAP regression after aggressive augmentation while train loss improves
- provenance-backed playbook, metric, intervention, policy, trajectory, run, trace, and lineage records for both scenarios
The intended execution model is:
- primary runtime:
Codexvia CLI or SDK - secondary runtime:
Claudevia CLI or SDK later - not a standalone inference API as the primary product shape
This means the core system should be designed around agent orchestration, tool execution, corpus traversal, experiment memory, and run-control workflows. Any API surface should be treated as optional glue, not the center of the architecture.
The current source of truth for that control plane is ARCHITECTURE.md, which defines the runtime-agnostic loop, adapter boundaries, and checkpoint model.