Skip to content

Latest commit

 

History

History
265 lines (178 loc) · 5.29 KB

File metadata and controls

265 lines (178 loc) · 5.29 KB

AGENTS.md — Operating Rules for Codex CLI

Purpose

This repository implements an artifact-driven educational lab for exploring:

  • Supervised Autoencoders (AE)
  • Gaussian Mixture VAEs (GM-VAE)
  • Latent projections (PCA / UMAP 2D & 3D)
  • kNN neighbor graphs
  • Interactive visualization and interpretation tools

Your role as an agent is to extend and improve the system while preserving its architectural guarantees.

This document defines those guarantees.


Core Architectural Principles (Non-Negotiable)

1. Artifact-Driven Visualization

The frontend must never recompute ML results.

All embeddings, projections, distances, neighbors, uncertainty metrics, and derived model quantities:

  • Are computed in Python
  • Persisted as artifacts
  • Loaded and rendered by the frontend

Frontend may compute:

  • Pixel transforms
  • Camera transforms
  • Pure UI state
  • Visual scaling based on persisted data

Frontend must not compute:

  • Latent embeddings
  • PCA/UMAP
  • Distance matrices
  • kNN graphs
  • Sampling from model distributions

2. Backend Owns All Computation

Training, projections, neighbors, sampling, feature extraction, uncertainty metrics, etc. are:

  • Implemented in Python
  • Versioned via artifact output
  • Persisted per-epoch

Visualization work must not silently modify training semantics.

If training behavior changes:

  • It must be explicit
  • It must be documented
  • Tests must reflect it

3. Per-Epoch Persistence Contract

All time-varying artifacts must follow:

  • *_epoch_<N>.json
  • *_latest.json

Runs live under:

runs/<run_id>/

Typical structure:

config.json
events.jsonl
projections/
neighbors/
samples/
checkpoints/

When introducing a new artifact type:

  • Provide per-epoch + latest
  • Keep schema stable and minimal
  • Ensure deterministic regeneration

4. Determinism

For a given:

  • run
  • epoch
  • projection method
  • visualization settings

Rendering should be stable and reproducible.

Avoid:

  • Random jitter
  • Order instability
  • Implicit sorting differences

Unless randomness is intentional and persisted.


SSE Event Contract

SSE powers live updates.

Rules:

  • Prefer additive schema changes
  • Do not rename event types casually
  • Update backend schemas and frontend TypeScript types together
  • Do not silently break event parsing

Visualization & Performance Guardrails

The system must remain usable at ~100–200 focused points.

When modifying visualization logic:

  • Avoid O(N²) recomputation on slider drag
  • Memoize expensive transforms
  • Avoid recomputing neighbor lists unnecessarily
  • Keep projection scaling method-aware (PCA ≠ UMAP)
  • Use correct dimensional distance formulas (2D vs 3D)

When adding new gating/scaling logic:

Expose debug stats in a collapsible debug panel.

Do not rely only on console logging.


Frontend Rules

  • Functional React components only
  • Explicit numeric clamping
  • No implicit type coercion
  • npx tsc --noEmit must pass
  • No silent failure paths
  • Errors must be surfaced in UI
  • Derived data should use useMemo

Rendering must remain artifact-driven.


Backend Rules

  • No training logic inside route handlers
  • Long-running work happens in background jobs
  • Always check cancellation flags
  • Never block the request loop
  • Keep modules small and testable
  • Prefer additive changes over refactors

Agent Workflow Modes

Explore Mode (default)

  • Inspect code
  • Propose improvements
  • Offer 1–2 structured approaches
  • Recommend a direction

Execute Mode

  • Implement scoped changes
  • Keep diffs minimal
  • Run safe checks
  • Provide verification steps

Switch to execute mode automatically if:

  • Change is localized
  • No schema contracts are altered
  • No training semantics are modified

Be cautious if touching:

  • Artifact schema
  • SSE schema
  • Training loop
  • Run lifecycle

Call out those changes explicitly.


Safe Command Autonomy Policy

You are allowed to run the following without approval:

Read-only

  • ls, tree, find
  • cat, sed, rg, grep
  • git status, git diff, git log

Frontend checks

  • cd frontend && npx tsc --noEmit
  • cd frontend && npm run build
  • cd frontend && npm run lint
  • cd frontend && npm test (if configured)

Backend checks

  • uv run pytest -q backend/tests
  • uv run python -m compileall .

Rule: If pytest fails due to environment issues, install missing dev dependencies with uv add --dev <pkg> and retry. Do not fallback to unittest.

You must ask before:

  • Adding major new dependencies
  • Upgrading large dependency trees
  • Deleting runtime data (runs/)
  • Running destructive filesystem commands
  • Rewriting git history

Before Finishing Any Task

Confirm:

  • Frontend still loads prior runs deterministically
  • No ML recomputation was introduced in frontend
  • SSE still functions
  • TypeScript compiles
  • Tests pass (or explain failures)
  • Backend tests were run with uv run pytest
  • New artifacts (if added) follow per-epoch + latest pattern

Philosophy

Favor:

  • Stability over novelty
  • Additive improvements over refactors
  • Smooth visual behavior over mathematically abrupt gating
  • Clear debug instrumentation over hidden logic
  • Architectural consistency over clever shortcuts

When uncertain, choose the path that preserves artifact purity and determinism.


This is the governing document for all future development.