StableSteering is a research prototype for studying interactive prompt-embedding steering in text-to-image systems.
If you want the easiest conceptual introduction, start with student_tutorial.md.
Yes, if you install the inference dependencies, prepare a local model snapshot, and run with the Diffusers backend enabled.
The real Diffusers path is GPU-only and explicitly requires CUDA. If a GPU is not available, the default app server does not start. It never falls back to mock automatically, and the mock generator is reserved for tests only.
Mock renders make the core system easier to implement, debug, and test. They let us validate:
- session creation
- round generation
- feedback submission
- update logic
- replay export
without requiring real image generation in every automated test.
The normal app runtime uses the real Diffusers backend on GPU. The mock generator exists only for explicit test harnesses.
Yes.
The normal workflow begins on /setup, where the user enters a text prompt and then starts a session from that prompt.
Yes.
Run:
python scripts/setup_huggingface.pyThis prepares a local model snapshot directory and writes a manifest describing what was downloaded.
The current MVP stores structured state in a local SQLite database:
data/stablesteering.db
Generated images, trace logs, and per-session reports are stored alongside it under data/.
The current generated artifacts are stored under:
data/artifacts/
Trace files are stored under:
data/traces/
Per-session trace bundles are stored under:
data/traces/sessions/<session_id>/
Each session bundle contains:
backend-events.jsonlfrontend-events.jsonlreport.html
Round generation and feedback submission run as async jobs.
In the browser, you will see:
- a progress bar
- a status label
- automatic refresh after success
- inline error text if the job fails
Behind the scenes, the UI submits async requests and polls a job-status endpoint until the work completes.
Yes.
Run:
python scripts/create_real_e2e_example.pyThis creates a real GPU-backed example bundle under:
output/examples/real_e2e_example_run/
That bundle includes generated images, a manifest, a standalone walkthrough HTML file, and the session trace report.
The schema supports:
- scalar rating
- pairwise comparison
- top-k ranking
- winner-only selection
- approve/reject with preferred approved winner
The current UI uses mode-specific controls for the selected feedback mode.
Yes, on the roadmap.
The current implementation focuses on prompt-first text-to-image steering, but the roadmaps now also include:
- image-prompt or image-variation steering
- inpainting steering
- ControlNet-guided steering
The current MVP includes:
random_localexploit_orthogonaluncertainty_guidedaxis_sweepincumbent_mix
The current MVP includes:
winner_copywinner_averagelinear_preference
The mock-generation path is deterministic for the same session state and seed logic, which is useful for tests and replay. Real generation still persists seeds and configuration for auditability, but exact image-level determinism depends on the runtime stack and model behavior.
Install dependencies:
python -m pip install -e .[dev]Start the app:
python scripts/run_dev.pyRun the tests:
python -m pytestRun browser tests:
npm install
npm run test:e2e:chromeNo. It is a research-oriented MVP intended to exercise the architecture described in the specification set.