diff --git a/AGENTS.md b/AGENTS.md
index a5f0e1c..c243b03 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -64,8 +64,8 @@ uv run pytest
 # Run with coverage report
 uv run pytest --cov=embedding_cluster --cov-report=term-missing
 
-# Run with coverage enforcement (70% minimum)
-uv run pytest --cov=embedding_cluster --cov-report=term-missing --cov-fail-under=70
+# Run with coverage enforcement (90% minimum, matches CI)
+uv run pytest --cov=embedding_cluster --cov-report=term-missing --cov-fail-under=90
 
 # Run a single test file
 uv run pytest tests/test_settings.py -v
@@ -110,7 +110,7 @@ E2E tests require pre-indexed ChromaDB data. The `webServer` config in
 GitHub Actions workflow in `.github/workflows/ci.yml` runs on push/PR:
 - **lint** job: `ruff check` + `ruff format --check`
 - **typecheck** job: `mypy embedding_cluster/`
-- **test** job: `pytest --cov --cov-fail-under=70`
+- **test** job: `pytest --cov` (90% minimum enforced by coverage report)
 
 All jobs use `uv sync --all-extras` for dependency installation.
 
@@ -172,16 +172,35 @@ embedding_cluster/
   settings.py          # Pydantic Settings (env var config)
   utils.py             # Shared utilities (logging, ChromaDB helpers, image downloader)
   indexer.py           # INDEX mode: CSV parsing, embedding generation, ChromaDB storage
-  scatter_plot.py      # PLOT mode: Clustering, t-SNE, Dash visualization
+  scatter_plot.py      # PLOT mode: Clustering, dimensionality reduction, visualization data
+  ai_naming.py         # LLM-powered cluster naming via LiteLLM
+  annotations.py       # Cluster annotation persistence (JSON sidecar files)
   csv/                 # Sample data files
+  server/
+    app.py             # FastAPI app factory, SPA serving
+    models.py          # Pydantic request/response models
+    tasks.py           # Background task registry
+    ws.py              # WebSocket manager for live progress
+    routes/
+      ai.py            # AI cluster naming endpoints
+      annotations.py   # Cluster annotation CRUD
+      collections.py   # ChromaDB collection management
+      csv.py           # CSV upload and preview
+      index.py         # Indexing jobs with WebSocket progress
+      plot.py          # Plot computation, cluster detail, sub-clustering
+      search.py        # Semantic search (text and image)
+frontend/
+  src/
+    App.tsx            # Router, QueryClient, Zustand provider
+    api/               # Typed API client layer
+    components/        # UI components organized by page
+    hooks/             # useIndexWebSocket, usePlotData
+    pages/             # HomePage, IndexPage, PlotPage, SettingsPage
+    stores/            # Zustand plotStore (plot state management)
+    types/             # TypeScript interfaces mirroring backend models
 tests/
-  __init__.py
   conftest.py          # Shared fixtures
-  test_settings.py     # Settings env var parsing tests
-  test_utils.py        # Utilities, Singleton, ImageDownloader tests
-  test_indexer.py      # Indexer pipeline tests (mocked ML models)
-  test_scatter_plot.py # Scatter plot tests (mocked data)
-  test_main.py         # Entry point dispatch tests
+  test_*.py            # Unit tests for each backend module and route
 ```
 
 ### Key Dependencies
@@ -191,10 +210,10 @@ Runtime:
 - `chromadb` - Vector database for embedding storage
 - `transformers` / `sentence-transformers` - Text and image embedding models
 - `torch` - ML framework backend
-- `dash` / `plotly` - Interactive 3D visualization
-- `scikit-learn` - KMeans clustering and t-SNE
+- `fastapi` / `uvicorn` - Web server and REST API
+- `scikit-learn` - KMeans clustering and dimensionality reduction
 - `aiohttp` - Async HTTP for image downloads
-- `openai` - Optional GPT-based cluster naming
+- `litellm` - Multi-provider LLM integration for cluster naming
 - `numpy` / `Pillow` - Numerical and image processing
 
 Dev:
@@ -202,6 +221,7 @@ Dev:
 - `mypy` - Static type checking
 - `ruff` - Linting and formatting
 - `pre-commit` - Git hook management
+- `httpx` - Test client for FastAPI routes
 
 ## Git & Commit Conventions
 
@@ -231,9 +251,15 @@ Extensive pre-commit setup. Key hooks:
 
 ## Data Flow
 
-1. **INDEX mode**: CSV -> parse rows -> generate embeddings (CLIP for images,
-   SentenceTransformer for text) -> store in ChromaDB collections
-2. **PLOT mode**: ChromaDB collection -> StandardScaler -> KMeans clustering ->
-   t-SNE 3D projection -> Dash/Plotly interactive scatter plot
-
-ChromaDB data is persisted to `./chromadb/` directory (gitignored).
+1. **INDEX mode**: CSV → parse rows → generate embeddings (CLIP for images,
+   SentenceTransformer for text) → store in ChromaDB collections
+2. **PLOT mode**: ChromaDB collection → StandardScaler → KMeans clustering →
+   dimensionality reduction (t-SNE/UMAP/PCA) → 3D point data via REST API
+3. **SERVER mode**: FastAPI serves REST API + built React SPA. Long-running
+   jobs (indexing, plot computation) use a task registry with WebSocket
+   progress streaming.
+
+Persistent data:
+- `./chromadb/` — Vector database (gitignored)
+- `./uploads/` — Uploaded CSV files (gitignored)
+- `./annotations/` — Cluster annotations as JSON sidecar files (gitignored)
diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 0000000..2e629ff
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,99 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+Python + React application for generating, indexing, and visualizing embedding clusters from CSV data. Uses CLIP/SentenceTransformer for embeddings, ChromaDB for vector storage, k-means for clustering, and a React/Three.js frontend for 3D visualization.
+
+- **Python 3.13**, managed with [uv](https://docs.astral.sh/uv/)
+- **Package name**: `embedding_cluster` (underscore, not hyphen)
+- **Entry point**: `python -m embedding_cluster` dispatches to INDEX, PLOT, or SERVER mode via `RUNNING_MODE` env var
+
+## Commands
+
+### Backend
+
+```bash
+uv sync --all-extras                                    # Install all dependencies
+RUNNING_MODE=SERVER uv run python -m embedding_cluster  # Start server on :8000
+uv run ruff check embedding_cluster/ tests/             # Lint
+uv run ruff check --fix embedding_cluster/ tests/       # Lint with auto-fix
+uv run ruff format embedding_cluster/ tests/            # Format
+uv run mypy embedding_cluster/                          # Type check (strict mode)
+uv run pytest                                           # Run all tests
+uv run pytest tests/test_settings.py -v                 # Run single test file
+uv run pytest tests/test_settings.py::test_fn -v        # Run single test function
+uv run pytest --cov=embedding_cluster --cov-report=term-missing --cov-fail-under=90  # Coverage (90% CI min)
+uv run pre-commit run --all-files                       # All pre-commit hooks
+```
+
+### Frontend
+
+```bash
+cd frontend && npm install                  # Install deps
+cd frontend && npm run dev                  # Dev server on :5173
+cd frontend && npm run build                # Production build (output: frontend/dist)
+cd frontend && npm run lint                 # ESLint
+cd frontend && npm run test:e2e             # Playwright E2E tests
+cd frontend && npx playwright test e2e/search.spec.ts  # Single E2E test
+```
+
+E2E tests require pre-indexed ChromaDB data and a built frontend. The Playwright config auto-starts the FastAPI backend.
+
+## Architecture
+
+### Three Running Modes
+
+All controlled by `RUNNING_MODE` env var, dispatched in `__main__.py`:
+- **INDEX**: `indexer.py` — CSV parsing → embedding generation → ChromaDB storage
+- **PLOT**: `scatter_plot.py` — ChromaDB → StandardScaler → k-means → dimensionality reduction (t-SNE/UMAP/PCA)
+- **SERVER**: `server/app.py` — FastAPI backend serving REST API + built React SPA from `frontend/dist`
+
+### Backend Structure
+
+- `settings.py` — All config via env vars using `pydantic-settings` `BaseSettings`
+- `server/app.py` — FastAPI app factory, mounts route modules and serves SPA
+- `server/routes/` — API routes split by domain: `ai.py`, `annotations.py`, `collections.py`, `csv.py`, `index.py`, `plot.py`, `search.py`
+- `server/tasks.py` — Background task management for long-running operations
+- `server/ws.py` — WebSocket support for live progress
+- `ai_naming.py` — LLM-powered cluster naming via LiteLLM (supports OpenAI, Ollama)
+- `annotations.py` — Cluster annotation persistence (JSON sidecar files in `annotations/`)
+- `utils.py` — ChromaDB helpers, image downloader with retry, singleton pattern
+
+### Frontend Structure
+
+React 19 + TypeScript + Vite + Tailwind CSS 4:
+- `pages/` — `HomePage`, `IndexPage`, `PlotPage`, `SettingsPage`
+- `components/` — Organized by page: `home/`, `index/`, `plot/`, `csv/`
+- `stores/plotStore.ts` — Zustand store for plot state
+- `api/` — API client layer
+- `hooks/` — React Query hooks
+- 3D visualization uses React Three Fiber (`@react-three/fiber` + `@react-three/drei`)
+
+## Code Style
+
+### Python
+- **ruff**: line length 90, target py313
+- **mypy strict mode** — all functions need type annotations
+- Use `from __future__ import annotations` in every module
+- Modern syntax: `str | None` (not `Optional`), `list[str]` (not `List`)
+- Absolute imports only: `from embedding_cluster.settings import Settings`
+- Heavy imports behind `TYPE_CHECKING` blocks where possible
+- Logger per module: `logger = logging.getLogger(__name__)`
+
+### Git Conventions
+- **Conventional commits** enforced by commitizen: `type(scope): description`
+- Types: `feat`, `fix`, `docs`, `test`, `refactor`
+- **No direct commits to master** (enforced by pre-commit hook)
+- Branch naming: `feature-name` style (e.g., `feat/ollama-provider-integration`)
+
+### Pre-commit Hooks
+Extensive setup including: ruff, commitizen, yamllint, markdownlint, shellcheck, gitleaks (secret detection), hadolint, check-jsonschema, no-commit-to-branch. Install with:
+```bash
+uv run pre-commit install --install-hooks -t pre-commit -t commit-msg
+```
+
+## CI
+
+GitHub Actions (`.github/workflows/ci.yml`): lint → typecheck → test (90% coverage minimum). All jobs use `uv sync --all-extras`.
diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
new file mode 100644
index 0000000..cde6c60
--- /dev/null
+++ b/CODE_OF_CONDUCT.md
@@ -0,0 +1,18 @@
+# Code of Conduct
+
+This project follows the
+[Contributor Covenant Code of Conduct v2.1](https://www.contributor-covenant.org/version/2/1/code_of_conduct/).
+
+Please read the full text at the link above. In summary, we are committed
+to providing a welcoming and inclusive experience for everyone.
+
+## Reporting
+
+If you experience or witness unacceptable behavior, please contact the
+project maintainer at **asafgallea@gmail.com**. All reports will be
+handled with discretion.
+
+## Attribution
+
+This Code of Conduct is adapted from the
+[Contributor Covenant](https://www.contributor-covenant.org), version 2.1.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..753b542
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,171 @@
+# Contributing
+
+Thanks for your interest in contributing to embedding-clusters! This guide
+covers everything you need to get started.
+
+## Prerequisites
+
+- [Python 3.13+](https://www.python.org/downloads/)
+- [uv](https://docs.astral.sh/uv/getting-started/installation/) package
+  manager
+- [Node.js 18+](https://nodejs.org/) (for frontend development)
+
+## Setup
+
+```bash
+git clone https://github.com/aGallea/embedding-clusters.git
+cd embedding-clusters
+uv sync --all-extras
+uv run pre-commit install --install-hooks -t pre-commit -t commit-msg
+```
+
+For frontend work:
+
+```bash
+cd frontend
+npm install
+```
+
+## Running Locally
+
+Start the full application (backend + frontend):
+
+```bash
+RUNNING_MODE=SERVER uv run python -m embedding_cluster
+```
+
+For frontend development with hot reload:
+
+```bash
+# Terminal 1 — backend
+RUNNING_MODE=SERVER uv run python -m embedding_cluster
+
+# Terminal 2 — frontend dev server (proxies API to backend)
+cd frontend && npm run dev
+```
+
+The Vite dev server runs on `http://localhost:5173` and proxies `/api` and
+`/ws` requests to the backend on port 8000.
+
+## Testing
+
+### Backend (Python)
+
+```bash
+uv run pytest                                  # Run all tests
+uv run pytest tests/test_settings.py -v        # Single file
+uv run pytest tests/test_settings.py::test_fn  # Single test
+uv run pytest --cov=embedding_cluster \
+  --cov-report=term-missing --cov-fail-under=90  # With coverage
+```
+
+Tests use `pytest-asyncio` in auto mode. CI enforces a **90% minimum
+coverage** threshold.
+
+### Frontend (E2E)
+
+```bash
+cd frontend
+npx playwright install chromium     # First-time setup
+npm run build                       # Build required before E2E
+npm run test:e2e                    # Run tests
+npm run test:e2e:ui                 # Run with interactive UI
+```
+
+E2E tests require pre-indexed data in ChromaDB. See the
+[AGENTS.md](AGENTS.md) E2E section for setup instructions.
+
+## Code Style
+
+### Python
+
+- **ruff** for linting and formatting (line length 90, target py313)
+- **mypy** in strict mode — all functions require type annotations
+- `from __future__ import annotations` in every module
+- Modern type syntax: `str | None`, `list[str]`, `dict[str, Any]`
+- Absolute imports only: `from embedding_cluster.settings import Settings`
+- Heavy imports behind `TYPE_CHECKING` blocks where possible
+- Logger per module: `logger = logging.getLogger(__name__)`
+
+```bash
+uv run ruff check embedding_cluster/ tests/       # Lint
+uv run ruff check --fix embedding_cluster/ tests/  # Auto-fix
+uv run ruff format embedding_cluster/ tests/       # Format
+uv run mypy embedding_cluster/                     # Type check
+```
+
+### Frontend (TypeScript)
+
+- ESLint with TypeScript and React hooks plugins
+- Tailwind CSS 4 for styling
+
+```bash
+cd frontend && npm run lint
+```
+
+## Pre-commit Hooks
+
+The project uses extensive pre-commit hooks that run automatically on
+commit. Key hooks include:
+
+- **ruff** — linting (with auto-fix) and formatting
+- **commitizen** — commit message validation
+- **gitleaks** — secret detection
+- **yamllint** / **markdownlint** — config file linting
+- **no-commit-to-branch** — prevents direct commits to master
+
+Run all hooks manually:
+
+```bash
+uv run pre-commit run --all-files
+```
+
+## Commit Messages
+
+This project uses [Conventional Commits](https://www.conventionalcommits.org/)
+enforced by [commitizen](https://commitizen-tools.github.io/commitizen/).
+
+Format: `type(scope): description`
+
+| Type | Use for |
+|------|---------|
+| `feat` | New features |
+| `fix` | Bug fixes |
+| `docs` | Documentation changes |
+| `test` | Adding or updating tests |
+| `refactor` | Code changes that neither fix bugs nor add features |
+
+Examples:
+
+```text
+feat(search): add image URL search support
+fix(indexer): handle empty CSV rows gracefully
+docs(readme): update quick start instructions
+test(server): add collection deletion tests
+```
+
+## Pull Request Process
+
+1. Create a branch from `master` (e.g. `feat/my-feature`)
+2. Make your changes and ensure all checks pass:
+   ```bash
+   uv run ruff check embedding_cluster/ tests/
+   uv run ruff format --check embedding_cluster/ tests/
+   uv run mypy embedding_cluster/
+   uv run pytest --cov=embedding_cluster --cov-fail-under=90
+   ```
+3. Push and open a pull request against `master`
+4. CI will run lint, typecheck, and test jobs automatically
+5. All conversations must be resolved before merging
+6. At least one approving review is required
+
+## Project Structure
+
+See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the full system
+design and component breakdown.
+
+## Good First Issues
+
+Look for issues labeled
+[`good first issue`](https://github.com/aGallea/embedding-clusters/labels/good%20first%20issue)
+for beginner-friendly tasks.
diff --git a/README.md b/README.md
index 23661f9..879a10e 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,15 @@
 # embedding-clusters
 
-![python-version][python-version]
+[![python-version][python-badge]][python-url]
+[![CI][ci-badge]][ci-url]
+[![License: MIT][license-badge]][license-url]
 
-Turn raw CSV data into beautiful, interactive embedding clusters with fast
+Turn raw CSV data into beautiful, interactive 3D embedding clusters with
 semantic search and a web UI.
 
 ![3D cluster plot](docs/screenshots/3d-cluster-plot.png)
 
-## Quick Start (Web UI)
+## Quick Start
 
 ```bash
 git clone https://github.com/aGallea/embedding-clusters.git
@@ -18,143 +20,117 @@ RUNNING_MODE=SERVER uv run python -m embedding_cluster
 
 Open <http://localhost:8000>.
 
-## Features
+> Requires Python 3.13+ and [uv](https://docs.astral.sh/uv/).
 
-- **Embeddings & Storage**: CLIP (images) + SentenceTransformer (text) with
-  ChromaDB persistence.
-- **Clustering & Plot**: k-means clusters with 3D t-SNE, UMAP, or PCA.
-- **Search & Collections**: semantic search by text or image URL, collection
-  browsing and deletion.
-- **Web UI**: CSV upload, live progress, plot controls, and multiple render
-  modes.
+## Features
 
-## Visual Highlights
+- **Embeddings** — CLIP (images) + SentenceTransformer (text) with
+  ChromaDB persistence
+- **Clustering** — k-means with automatic cluster count suggestion
+- **3D Visualization** — interactive scatter plot with t-SNE, UMAP, or PCA
+  reduction and multiple render modes (particles, sprites, instanced spheres)
+- **Semantic Search** — find similar items by text or image URL, highlighted
+  directly in the 3D view
+- **Cluster Drill-Down** — inspect cluster items, sub-cluster within a
+  cluster, and explore hierarchical structure
+- **AI-Powered Naming** — auto-label clusters using OpenAI, Google,
+  Anthropic, or Ollama via LiteLLM
+- **Annotations** — rename, tag, and annotate clusters with persistent
+  notes
+- **Web UI** — CSV upload, live indexing progress via WebSocket, plot
+  controls, and collection management
+
+## Screenshots
 
 ![Semantic search demo](docs/gifs/semantic-search-mini.gif)
 
-![Home dashboard](docs/screenshots/home-page.png)
-![Index page config](docs/screenshots/index-page-config.png)
-![Index page progress](docs/screenshots/index-page-progress.png)
-![Semantic search results](docs/screenshots/semantic-search.png)
-
-## How Things Work
-
-The tool turns a CSV file into an interactive 3D cluster
-visualization in a few steps:
-
-1. **Upload CSV** -- Provide a CSV file containing your data.
-   The web UI lets you drag-and-drop; the CLI accepts a file path.
-2. **Select fields** -- Choose which columns to embed. Text fields
-   (e.g. product names) use a SentenceTransformer model; image URL
-   fields use a CLIP model. You can embed both in the same dataset.
-3. **Model download** -- The selected model is pulled from
-   [HuggingFace](https://huggingface.co) on first use and cached
-   locally for subsequent runs.
-4. **Embedding & storage** -- Each row is converted into a vector
-   embedding by the chosen model. Embeddings are stored in a
-   [ChromaDB](https://www.trychroma.com/) collection for
-   persistent, queryable vector storage.
-5. **Plot configuration** -- Pick a collection, set the number of
-   k-means clusters (or let the tool suggest one), and choose a
-   dimensionality reduction algorithm (t-SNE, UMAP, or PCA).
-6. **3D visualization** -- The reduced vectors are rendered as an
-   interactive 3D scatter plot. Hover for metadata, toggle cluster
-   visibility, switch render modes, or go fullscreen.
-7. **Semantic search** -- Enter a text query or paste an image URL
-   to find the most similar items. Matching points are highlighted
-   directly in the 3D view.
-8. **Cluster groupings** -- Toggle individual clusters on/off to
-   focus on specific groups. Use the optional GPT-powered naming
-   to label each cluster automatically.
-
-## Cluster Drill-Down and Annotation
-
-After generating a plot you can inspect, subdivide, and annotate
-individual clusters directly from the web UI.
-
-### Cluster Detail Panel
-
-Click a cluster name in the legend to open a side panel listing every
-item in that cluster. Items are sorted by distance to the centroid so
-the most representative points appear first. The panel supports
-pagination, displays item metadata, and shows image thumbnails when
-an image field is available.
-
-### Sub-Clustering
-
-Inside the detail panel, toggle **Sub-cluster** to re-run k-means
-within a single cluster. The result is rendered as a mini 3D scatter
-plot (PCA-reduced) so you can explore hierarchical structure without
-leaving the page.
-
-### Annotations
-
-Each cluster can be renamed, tagged, and annotated with free-form
-notes. Changes are saved automatically (debounced) and persisted as
-JSON sidecar files in the `annotations/` directory. Annotations
-survive page reloads and are scoped per plot job.
-
-### API Endpoints
-
-The feature exposes the following REST endpoints under `/api`:
-
-- `GET /plot/{job_id}/cluster/{index}` -- paginated cluster detail
-- `POST /plot/{job_id}/cluster/{index}/sub-cluster` -- sub-cluster
-  a single cluster with configurable k
-- `GET /annotations/{job_id}` -- fetch all annotations for a job
-- `PUT /annotations/{job_id}` -- update annotations
-- `DELETE /annotations/{job_id}` -- delete annotations
+| | |
+|---|---|
+| ![Home](docs/screenshots/home-page.png) | ![Index config](docs/screenshots/index-page-config.png) |
+| ![Index progress](docs/screenshots/index-page-progress.png) | ![Search](docs/screenshots/semantic-search.png) |
+
+## How It Works
 
 ```text
-CSV --> Select Fields --> Download Model --> Embed & Store
-  --> Configure Plot --> 3D Visualization --> Search & Explore
+CSV → Select Fields → Download Model → Embed & Store
+  → Configure Plot → 3D Visualization → Search & Explore
 ```
 
-## Using CLI
+1. **Upload CSV** — drag-and-drop in the web UI or pass a file path via CLI
+2. **Select fields** — choose text columns (SentenceTransformer) and/or
+   image URL columns (CLIP) to embed
+3. **Embed & store** — rows are converted to vector embeddings and stored in
+   a ChromaDB collection
+4. **Plot** — pick a collection, set cluster count (or auto-suggest), choose
+   a reduction algorithm, and render an interactive 3D scatter plot
+5. **Search** — enter a text query or image URL to find and highlight the
+   most similar items
+6. **Drill down** — click a cluster to inspect items, sub-cluster for
+   hierarchical exploration, and annotate with AI-generated or custom names
+
+For detailed usage of all three modes (SERVER, INDEX, PLOT), environment
+variables, and API endpoints, see [docs/USAGE.md](docs/USAGE.md).
+
+## Architecture
+
+The application has three running modes controlled by the `RUNNING_MODE`
+environment variable:
+
+- **SERVER** — FastAPI backend + React SPA (the web UI)
+- **INDEX** — CLI-only CSV embedding pipeline
+- **PLOT** — CLI-only cluster visualization
+
+See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the full system design,
+data flow diagrams, and component breakdown.
 
-### Index (CLI)
+## Development
 
 ```bash
-RUNNING_MODE=INDEX \
-  LOCAL_CSV_FILENAME=./embedding_cluster/csv/fashion_small.csv \
-  ID_FIELD=id \
-  IMAGE_EMBEDDING_FIELDS='["imageUrl"]' \
-  CHROMADB_COLLECTION_PREFIX=fashion_ \
-  NUMBER_OF_ASYNC_TASKS=10 \
-  uv run python -m embedding_cluster
+uv sync --all-extras                           # Install dependencies
+uv run ruff check embedding_cluster/ tests/    # Lint
+uv run ruff format embedding_cluster/ tests/   # Format
+uv run mypy embedding_cluster/                 # Type check (strict)
+uv run pytest                                  # Run tests
 ```
 
-### Plot (CLI)
+Frontend (React 19 + TypeScript + Vite + Tailwind CSS 4):
 
 ```bash
-RUNNING_MODE=PLOT \
-  CHROMADB_COLLECTION_NAME=fashion_imageUrl \
-  TEXT_DISPLAY_FIELDS='["productDisplayName"]' \
-  IMAGE_FIELD=imageUrl \
-  uv run python -m embedding_cluster
+cd frontend && npm install && npm run dev      # Dev server on :5173
+cd frontend && npm run build                   # Production build
+cd frontend && npm run test:e2e                # Playwright E2E tests
 ```
 
-Key environment variables:
-
-- `RUNNING_MODE`: `INDEX`, `PLOT`, or `SERVER`
-- `TEXT_MODEL_NAME`: SentenceTransformer model name
-- `IMAGE_MODEL_NAME`: CLIP model name
-- `NUM_CLUSTERS`: k-means cluster count
-- `REDUCTION_ALGORITHM`: `tsne`, `umap`, or `pca`
+See [CONTRIBUTING.md](CONTRIBUTING.md) for the full development guide.
 
-## Development
+## Tech Stack
 
-```bash
-uv sync --all-extras
-uv run ruff check embedding_cluster/ tests/
-uv run ruff format embedding_cluster/ tests/
-uv run mypy embedding_cluster/
-uv run pytest
-```
+| Layer | Technology |
+|-------|-----------|
+| Backend | Python 3.13, FastAPI, Pydantic |
+| Embeddings | SentenceTransformers, CLIP (HuggingFace) |
+| Vector DB | ChromaDB |
+| ML | scikit-learn (KMeans, t-SNE, PCA), UMAP |
+| AI Naming | LiteLLM (OpenAI, Google, Anthropic, Ollama) |
+| Frontend | React 19, TypeScript, Vite, Tailwind CSS 4 |
+| 3D | React Three Fiber, Three.js, drei |
+| State | Zustand, TanStack React Query |
+| Testing | pytest, Playwright |
+| CI | GitHub Actions (lint, typecheck, test with 90% coverage) |
 
 ## Contributing
 
-Pull requests are welcome. For major changes, please open an issue first.
+See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, code style, and PR
+guidelines.
+
+## License
+
+[MIT](LICENSE)
 
 <!-- MARKDOWN LINKS & IMAGES -->
-[python-version]: https://img.shields.io/badge/python-3.13-blue.svg
+[python-badge]: https://img.shields.io/badge/python-3.13-blue.svg
+[python-url]: https://www.python.org/downloads/
+[ci-badge]: https://github.com/aGallea/embedding-clusters/actions/workflows/ci.yml/badge.svg
+[ci-url]: https://github.com/aGallea/embedding-clusters/actions/workflows/ci.yml
+[license-badge]: https://img.shields.io/badge/License-MIT-green.svg
+[license-url]: LICENSE
diff --git a/SECURITY.md b/SECURITY.md
new file mode 100644
index 0000000..d1dea2c
--- /dev/null
+++ b/SECURITY.md
@@ -0,0 +1,39 @@
+# Security Policy
+
+## Reporting a Vulnerability
+
+If you discover a security vulnerability, please report it responsibly by
+emailing **asafgallea@gmail.com**. Do not open a public issue.
+
+You can expect:
+
+- An acknowledgment within **48 hours**
+- A status update within **7 days**
+- Coordinated disclosure once a fix is available
+
+## Supported Versions
+
+| Version | Supported |
+|---------|-----------|
+| Latest on `master` | Yes |
+| Older releases | No |
+
+## Scope
+
+This policy covers the `embedding-clusters` application code, including:
+
+- The Python backend (FastAPI server, indexer, plot computation)
+- The React frontend
+- Configuration and build tooling
+
+## Security Considerations
+
+- **File uploads** — CSV uploads are saved to a sandboxed `./uploads/`
+  directory. The server validates file paths to prevent directory traversal.
+- **AI credentials** — LLM API keys are configured per-session in the
+  browser and sent per-request. They are not stored server-side.
+- **ChromaDB** — runs embedded (no network exposure). Data is stored
+  locally in `./chromadb/`.
+- **No authentication** — the application is designed for local or trusted
+  network use. Do not expose it to the public internet without adding an
+  authentication layer.
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
new file mode 100644
index 0000000..6815476
--- /dev/null
+++ b/docs/ARCHITECTURE.md
@@ -0,0 +1,268 @@
+# Architecture
+
+This document describes the system design, component responsibilities, and
+data flow of **embedding-clusters**.
+
+## Overview
+
+The application converts CSV data into interactive 3D embedding
+visualizations. It has three running modes, all dispatched from a single
+entry point (`python -m embedding_cluster`):
+
+| Mode | Entry | Purpose |
+|------|-------|---------|
+| `SERVER` | `server/app.py` | FastAPI backend + React SPA |
+| `INDEX` | `indexer.py` | CLI embedding pipeline |
+| `PLOT` | `scatter_plot.py` | CLI cluster visualization |
+
+```text
+                          __main__.py
+                         /     |     \
+                        /      |      \
+                  INDEX      SERVER     PLOT
+                    |          |          |
+               indexer.py   FastAPI   scatter_plot.py
+                    |       /  |  \       |
+                    |   routes |  SPA     |
+                    |          |          |
+                    +--- ChromaDB --------+
+```
+
+## Backend Components
+
+### Configuration (`settings.py`)
+
+All configuration is driven by environment variables, parsed by
+`pydantic-settings` `BaseSettings`. Each setting has a `Field()` with a
+default value and description. List fields accept JSON-encoded strings
+(e.g. `'["field1","field2"]'`).
+
+### Indexing Pipeline (`indexer.py`)
+
+Responsible for the INDEX mode and also used by the server's indexing route.
+
+1. Read CSV rows (with optional start/stop line range)
+2. Load embedding models:
+   - **SentenceTransformer** for text fields
+   - **CLIP** (via HuggingFace Transformers) for image URL fields
+3. Generate embeddings in batches with semaphore-controlled concurrency
+4. Store embeddings + metadata in ChromaDB collections
+5. Report progress via callback (used by WebSocket in server mode)
+6. Support cancellation via `asyncio.Event`
+
+Images are downloaded asynchronously with exponential backoff retry
+(up to 6 attempts) using a singleton `ImageDownloader` backed by
+`aiohttp.ClientSession`.
+
+### Plot Computation (`scatter_plot.py`)
+
+Responsible for the PLOT mode and used by the server's plot route.
+
+1. Load embeddings from a ChromaDB collection
+2. Standardize with `StandardScaler`
+3. Reduce dimensions using t-SNE, UMAP, or PCA
+4. Cluster with KMeans
+5. Compute silhouette scores, centroids, and per-point distances
+6. Return structured point and cluster data
+
+Additional capabilities:
+- **Optimal cluster suggestion** — evaluates k=2..30 with inertia and
+  silhouette scores
+- **Sub-clustering** — re-run KMeans within a single cluster or on a
+  selected subset of points
+
+### AI Naming (`ai_naming.py`)
+
+Uses [LiteLLM](https://github.com/BerriAI/litellm) as a universal gateway
+to call any LLM provider (OpenAI, Google, Anthropic, Ollama) with a single
+interface. Generates short (max 5 words) descriptive names for clusters
+based on sampled items.
+
+### Annotations (`annotations.py`)
+
+Persists cluster metadata (name, notes, tags) as JSON sidecar files in
+`./annotations/`, one file per plot job. The `AnnotationManager` handles
+read/write with automatic timestamping.
+
+### Utilities (`utils.py`)
+
+- **Logging** — colored console formatter
+- **ChromaDB helpers** — collection creation, batch document initialization
+- **ImageDownloader** — singleton async image fetcher with retry logic
+- **ID generator** — random alphanumeric IDs for jobs and documents
+
+## Server Architecture
+
+The `SERVER` mode runs a FastAPI application that serves both the REST API
+and the built React SPA.
+
+### App Factory (`server/app.py`)
+
+`create_app()` assembles the FastAPI app:
+- Registers all API route modules under `/api`
+- Adds CORS middleware for frontend dev server (`localhost:5173`)
+- Serves the React SPA from `frontend/dist` (if built), with catch-all
+  fallback to `index.html` for client-side routing
+
+### Task Management (`server/tasks.py`)
+
+Long-running operations (indexing, plot computation) run as background
+async tasks tracked by an in-memory `TaskRegistry`:
+
+- Each job gets a unique ID and a `TaskState` with status, progress dict,
+  result, error, and a cancel event
+- Status lifecycle: `PENDING` → `RUNNING` → `COMPLETED` | `FAILED` | `CANCELLED`
+- Clients poll status via REST or subscribe via WebSocket
+
+### WebSocket Manager (`server/ws.py`)
+
+Manages per-job WebSocket connections for real-time progress streaming.
+Broadcasts JSON messages (progress, log, heartbeat, completed, error) to
+all connected clients for a given job ID.
+
+### API Routes (`server/routes/`)
+
+| Route module | Prefix | Responsibility |
+|-------------|--------|---------------|
+| `csv.py` | `/api/csv` | Upload and preview CSV files |
+| `index.py` | `/api/index` | Start/cancel indexing jobs, WebSocket progress |
+| `collections.py` | `/api/collections` | List, detail, delete ChromaDB collections |
+| `plot.py` | `/api/plot` | Compute plots, cluster detail, sub-clustering, suggest k |
+| `search.py` | `/api/search` | Semantic search (text or image query) |
+| `ai.py` | `/api/ai` | LLM cluster naming, connection testing, Ollama proxy |
+| `annotations.py` | `/api/annotations` | CRUD for cluster annotations |
+
+### Request/Response Models (`server/models.py`)
+
+All API contracts are defined as Pydantic models. The frontend TypeScript
+types in `frontend/src/types/index.ts` mirror these models.
+
+## Frontend Architecture
+
+The frontend is a React 19 SPA built with Vite and Tailwind CSS 4.
+
+### Routing (`App.tsx`)
+
+Four pages mapped via React Router:
+
+| Path | Page | Purpose |
+|------|------|---------|
+| `/` | `HomePage` | Collection browser, quick actions |
+| `/index` | `IndexPage` | CSV upload, embedding config, progress |
+| `/plot` | `PlotPage` | 3D visualization, search, annotations |
+| `/settings` | `SettingsPage` | AI provider configuration |
+
+### State Management
+
+- **Zustand** (`stores/plotStore.ts`) — single store for all plot-related
+  state: points, clusters, visibility, search results, drill-down path,
+  annotations, render mode, algorithm parameters
+- **TanStack React Query** — server state (collections, plot data polling)
+
+### 3D Visualization
+
+Uses [React Three Fiber](https://github.com/pmndrs/react-three-fiber)
+(`@react-three/fiber`) with `drei` helpers. Three render modes:
+
+1. **Particles** — GPU-accelerated point cloud (default, best performance)
+2. **Sprites** — image thumbnails at each point (when image field available)
+3. **Instanced Spheres** — 3D sphere meshes with lighting
+
+### API Client Layer (`api/`)
+
+Typed fetch wrappers organized by domain (`client.ts`, `indexing.ts`,
+`plot.ts`, `ai.ts`, `collections.ts`, `csv.ts`). All requests go through
+a shared `apiFetch<T>()` utility with error handling.
+
+### Hooks
+
+- `useIndexWebSocket` — real-time indexing progress with stuck detection
+  (warning after 15s, error after 30s of silence)
+- `usePlotData` — starts plot computation, polls for results every 2s
+
+## Data Flow
+
+### Indexing (Web UI)
+
+```text
+Browser                          Server                    Storage
+  |                                |                         |
+  |-- POST /csv/upload ---------->|                         |
+  |<---- filename, columns -------|                         |
+  |                                |                         |
+  |-- POST /index/start -------->|                         |
+  |<---- job_id ------------------|                         |
+  |                                |-- load models           |
+  |== WS /index/ws/{job_id} ====>|                         |
+  |                                |-- read CSV              |
+  |<--- progress messages --------|-- embed rows ---------->|
+  |<--- log messages -------------|-- store in ChromaDB --->|
+  |<--- completed message --------|                         |
+```
+
+### Plot Generation
+
+```text
+Browser                          Server                    Storage
+  |                                |                         |
+  |-- POST /plot/compute -------->|                         |
+  |<---- job_id ------------------|                         |
+  |                                |-- load embeddings <----|
+  |-- GET /plot/data/{id} ------->|-- reduce dimensions     |
+  |<---- ready: false ------------|-- KMeans clustering     |
+  |-- GET /plot/data/{id} ------->|-- compute centroids     |
+  |<---- ready: true, data -------|                         |
+  |                                |                         |
+  |-- render 3D scene             |                         |
+```
+
+### Semantic Search
+
+```text
+Browser                          Server                    Storage
+  |                                |                         |
+  |-- POST /search -------------->|                         |
+  |                                |-- infer model type      |
+  |                                |-- embed query           |
+  |                                |-- ChromaDB.query() <----|
+  |<---- results + distances -----|                         |
+  |                                |                         |
+  |-- highlight in 3D scene       |                         |
+```
+
+## Storage
+
+| Directory | Contents | Persistence |
+|-----------|----------|-------------|
+| `./chromadb/` | Vector database (embeddings + metadata) | Persistent, gitignored |
+| `./uploads/` | User-uploaded CSV files | Persistent, gitignored |
+| `./annotations/` | Cluster annotation JSON files | Persistent, gitignored |
+
+## Design Decisions
+
+### Why ChromaDB?
+
+ChromaDB provides embedded vector storage with no external dependencies.
+Collections persist to disk automatically, support metadata filtering, and
+offer nearest-neighbor search out of the box — exactly what this tool needs
+without requiring a separate database server.
+
+### Why LiteLLM?
+
+Rather than coupling to a single LLM provider, LiteLLM provides a unified
+interface to OpenAI, Google, Anthropic, and Ollama. Users can switch
+providers from the settings page without code changes.
+
+### Why React Three Fiber?
+
+The 3D visualization needs to render thousands of points interactively.
+React Three Fiber provides a React-native API over Three.js, enabling
+declarative scene composition while retaining GPU-level performance through
+instanced rendering and point clouds.
+
+### Job-Based Architecture
+
+Embedding generation and plot computation can take seconds to minutes. The
+task registry pattern decouples request handling from execution, allowing
+the frontend to poll or subscribe via WebSocket without blocking HTTP
+connections.
diff --git a/docs/USAGE.md b/docs/USAGE.md
new file mode 100644
index 0000000..d3fa07e
--- /dev/null
+++ b/docs/USAGE.md
@@ -0,0 +1,199 @@
+# Usage
+
+This guide covers all three running modes and their configuration.
+
+## Web UI (SERVER mode)
+
+The recommended way to use embedding-clusters:
+
+```bash
+RUNNING_MODE=SERVER uv run python -m embedding_cluster
+```
+
+Open <http://localhost:8000>. The web UI provides:
+
+1. **Home** — browse existing collections, see item counts and model info
+2. **Index** — upload a CSV, select fields to embed, configure models,
+   and watch real-time progress via WebSocket
+3. **Plot** — pick a collection, set clustering parameters, and interact
+   with a 3D scatter plot
+4. **Settings** — configure AI provider for cluster naming
+
+### Workflow
+
+1. Navigate to the **Index** page
+2. Upload a CSV file (drag-and-drop or file picker)
+3. Select which columns to embed:
+   - **Text fields** use a SentenceTransformer model
+   - **Image URL fields** use a CLIP model
+4. Click **Start** and watch the progress bar
+5. Navigate to the **Plot** page
+6. Select the new collection and configure:
+   - Number of clusters (or click **Suggest** for auto-detection)
+   - Reduction algorithm: t-SNE, UMAP, or PCA
+   - Algorithm-specific parameters (perplexity, learning rate, etc.)
+7. Click **Compute** to generate the 3D visualization
+8. Explore:
+   - **Hover** points to see metadata
+   - **Search** by text or image URL to highlight similar items
+   - **Click** a cluster in the legend to drill down
+   - **Sub-cluster** within a cluster for hierarchical exploration
+   - **Annotate** clusters with names, tags, and notes
+   - **AI Name** clusters using your configured LLM provider
+
+### Render Modes
+
+The 3D plot supports three render modes, switchable from the plot controls:
+
+- **Particles** — GPU-accelerated point cloud (default, best for large
+  datasets)
+- **Sprites** — image thumbnails at each point (requires an image field)
+- **Instanced Spheres** — 3D sphere meshes with lighting effects
+
+## CLI: INDEX mode
+
+Embed CSV data into ChromaDB collections from the command line:
+
+```bash
+RUNNING_MODE=INDEX \
+  LOCAL_CSV_FILENAME=./embedding_cluster/csv/fashion_small.csv \
+  ID_FIELD=id \
+  IMAGE_EMBEDDING_FIELDS='["imageUrl"]' \
+  CHROMADB_COLLECTION_PREFIX=fashion_ \
+  NUMBER_OF_ASYNC_TASKS=10 \
+  uv run python -m embedding_cluster
+```
+
+You can also embed text fields:
+
+```bash
+RUNNING_MODE=INDEX \
+  LOCAL_CSV_FILENAME=./data/products.csv \
+  ID_FIELD=product_id \
+  TEXT_EMBEDDING_FIELDS='["name", "description"]' \
+  CHROMADB_COLLECTION_PREFIX=products_ \
+  uv run python -m embedding_cluster
+```
+
+Or both text and image fields in the same run:
+
+```bash
+RUNNING_MODE=INDEX \
+  LOCAL_CSV_FILENAME=./data/catalog.csv \
+  ID_FIELD=id \
+  TEXT_EMBEDDING_FIELDS='["title"]' \
+  IMAGE_EMBEDDING_FIELDS='["thumbnail_url"]' \
+  CHROMADB_COLLECTION_PREFIX=catalog_ \
+  uv run python -m embedding_cluster
+```
+
+## CLI: PLOT mode
+
+Generate a cluster visualization from an existing collection:
+
+```bash
+RUNNING_MODE=PLOT \
+  CHROMADB_COLLECTION_NAME=fashion_imageUrl \
+  TEXT_DISPLAY_FIELDS='["productDisplayName"]' \
+  IMAGE_FIELD=imageUrl \
+  NUM_CLUSTERS=8 \
+  REDUCTION_ALGORITHM=umap \
+  uv run python -m embedding_cluster
+```
+
+## Environment Variables
+
+All configuration is via environment variables, parsed by
+[pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/).
+
+### General
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `RUNNING_MODE` | — | `INDEX`, `PLOT`, or `SERVER` |
+| `DEVICE` | `cpu` | PyTorch device (`cpu`, `mps`, `cuda`) |
+
+### Indexing
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `LOCAL_CSV_FILENAME` | — | Path to CSV file |
+| `ID_FIELD` | — | Column name for unique row IDs |
+| `TEXT_EMBEDDING_FIELDS` | `[]` | JSON array of text columns to embed |
+| `IMAGE_EMBEDDING_FIELDS` | `[]` | JSON array of image URL columns to embed |
+| `TEXT_MODEL_NAME` | `BAAI/bge-small-en-v1.5` | SentenceTransformer model |
+| `IMAGE_MODEL_NAME` | `openai/clip-vit-base-patch32` | CLIP model |
+| `CHROMADB_COLLECTION_PREFIX` | — | Prefix for collection names |
+| `NUMBER_OF_ASYNC_TASKS` | `5` | Concurrency limit for async operations |
+| `BULK_SIZE` | `100` | Batch size for ChromaDB upserts |
+| `START_LINE` | — | First CSV line to process (optional) |
+| `STOP_LINE` | — | Last CSV line to process (optional) |
+
+### Plotting
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `CHROMADB_COLLECTION_NAME` | — | Collection to visualize |
+| `TEXT_DISPLAY_FIELDS` | `[]` | JSON array of metadata fields to show on hover |
+| `IMAGE_FIELD` | — | Metadata field containing image URLs |
+| `NUM_CLUSTERS` | `10` | Number of k-means clusters |
+| `REDUCTION_ALGORITHM` | `tsne` | `tsne`, `umap`, or `pca` |
+| `TSNE_PERPLEXITY` | `30` | t-SNE perplexity parameter |
+| `TSNE_LEARNING_RATE` | `200` | t-SNE learning rate |
+| `UMAP_N_NEIGHBORS` | `15` | UMAP neighbors parameter |
+| `UMAP_MIN_DIST` | `0.1` | UMAP minimum distance |
+| `UMAP_METRIC` | `cosine` | UMAP distance metric |
+
+## API Endpoints
+
+When running in SERVER mode, the following REST endpoints are available
+under `/api`:
+
+### CSV
+
+- `POST /api/csv/upload` — upload a CSV file
+- `POST /api/csv/preview` — preview columns and sample rows
+
+### Indexing
+
+- `POST /api/index/start` — start an indexing job
+- `GET /api/index/status/{job_id}` — poll job progress
+- `POST /api/index/cancel/{job_id}` — cancel a running job
+- `WS /api/index/ws/{job_id}` — real-time progress via WebSocket
+
+### Collections
+
+- `GET /api/collections` — list all collections
+- `GET /api/collections/{name}` — collection detail with metadata fields
+- `DELETE /api/collections/{name}` — delete a collection
+
+### Plot
+
+- `POST /api/plot/compute` — start plot computation
+- `GET /api/plot/data/{job_id}` — fetch computed plot data
+- `GET /api/plot/{job_id}/cluster/{index}` — paginated cluster items
+- `POST /api/plot/{job_id}/cluster/{index}/sub-cluster` — sub-cluster
+- `POST /api/plot/{job_id}/sub-cluster` — sub-cluster by point IDs
+- `POST /api/plot/suggest-clusters` — auto-suggest optimal cluster count
+- `POST /api/plot/{job_id}/suggest-k` — suggest k for sub-clustering
+
+### Search
+
+- `POST /api/search` — semantic search by text or image URL
+
+### AI Naming
+
+- `POST /api/ai/name-clusters` — generate cluster names via LLM
+- `POST /api/ai/name-sub-clusters` — generate sub-cluster names
+- `POST /api/ai/test-connection` — validate LLM credentials
+- `POST /api/ai/ollama/models` — list available Ollama models
+
+### Annotations
+
+- `GET /api/annotations/{job_id}` — fetch annotations for a job
+- `PUT /api/annotations/{job_id}/cluster/{index}` — update annotation
+- `DELETE /api/annotations/{job_id}` — delete all annotations for a job
+
+### Health
+
+- `GET /api/health` — returns `{"status": "ok"}`