n0x

Run the full AI stack in a browser tab.

n0x puts an LLM, an autonomous agent, document Q&A, a Python runtime, image generation, and web search into one browser tab. No server. No account. No API keys. Open a tab, pick a model, start working.

The default path is fully local — your prompts, files, and model weights never leave your machine. WebGPU handles inference at 35–60 tok/s on a normal laptop GPU. But if you want more power, flip to Ollama or plug in a cloud API (Groq, OpenRouter, any OpenAI-compatible endpoint) and you're running the same tool stack against bigger models.

Why this exists

Every AI tool I tried either wanted my data, wanted my money, or both. I wanted something I could open in a browser and just use — no Docker, no Python venv, no sign-up wall, no "you've hit your free tier limit."

So I built n0x. It's an actual workstation, not a chatbot wrapper.

What you get

Pick your backend. Three providers, one interface:

Provider	What runs	Setup
Browser (WebGPU)	40+ open-source models, 360MB→70B, running on your GPU via WebLLM	Zero. Just pick a model.
Ollama	Any model from your local Ollama server	`ollama serve` — n0x auto-detects it
Cloud API	Groq, OpenRouter, or any OpenAI-compatible endpoint	Paste a key + base URL

Switch between them mid-conversation. Your chat history stays.

Agent mode. A ReAct reasoning loop that actually works. The LLM chains tool calls autonomously — web search, document lookup, Python execution, memory recall — and you watch it think in real time, token by token. It handles malformed JSON output, recovers from loops, budgets its context window, and times out gracefully. The trace UI shows every step with timing and token cost.

Document Q&A. Drop a PDF, DOCX, TXT, CSV, or markdown file into the chat. n0x chunks it with sentence-boundary-aware splitting, embeds it with MiniLM-L6 (in a Web Worker, so the UI doesn't freeze), indexes it with Voy cosine search, and re-ranks results with MMR so you get relevant answers without duplicate noise. Vectors are cached in IndexedDB — upload once, query forever.

Python runtime. Pyodide runs in a WASM sandbox. Code output feeds back into the conversation. If execution fails, the error goes to the LLM automatically for a fix.

Image generation. Type "generate an image of..." and Pollinations (Flux, z-image-turbo, klein, and more) handles the rest. Works with a free API key or no key at all. Loading skeleton, retry on failure, zoom and download built in.

Web search. DuckDuckGo + Wikipedia synthesis. Plug in a Tavily key for deeper, citation-rich results.

Memory. The agent stores and recalls facts across sessions. Persistent in IndexedDB.

Voice. Speech-to-text and text-to-speech via the Web Speech API. Works offline.

Branching. Fork any message into an alternate conversation thread. Both branches persist in the sidebar.

Personas. Five system prompts — Default, Senior Engineer, Writer, Tutor, Analyst — each with their own tone, formatting rules, and domain focus.

How it works

                              ┌─────────────────────┐
                              │    Provider Layer    │
                              │  WebGPU · Ollama ·   │
                              │  Cloud API (OpenAI)  │
                              └────────┬────────────┘
                                       │
┌──────────┐     ┌───────────┐    ┌────▼────┐
│  User     │────▶  Router    │───▶│  LLM    │
│  Input    │     │           │    │ Stream  │
└──────────┘     │  direct /  │    └────┬────┘
                 │  agent /   │         │
                 │  image     │    ┌────▼─────────────────────┐
                 └───────────┘    │  Agent (ReAct Loop)       │
                                  │  thought → action →       │
                                  │  observation → repeat     │
                                  │                           │
                                  │  Tools:                   │
                                  │   ├ Web Search (DDG/Tavily)│
                                  │   ├ RAG (Voy + MMR)       │
                                  │   ├ Python (Pyodide WASM) │
                                  │   ├ Memory (IndexedDB)    │
                                  │   └ Image Gen (Pollinations)│
                                  └───────────────────────────┘

Everything above the line runs in the browser. The only network calls are optional: search queries go through a CORS proxy, image prompts go to Pollinations. Disable both and you have a fully air-gapped AI workstation.

Models

40+ models. MLC-compiled, quantized, cached in browser storage after first download. Real inference, not API calls.

	Examples	Size	Speed
Tiny	SmolLM2 360M, Qwen 0.5B	360MB–900MB	60+ t/s
Balanced	Qwen 2.5 1.5B (default), Phi-3.5, Llama 3.2 3B	700MB–2.2GB	35–50 t/s
Heavy	Mistral 7B, Qwen 2.5 7B, Llama 3.1 8B, Gemma 2 9B	4–6GB	15–25 t/s
Flagship	Qwen 2.5 32B, Llama 3.3 70B, R1 Llama 70B	10–30GB	8–15 t/s
Code	Qwen Coder 1.5B/7B/32B, DeepSeek Coder	800MB–20GB	varies

Start with Qwen 2.5 1.5B (~1GB). It loads in seconds on a warm cache and handles most tasks well. Scale up from there.

Run it yourself

Chrome or Edge (WebGPU required). Node 18+.

git clone https://github.com/ixchio/n0x.git
cd n0x
npm install
npm run dev

Open localhost:3000. First launch downloads the default model (~1GB) — after that it loads from cache instantly.

Optional env vars

TAVILY_API_KEY=        # Better search results (server-side, never touches the client)
POLLINATIONS_API_KEY=  # Image gen without watermarks + higher rate limits

Both are optional. Everything works without them.

Privacy

Your prompts, documents, and model weights stay in your browser. Period.

The only data that leaves your machine:

Search queries — routed through a DuckDuckGo CORS proxy (if you use search)
Image prompts — sent to Pollinations API (if you generate images)

Turn both off and nothing leaves your machine. Not metadata, not telemetry, nothing.

Stack

Next.js 14 · TypeScript · WebLLM (WebGPU) · Transformers.js · Voy · Pyodide · Zustand · Tailwind · IndexedDB

MIT · ixchio

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
app		app
components		components
lib		lib
public		public
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.cjs		postcss.config.cjs
tailwind.config.cjs		tailwind.config.cjs
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

n0x

Why this exists

What you get

How it works

Models

Run it yourself

Optional env vars

Privacy

Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

n0x

Why this exists

What you get

How it works

Models

Run it yourself

Optional env vars

Privacy

Stack

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages