OpenJet sets up the local model backend for your hardware and gives you a Claude-Code-style coding agent that can read files, edit code, and run commands fully on your own machine.
If you are new to local LLMs, OpenJet is the fastest way to get started without spending hours figuring out models, runtimes, and config.
If you have already tried local LLMs and got frustrated piecing together a model backend, a frontend, and an actual coding agent, OpenJet removes that setup tax.
OpenJet is built for people looking for a Claude Code alternative, easy local LLM setup, or a self-hosted local coding agent.
pipx install open-jet
openjet setupIf you do not use pipx, install with Python directly:
python -m pip install --user open-jet
openjet setupThe PyPI package is open-jet; the installed command is openjet.
Recommended hardware: Apple silicon with 24GB+ unified memory, or a GPU with 14GB+ VRAM.
The tables below list the setup catalog entries from src/config.py. max_ram_gb
is the configured setup target for that row.
General (any GPU/RAM — no unified_memory_only flag):
| Model | Configured max_ram_gb |
|---|---|
| Qwen3.5 4B | 6.0 |
| Qwen3.5 9B | 12.0 |
| Qwen3.6 27B UD-IQ2_XXS | 12.0 |
| Qwen3.6 27B UD-IQ3_XXS | 16.0 |
| Qwen3.6 27B Q4_K_M MTP | 20.0 |
Unified memory only (unified_memory_only: True, llama_cpu_moe: True):
| Model | Configured max_ram_gb |
|---|---|
| Gemma 4 26B A4B | 24.0 |
| Qwen3.6 35B A3B UD-Q3_K_XL | 24.0 |
| Qwen3.6 35B A3B | 32.0 |
Setup detects your hardware, picks a model that fits your RAM, downloads it, and gets everything running. Already have a .gguf? It finds that too.
Then run:
openjetOther entrypoints from the same install:
openjet benchmark --sweepopenjet fixfrom openjet.sdk import OpenJetSession, recommend_hardware_config| What it does | Why it matters |
|---|---|
| Easy local LLM setup | Get a working local coding agent without manually learning the entire backend and runtime stack first |
| Unified backend + harness | One local system instead of separately wiring together a model runtime, config layer, frontend, and agent workflow |
| Claude-Code-style workflow | Read files, edit code, run commands, and work in a terminal agent instead of a plain chat window |
| Hardware-aware setup | OpenJet picks sensible defaults for your machine instead of leaving you to trial-and-error every setting |
| Fully local | Your code stays on your machine, with no cloud dependency required |
| Remote execution support | Run the model on one machine and execute on another |
| SDK + benchmarks included | Script the same runtime from Python and measure performance on your own hardware |
| Tool | Backend setup | Local runtime provisioning | Hardware auto-config | Terminal agent | Memory persistence |
|---|---|---|---|---|---|
| OpenJet | Built in: install + openjet setup |
Yes: model discovery/download + llama.cpp config |
Yes | Full TUI | Yes: global + project memory |
| Aider | Manual: choose API, local endpoint, or provider config | No | No | Terminal chat | No persistent agent memory |
| Cline | Manual: extension/CLI plus provider or local model config | No | No | Editor-first; CLI available | Yes: Memory Bank/rules |
| OpenCode | Manual: install CLI plus provider/local model config | No | No | Full TUI | Sessions/config persist |
An agent in your terminal that can actually do useful work:
-
Read and edit your code
Search files, apply edits, and write new ones -
Run shell commands
Explicit approval before commands execute -
Resume sessions
Close the terminal, come back later, keep going -
Work on constrained hardware
Automatic context condensing and model unload / reload around heavy tasks -
Connect to devices
Cameras, microphones, GPIO, and remote devices for edge and embedded workflows -
Connect MCP tools - optionally expose trusted MCP server tools through OpenJet's normal tool registry
-
Use the Python SDK
Automate the same runtime from scripts and external apps -
Auto-configure local inference
Hardware profiling and recommended settings for localllama.cpp -
Benchmark your setup
Sweep GPU layers, batch sizes, and thread counts on your own hardware
Interactive local agent work in the terminal.
Embed sessions, profile hardware, and automate workflows from Python.
from openjet.sdk import OpenJetSession, recommend_hardware_configMeasure prompt and generation performance on your active model profile.
openjet benchmark --sweepCloud coding agents need API keys, send your code to someone else's server, and charge per token.
Most local tools stop at chat. You can run a model, but you still do not have a real coding workflow.
OpenJet closes that gap. It is built for people who want the speed, control, and privacy of local LLMs without becoming experts in runtimes, config, and frontend/backend glue just to get started.
Everything runs on your machine.
- Usage: CLI
- Usage: Slash commands
- Usage: Skills
- Usage: MCP
- Usage: Device sources
- Usage: Workflow harness
- Usage: Session state and logging
Benchmarkers and testers are appreciated.
OpenJet core is licensed under Apache-2.0.
That means individual developers and companies can use, modify, and embed the core SDK and CLI freely under the Apache terms. Future paid offerings for hosted, team, or enterprise functionality may be shipped separately under commercial terms.
External contributions are accepted under the contributor terms in CONTRIBUTING.md and CLA.md.
