Skip to content

Open-source AI robotics framework — tiered brain, 8 AI providers, multi-robot swarm, self-improving loop

License

Notifications You must be signed in to change notification settings

craigm26/OpenCastor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

272 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

OpenCastor

OpenCastor

The Universal Runtime for Embodied AI

PyPI CI License Python Discord

99,547 lines of Python · 3,431 tests · Python 3.10–3.13
Connect any AI model to any robot hardware through a single YAML config.


🚀 Install in 10 Seconds

curl -sL opencastor.com/install | bash
Other platforms

Windows 11 (PowerShell):

irm https://raw.githubusercontent.com/craigm26/OpenCastor/main/scripts/install.ps1 | iex

Docker:

docker compose up

Manual:

git clone https://github.com/craigm26/OpenCastor.git
cd OpenCastor
python3 -m venv venv && source venv/bin/activate
pip install -e ".[dev]"

Supports Linux, macOS (Apple Silicon & Intel), Windows 11, Raspberry Pi, Docker. Installer flags: --dry-run, --no-rpi, --skip-wizard

✨ What's New in v2026.2.26.3

  • Google setup reliability hardening (2026-02-26) — setup preflight now reports Google model fallback usage accurately, model availability probing only runs when GOOGLE_API_KEY is present, and Google auth guidance now clearly distinguishes optional ADC setup from required Gemini API-key model calls

  • Stability pass (2026-02-25) — full-suite hardening complete: 3,431 tests passing; fixed cross-platform daemon path rendering, JWT fallback edge cases, plugin SHA newline normalization, and async warning cleanup in Teams/WhatsApp channels

  • Messaging channels now drive hardware — fixed a silent VFS ACL bug where the channel principal was denied write access to /dev/motor, causing every WhatsApp/Telegram/Discord motor command to reply but never move the wheels

  • Clean channel replies — AI replies sent to messaging channels and TTS no longer include the raw JSON action block ({"type": ...}) the runtime uses internally

  • WaypointNav ESC floor — added min_drive_s = 0.4 s so RC ESCs have time to spool up on short-distance commands (configurable via physics.min_drive_s)

  • neonize 0.3.14 audio — voice messages on WhatsApp work again; uses download_any() API introduced in neonize 0.3.14

  • OpenRouter provider — one API key unlocks 100+ models (GPT-4.1, Claude, Gemini, Mistral, DeepSeek, LLaMA 3.3…)

  • IMU driver — MPU6050 / BNO055 / ICM-42688 accelerometer + gyroscope via smbus2, auto-detected

  • 2D LiDAR driver — RPLidar A1/A2/C1/S2, 4-sector obstacle mapping, REST API

  • Reactive obstacle avoidance — fuses LiDAR + OAK-D depth; e-stop zone, slow zone, configurable via REST

  • LLM response cache — SQLite cache keyed by SHA-256(instruction + image); dramatically cuts API cost on repeated scenes

  • JS/TS SDK (sdk/js/) — zero-dependency TypeScript client: command(), stream(), status(), stop()

  • Fine-tune exportcastor export-finetune / GET /api/finetune/export — JSONL in OpenAI/Anthropic format from episode memory

  • Sentence Transformers provider — dense embeddings for semantic search and RAG pipelines

  • VLA provider — Vision-Language-Action model support (OpenVLA / Octo / pi0)

  • 3D point cloud (castor/pointcloud.py) — PLY export from OAK-D depth frames

  • Object detection (castor/detection.py) — YOLOv8/DETR real-time detection with REST overlays

  • Simulation bridge (castor/sim_bridge.py) — generate MuJoCo MJCF / Gazebo SDF from RCAN spec

  • Personality profiles — inject tone characters (friendly, military, scientist…) into every brain prompt

  • Wake-word voice loop (castor/voice_loop.py) — Porcupine → STT → brain pipeline

  • Groq provider presetgroq_rover.rcan.yaml for LPU-accelerated inference

  • OAK-4 Pro presetoak4_pro.rcan.yaml with 4K + depth + IMU auto-detection

Previous highlights (v2026.2.22.x): ROS2 driver · swarm CLI · dead-reckoning nav · YAML behavior runner · multi-user JWT · WebRTC streaming · MediaPipe gestures · video recording · outbound webhooks · BM25 episode search

🔄 Self-Improving Loop (Sisyphus Pattern)

Your robot learns from its mistakes. After each task, the Sisyphus Loop analyzes what happened, identifies failures, generates fixes, verifies them, and applies improvements automatically.

Episode → PM (Analyze) → Dev (Patch) → QA (Verify) → Apply
  • Disabled by default — opt-in via castor wizard or YAML config
  • 4 cost tiers — from $0 (local Ollama) to ~$5-15/mo (Claude)
  • Auto-apply preferences — config tuning only, behavior rules, or manual review
  • Rollback — undo any improvement with castor improve --rollback <id>
  • ALMA consolidation — cross-episode pattern analysis for deeper learning
castor improve --episodes 10    # Analyze last 10 episodes
castor improve --status         # View improvement history

🧠 Tiered Brain Architecture

OpenCastor doesn't send every decision to a $0.015/request cloud API. Instead, it routes through three layers — only escalating when needed:

┌─────────────────────────────────────────────────────────┐
│                    MESSAGING LAYER                       │
│         WhatsApp · Telegram · Discord · Slack            │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                   API GATEWAY                            │
│            FastAPI · REST · Webhooks · JWT               │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│               TIERED BRAIN STACK                         │
│                                                          │
│  ┌────────────────────────────────────────────────────┐  │
│  │  Layer 3: PLANNER         Claude Opus · ~12s       │  │
│  │  Complex reasoning, multi-step plans, novel tasks  │  │
│  ├────────────────────────────────────────────────────┤  │
│  │  Layer 2: FAST BRAIN      HF / Gemini · ~500ms    │  │
│  │  Classification, simple Q&A, routine decisions     │  │
│  ├────────────────────────────────────────────────────┤  │
│  │  Layer 1: REACTIVE        Rule engine · <1ms       │  │
│  │  Obstacle stop, boundary enforce, emergency halt   │  │
│  └────────────────────────────────────────────────────┘  │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                  PERCEPTION                              │
│  Hailo-8 NPU (YOLOv8) · OAK-D Depth · Camera · IMU    │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│               RCAN SAFETY KERNEL                         │
│    Physical bounds · Anti-subversion · Audit chain       │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                  DRIVER LAYER                            │
│       PCA9685 · Dynamixel · GPIO · Serial · I2C         │
└──────────────────────┬──────────────────────────────────┘
                       │
                  [ YOUR ROBOT ]

Cost-effective by default. The reactive layer handles 80% of decisions at zero cost. The fast brain handles another 15%. The planner only fires for genuinely complex tasks.

🤖 Supported AI Providers

Provider Models Latency Best For
Anthropic claude-opus-4-6, claude-sonnet-4-6 ~12s Complex planning, safety-critical reasoning
Google gemini-2.5-flash, gemini-2.5-pro ~500ms Multimodal, video, speed
OpenAI gpt-4.1, gpt-4.1-mini ~2s Instruction following, 1M context
OpenRouter 100+ models (Mistral, DeepSeek, LLaMA…) Varies Multi-model fallback, cost optimization
HuggingFace Transformers, any hosted model ~500ms Fast brain layer, classification
Ollama llava:13b, any local model Varies Privacy, offline, zero cost
llama.cpp GGUF models ~200ms Edge inference, Raspberry Pi
MLX Apple Silicon native (mlx-lm, vLLM-MLX) ~50ms Mac M1–M4, 400+ tok/s
Apple Foundation Models apple-balanced, apple-creative, apple-tagging ~50-300ms Apple Intelligence on eligible Macs
Groq llama-3.3-70b, mixtral-8x7b ~100ms LPU-accelerated, fastest API
VLA OpenVLA, Octo, pi0 Varies End-to-end robot action models
Sentence Transformers all-MiniLM-L6-v2, others ~10ms Semantic search, RAG, embeddings
ONNX Runtime Any .onnx model ~50ms Quantized on-device inference
Kimi / MiniMax / Qwen Chinese LLMs Varies Chinese language, local Qwen3
Claude OAuth Proxy-authenticated Claude ~12s Team/org deployments

Swap providers with one YAML change:

agent:
  provider: "anthropic"
  model: "claude-opus-4-6"

👁️ Vision & Perception

Hailo-8 NPU — On-Device Object Detection

No cloud API calls needed. The Hailo-8 neural processing unit runs YOLOv8 locally:

  • 80 COCO object classes — people, vehicles, animals, furniture, and more
  • ~250ms inference — fast enough for real-time obstacle avoidance
  • Zero API cost — all processing happens on the edge

OAK-D Stereo Depth Camera

RGB + depth streaming via DepthAI v3. Get 3D spatial awareness for navigation, manipulation, and mapping.

perception:
  camera:
    type: "oakd"
    depth: true
    resolution: "1080p"
  npu:
    type: "hailo8"
    model: "yolov8n"
    confidence: 0.5

🛡️ Safety First

OpenCastor implements defense-in-depth safety, inspired by ContinuonAI principles and fully RCAN spec compliant:

Layer What It Does
Physical Bounds Workspace limits, joint constraints, force capping
Anti-Subversion Prompt injection defense, input sanitization
Work Authorization Dangerous commands require explicit approval
Tamper-Evident Audit Hash-chained logs at /proc/safety — any tampering is detectable
Emergency Stop Hardware and software e-stop, reactive layer < 1ms
castor audit --verify        # Verify audit chain integrity
castor approvals             # View/approve dangerous commands
castor privacy --config r.yaml  # Show sensor access policy

📦 Quick Start

🧩 Agent Skills

External agents can load production-ready runbooks from skills/:

  • skills/opencastor-operator/SKILL.md — robot operations, diagnostics, config checks, safety guardrails, and failure recovery.
  • skills/opencastor-developer/SKILL.md — API checks, RCAN validate/lint/migrate workflow, and dashboard/watch debugging flow.

If you are integrating with coding agents (Codex, Claude, etc.), point them to these skill files first so they follow OpenCastor-safe command sequences.

1. Install & Configure

curl -sL opencastor.com/install | bash
castor wizard

The wizard now uses a shared CLI/web setup-v2 flow: device probe, stack selection, model profiles, provider preflight, guided fallback, hardware preset, and optional messaging setup (WhatsApp/Telegram). It remembers your previous choices.

Setup now includes device-aware stack profiles:

  • apple_native — Apple Native (Recommended on eligible Mac)
  • mlx_local_vision — MLX Local Vision
  • ollama_universal_local — Ollama Universal Local
Apple Profile Meaning
apple-balanced Apple Balanced
apple-creative Apple Creative
apple-tagging Apple Tagging

If Apple Foundation Models are unavailable, setup shows guided fallback choices automatically.

Apple Foundation Models Troubleshooting

  • Apple Intelligence disabled: enable it in macOS System Settings and wait for model assets.
  • Device not eligible: use MLX or Ollama fallback stack.
  • Model not ready: keep device online and retry setup later.
  • Xcode requirement: install Xcode 26+ and ensure xcodebuild is available.

2. Run

castor run --config my_robot.rcan.yaml

3. Open the Dashboard

http://localhost:8501

4. Diagnose Issues

castor doctor

Minimal Python Example

from castor.providers import get_provider
from castor.drivers.pca9685 import PCA9685Driver

brain = get_provider({"provider": "anthropic", "model": "claude-opus-4-6"})
driver = PCA9685Driver(config["drivers"][0])

while True:
    frame = camera.capture()
    thought = brain.think(frame, "Navigate to the kitchen, avoid obstacles.")
    if thought.action:
        driver.move(thought.action["linear"], thought.action["angular"])

🏪 Community Hub

Browse, install, and share robot recipes:

castor hub search "delivery bot"
castor hub install @alice/warehouse-picker
castor hub publish my_robot.rcan.yaml

Recipes are shareable RCAN configs — complete robot personalities with perception, planning, and safety settings bundled together.

🔧 CLI Reference

Setup & Config
castor wizard                          # Interactive setup wizard
castor quickstart                      # Wizard + demo in one command
castor configure --config robot.yaml   # Interactive config editor
castor install-service --config r.yaml # Generate systemd unit file
castor learn                           # Step-by-step tutorial
castor doctor                          # Full system health check
castor fix                             # Auto-fix common issues
Run & Monitor
castor run --config robot.yaml         # Perception-action loop
castor run --config robot.yaml --simulate  # No hardware needed
castor gateway --config robot.yaml     # API gateway + messaging
castor dashboard                       # Streamlit web UI
castor demo                            # Simulated demo
castor shell --config robot.yaml       # Interactive command shell
castor repl --config robot.yaml        # Python REPL with robot objects
castor status                          # Provider/channel readiness
castor logs -f                         # Structured colored logs
castor benchmark --config robot.yaml   # Performance profiling
Hardware & Recording
castor test-hardware --config robot.yaml  # Test motors individually
castor calibrate --config robot.yaml      # Interactive calibration
castor record --config robot.yaml         # Record a session
castor replay session.jsonl               # Replay a recorded session
castor watch --gateway http://127.0.0.1:8000  # Live telemetry
Hub & Fleet
castor hub search "patrol bot"         # Browse community recipes
castor hub install @user/recipe        # Install a recipe
castor hub publish config.yaml         # Share your recipe
castor discover                        # Find RCAN peers on LAN
castor fleet                           # Multi-robot status (mDNS)
Safety & Admin
castor approvals                       # View/approve dangerous commands
castor audit --since 24h               # View audit log
castor audit --verify                  # Verify chain integrity
castor privacy --config robot.yaml     # Sensor access policy
castor token --role operator           # Issue JWT
castor upgrade                         # Self-update + health check

🏗️ Supported Hardware

Pre-made RCAN presets for popular kits, or generate your own:

Kit Price Preset
Waveshare AlphaBot / JetBot ~$45 config/presets/waveshare_alpha.rcan.yaml
Adeept RaspTank / DarkPaw ~$55 config/presets/adeept_generic.rcan.yaml
SunFounder PiCar-X ~$60 config/presets/sunfounder_picar.rcan.yaml
Robotis Dynamixel (X-Series) Varies config/presets/dynamixel_arm.rcan.yaml
OAK-4 Pro + Depth + IMU ~$150 config/presets/oak4_pro.rcan.yaml
DIY (ESP32, Arduino, custom) Any Generate with castor wizard

🏫 STEM & Second-Hand Hardware

OpenCastor explicitly supports the parts that students, educators, and hobbyists actually have — donated kits, school surplus, eBay finds, and sub-$50 Amazon staples. If you found it at Goodwill, a school auction, or a makerspace scrap bin, there's probably a preset for it.

Kit Typical New Price Where to Find Used Preset
LEGO Mindstorms EV3 ~$300 new School surplus, eBay $30-80 config/presets/lego_mindstorms_ev3.rcan.yaml
LEGO SPIKE Prime ~$320 new STEM program donations, eBay $80-150 config/presets/lego_spike_prime.rcan.yaml
VEX IQ System ~$250 new Robotics team surplus, school auctions $50-120 config/presets/vex_iq.rcan.yaml
Makeblock mBot ~$50 new eBay $10-25, Amazon Warehouse config/presets/makeblock_mbot.rcan.yaml
Arduino + L298N (DIY) ~$8-15 total Makerspace bins, AliExpress config/presets/arduino_l298n.rcan.yaml
ESP32 + Motor Driver (DIY) ~$6-12 total AliExpress, hackerspaces config/presets/esp32_generic.rcan.yaml
Yahboom ROSMASTER X3 ~$150-200 Amazon Warehouse, eBay config/presets/yahboom_rosmaster.rcan.yaml
Elegoo Tumbller / Smart Car ~$35-40 new Amazon Warehouse $15-25, eBay config/presets/elegoo_tumbller.rcan.yaml
Freenove 4WD Car (Pi-based) ~$40 new eBay $15-25 (Pi not included) config/presets/freenove_4wd.rcan.yaml
Cytron Maker Pi RP2040 ~$10 new Hackerspaces, STEM lab surplus config/presets/cytron_maker_pi.rcan.yaml

🔍 Not sure what you have? See the Hardware Identification Guide for a decision tree: "I found this at a thrift store, now what?"

Tips for Second-Hand Hardware

  • Test first, code later. Run castor test-hardware --config config/presets/<preset>.rcan.yaml -y to verify motion and stop behavior before writing autonomy code.
  • Cables are the most common failure point. LEGO connector cables, USB-B ports, and servo leads are all cheap to replace.
  • Clone boards are fine. Arduino Uno clones with CH340 USB chips work perfectly. You may need to install the CH341SER driver on Windows.
  • Battery health matters. Test battery packs under load — many donated robots have degraded cells that drop voltage and confuse motor drivers.
  • Firmware toolchains vary by kit. Use each preset's notes section for the expected firmware/runtime path and connection mode.

🤝 Contributing

OpenCastor is fully open source (Apache 2.0) and community-driven.

Areas we need help with: driver adapters (ODrive, VESC, ROS2), new AI providers (Mistral, Grok, Cohere), messaging channels (Matrix, Signal), sim-to-real (Gazebo / MuJoCo), and tests.

📄 License

Apache 2.0 — built for the community, ready for the enterprise.


Built on the RCAN Spec by Craig Merry

About

Open-source AI robotics framework — tiered brain, 8 AI providers, multi-robot swarm, self-improving loop

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •