An AI-powered assistant for code reviews and improvement suggestions. Privacy-focused and local-first - your code never leaves your hardware when using local models.
Warning
This is an ongoing research project under active development. Features and APIs may change without notice, and breaking changes may occur between versions. Use in production at your own risk.
- π Web UI - Modern chat interface with persistent sessions and conversation management
- π§ RAG (Retrieval-Augmented Generation) - Semantic search over your documents for context-aware responses
- π§ Tool Calling - File operations, code search, and bash commands with built-in security
- π Plugin System - Extend capabilities with JavaScript plugins (NEW!)
- π AI Code Reviews - Language-specific analysis and suggestions
- π Environment Awareness - LLM receives system context for smarter responses
- π Security First - Path validation, .squidignore support, and user approval for all operations
- π Universal Compatibility - Works with LM Studio, OpenAI, Ollama, Mistral, and other OpenAI-compatible APIs
Your code never leaves your hardware when using local LLM services (LM Studio, Ollama, etc.).
- π Complete Privacy - Run models entirely on your own machine
- π Local-First - No data sent to external servers with local models
- π‘οΈ You Control Your Data - Choose between local models (private) or cloud APIs (convenient)
- π Secure by Default - Multi-layered security prevents unauthorized file access
Privacy Options:
- Maximum Privacy: Use LM Studio or Ollama - everything runs locally, no internet required for inference
- Cloud Convenience: Use OpenAI or other cloud providers - data sent to their servers for processing
- Your Choice: Squid works with both - you decide based on your privacy needs
All file operations require your explicit approval, regardless of which LLM service you use.
For Docker installation (recommended): Only Docker Desktop 4.34+ or Docker Engine with Docker Compose v2.38+ is required. All AI models are automatically managed.
For manual installation: You'll need:
-
Rust toolchain (for building squid)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
-
An OpenAI-compatible LLM service (choose one):
Local LLM Options
Run AI models locally with these tools:
LM Studio (Recommended for GUI)
- User-friendly interface for running local LLMs
- Download from https://lmstudio.ai/
- Recommended model:
lmstudio-community/Qwen2.5-Coder-7B-Instruct-MLX-4bit - Default endpoint:
http://127.0.0.1:1234/v1 - No API key required
Ollama (Lightweight CLI)
- Command-line tool for running LLMs
- Install:
brew install ollama(macOS) or https://ollama.com/ - Recommended model:
ollama pull qwen2.5-coder - Default endpoint:
http://localhost:11434/v1 - No API key required
Docker Model Runner
- Manage AI models through Docker
- Enable in Docker Desktop Settings β AI tab
- Pull models:
docker model pull hf.co/bartowski/Qwen2.5-Coder-7B-Instruct-GGUF:Q4_K_M - Default endpoint:
http://localhost:12434/engines/v1 - No API key required
Cloud API Services (OpenAI-Compatible)
All these services use the standard OpenAI API format - just change the endpoint URL and API key:
OpenAI
- Endpoint:
https://api.openai.com/v1 - API Key: https://platform.openai.com/api-keys
- Models:
gpt-4,gpt-4-turbo,gpt-3.5-turbo, etc.
Mistral AI
- Endpoint:
https://api.mistral.ai/v1 - API Key: https://console.mistral.ai/
- Models:
devstral-2512,mistral-large-latest,mistral-small-latest, etc.
Other Compatible Services
- OpenRouter (https://openrouter.ai/) - Access to multiple LLM providers
- Together AI (https://together.ai/) - Fast inference
- Anyscale (https://anyscale.com/) - Enterprise solutions
- Groq (https://groq.com/) - Ultra-fast inference
- Any custom OpenAI-compatible endpoint
The easiest way to get started - automated setup with helpful checks:
# Clone the repository
git clone https://github.com/DenysVuika/squid.git
cd squid
# Setup environment configuration
cp .env.docker.example .env
# Run the setup script (recommended)
chmod +x docker-setup.sh
./docker-setup.sh setup
# Or use Docker Compose directly
docker compose up -dThe setup script will:
- β Verify Docker and Docker Compose versions
- β Check Docker AI features are enabled
- β Check available disk space (10GB+ recommended)
- β Build Squid server image
- β Pull AI models (~4.3GB total)
- β Start all services with health checks
This automatically pulls and runs:
- Squid server (web UI + API) on http://localhost:3000
- Qwen2.5-Coder 7B (bartowski/Q4_K_M, ~4GB) - Main LLM
- Nomic Embed Text v1.5 (~270MB) - Embeddings for RAG
Requirements:
- Docker Desktop 4.34+ with AI features enabled, or
- Docker Engine with Docker Compose v2.38+
- 10GB RAM available for Docker
Apple Silicon: Default config uses CPU inference (optimized for M1/M2/M3/M4). See docker-compose.yml for GPU options.
Useful commands:
./docker-setup.sh status # Check service status
./docker-setup.sh logs # View logs
./docker-setup.sh stop # Stop services
./docker-setup.sh restart # Restart services
./docker-setup.sh update # Update models and imagesFor manual installation with your own LLM service:
cargo install squid-rsThis installs the squid command globally. You'll need to configure it to connect to your LLM service (see Configuration section).
Clone the repository and install locally:
git clone https://github.com/DenysVuika/squid.git
cd squid
cargo install --path .cargo build --releaseFor development, use cargo run -- instead of squid in the examples below.
The web UI is automatically built during Rust compilation via build.rs. Simply run:
cargo build --releaseThe build script (build.rs) will:
- Check if the web sources have changed since the last build
- Automatically run
npm ci(ornpm install) andnpm run buildin theweb/directory - Output the built assets to the
static/folder, which is then embedded into the binary - Skip the web build if npm is not available (falls back to existing static files)
The squid serve command will then serve both the web UI and the API from the same server.
Manual Build (Optional): You can still build the web UI manually if needed:
cd web
npm install
npm run build
cd ..Note: If you're using a pre-built binary from crates.io or releases, the web UI is already included.
Docker Compose automatically manages AI models and services, but requires a .env file for configuration.
Setup Steps:
# 1. Copy the Docker environment template
cp .env.docker.example .env
# 2. Start the services
docker compose up -dThe .env file configures:
- Model endpoints:
API_URL,SQUID_EMBEDDING_URL(connect to Docker AI models) - Model identifiers:
SQUID_EMBEDDING_MODEL(which embedding model to use)
Default configuration:
- LLM: Qwen2.5-Coder 7B via Docker AI at
http://llm:8080/v1 - Embeddings: Nomic Embed Text v1.5 via Docker AI at
http://embedding:8080/v1 - Context window: 32K tokens (set in docker-compose.yml)
- Log level: info (set in docker-compose.yml)
- RAG: Enabled with semantic search
β‘ Performance Note: Docker AI models now use Metal GPU acceleration by default on Apple Silicon (M1/M2/M3/M4), providing fast inference comparable to LM Studio and Ollama!
Using External LLM Services (optional):
You can use external services (OpenAI, LM Studio, Ollama) instead of Docker AI models by modifying the environment section in docker-compose.yml:
services:
squid:
environment:
# For LM Studio (running on host)
- API_URL=http://host.docker.internal:1234/v1
- SQUID_EMBEDDING_URL=http://host.docker.internal:1234/v1
- SQUID_EMBEDDING_MODEL=nomic-embed-text
# For Ollama (running on host)
# - API_URL=http://host.docker.internal:11434/v1
# For OpenAI
# - API_URL=https://api.openai.com/v1
# - API_KEY=your-api-key-hereNote: Use host.docker.internal to access services running on your Mac/PC from inside Docker containers.
Customization options:
- Adjust GPU layers, context window, log level, or database path by editing
docker-compose.yml - Disable GPU acceleration by setting
--n-gpu-layersto0for CPU-only inference - Allow network access by setting
SQUID_SERVER_ALLOW_NETWORK=true(binds to 0.0.0.0 instead of 127.0.0.1)
Important: Environment variables defined in docker-compose.yml always override any squid.config.json file in the workspace. This ensures Docker configuration takes precedence.
See .env.docker.example for all available options and examples, and docker-compose.yml for model configuration.
Docker uses a two-layer approach for workspace management:
- Host Mount: The
WORKSPACE_DIRenvironment variable controls what host directory is mounted into the container - Working Directory: Inside the container,
SQUID_WORKING_DIR=/workspacepoints to this mounted directory
By default, Docker mounts the current directory (.) to /workspace in the container. You can bind a specific project directory by setting the WORKSPACE_DIR environment variable:
# Option 1: Set in .env file
echo "WORKSPACE_DIR=/path/to/your/project" >> .env
# Option 2: Set inline when starting
WORKSPACE_DIR=/path/to/your/project docker compose up -d
# Option 3: Modify docker-compose.yml volumes section
# Change: - ${WORKSPACE_DIR:-.}:/workspace
# To: - /absolute/path/to/project:/workspaceThe workspace directory is where Squid will:
- Browse and display files in the Web UI file explorer
- Execute tool operations (read_file, write_file, etc.)
- Search for code patterns
- Run bash commands (when approved)
- Automatically created if it doesn't exist
Security Note: All file operations are restricted to the workspace directory and respect .squidignore patterns. Plugins cannot see the actual filesystem path and work with relative paths only.
Example - Work with a specific project:
# Navigate to squid directory
cd squid
# Set workspace to a different project
WORKSPACE_DIR=~/Projects/my-app docker compose up -d
# Now the Web UI will show files from ~/Projects/my-appFor manual installations (cargo install, from source), you need to configure Squid to connect to your LLM service.
Quick Setup:
# Interactive configuration (recommended)
squid init
# Or use command-line flags to skip prompts
squid init --url http://127.0.0.1:1234/v1 --log-level infoThis creates a squid.config.json file with:
- API endpoint configuration: Connection to your LLM service
- Default agents: Pre-configured
general-assistant(full access) andcode-reviewer(read-only) - Context window settings: Applied to each agent (can be customized per-agent later)
- Optional RAG setup: Document search and retrieval features
Note: CLI commands (squid ask, squid review) work with either:
- A
squid.config.jsonfile (recommended for agent configurations) - Environment variables in a
.envfile (minimum:API_URL) - A combination of both (environment variables override config file)
If neither is configured, commands will suggest running squid init or setting up environment variables.
For complete configuration documentation, including:
- Interactive and non-interactive
squid initusage - Configuration file format
- Environment variables
- All available options
See CLI Reference - Init Command.
Quick reference:
- LM Studio:
http://127.0.0.1:1234/v1 - Ollama:
http://localhost:11434/v1 - Docker Model Runner:
http://localhost:12434/engines/v1 - OpenAI:
https://api.openai.com/v1 - Mistral AI:
https://api.mistral.ai/v1
-
API_URL: The base URL for the OpenAI-compatible API endpoint- LM Studio:
http://127.0.0.1:1234/v1 - Ollama:
http://localhost:11434/v1 - Docker Model Runner:
http://localhost:12434/engines/v1 - OpenAI:
https://api.openai.com/v1 - Mistral AI:
https://api.mistral.ai/v1 - Other OpenAI-compatible services: Check provider's documentation
- LM Studio:
-
API_KEY: Your API key- Local services (LM Studio, Ollama, Docker):
not-needed - Cloud services (OpenAI, Mistral, etc.): Your actual API key
- Local services (LM Studio, Ollama, Docker):
-
SQUID_CONTEXT_WINDOW: Maximum context window size in tokens (optional, default:8192)- Used to calculate context utilization and prevent exceeding limits
- Set via
squid init --context-window 32768or in config file - See Common Context Window Sizes below for popular models
-
SQUID_LOG_LEVEL: Console logging verbosity (optional, default:error)error: Only errors (default)warn: Warnings and errorsinfo: Informational messagesdebug: Detailed debugging informationtrace: Very verbose output
-
SQUID_DB_LOG_LEVEL: Database logging verbosity (optional, default:debug)- Controls what log levels are saved to the database (viewable in Web UI)
error: Only errorswarn: Warnings and errorsinfo: Informational messagesdebug: Detailed debugging information (default)trace: Very verbose output- Note: Only logs from the squid application are saved to the database (dependency logs are filtered out)
- Independent from console logging - you can have different levels for console and database
-
SQUID_DATABASE_PATH: Path to the SQLite database file (optional, default:squid.db)- Used to store chat sessions, messages, and logs
- Can be relative (e.g.,
squid.db) or absolute (e.g.,/path/to/squid.db) - When relative, resolved based on:
- Config file location (if
squid.config.jsonexists) - Existing database in parent directories (searches upward)
- Current working directory (creates new database)
- Config file location (if
- Important: The server automatically finds the correct database when running from subdirectories
- Set via
.envfile to override automatic detection - Example:
SQUID_DATABASE_PATH=/Users/you/squid-data/squid.db
-
SQUID_WORKING_DIR: Working directory for AI operations (optional, default:./workspace)- Defines the root directory for all file operations, code search, and plugin access
- Can be relative (e.g.,
./workspace) or absolute (e.g.,/path/to/project) - Automatically created if it doesn't exist
- Can be set in config file (
"working_dir") or via environment variable - Docker: Set via
SQUID_WORKING_DIRenv var (mounted workspace should match this path) - Security: All file operations and plugin access are restricted to this directory
- Plugin isolation: Plugins cannot see the actual filesystem path and work with relative paths only
- Example:
SQUID_WORKING_DIR=/Users/you/projects/myapp - CLI override:
squid serve --dir /custom/path(overrides config)
-
Template Variables: Agent prompts support variable substitution using the Tera template engine
- Variables are automatically available in agent prompts (in
squid.config.json) - System prompts and built-in prompts also support template variables
- Available variables (secure and privacy-safe by default):
{{persona}}- Base AI personality and tool usage guidelines fromsrc/assets/persona.md(use in custom agent prompts to include core behavior){{now}}- Current timestamp in ISO 8601 format (e.g.,2026-03-28T12:34:56+00:00){{date}}- Current date (e.g.,2026-03-28){{time}}- Current time (e.g.,12:34:56){{year}},{{month}},{{day}}- Date components{{timestamp}}- Unix timestamp{{timezone}}- Timezone name (e.g.,UTC){{timezone_offset}}- Timezone offset (e.g.,+0000){{os}}- Operating system name (e.g.,macOS,Linux,Windows){{os_version}}- OS version{{kernel_version}}- Kernel version{{arch}}- System architecture (e.g.,x86_64,aarch64){{os_family}}- OS family (e.g.,unix,windows)
- Example usage in agent prompt:
Note: Include
{ "agents": { "code-reviewer": { "prompt": "{{persona}}\n\nYou are an expert code reviewer on {{os}} ({{arch}}) at {{now}}..." } } }{{persona}}at the start of custom agent prompts to preserve base personality and tool usage guidelines - Fully custom prompts (without
{{persona}}): For specialized agents with completely custom behavior, omit{{persona}}:This creates agents with no inherited guidelines β useful for demos, experiments, or highly specialized personalities. Pair with{ "agents": { "pirate": { "name": "Captain Squidbeard", "prompt": "Ye be Captain Squidbeard π΄ββ οΈ, a cunning pirate squid sailin' the seven seas of code! Speak like a proper pirate..." }, "shakespeare": { "name": "William Shakespeare", "use_tools": false, "prompt": "Thou art William Shakespeare, the immortal Bard of Avon βοΈ. Speak always in the eloquent style of the Elizabethan age..." } } }"use_tools": falsefor persona agents that should never invoke tools; the Tools button will be hidden automatically in the Web UI.
- Templates use Tera syntax - see Tera documentation for advanced features
- Variables are automatically available in agent prompts (in
-
server.allow_network: Allow server to be accessible from local network (optional, default:false)- When disabled (default), the server binds to
127.0.0.1(localhost only) - When enabled, the server binds to
0.0.0.0(accessible from other devices on local network) - Security Note: Only enable this if you trust your local network and understand the security implications
- Set via
squid.config.json:{ "server": { "allow_network": true } } - Or via environment variable (takes precedence):
SQUID_SERVER_ALLOW_NETWORK=true - Useful when:
- Accessing the Web UI from a mobile device or tablet on the same network
- Running Squid on a server and accessing it from other computers
- Testing the Web UI across different devices
- The console output indicates whether the server is accessible from the local network
- When disabled (default), the server binds to
Squid uses an agent-based architecture where each agent has its own model, system prompt, and tool permissions. This allows you to create specialized assistants for different tasks.
Agent Configuration Example:
{
"agents": {
"code-reviewer": {
"name": "Code Reviewer",
"enabled": true,
"description": "Reviews code for best practices and potential issues",
"model": "anthropic/claude-sonnet-4-5",
"context_window": 200000,
"prompt": "You are a code reviewer. Focus on security, performance, and maintainability.",
"permissions": {
"allow": ["now", "read_file", "grep"],
"deny": ["write_file", "bash"]
}
},
"general-assistant": {
"name": "General Assistant",
"enabled": true,
"description": "Full-featured coding assistant",
"model": "qwen2.5-coder-7b-instruct",
"context_window": 32768,
"pricing_model": "gpt-4o",
"suggestions": [
"Read and summarize the main source files",
"Show me the recent git log",
"Find all TODO comments in the codebase"
],
"permissions": {
"allow": ["now", "read_file", "write_file", "grep", "bash"],
"deny": []
}
}
},
"default_agent": "general-assistant"
}Agent Properties:
- id (object key): Unique identifier for the agent (e.g.,
"code-reviewer") - name: Display name shown in the UI
- enabled: Whether the agent appears in the agent selector (default:
true) - description: Brief explanation of the agent's purpose
- model: The underlying LLM model ID
- For local models: Use the model name (e.g.,
"qwen2.5-coder-7b-instruct") - For cloud services: Use provider/model format (e.g.,
"anthropic/claude-sonnet-4-5","openai/gpt-4")
- For local models: Use the model name (e.g.,
- pricing_model (optional): Model ID to use for cost estimation in the UI
- Required for local models to calculate token costs
- Maps your local model to a known cloud model's pricing (e.g.,
"gpt-4o","gpt-4o-mini") - Cloud models use their own pricing automatically and don't need this field
- Example: Set to
"gpt-4o"for high-capability models or"gpt-4o-mini"for smaller models
- context_window (optional): Maximum context window size in tokens for this agent
- Overrides the global
context_windowsetting for this specific agent - Used for accurate token usage tracking and context utilization calculations
- If not specified, uses the global
context_windowfrom the root config - Example:
32768for Qwen2.5-Coder,200000for Claude 3.5 Sonnet,128000for GPT-4
- Overrides the global
- prompt (optional): Custom system prompt for this agent
- Overrides the default system prompt
- Defines the agent's personality and behavior
- use_tools (optional): Whether this agent can use tools at all (default:
true)- When set to
false, all tool usage is disabled for this agent regardless ofpermissions - The Tools toggle button is hidden in the Web UI for agents with
use_tools: false - Enforced server-side β the client cannot override this setting
- Useful for pure persona agents (e.g. a Shakespeare chatbot) that should never call tools
- Example:
"use_tools": false
- When set to
- suggestions (optional): A list of suggested prompts displayed in the Web UI for this agent
- Shown as clickable chips above the input box when no messages have been sent yet
- Automatically updates when the user switches agents
- Agents with no
suggestionsfield show no suggestion bar - Tailor suggestions to each agent's capabilities (e.g. code-related prompts for a code reviewer)
- Example:
"suggestions": ["Review this file for security issues", "Find all TODOs"]
- permissions: Tool execution permissions specific to this agent
- allow: Tools this agent can use without confirmation
- deny: Tools this agent cannot use at all
- Supports granular bash permissions (e.g.,
"bash:ls","bash:git status") β οΈ Important: Dangerous bash commands (rm,sudo,chmod,dd,curl,wget,kill) are always blocked regardless of permissions
Default Agent:
The default_agent field specifies which agent is selected by default when starting a new session.
Multiple Agent Workflows:
You can create agents for different purposes:
- Code Reviewer (read-only): Reviews code without making changes
- Safe Explorer (read-only): Explores and documents code
- General Assistant (full access): Makes code changes and runs commands
- Terminal Assistant (command specialist): Focused on bash operations with specific command allowlists
- Persona Agent (no tools): A fully custom personality with
use_tools: falseβ e.g. a Shakespearean bard or a pirate
Migration Note: If you have an existing squid.config.json without agents, add the agents section to enable agent-based configuration. Legacy configurations continue to work but use the global model setting.
π Click to expand - Context window sizes for popular models
| Model | Context Window | Config Value |
|---|---|---|
| Qwen2.5-Coder-7B | 32K tokens | 32768 |
| GPT-4 | 128K tokens | 128000 |
| GPT-4o | 128K tokens | 128000 |
| GPT-3.5-turbo | 16K tokens | 16385 |
| Claude 3 Opus | 200K tokens | 200000 |
| Claude 3.5 Sonnet | 200K tokens | 200000 |
| Llama 3.1-8B | 128K tokens | 131072 |
| Mistral Large | 128K tokens | 131072 |
| DeepSeek Coder | 16K tokens | 16384 |
| CodeLlama | 16K tokens | 16384 |
How to find your model's context window:
- Check your model's documentation on Hugging Face
- Look in the model card or
config.json - Check your LLM provider's documentation
- For LM Studio: Look at the model details in the UI
Why it matters:
- β Real-time utilization percentage (e.g., "45% of 32K context used")
- β Prevents API errors from exceeding model capacity
- β Accurate token usage statistics displayed in web UI
- β Better planning for long conversations
Setting agent-specific context windows:
After running squid init, edit squid.config.json to set context windows per agent:
{
"agents": {
"general-assistant": {
"model": "local-model",
"context_window": 32768, // Qwen2.5-Coder: 32K
...
},
"code-reviewer": {
"model": "gpt-4",
"context_window": 128000, // GPT-4: 128K
...
}
}
}Squid provides both a modern Web UI and a command-line interface. We recommend the Web UI for the best experience.
Modern chat interface with session management, token usage tracking, and real-time cost estimates
# Start with default workspace (current directory)
docker compose up -d
# Work with a specific project directory
WORKSPACE_DIR=/path/to/your/project docker compose up -d
# Example: Analyze a React app
WORKSPACE_DIR=~/Projects/my-react-app docker compose up -d
# View logs
docker compose logs -f squid
# Stop services
docker compose downAccess the Web UI at http://localhost:3000
The workspace directory determines what files the AI can see and work with. All file operations, code search, and bash commands operate within this directory.
Start the built-in web interface for Squid:
# Start Web UI on default port (8080)
squid serve
# Specify a custom port
squid serve --port 3000
squid serve -p 3000
# Use a custom database file
squid serve --db=/path/to/custom.db
# Use a custom working directory
squid serve --dir=/path/to/project
# Combine all options
squid serve --port 3000 --db=custom.db --dir=/path/to/projectThe web server will:
- Launch the Squid Web UI at
http://127.0.0.1:8080(or your specified port, Docker uses 3000) - Provide a browser-based interface for interacting with Squid
- Expose REST API endpoints for chat, sessions, and logs
- Display the server URL and API endpoint on startup
Server Options (Manual Installation):
--port/-p: Port number to run the server (default:8080, Docker uses3000)--db: Path to custom database file (default:squid.dbin current/config directory)--dir: Working directory for the server (changes to this directory before starting)
Use Cases:
- Use
--dbto specify a different database file for separate projects or testing - Use
--dir(orWORKSPACE_DIRin Docker) to work with a specific project directory - The database path is relative to the working directory (after
--diris applied)
Web UI Features:
- Chat Page - Interactive chat interface with session management sidebar
- π Token usage indicator - Real-time context utilization percentage (e.g., "5.6% β’ 7.1K / 128K")
- π° Cost tracking - Displays estimated cost for both cloud and local models
- ποΈ Session sidebar - Browse and switch between past conversations
- βοΈ Auto-generated titles - Sessions titled from first message, editable inline
- π Multi-file attachments - Add context from multiple files
- Logs Page - View application logs with pagination
- π Filter by log level (error, warn, info, debug, trace)
- π Adjustable page size (25, 50, 100, 200 entries)
- π¨ Color-coded log levels and timestamps
- π Session ID tracking for debugging
The web UI and API are served from the same server, so the chatbot automatically connects to the local API endpoint.
Web UI Development (Hot Reload):
For development with instant hot reloading:
# Terminal 1 - Backend server
cargo run serve --port 8080
# Terminal 2 - Frontend dev server
cd web && npm run devThen open http://localhost:5173 in your browser. Changes to frontend code will appear instantly. The Vite dev server proxies API requests to the Rust backend.
To build for production: cd web && npm run build (outputs to static/ directory).
For advanced users and automation, Squid provides a full CLI. See the CLI Reference for detailed documentation on:
squid ask- Ask questions with optional file contextsquid review- Review code with language-specific analysissquid rag- Manage RAG document indexingsquid logs- View, clear, and clean up application logssquid init- Initialize project configuration
Configuration Requirement: Most CLI commands (ask, review, serve) require either a squid.config.json file OR essential environment variables (at minimum API_URL). You can:
- Run
squid initto create a config file with agent configurations - Use a
.envfile with environment variables likeAPI_URL,API_KEY, etc. - Mix both approaches (environment variables override config file settings)
Quick Examples:
# Ask a question (uses default agent)
squid ask "What is Rust?"
# Ask with specific agent
squid ask "What is Rust?" --agent code-reviewer
# Review a file (uses default agent)
squid review src/main.rs
# Review with specific agent
squid review src/main.rs --agent general-assistant
# Initialize RAG for a project
squid rag init
# View application logs
squid logs show --level error
# Clear all logs from database
squid logs reset
# Remove logs older than 30 days (default)
squid logs cleanup
# Remove logs older than 7 days
squid logs cleanup --max-age-days 7For complete CLI documentation, examples, and advanced usage, see docs/CLI.md.
Squid includes RAG capabilities for semantic search over your documents, enabling context-aware AI responses using your own documentation.
Quick Start:
# Docker (already configured)
docker compose up -d
# Or manually setup
mkdir documents
cp docs/*.md documents/
squid rag init
squid serve
# Click the RAG toggle (π) in the Web UIFeatures:
- π Semantic search over your documentation
- π One-click toggle in Web UI
- πΎ Persistent knowledge base - index once, query many times
- π Source attribution - see which documents were used
- π Auto-indexing - supports Markdown, code, configs, and more
Using RAG:
- Add documents to
./documentsdirectory - Run
squid rag initto index them - Toggle RAG in the Web UI to enable semantic search
- Ask questions about your documentation
For complete RAG documentation including configuration, API endpoints, best practices, and troubleshooting, see docs/RAG.md.
Database & Persistence:
- All chat sessions, messages, and logs are automatically saved to
squid.db(SQLite database) - Sessions persist across server restarts - your conversation history is always preserved
- The database location is automatically detected:
- If
squid.config.jsonexists, database is stored relative to the config file - If no config file, searches parent directories for existing
squid.db - Falls back to current directory if no database found
- If
- You can override the location with
SQUID_DATABASE_PATHenvironment variable or in config file - Run the server from any subdirectory - it will find and use the same database
Press Ctrl+C to stop the server.
The web server exposes REST API endpoints for programmatic access:
Chat Endpoint: POST /api/chat
Request Body:
{
"message": "Your question here",
"file_content": "optional file content",
"file_path": "optional/file/path.rs",
"system_prompt": "optional custom system prompt",
"model": "optional model ID (overrides config default)"
}Response: Server-Sent Events (SSE) stream with JSON events:
{"type": "content", "text": "response text chunk"}
{"type": "done"}Sessions Endpoints:
GET /api/sessions- List all sessions with metadataGET /api/sessions/{id}- Load full session historyDELETE /api/sessions/{id}- Delete a session
Logs Endpoint: GET /api/logs
Query Parameters:
page- Page number (default: 1)page_size- Entries per page (default: 50)level- Filter by level (error, warn, info, debug, trace)session_id- Filter by session ID
Response:
{
"logs": [
{
"id": 1,
"timestamp": 1234567890,
"level": "info",
"target": "squid::api",
"message": "Server started",
"session_id": null
}
],
"total": 100,
"page": 1,
"page_size": 50,
"total_pages": 655
}Agents Endpoint: GET /api/agents
Fetches available agents configured in your squid.config.json file. Each agent has its own model, system prompt, and tool permissions.
Response:
{
"agents": [
{
"id": "general-assistant",
"name": "General Assistant",
"description": "Full-featured coding assistant with all tools available",
"model": "qwen2.5-coder-7b-instruct",
"enabled": true,
"pricing_model": "gpt-4o",
"suggestions": [
"Read and summarize the main source files",
"Show me the recent git log",
"Find all TODO comments in the codebase"
],
"permissions": {
"allow": ["now", "read_file", "write_file", "grep", "bash"],
"deny": []
}
},
{
"id": "code-reviewer",
"name": "Code Reviewer",
"description": "Reviews code for best practices and potential issues",
"model": "anthropic/claude-sonnet-4-5",
"enabled": true,
"permissions": {
"allow": ["now", "read_file", "grep"],
"deny": ["write_file", "bash"]
}
}
],
"default_agent": "general-assistant"
}Features:
- Returns all enabled agents from your configuration
- Each agent includes its model, description, and tool permissions
- Optional
pricing_modelfield for cost estimation (useful for local models) - Used by Web UI agent selector to display available assistants
Example using curl:
curl -X POST http://127.0.0.1:8080/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "Explain Rust async/await"}' \
-NExample using fetch (JavaScript):
const response = await fetch('http://127.0.0.1:8080/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: 'Explain async/await in Rust' })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const event = JSON.parse(line.slice(6));
if (event.type === 'content') {
console.log(event.text);
}
}
}
}See web/src/lib/chat-api.ts for a complete TypeScript client implementation.
Note: The chatbot UI is served from the same server as the API, so it automatically uses the relative path /api/chat without requiring any configuration.
The web server also provides REST endpoints for managing chat sessions:
List all sessions: GET /api/sessions
Response:
{
"sessions": [
{
"session_id": "abc-123-def-456",
"message_count": 8,
"created_at": 1707654321,
"updated_at": 1707658921,
"preview": "Explain async/await in Rust",
"title": "Async/await in Rust"
}
],
"total": 1
}Get session details: GET /api/sessions/{session_id}
Response:
{
"session_id": "abc-123-def-456",
"messages": [
{
"role": "user",
"content": "Explain async/await in Rust",
"sources": [],
"timestamp": 1707654321
},
{
"role": "assistant",
"content": "Async/await in Rust...",
"sources": [{"title": "sample.rs"}],
"timestamp": 1707654325
}
],
"created_at": 1707654321,
"updated_at": 1707658921,
"title": "Async/await in Rust"
}Update a session (rename): PATCH /api/sessions/{session_id}
Request:
{
"title": "My Custom Session Title"
}Response:
{
"success": true,
"message": "Session updated successfully"
}Delete a session: DELETE /api/sessions/{session_id}
Response:
{
"success": true,
"message": "Session deleted successfully"
}Web UI Features:
- Browse all conversations in the sidebar
- Sessions automatically titled from first user message
- Click any session to load its full history
- Rename sessions with inline edit dialog (pencil icon)
- Delete sessions with confirmation dialog
- Toggle sidebar visibility
- Sessions show title (or preview), message count, and last activity time
Squid's LLM can intelligently use tools (read files, write files, search code, execute safe commands) when needed. All tool operations are protected by multiple security layers and require user approval.
Security Features:
- π‘οΈ Path Validation - Blocks system directories automatically
- π Ignore Patterns -
.squidignorefile (like.gitignore) - π User Approval - Manual confirmation for each operation
- π» Safe Bash - Dangerous commands always blocked
Available Tools:
- π read_file - Read file contents
- π write_file - Write to files with preview
- π grep - Search code with regex
- π now - Get current date/time
- π» bash - Execute safe commands (ls, git, cat, etc.)
For complete security documentation and tool usage examples, see:
- Security Features - Detailed security layers and best practices
- CLI Reference - Tool calling examples and usage
Squid supports a powerful JavaScript-based plugin system that lets you extend its capabilities with custom tools. Plugins are invoked by the LLM alongside built-in tools.
1. Create a plugin:
mkdir -p plugins/my-plugin
cd plugins/my-plugin2. Add plugin.json:
{
"id": "my-plugin",
"title": "My Plugin",
"description": "Does something useful",
"version": "0.1.0",
"api_version": "1.0",
"security": {
"requires": ["read_file"],
"network": false,
"file_write": false
},
"input_schema": {
"type": "object",
"properties": {
"message": { "type": "string" }
},
"required": ["message"]
},
"output_schema": {
"type": "object",
"properties": {
"result": { "type": "string" }
},
"required": ["result"]
}
}3. Add index.js:
function execute(context, input) {
context.log(`Processing: ${input.message}`);
return { result: input.message.toUpperCase() };
}
globalThis.execute = execute;4. Enable in config:
{
"agents": {
"general-assistant": {
"permissions": {
"allow": ["read_file", "plugin:*"]
}
}
}
}- π Sandboxed Execution - QuickJS runtime with memory and CPU limits
- π Hybrid Security - Plugins declare needs, agents control access
- β Schema Validation - JSON Schema for input/output
- π Dual Location - Workspace (
./plugins/) and global (~/.squid/plugins/) - π οΈ Context API - Safe file operations, logging, and HTTP (when permitted)
Three example plugins are included:
- markdown-linter - Analyzes markdown files for style issues
- code-formatter - Formats code with basic rules
- http-fetcher - Fetches content from URLs
Test code-formatter (fully functional):
squid chat
> Format this JSON: {"name":"test","age":30}Test markdown-linter (fully functional, reads files):
squid chat
> Lint the README.md fileTest http-fetcher (requires network permission):
squid chat
> Fetch content from https://api.github.comNote: The http-fetcher plugin requires network: true permission in its plugin.json. All context APIs (readFile, writeFile, httpGet) are now fully implemented and functional.
See docs/PLUGINS.md for complete documentation including:
- Plugin structure and API reference
- Security model and permissions
- Complete examples
- Best practices and troubleshooting
- Quick Start Guide - Get started in 5 minutes
- CLI Reference - Complete command-line interface documentation
- Plugin Development Guide - Create custom JavaScript tools (NEW!)
- RAG Guide - Retrieval-Augmented Generation (semantic document search)
- Security Features - Tool approval and security safeguards
- System Prompts Reference - Guide to all system prompts and customization
- Examples - Comprehensive usage examples and workflows
- Changelog - Version history and release notes
- Sample File - Test file for trying out the file context feature
- Example Files - Test files for code review prompts
- AI Agents Guide - Instructions for AI coding assistants working on this project
Try the code review and security features with the provided test scripts:
# Test code reviews (automated)
./tests/test-reviews.sh
# Test security approval (interactive)
./tests/test-security.sh
# Or test individual examples
squid review sample-files/example.rs
squid review sample-files/example.ts --stream
squid review sample-files/example.html -m "Focus on accessibility"See tests/README.md for complete testing documentation and sample-files/README.md for details on each example file.
Using with LM Studio
- Download and install LM Studio from https://lmstudio.ai/
- Download the recommended model:
lmstudio-community/Qwen2.5-Coder-7B-Instruct-MLX-4bit - Load the model in LM Studio
- Start the local server (
βοΈ icon β "Start Server") - Set up your
.env:API_URL=http://127.0.0.1:1234/v1 API_KEY=not-needed
- Run:
squid ask "Write a hello world program in Rust" # Or with a file squid ask -f sample-files/sample.txt "What is this document about?" # Use --no-stream for complete response at once squid ask --no-stream "Quick question"
Using with Ollama
- Install Ollama from https://ollama.com/
- Start Ollama service:
ollama serve
- Pull the recommended model:
ollama pull qwen2.5-coder
- Set up your
.env:API_URL=http://localhost:11434/v1 API_KEY=not-needed
- Run:
squid ask "Write a hello world program in Rust" # Or with a file squid ask -f mycode.rs "Explain this code" # Use --no-stream if needed squid ask --no-stream "Quick question"
Using with OpenAI
- Get your API key from https://platform.openai.com/api-keys
- Set up your
.env:API_URL=https://api.openai.com/v1 API_KEY=sk-your-api-key-here
- Run:
squid ask "Explain the benefits of Rust" # Or analyze a file squid ask -f mycode.rs "Review this code for potential improvements" # Use --no-stream for scripting result=$(squid ask --no-stream "Generate a function name")
Using with Mistral API
- Get your API key from https://console.mistral.ai/
- Set up your
.env:API_URL=https://api.mistral.ai/v1 API_KEY=your-mistral-api-key-here
- Run:
squid ask "Write a function to parse JSON" # Or use code review squid review myfile.py # Mistral models work great for code-related tasks
Apache-2.0 License. See LICENSE file for details.
