#

llama-cpp

Here are 1,255 public repositories matching this topic...

mozilla-ai / llamafile

Distribute and run LLMs with a single file.

cross-platform speech-to-text local-inference llama-cpp local-llm local-ai gguf open-source-ai single-file-executable

Updated Jun 9, 2026
C++

getumbrel / llama-gpt

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!

ai self-hosted openai llama gpt gpt-4 llm chatgpt llamacpp llama-cpp gpt4all localai llama2 llama-2 code-llama codellama

Updated Apr 23, 2024
TypeScript

SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot llama gpt multi-modal llm llava semantic-kernel llamacpp llama-cpp llama2 llama3

Updated Jun 1, 2026
C#

SharpAI / DeepCamera

Open-Source AI Camera Skills Platform, AI NVR & CCTV Surveillance. Local VLM video analysis with Qwen, DeepSeek, SmolVLM, LLaVA, YOLO26. LLM-powered agentic security camera agent — watches, understands, remembers & guards your home via Telegram, Discord or Slack. Pluggable AI skills. OpenAI, Google, Anthropic or local AI. Runs on Mac Mini & AI PC.

Updated Jun 18, 2026
JavaScript

Luce-Org / lucebox-hub

Fast LLM speculative inference server for consumer hardware.

spark kernel cuda cuda-kernels luce poolside rtx3090 llama-cpp local-ai qwen speculative-decoding dflash megakernel speculative-prefill pflash lucebox

Updated Jun 18, 2026
C++

maid

Mobile-Artificial-Intelligence / maid

Maid is a free and open source application for interfacing with llama.cpp models locally, and with Anthropic, DeepSeek, Ollama, Mistral and OpenAI models remotely.

android facebook chatbot openai llama mistral claude chatgpt anthropic llama-cpp ollama gguf mobile-artificial-intelligence deepseek

Updated Apr 7, 2026
TypeScript

off-grid-mobile-ai

alichherawalla / off-grid-mobile-ai

The Swiss Army Knife of Offline AI. Chat, Speak, and Generate Images - Privacy First, Zero Internet. Download an LLM and use it on your mobile device. No data ever leaves your phone. Supports text-to-text, vision, text-to-image

privacy-first edge-ai ondevice mobile-ai llama-cpp local-ai offline-llm gguf stable-diffusion-android offline-ai whisper-android tool-calling ondevice-ai

Updated Jun 18, 2026
TypeScript

node-llama-cpp

withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

Updated Jun 18, 2026
TypeScript

antoinezambelli / forge

A Python framework for self-hosted LLM tool-calling and multi-step agentic workflows

python self-hosted agents llm llama-cpp function-calling ollama llamafile agentic-workflow agentic-ai tool-calling

Updated Jun 18, 2026
Python

Light-Heart-Labs / DreamServer

Turn your PC, Mac, or Linux box into an AI server. LLM inference, chat UI, voice, agents, workflows, RAG, and image generation.

docker text-to-speech amd self-hosted nvidia speech-to-text workflow-automation ai-agents rag n8n llm llama-cpp comfyui local-ai open-webui strix-halo

Updated Jun 18, 2026
Shell

undreamai / LLMUnity

Create characters in Unity with LLMs!

chat gamedev ai unity chatbot game-development dialogue unity3d character npc llama unity2d conversational-ai rag llm generative-ai llama-cpp

Updated Apr 29, 2026
C#

RunanywhereAI / RCLI

Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG

text-to-speech metal speech-to-text voice-assistant rag parakeet on-device-ai apple-silicon ai-assistant llm llama-cpp local-ai tool-calling kokoro-tts qwen3 lfm2 kitten-tts

Updated Mar 16, 2026
C++

gotzmann / llama.go

llama.go is like llama.cpp in pure Golang!

llama gpt alpaca vicuna gpt3 gpt4 llm chatgpt dalai llama-cpp gpt4all

Updated Sep 20, 2024
Go

ggml-org / Llama-macOS

A cosy home for your LLMs.

macos swift ai llms llama-cpp

Updated Jun 18, 2026
Swift

docker / compose-for-agents

Build and run AI agents using Docker Compose. A collection of ready-to-use examples for orchestrating open-source LLMs, tools, and agent runtimes.

docker docker-compose examples openai-gym self-hosted ai-agents large-language-models llama-cpp agentic-workflows

Updated Jun 4, 2026
TypeScript

mybigday / llama.rn

React Native binding of llama.cpp

android ios react-native llama llm llama-cpp

Updated Jun 13, 2026
C++

FuJacob / cotabby

Cotabby is local AI autocomplete for your entire Mac. Open source. On device. Everywhere you type.

macos productivity autocomplete ai llama menu-bar writing-assistant llama-cpp local-ai cotypist cotabby

Updated Jun 14, 2026
Swift

Anbeeld / beellama.cpp

DFlash & TurboQuant in llama.cpp with up to 3x faster generation and 7.5x more KV cache in same VRAM

inference quantization kv-cache llm llm-serving llama-cpp ggml llm-inference speculative-decoding dflash turboquant

Updated Jun 17, 2026
C++

the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders

ai transformers humaneval llm langchain llama-cpp ggml

Updated Jun 21, 2025
Python

withcatai / catai

Run AI ✨ assistant locally! with simple API for Node.js 🚀

nodejs ai chatbot openai chatui vicuna ai-assistant llm chatgpt dalai llama-cpp vicuna-installation-guide localai wizardlm local-llm catai ggmlv3 gguf node-llama-cpp

Updated Nov 16, 2025
TypeScript

Improve this page

Add a description, image, and links to the llama-cpp topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llama-cpp topic, visit your repo's landing page and select "manage topics."