GitHub - AISecurityConsortium/AIGoat: AI Goat - Learn AI security by attacking and defending a real AI-powered e-commerce application. Built for Red Teamers, security researchers, AI enthusiasts, and students to learn about adversarial attacks on AI/LLM systems. It is strictly for educational use, and the authors disclaim responsibility for any misuse.

AI Goat - Learn AI security by attacking and defending a real AI-powered e-commerce application.

This application is intentionally vulnerable. Run it only on your local machine for learning purposes. Do not expose it to the internet.

Why AI Goat Exists

Modern AI applications introduce new attack surfaces that traditional security testing does not cover. Prompt injection, data poisoning, system prompt leakage, and RAG manipulation are real risks in production AI systems -- but most teams have never seen them in action.

AI Goat was created to help developers and security professionals learn how AI systems can be attacked and defended through hands-on experimentation. Instead of reading about vulnerabilities in theory, you exploit them yourself in a safe, local environment.

What is AI Goat Shop?

AI Goat Shop is an online store with a built-in AI chatbot called Cracky. The store looks and works like a real e-commerce site -- you can browse products, place orders, leave reviews, and chat with the AI assistant.

The catch: Cracky has real security vulnerabilities that you can exploit. Every vulnerability maps to the OWASP Top 10 for LLM Applications, the industry standard for AI/LLM security risks.

The platform gives you:

9 Attack Labs -- guided exercises that teach you how to exploit specific AI vulnerabilities
9 CTF Challenges -- capture-the-flag challenges where you earn points by successfully attacking the chatbot
3 Defense Levels -- see how the same attack behaves when defenses are turned on
A poisonable Knowledge Base -- inject fake documents and watch the AI trust them

Everything runs on your computer. No cloud accounts. No API keys. No internet required after setup.

Who Is AI Goat For?

Security engineers — studying LLM vulnerabilities and building defenses
AI/ML engineers — understanding how the systems they build can be exploited
Red teamers — practicing adversarial AI techniques in a controlled environment
Security trainers — running workshops and demonstrations on AI security risks
Researchers — investigating prompt injection, data poisoning, and guardrail effectiveness
Students — learning about AI security for the first time through hands-on exercises

Typical Training Workflow

Whether you're using AI Goat for self-study or running a workshop, the typical flow looks like this:

Deploy the environment — Start the vulnerable AI chatbot on your local machine
Interact with the assistant — Browse the store, chat with the AI, and understand normal behavior
Perform attacks — Follow the attack labs to exploit vulnerabilities like prompt injection, data poisoning, and system prompt leakage
Capture flags — Complete CTF-style challenges by successfully exploiting the chatbot and submitting the flags you earn
Apply defenses — Switch to higher defense levels and observe how guardrails block or mitigate attacks
Re-test — Try the same attacks again with defenses active and see what still works

Quick Start

Prerequisites

Tool	Purpose	Install
Python 3.11+	Backend server	python.org
Node.js 18+	Frontend app	nodejs.org
Ollama	Local AI model	ollama.ai

One-Command Start

git clone https://github.com/AISecurityConsortium/AIGoat.git
cd AIGoat
./scripts/start.sh

That's it. The script will:

Check that Ollama is running (starts it if not)
Download the Mistral AI model if it's not already installed
Set up the database with demo data
Start the backend API server
Start the frontend web application

Once you see "AI Goat is running!", open your browser:

What	URL
Application	http://localhost:3000
API Docs	http://localhost:8000/docs

Login Credentials

Username	Password	Role
`alice`	`password123`	Regular user
`bob`	`password123`	Regular user
`charlie`	`password123`	Regular user
`admin`	`admin123`	Admin

Stopping the Application

./scripts/stop.sh

Starting Fresh (Reset Database)

./scripts/stop.sh --clean
./scripts/start.sh

Or use the shorthand:

./scripts/start.sh --fresh

Docker (Alternative)

Requires Docker Desktop with at least 12 GB RAM allocated. See Hardware Requirements for details.

# One-time setup: create the persistent model volume
docker volume create ollama_models

# Start the application
cd docker
docker-compose up --build

The Docker setup starts three containers: backend, frontend (via Nginx), and Ollama. On first run the backend automatically pulls the Mistral model (~4.5 GB).

The ollama_models volume is external -- it survives docker-compose down -v so the model is only downloaded once. To reset app data (database, challenges) without re-downloading the model:

docker-compose down -v   # removes app data, keeps model
docker-compose up -d     # re-seeds on next startup

How the Application Works

The AI Chatbot (Cracky)

Cracky is the AI shopping assistant. It can answer questions about products, look up orders, and help customers. At Defense Level 0 (Vulnerable), Cracky has no security protections -- it will freely share internal data, follow injected instructions, and adopt fake personas.

You interact with Cracky through the chat widget available on every page.

Defense Levels

Use the toggle in the navigation bar to switch between:

Level	Name	What Happens
0	Vulnerable	No protections. All attacks work. This is where you start.
1	Hardened	Input validation, intent classification, and output filtering are active. Some attacks still work with creative phrasing.
2	Guardrailed	Full NVIDIA NeMo Guardrails are active. Most direct attacks are blocked. Only advanced techniques have a chance.

Knowledge Base

The Knowledge Base page lets you add, edit, and delete documents that the chatbot uses as reference material. This is an intentional attack surface -- you can inject fake information and watch the chatbot trust it.

After adding or modifying entries, click "Sync to Vector DB" to push changes into the chatbot's retrieval pipeline.

Attack Labs

Attack Labs are guided exercises. Each lab targets a specific OWASP LLM vulnerability, provides example prompts, and explains what to look for at each defense level.

Navigate to the Attack Labs page from the sidebar.

OWASP	Lab	What You Learn
LLM01	Prompt Injection (3 labs)	Override chatbot instructions, inject hidden commands, chain multi-turn attacks
LLM02	Sensitive Info Disclosure (3 labs)	Extract admin credentials, customer data, internal config from the chatbot's context
LLM04	Data Poisoning (3 labs)	Inject fake info through reviews and tips that the chatbot repeats as fact
LLM05	Insecure Output (XSS)	Make the chatbot generate HTML/JavaScript that executes in the browser
LLM07	System Prompt Leakage	Extract the chatbot's hidden system instructions, including its confidential config block
LLM08	RAG Weaknesses (3 labs)	Poison the Knowledge Base, manipulate vector retrieval, flood the context window
LLM09	Misinformation (3 labs)	Trick the chatbot into fabricating certifications, endorsements, and safety data

Challenges

Challenges are CTF-style (Capture The Flag) tasks. You earn points by successfully exploiting a vulnerability and submitting the flag that appears in the chatbot's response.

Navigate to the Challenges page from the sidebar.

How Challenges Work

Each challenge has its own dedicated chat window, completely separate from the main shop chatbot.

Click a challenge card to open its detail view
Click "Start Challenge" to activate the dedicated challenge chat
Use the challenge chat to craft your attack (for KB challenges, enable the KB toggle in the chat header)
When you succeed, a flag (like AIGOAT{a1b2c3d4...}) appears in the chat response
Copy the flag and paste it into the submission field on the left panel

Challenge List

#	Name	Difficulty	Points	How to Interact
1	Prompt Injection	Beginner	100	Chatbot
2	System Prompt Extraction	Beginner	100	Chatbot
3	RAG Knowledge Poisoning	Beginner	150	Knowledge Base + Chatbot
4	Context Override	Beginner	100	Chatbot
5	Multi-turn Escalation	Intermediate	250	Chatbot (3+ messages)
6	Identity Hijacking	Intermediate	200	Chatbot
7	Authoritative Context Poisoning	Intermediate	300	Knowledge Base + Chatbot
8	Chained KB + Injection	Intermediate	400	Knowledge Base + Chatbot
9	Guardrail Erosion	Intermediate	500	Chatbot (4+ messages)

Total possible points: 2,100

Flags are unique per user and generated dynamically -- you cannot find them in the source code.

For full walkthrough solutions, see docs/challenges-walkthrough.md.

Platform Architecture

A simplified view of how the main components connect:

User
  │
  ▼
AI Chatbot Interface (React frontend)
  │
  ▼
FastAPI Backend
  │
  ├── Defense Pipeline (Level 0 / 1 / 2)
  │
  ├── Large Language Model (Ollama + Mistral)
  │
  ├── Knowledge Base (ChromaDB + RAG retrieval)
  │
  ├── Guardrails / Security Filters (NeMo Guardrails)
  │
  └── Challenge Evaluation Engine (flag generation)

The user interacts with the AI chatbot through the React frontend. Every message passes through the FastAPI backend, which routes it through the defense pipeline before reaching the language model. The defense level determines what checks are applied — from no protections at Level 0, to full NeMo Guardrails at Level 2. The Knowledge Base provides context via RAG retrieval, and the challenge engine evaluates exploit attempts and generates flags when an attack succeeds.

Detailed Architecture

graph TB
    subgraph User["User's Browser"]
        FE["React 18 + Material-UI<br/>(Port 3000)"]
    end

    subgraph Backend["FastAPI Backend (Port 8000)"]
        API["API Routes<br/>/api/chat, /api/products,<br/>/api/auth, /api/knowledge-base"]
        MW["JWT Auth Middleware"]

        subgraph Defense["Defense Pipeline"]
            L0["Level 0: Vulnerable<br/>(no checks, passthrough)"]
            L1["Level 1: Hardened<br/>Input Validator → Intent Classifier<br/>→ Policy Engine → Output Moderator"]
            L2["Level 2: Guardrailed<br/>NeMo Guardrails + Colang Rules"]
        end

        CE["Challenge Engine<br/>9 Evaluators + HMAC Flag Generator"]
        RAG["RAG Subsystem<br/>Query Rewriter → Retriever → Context Builder"]
    end

    subgraph Storage["Data Layer"]
        DB[("SQLite<br/>Users, Orders, Products,<br/>Challenges, Telemetry")]
        VDB[("ChromaDB<br/>Vector Embeddings<br/>for Knowledge Base")]
    end

    subgraph AI["AI Layer"]
        OLLAMA["Ollama<br/>(Local LLM - Mistral)"]
        EMB["Sentence Transformers<br/>(all-MiniLM-L6-v2)"]
    end

    subgraph Guardrails["NeMo Guardrails (Level 2)"]
        INPUT_RAILS["Input Rails<br/>Injection, Jailbreak, Sensitive Data,<br/>Social Engineering, Off-topic"]
        OUTPUT_RAILS["Output Rails<br/>PII Detection, Prompt Leak,<br/>HTML/XSS, Hallucination"]
    end

    FE -->|"REST API + SSE (streaming)"| MW
    MW --> API
    API --> Defense
    L0 -->|"No checks"| OLLAMA
    L1 -->|"Validated + classified"| OLLAMA
    L2 --> INPUT_RAILS
    INPUT_RAILS -->|"Blocked → canned response"| API
    INPUT_RAILS -->|"Allowed"| OLLAMA
    OLLAMA --> OUTPUT_RAILS
    OUTPUT_RAILS -->|"Clean"| API
    OUTPUT_RAILS -->|"Blocked → safe response"| API
    API --> CE
    CE -->|"Evaluate exploit → emit flag"| API
    API --> RAG
    RAG --> VDB
    RAG --> EMB
    EMB --> VDB
    API --> DB
    OLLAMA -.->|"Local inference<br/>Port 11434"| AI

How data flows through the system:

The React frontend sends API requests to the FastAPI backend.
Every request passes through JWT authentication middleware.
Chat messages enter the Defense Pipeline, which applies checks based on the current defense level (0, 1, or 2).
At Level 2, NeMo Guardrails inspect the input first. If a threat is detected, the message never reaches the AI -- a canned refusal is returned immediately.
If the message passes input checks, it goes to Ollama (the local AI model) for a response.
The AI's response passes through output rails (PII detection, prompt leak detection, HTML/XSS detection) before reaching the user.
Challenges have their own dedicated chat endpoint (/api/challenges/{id}/chat) with challenge-specific prompts. The Challenge Engine evaluates whether the exploit succeeded and injects a dynamic flag into the response.
The RAG subsystem retrieves relevant Knowledge Base documents from ChromaDB when KB integration is enabled.

Project Structure

AIGoat/
├── app/                    Python backend (FastAPI)
│   ├── api/                API route handlers
│   ├── challenges/         Flag engine and 9 exploit evaluators
│   ├── core/               Config, database, security utilities
│   ├── defense/            Input validation, intent classification, output moderation
│   ├── models/             Database models (SQLAlchemy)
│   ├── rag/                Knowledge Base retrieval (ChromaDB + embeddings)
│   └── services/           Business logic (cart, orders, chat)
├── config/
│   ├── config.yml          Main configuration file
│   └── labs.yml            Attack lab definitions
├── frontend/               React application (Material-UI)
├── prompts/                System prompts for each defense level and lab
│   ├── level0/             Vulnerable (no restrictions)
│   ├── level1/             Hardened (with security rules)
│   ├── level2/             Guardrailed (strict containment)
│   ├── labs/               Lab-specific vulnerable prompts
│   └── challenges/         Challenge-specific system prompts (one per challenge)
├── guardrails/             NeMo Guardrails config (Level 2)
├── scripts/
│   ├── start.sh            Start the application
│   ├── stop.sh             Stop the application
│   └── seed.py             Database seeding script
├── docs/
│   ├── workshop-guide.md   Instructor guide for workshops
│   └── challenges-walkthrough.md  Full challenge solutions
├── media/                  Product images and logo
└── docker/                 Docker Compose setup

Configuration

All settings are in config/config.yml:

app:
  secret_key: "aigoat-dev-secret-change-in-production"

ollama:
  base_url: "http://localhost:11434"
  model: "mistral"

defense:
  level: 0          # Default defense level (0, 1, or 2)

rag:
  enabled: true
  top_k: 5          # Number of KB documents retrieved per query

Hardware Requirements

The Mistral 7B model alone needs ~4.5 GB of RAM to load. If your machine does not have at least 8 GB of free RAM, the AI model will fail to start and the chatbot will not work.

Resource	Minimum	Recommended
RAM	8 GB free (not total -- free)	16 GB+ total
Disk	6 GB (app + Mistral model weights)	10 GB
CPU	4 cores	8+ cores
GPU	Not required, but strongly recommended	Any NVIDIA/Apple Silicon GPU with 6 GB+ VRAM

Why You Want a GPU

Without a GPU, every chat response runs on your CPU and takes 10-30 seconds. With a GPU (NVIDIA CUDA or Apple Silicon Metal), responses come back in 1-3 seconds. If you have a supported GPU, Ollama will use it automatically -- no configuration needed.

Docker Users

Docker containers share the same pool of RAM. You must allocate at least 12 GB of RAM to Docker Desktop -- the Mistral model (4.5 GB) plus the backend (PyTorch, sentence-transformers: ~3 GB) plus OS overhead leaves no room at the default 8 GB setting.

Docker Desktop → Settings → Resources → Memory → set to 12 GB or higher

If your machine only has 8 GB of total RAM, either:

Run Ollama natively on the host (outside Docker) and point the backend at it, or
Switch to a smaller model like tinyllama in docker/config.yml and update the ollama-pull entrypoint in docker-compose.yml

For GPU passthrough in Docker, use the NVIDIA Container Toolkit (--gpus all) or run Ollama natively on the host and point the backend at it.

Troubleshooting

"Ollama not reachable": Install Ollama from ollama.ai and make sure it's running (ollama serve).

Chatbot is slow: Ollama runs the AI model on your CPU by default. A GPU significantly improves speed. You can also try a smaller model by changing ollama.model in config/config.yml to "tinyllama".

"Port already in use": Run ./scripts/stop.sh first, or manually kill processes on ports 8000 and 3000.

Frontend shows blank page: Check that the backend is running at http://localhost:8000. The frontend depends on the API.

Knowledge Base not affecting chatbot: After adding/editing KB entries, you must click "Sync to Vector DB" on the Knowledge Base page, and enable the KB toggle in the chatbot.

Security Notice

AI Goat Shop is intentionally vulnerable software. The vulnerabilities are features, not bugs.

Intentional vulnerabilities (do not report): Prompt injection, system prompt extraction, RAG poisoning, data leakage, XSS via chatbot output at Level 0, weak default credentials.

Unintentional vulnerabilities (please report): Authentication bypass, arbitrary code execution, container escape, SQL injection in the backend, path traversal.

See SECURITY.md for the full disclosure policy.

Project Status

AI Goat is an actively evolving open-source platform for learning how modern AI systems can be attacked and defended.

The project is currently in early development and new labs, defense mechanisms, and learning modules will be added over time.

Project Evolution

AI Goat started as an experimental research project exploring how AI-powered applications can be attacked. Over time, it has evolved into a structured open-source training platform for AI security — with guided labs, CTF challenges, multiple defense levels, and a growing body of educational content.

The recent introduction of formal governance, structured licensing, and contribution guidelines reflects this maturity. The goal is to build a reliable, well-documented platform that individuals, teams, and organizations can confidently use for AI security training.

See GOVERNANCE.md for details on project ownership and decision-making.

Community and Research

AI Goat welcomes participation from anyone interested in AI security:

Researchers studying prompt injection, RAG vulnerabilities, or guardrail effectiveness
Educators building AI security workshops or university courses
Developers experimenting with defensive techniques and guardrail configurations
Security teams evaluating AI risks in their organizations

If you have questions, ideas, or feedback:

Open an issue on the GitHub repository
Submit a pull request — see CONTRIBUTING.md for guidelines
Start a discussion — we're happy to hear from the community

Licensing

AI Goat uses two licenses to keep the platform open while protecting training content.

Platform Code — Apache License 2.0

The application code is open source under the Apache License 2.0. This includes:

app/ — backend (FastAPI, Python), except app/challenges/
frontend/ — frontend (React, Material-UI)
guardrails/ — NeMo Guardrails configuration
scripts/ — startup and utility scripts
docker/ — Docker Compose setup
config/config.yml — runtime configuration

Anyone can use, modify, and distribute the platform code — including for commercial purposes.

Training Content — CC BY-NC-SA 4.0

The educational material is licensed under Creative Commons BY-NC-SA 4.0. This includes:

app/challenges/ — flag generation engine and exploit evaluators
prompts/ — system prompts for defense levels, labs, and challenges
docs/ — workshop guides and challenge walkthroughs
media/ — images, logos, and training assets
config/labs.yml — attack lab definitions

You can freely:

Use it for personal learning
Run labs and challenges locally
Experiment, modify, and research
Share modifications under the same license

You need permission for:

Selling AI Goat based training
Paid workshops using the labs
Certification programs built from AI Goat content

See TRAINING_LICENSE.md for details on commercial usage.

Made with ❤️ by Farooq and Nal

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
app		app
config		config
docker		docker
docs		docs
frontend		frontend
guardrails/config		guardrails/config
media		media
prompts		prompts
scripts		scripts
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
RELEASE_NOTES_v0.3.md		RELEASE_NOTES_v0.3.md
SECURITY.md		SECURITY.md
TRAINING_LICENSE.md		TRAINING_LICENSE.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AI Goat - Learn AI security by attacking and defending a real AI-powered e-commerce application.

Why AI Goat Exists

What is AI Goat Shop?

Who Is AI Goat For?

Typical Training Workflow

Quick Start

Prerequisites

One-Command Start

Login Credentials

Stopping the Application

Starting Fresh (Reset Database)

Docker (Alternative)

How the Application Works

The AI Chatbot (Cracky)

Defense Levels

Knowledge Base

Attack Labs

Challenges

How Challenges Work

Challenge List

Platform Architecture

Detailed Architecture

Project Structure

Configuration

Hardware Requirements

Why You Want a GPU

Docker Users

Troubleshooting

Security Notice

Project Status

Project Evolution

Community and Research

Licensing

Platform Code — Apache License 2.0

Training Content — CC BY-NC-SA 4.0

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages