PromptShields

Secure AI Applications in 3 Lines of Code

Stop prompt injection, jailbreaks, and data leaks in production LLM applications.

Installation

pip install promptshields

Quick Start

from promptshield import Shield

shield = Shield.balanced()
result = shield.protect_input(user_input, system_prompt)

if result['blocked']:
    print(f"Blocked: {result['reason']} (score: {result['threat_level']:.2f})")
    print(f"Breakdown: {result['threat_breakdown']}")

That's it. Production-ready security in 3 lines.

Why PromptShields?

Feature	PromptShields	DIY Regex	Paid APIs
Setup Time	3 minutes	Weeks	Days
Cost	Free	Free	$$$$
Privacy	100% Local	Local	Cloud
F1 Score	0.97 (RF) / 0.96 (DeBERTa)	~0.60	~0.95
ML Models	4 + DeBERTa	None	Black box
Async	Native	DIY	Varies

What We Block

Prompt injection attacks (direct + indirect)
Jailbreak attempts (DAN, persona replacement)
System prompt extraction
PII leakage
Session anomalies
Encoded/obfuscated attacks (Base64, URL, Unicode)

Security Modes

Choose the right tier for your application:

Shield.fast()       # ~1ms  - High throughput (pattern matching only)
Shield.balanced()   # ~2ms  - Production default (patterns + session tracking)
Shield.strict()     # ~7ms  - Sensitive apps (+ 1 ML model + PII detection)
Shield.secure()     # ~12ms - Maximum security (4 ML models ensemble)

New in v2.5.0

Per-Layer Threat Breakdown

Every response now shows exactly which layer triggered:

result = shield.protect_input(user_text, system_prompt)
print(result["threat_breakdown"])
# {"pattern_score": 0.0, "ml_score": 0.994, "session_score": 0.0}

DeBERTa Support

shield = Shield(models=["deberta"])  # Auto-downloads from HuggingFace

Async Support

from promptshield import AsyncShield

shield = AsyncShield.balanced()
result = await shield.aprotect_input(user_text, system_prompt)

FastAPI Middleware

from promptshield import Shield
from promptshield.integrations.fastapi import PromptShieldMiddleware

app.add_middleware(PromptShieldMiddleware, shield=Shield.balanced())

Allowlist & Custom Rules

shield = Shield(
    patterns=True,
    models=["random_forest"],
    allowlist=["summarize this document", "translate to french"],
    custom_patterns=[r"jailbreak|dan mode|evil\s*bot"],
)

Benchmark Results

Trained on neuralchemy/Prompt-injection-dataset:

Model	F1	ROC-AUC	FPR	Latency
Random Forest	0.969	0.994	6.9%	<1ms
Logistic Regression	0.964	0.995	6.4%	<1ms
Gradient Boosting	0.961	0.994	7.9%	<1ms
LinearSVC	0.959	0.995	10.3%	<1ms
DeBERTa-v3-small	0.959	0.950	8.5%	~50ms

Pre-trained models: neuralchemy/prompt-injection-detector · neuralchemy/prompt-injection-deberta

Documentation

Full Documentation — Complete guide with framework integrations

Quickstart Guide — Get running in 5 minutes

License

MIT License — see LICENSE

Built by NeurAlchemy — AI Security & LLM Safety Research

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github		.github
models		models
promptshield		promptshield
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
DOCUMENTATION.md		DOCUMENTATION.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PromptShields

Installation

Quick Start

Why PromptShields?

What We Block

Security Modes

New in v2.5.0

Per-Layer Threat Breakdown

DeBERTa Support

Async Support

FastAPI Middleware

Allowlist & Custom Rules

Benchmark Results

Documentation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PromptShields

Installation

Quick Start

Why PromptShields?

What We Block

Security Modes

New in v2.5.0

Per-Layer Threat Breakdown

DeBERTa Support

Async Support

FastAPI Middleware

Allowlist & Custom Rules

Benchmark Results

Documentation

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages