English · 中文说明
Encrypt PII, not meaning. Locally.
The privacy layer between you and AI. Your identity stays on your device — AI gets the meaning, not you.
Rated PRvL-Gold on the PRvL reference suite — see the spec for what it measures.
from argus_redact import redact
redacted, key = redact("张三的电话是13812345678,身份证号110101199003074610", names=["张三"], lang="zh", salt=42)
print(redacted)
# expected: P-83811的电话是138****5678,身份证号ID-03292
print(sorted(key.items()))
# expected: [('138****5678', '13812345678'), ('ID-03292', '110101199003074610'), ('P-83811', '张三')]pip install argus-redact| Promise | How | |
|---|---|---|
| 🛡️ | Protected — your PII never leaves your device | 3-layer local detection: regex → NER → local LLM |
| 🧠 | Usable — AI can still understand and help you | Pseudonym replacement preserves meaning and context |
| 🔄 | Reversible — substring-level inverse via per-message key | One-line restore() for verbatim LLM echoes; paraphrase / coref handled by compose layer, best-effort |
Other tools shred your PII — it's gone forever. argus-redact encrypts it with a different key every time. ETH Zurich research shows LLMs can deanonymize users for $1-4/person when pseudonyms are fixed. We generate fresh random keys per call — the cloud sees unrelated pseudonyms every time.
redact() emits per-type pseudonym codes, not Chinese label literals:
>>> redact("员工张三,身份证110101199003074610,电话13812345678", mode='fast', lang='zh')
('员工P-83811,身份证ID-89732,电话138****5678',
{'P-83811': '张三', 'ID-89732': '110101199003074610', '138****5678': '13812345678'})| Type group | Default output | Strategy | Reversible |
|---|---|---|---|
person / organization |
P-NNNNN / O-NNNNN |
pseudonym |
✓ |
phone / email / bank_card |
138****5678 (partial digits visible) |
mask |
✗ |
id_number / medical / ssn / ... |
ID-NNNNN / MED-NNNNN / SSN-NNNNN |
remove → per-type code |
✓ |
self_reference |
我 / 我妈 (kept verbatim) |
keep |
✓ |
To unify all reversible types under one prefix (hides PII type from the LLM):
redact(
text,
unified_prefix="R",
config={
"phone": {"strategy": "remove"}, # mask types must opt in to participate
"email": {"strategy": "remove"},
},
)
# → "员工R-83811,身份证R-89732,电话R-12345"<TYPE_N> 1-based sequential token style is on the future-release candidate list (no committed timeline). See docs/configuration.md for the current strategy reference.
argus-redact evaluates your text from your perspective, not a regulator's:
🟢 Safe — nothing about you is exposed
🟡 Caution — contains personal info, not dangerous alone
🟠 Danger — can narrow down to you specifically
🔴 Exposed — directly identifies you
from argus_redact import redact
report = redact("身份证110101199003074610,手机13812345678,确诊糖尿病", report=True)
report.risk.level # "critical"
report.risk.score # 1.0
report.risk.reasons # ("id_number (critical)", "phone (high)", "medical (critical)", ...)This is what compliance frameworks don't tell you: how dangerous is it to share this specific text with AI?
Layer 1 Rust+Regex phone, ID, bank card, email, self-reference, ... <0.2ms
│
produce_hints() → text_intent, pii_density, self_reference_tier
│
Layer 2 NER ← hints locations, organizations, standalone names 10-100ms
Layer 3 Local LLM implicit PII — symptoms→disease, behavior→belief ~20s
Layers are not independent — L1 passes hints to L2, enabling collaborative detection. Instruction text ("帮我看看这段代码") skips NER entirely. High PII density lowers NER thresholds. Cross-layer agreement boosts confidence.
Unicode-hardened: NFKC normalization, zero-width stripping, Cyrillic/Greek confusable defense, Chinese digit detection (一三八零零一三八零零零 → detected as phone).
Core engine (regex matching, entity merging, restore, pseudonym generation) is written in Rust via PyO3 for maximum performance. Python handles orchestration, NER models, and LLM integration.
60+ PII types across 3 layers — from phone numbers to medical diagnoses, religious beliefs, political opinions. Default is mode="fast" (Layer 1 only, zero deps, sub-ms). Opt in: mode="ner" (+ NER models) → mode="auto" (all three layers).
Telemetry: ARGUS_PERF_LOG=perf.jsonl for per-call timing breakdown. Details →
Deployment fit — modes have very different latency budgets; pick by where you sit in the request path:
| Mode | Latency (per doc) | Suitable as |
|---|---|---|
fast |
<1ms | Inline gateway plugin / hot LLM proxy path |
ner |
10–100ms | Sidecar / pre-flight middleware |
auto |
~20s (LLM-bound) | Async batch / offline review queue |
Don't put auto in front of an interactive LLM call. Use fast inline + auto in a parallel audit lane.
argus-redact is a PII data minimization aid, not an anonymization or compliance certification:
- L1 fast (regex) matches well-defined formats. Novel or obfuscated variants, cross-field inference attacks pass through.
- L2 NER is statistical inference; out-of-distribution text (informal, typo-heavy, minority names) has higher miss rate. See benchmark results for measured numbers.
- No guarantee against adversarial inputs — attackers can craft text that evades detection.
- Removing explicit PII ≠ anonymity. LLM agents can re-identify individuals by combining residual, individually-non-identifying cues with public data — even on redacted text, even during benign tasks (Ko et al. 2026). Reversible substitution protects explicit identifiers and preserves LLM utility; it does not defend against inference-based re-identification, which a per-document redactor cannot fully prevent — the residual comes from combinations of quasi-identifiers, not single fields (why coarsening one field doesn't fix it).
- Not a GDPR / PIPL anonymization framework — anonymization is a compliance process decision, not a single-library output.
When to use argus-redact: reversible pseudonymization for LLM pipelines where you need redact() → LLM → restore() with zero PII crossing the network boundary.
When to consider alternatives: if you need one-way English PII masking with a single model call, OpenAI Privacy Filter and similar model-based maskers may fit better. argus-redact's strongest suit is reversible pseudonymization with per-message keys; Chinese has the deepest support (HanLP + native validators), the other 7 languages have regex + spaCy NER coverage. Pick by the workload, not by exclusivity.
Combine argus-redact with audit logging, rate limiting, and upstream policy — no single layer is sufficient.
| zh | en | ja | ko | de | uk | in | br | |
|---|---|---|---|---|---|---|---|---|
| Phone | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| National ID | MOD11-2 + 15位旧版 | SSN | My Number | RRN | Tax ID | NINO | Aadhaar | CPF/CNPJ |
| Bank/Card | Luhn | Luhn | — | — | IBAN | — | PAN | — |
| Person names | HanLP | spaCy | spaCy | spaCy | spaCy | spaCy | spaCy | spaCy |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Mix freely: lang=["zh", "en", "de"]. Pass known names: names=["王一", "张三"].
Rust core (PyO3), mode="fast" — p50, Apple M-series, Python 3.11. Reproduce
with python tests/benchmark/bench_l1_rust_vs_python.py:
| Text | redact() | restore() | Throughput |
|---|---|---|---|
| Short (17 chars) | 0.03ms | <0.01ms | ~29,000 docs/sec |
| Medium (770 chars) | 0.75ms | 0.15ms | ~1,330 docs/sec |
| Long (10K chars) | 9.3ms | 0.18ms | ~107 docs/sec |
Pre-built wheels for all major platforms — no Rust toolchain needed to install:
✓ Linux x86_64 (glibc + musl/Alpine)
✓ Linux aarch64 (Raspberry Pi + Alpine ARM)
✓ macOS (Apple Silicon + Intel)
✓ Windows x64
× Python 3.10 / 3.11 / 3.12 / 3.13
Detection accuracy
| Mode | Precision | Recall | F1 |
|---|---|---|---|
| fast (regex) | 81.6% | 31.9% | 45.8% |
| ner (+ spaCy) | 74.9% | 42.8% | 54.4% |
| auto (+ Ollama 32B) | skipped this run |
ai4privacy en, 500 samples, v0.7.9. auto mode skipped on the maintainer's hardware — see benchmark-report.md for full matrix + reproduction commands.
For context: fast mode is high-precision / low-recall by design — it only emits formats it can validate (Luhn, MOD11-2, etc.). Recall comes from ner and auto at the cost of latency. Pick the mode for your deployment shape (see Deployment fit above). Full benchmarks → | Performance →
| Dimension | Current (v0.7.13) | Next milestone |
|---|---|---|
| Protected | 60+ PII types, L1-L3. 0% PII leak on default profile across GPT-5 / Claude-Opus-4.5 / Gemini-2.5-Pro / GLM-4.5 in the PRvL reference suite. pseudonym-llm profile: 100% on three of four models; 96% / Bronze on Claude-Opus-4.5 (single reroll cell). Not a guarantee against adversarial inputs — see prvl-standard.md for full matrix. Cross-layer hints in 8 langs (zh/en/ja/ko/de/uk/in/br). SHAKE-256 derivation + full-salt entropy + faker identity-pass guard. State export omits salt by default; HTTP server refuses no-auth start; CLI writes O_NOFOLLOW + key files mode 0600; MCP token store TTL+LRU (v0.6.2). Windows CI + property-tested invariants + mutation-tested core (v0.6.3) + perf budget CI gate (v0.6.4) + session-isolation in integrations (v0.6.6) + README pinned-to-doctest + version-sync CI guard (v0.6.6) + compose namespace + pure-layer purity guard (v0.6.7) + seed→salt API rename + PIITypeDef SSOT + Presidio bridge through public redact + 3 new types (v0.6.8) + compose helpers shipped (v0.6.9) + Layer 1 freeze guards + KDF replay vectors + dead code subtract + manylinux digest pin (v0.6.10) + Adapter authoring surface (compose.register_pii_type / PIITypeDef / PatternMatch) + KDF replay edge cases (full-FF salt fix) + Layer 2 signature snapshot (v0.6.11) + HK/Macao travel permits + housing-fund zh L1 coverage (v0.6.12). v0.7.x — 100% Rust core SSOT: argus-redact-core crate + crates.io publish, with patterns/validators/normalization/replace+restore/fakers/person-scoring + the full L1 redact/restore engine ported to Rust (v0.7.0–v0.7.8) + fail-closed hardening & detection-correctness (v0.7.9–v0.7.10) + in-browser wasm build (v0.7.11). v0.7.12 — quasi-identifier detection breadth: evidence-gated zh bare-region, occupation, medical condition/allergy, and hobby (new type) detection via a shared evidence_detector framework, plus a re-identification eval (PRvL+ X-axis); the unreleased generalize strategy removed |
Adversarial testing |
| Usable | PRvL U=100%. Pseudonym codes + realistic mode (zh + en + RFC shared) + per-call strategy overrides + keep strategy (whitelisted) + resumable streaming sessions + incremental streaming default + cross-language alias restore (zh ↔ en) |
Task-aware guidance |
| Reversible | PRvL R by task: reference 100%, extract 50%, creative 0% (by design). Cross-language LLM rewrites (张三 → Zhang San) auto-restored via result.aliases + restore(text, key, aliases=...) |
Task-aware guidance |
| Compliance | Meets PIPL Art.28 sensitive PII categories, risk assessment + profiles | PIPL/GDPR/HIPAA (byproduct) |
| Coverage | 8 langs, 4 LLMs benchmarked, 6 frameworks | Browser extension |
# Assess risk before sending to AI
report = redact(text, report=True)
report.risk.level # "critical"
report.risk.pipl_articles # ("PIPL Art.28", "PIPL Art.51", ...)
report.entities # detected PII details
report.stats # per-layer timing# CLI
argus-redact assess <<< "身份证110101199003074610"Compliance profiles: redact(text, profile="pipl") / "gdpr" / "hipaa".
Type filtering: redact(text, types=["phone", "id_number"]) / types_exclude=["address"].
Default redaction emits placeholder labels ([TEL-79329], P-164) — clear for audit, but breaks downstream LLM reasoning because the message structure is gone. The pseudonym-llm profile replaces PII with realistic-looking but reserved-range fake values (e.g., 19999... mobile, 999... ID, 999999... bank card). LLMs reason correctly; humans can still tell it's synthetic if they know the convention.
Each call returns three text forms sharing one key dict:
| Form | Example | Use for |
|---|---|---|
audit_text |
请拨打 [TEL-79329] 联系 P-164 |
Compliance archive — placeholder labels are auditable |
downstream_text |
请拨打 19999123456 联系张明 |
LLM input — semantic structure preserved |
display_text |
请拨打 19999123456ⓕ 联系张明ⓕ |
UI rendering — visible ⓕ marker prevents confusion |
from argus_redact import redact_pseudonym_llm, restore
# Chinese
zh = redact_pseudonym_llm("请拨打 13912345678 联系王建国", lang="zh")
zh.downstream_text # "请拨打 19999123456 联系张明" → LLM
zh.display_text # "请拨打 19999123456ⓕ 联系张明ⓕ" → UI
# English
en = redact_pseudonym_llm("Call (415) 555-1234, SSN 123-45-6789", lang="en")
en.downstream_text # "Call (555) 555-0142, SSN 999-37-2811" → LLM
en.audit_text # "Call [PHONE-23801], SSN [SSN-15772]" → audit
# Mixed (auto-detect)
mx = redact_pseudonym_llm("客户Wang at user@company.com", lang="auto")
# Round-trip works on any of the three forms, in any language
restore(zh.downstream_text, zh.key) # → original
restore(en.downstream_text, en.key) # → original
restore(mx.downstream_text, mx.key) # → original# CLI emits all three forms as JSON
echo "Call (415) 555-1234" | \
argus-redact redact -k key.json --profile pseudonym-llm -l en | \
jq .downstream_text
# "Call (555) 555-0142"Reserved ranges:
- zh:
199-99-XXXXXXmobile (sub-segment unassigned by 工信部),099-landline (no such area code),999XXXID address code (GB/T 2260 unassigned),999999bank BIN (银联 unassigned), 滨海市 fictional city. - en:
(555) 555-01XXphone (FCC permanent fictional reservation),999-XX-XXXXSSN (SSA never assigns 9XX),999999credit card BIN, John Doe / Jane Roe person, 1313 Mockingbird Lane address. - shared (RFC):
example.com/.org/.netemail (RFC 2606),192.0.2.0/24/198.51.100.0/24/203.0.113.0/24IPv4 (RFC 5737),2001:db8::/32IPv6 (RFC 3849),00:00:5E:00:53:xxMAC (RFC 7042).
Argus Gateway integration: response headers should include X-Argus-Redact-Profile: pseudonym-llm; UI clients render display_text, LLM clients consume downstream_text. Storage of downstream_text as business truth is unsafe — it's synthetic by design.
Real users named like canonical fakes (e.g., a real customer named 张三 or John Doe): pass reserved_names={"person_zh": ()} (or person_en) to disable that locale's canonical-name pollution detection so the real user's name flows through normal redaction.
For chat sessions or long-form input where text arrives in chunks, use StreamingRedactor (input side) and StreamingRestorer (output side). Both require complete logical units per chunk (sentence / paragraph / turn) — entities split across chunk boundaries are not handled.
from argus_redact.streaming import StreamingRedactor, StreamingRestorer
# Input side: redact each chunk; same original value across chunks → same fake
r = StreamingRedactor(salt=b"my-secret-salt", lang="zh")
for chunk in input_stream: # one sentence/paragraph/turn each
res = r.feed(chunk)
send_to_llm(res.downstream_text)
# Output side: restore LLM output stream at sentence boundaries
restorer = StreamingRestorer(r.aggregate_key())
for chunk in llm_output_stream:
restored = restorer.feed(chunk)
if restored:
print(restored, end="")
print(restorer.flush(), end="")True byte-level streaming (entities crossing chunk boundaries) needs full incremental detection and is roadmapped for a later release.
⚠️ Realistic-mode output must not be re-redacted (it would corrupt the key dict).redact_pseudonym_llmwill raisePseudonymPollutionErrorif called on already-faked input — callrestore()first.
Full API → · Design constraints →
| Install | |
|---|---|
| LangChain / LlamaIndex / FastAPI | core |
| Presidio bridge | pip install argus-redact[presidio] |
| MCP Server (Claude Desktop / Cursor) | pip install argus-redact[mcp] |
| HTTP API Server | pip install argus-redact[serve] |
| Structured data (JSON / CSV) | core |
| Streaming restore | core |
| Docker | slim 157MB / full 5GB |
PII never leaves your device. Per-message keys prevent cross-request profiling. Full security model →
Meets PIPL · GDPR · HIPAA technical requirements as a byproduct of its privacy-first design. Details →
| Getting Started | Install, first redact/restore, key management |
| API Reference | All parameters, return types, streaming, structured data |
| CLI Reference | Commands, flags, serve, MCP server |
| Configuration | Per-type strategies, enterprise mask rules, false positive reduction |
| Sensitive Info | Taxonomy, privacy levels, roadmap |
| PII Type Catalog | All PII types — strategy, sensitivity, PIPL/GDPR/HIPAA mapping (auto-generated) |
| Architecture | Three-layer engine, cross-layer hints, pure/impure separation |
| Language Packs | Adding new languages |
| Security Model | Threat model, compliance, per-message keys |
| PRvL Standard | Open evaluation standard: Privacy × Reversibility × Language |
| Layer 3 Benchmark | LLM model comparison, prompt design, regulatory analysis |
| Benchmarks | Evaluation against 9 public PII datasets |
| Performance | Latency, throughput, benchmark results |
CONTRIBUTING.md — language packs, test scenarios, framework integrations welcome.
| Who | Contribution |
|---|---|
| @aiedwardyi | Brazilian Portuguese language pack (CPF, CNPJ, phone) |