This is a fork of kipeum86/document-redactor — an excellent offline DOCX redaction tool originally built for Korean legal practice. This fork adds detection rules for the legal system of England & Wales, with a focus on clinical negligence proceedings and inquests.
All credit for the core architecture, security model, OOXML handling, and verification pipeline belongs to the original project. This fork only extends the detection rules.
The upstream tool ships with detection rules tuned for Korean and US legal documents. This fork adds 20 UK-specific rules across four new files, without modifying any existing rules:
| Rule | Example | Tier |
|---|---|---|
| National Insurance number | QQ 12 34 56 C |
All |
| NHS number (Modulus 11 validated) | 943 476 5919 |
All |
| UK domestic phone (all Ofcom formats) | 07700 900123, 020 7946 0958, 0117 496 0123 |
All |
| UK postcode | SW1A 1AA, B2 4QA |
Standard |
| GMC number (context-gated) | GMC No: 1234567 |
Standard |
| NMC PIN (context-gated) | NMC PIN: 12A3456B |
Standard |
| UK driving licence | SMITH 861215 J99KA 12 |
Standard |
| Hospital number / MRN (context-gated) | MRN: RXH 123456 |
Standard |
| UK bank sort code (context-gated) | Sort Code: 12-34-56 |
Standard |
| Rule | Example | Tier |
|---|---|---|
| Court claim number (all divisions) | KB-2024-001234, county court refs |
Standard |
| Coroner's reference (context-gated) | Inquest Ref: 2024-0123 |
Standard |
| UK legal context scanner | Claim No: ..., Inquest into the death of ... |
Standard |
| Rule | Example | Tier |
|---|---|---|
| DD/MM/YYYY date (calendar-validated) | 15/03/2024, 15.03.2024 |
Standard |
| DD/MM/YY short date | 15/03/24 |
Paranoid |
| Rule | Example | Tier |
|---|---|---|
| NHS Trust / Health Board / ICB | Barts Health NHS Trust, Betsi Cadwaladr University Health Board |
Standard |
| UK judicial titles + name | His Honour Judge Smith, Mrs Justice Andrews, HHJ Taylor |
Standard |
| KC / QC + name | Sarah Jones KC |
Standard |
| Medical professional titles + name | Consultant Smith, Staff Nurse Patel |
Paranoid |
| Medical record context labels | Patient:, D.O.B:, GP:, Ward: |
Standard |
| Inquest context | Touching the death of, Deceased:, The late |
Standard |
Neutral citations ([2024] EWHC 123 (KB)), law report citations ([2024] 1 WLR 123), statute references (s.11 Limitation Act 1980), and CPR references are not flagged. These are public legal knowledge — they don't identify any person, case, or place.
Everything else comes from the original project:
- Zero-network architecture — CSP
default-src 'none', ESLint network bans, build-time ship gate - Single HTML file — download, double-click, redact
- OOXML deep traversal — body, headers, footers, footnotes, endnotes, comments, metadata, relationship files
- Round-trip verification — the output DOCX is re-parsed and checked before download
- Metadata stripping — scrubs author, company, tracked changes, comments, custom properties
- Field and hyperlink flattening — catches hidden URLs in OOXML instruction text
- Manual additions — type any string to add it as a redaction target
- 1,700+ automated tests with 90% coverage thresholds
See the upstream README and USAGE.md for full documentation.
- Go to Releases and download
document-redactor.html - Double-click to open in your browser
- Drop a
.docxfile - Review candidates, add any the tool missed
- Click Apply and verify
- Download the
.redacted.docx
git clone https://github.com/banterny/document-redactor.git
cd document-redactor
bun install
bun run test
bun run build
open dist/document-redactor.htmlThis fork tracks the upstream main branch. To pull in future improvements:
git fetch upstream
git merge upstream/mainThe UK rules live in separate files (*-uk.ts), so merge conflicts should be rare.
Apache 2.0 — same as the upstream project.
This fork exists because kipeum86 built something genuinely excellent. The security architecture (defence-in-depth with three enforcement layers), the round-trip verification pipeline, and the single-file distribution model are all outstanding engineering decisions. This fork just teaches it to recognise UK postcodes and NHS numbers.