Skip to content

Commit 5ec346e

Browse files
Aditya Kaushalclaude
authored andcommitted
fix: replace Mermaid runtime with static SVG diagrams
Mermaid CDN was failing to render on GitHub Pages causing blank diagram areas. All 7 diagrams now reference pre-rendered SVG files from docs/diagrams/: diagram-1 → Architecture (BASE + Clusters) diagram-2 → Case Flow diagram-3 → Document Ingestion diagram-4 → Entity Resolution diagram-5 → Reasoning Loop diagram-6 → Privacy Gateway diagram-7 → Audit Chain Removed Mermaid CDN loader script and unused .mermaid CSS. Applied to docs/ire-builder-guide.html, root HTML, and releases/v1.2. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 268cf4e commit 5ec346e

3 files changed

Lines changed: 24 additions & 564 deletions

File tree

IRE_Builder_Guide.html

Lines changed: 8 additions & 188 deletions
Original file line numberDiff line numberDiff line change
@@ -4,30 +4,7 @@
44
<meta charset="UTF-8">
55
<meta name="viewport" content="width=device-width, initial-scale=1.0">
66
<title>IRE Builder Guide — Institutional Reasoning Engines</title>
7-
<script>
8-
// Inline Mermaid loader — loads from CDN after page ready
9-
window.addEventListener('load', function() {
10-
var s = document.createElement('script');
11-
s.src = 'https://cdnjs.cloudflare.com/ajax/libs/mermaid/10.6.1/mermaid.min.js';
12-
s.onload = function() {
13-
mermaid.initialize({
14-
startOnLoad: false,
15-
theme: 'default',
16-
themeVariables: {
17-
primaryColor: '#1B3A5C',
18-
primaryTextColor: '#ffffff',
19-
primaryBorderColor: '#2E6DA4',
20-
lineColor: '#2E6DA4',
21-
secondaryColor: '#D5E8F0',
22-
tertiaryColor: '#F4F6F9'
23-
},
24-
flowchart: { htmlLabels: true, curve: 'basis' }
25-
});
26-
mermaid.run({ querySelector: '.mermaid' });
27-
};
28-
document.head.appendChild(s);
29-
});
30-
</script>
7+
318
<style>
329
:root {
3310
--navy:#1B3A5C; --mid:#2E6DA4; --light:#D5E8F0;
@@ -130,7 +107,6 @@
130107
box-shadow:0 2px 7px rgba(0,0,0,.07);overflow-x:auto}
131108
.dt{font-size:12px;font-weight:700;color:var(--mid);margin-bottom:14px;
132109
letter-spacing:.5px;text-transform:uppercase}
133-
.mermaid{min-height:80px}
134110

135111
/* Steps */
136112
.steps{margin:16px 0}
@@ -260,37 +236,7 @@ <h2 class="sh">What It Is vs What It Is Not</h2>
260236
<h1 class="page-title">Architecture <span>at a Glance</span></h1>
261237
<p class="page-sub">Two layers: a mandatory base every deployment must include, and modular clusters that activate based on your investigation context.</p>
262238
<div class="dw"><div class="dt">Full IRE Architecture — Base + 7 Clusters</div>
263-
<div class="mermaid">
264-
graph TD
265-
subgraph BASE["🔒 MANDATORY BASE — Always Active"]
266-
B1[B1 Case Isolation] --- B2[B2 Retrieval Verifier]
267-
B2 --- B3[B3 Evidence Grounder]
268-
B3 --- B4[B4 Human Review Gate]
269-
B4 --- B5[B5 Audit Logger]
270-
B5 --- B6[B6 Hash Chain]
271-
B6 --- B7[B7 Model Pinning]
272-
B7 --- B8[B8 Bias Monitor]
273-
end
274-
subgraph CLUSTERS["📦 MODULAR CLUSTERS"]
275-
CA[Cluster A\nDocumentary]
276-
CB[Cluster B\nBehavioural]
277-
CC[Cluster C\nNetwork]
278-
CD[Cluster D\nReasoning]
279-
CE[Cluster E\nPrivacy]
280-
CF[Cluster F\nMemory]
281-
CG[Cluster G\nIntegrity]
282-
end
283-
BASE --> CLUSTERS
284-
CA --> CC
285-
CA --> CD
286-
CB --> CC
287-
CC --> CD
288-
CD --> CE
289-
CD --> CF
290-
BASE --> CG
291-
style BASE fill:#1B3A5C,color:#fff,stroke:#2E6DA4
292-
style CLUSTERS fill:#EBF5FB,stroke:#2E6DA4,color:#1B3A5C
293-
</div></div>
239+
<img src="diagrams/diagram-1.svg" alt="Diagram 1" style="width:100%;max-width:100%;display:block;"></div>
294240
<h2 class="sh">The 8 Mandatory Base Components</h2>
295241
<div class="tw"><table>
296242
<tr><th>#</th><th>Component</th><th>Plain English</th><th>Why Non-Negotiable</th></tr>
@@ -504,33 +450,7 @@ <h2 class="sh">Phase 6 — Report Generation + Go Live (Month 10+)</h2>
504450
<h1 class="page-title">Workflow: <span>End-to-End Case Flow</span></h1>
505451
<p class="page-sub">How a case moves through the full IRE system — from opening to locked report.</p>
506452
<div class="dw"><div class="dt">Full Case Lifecycle</div>
507-
<div class="mermaid">
508-
flowchart TD
509-
A([Investigator Opens Case]) --> B[Case Namespace Created\nAudit Chain Initialised]
510-
B --> C[Evidence Upload]
511-
C --> D{Evidence Type?}
512-
D -->|Documents| E[Cluster A: Ingest → Chunk → Embed → Index]
513-
D -->|Transcripts| F[Cluster B: Transcribe → Chunk → PII Scrub → Index]
514-
E --> G[Entity Extraction]
515-
F --> G
516-
G --> H[Cluster C: Entity Resolution HITL]
517-
H --> I[Graph Build: Nodes + Edges]
518-
I --> J[Cluster D: Recursive Reasoning Loop]
519-
J --> K{Score >= 0.80?}
520-
K -->|No| L[Iterate: New Queries]
521-
L --> J
522-
K -->|Yes| M[Draft Report Generated in 60-90s]
523-
M --> N[Human Review Gate:\nSection-by-Section Approval]
524-
N --> O{All Approved?}
525-
O -->|No| P[Investigator Edits]
526-
P --> N
527-
O -->|Yes| Q[Final Report Locked\nCitation Map Appended]
528-
Q --> R[Audit Chain Complete:\nReport Hash Stored]
529-
R --> S([Case Closed])
530-
style A fill:#1B3A5C,color:#fff
531-
style S fill:#27AE60,color:#fff
532-
style N fill:#D4A017,color:#fff
533-
</div></div>
453+
<img src="diagrams/diagram-2.svg" alt="Diagram 2" style="width:100%;max-width:100%;display:block;"></div>
534454
<h2 class="sh">Investigator Accountability Schema</h2>
535455
<p>Logging that a human approved a finding is necessary but not sufficient. For outputs to be genuinely audit-ready, the approval record must capture the quality and basis of the human judgment — not just the fact that it occurred.</p>
536456
<div class="tw"><table>
@@ -565,28 +485,7 @@ <h2 class="sh">What Happens at Each Stage</h2>
565485
<h1 class="page-title">Workflow: <span>Document Ingestion</span></h1>
566486
<p class="page-sub">How a document goes from upload to queryable evidence in the case namespace.</p>
567487
<div class="dw"><div class="dt">Cluster A — Document Ingestion Pipeline</div>
568-
<div class="mermaid">
569-
flowchart LR
570-
A([File Upload]) --> B{File Type?}
571-
B -->|PDF text| C[PyMuPDF]
572-
B -->|PDF scanned| D[Tesseract OCR]
573-
B -->|Excel/CSV| E[pandas]
574-
B -->|Email| F[extract-msg]
575-
B -->|Word| G[python-docx]
576-
C --> H[Chunker]
577-
D --> H
578-
E --> H
579-
F --> H
580-
G --> H
581-
H --> I[Document-Type Strategy]
582-
I --> J[Metadata Tagging\nchunk_id · source · page_ref\nentity_tags · date_range]
583-
J --> K[Embedding Model\nBGE-M3 via Ollama]
584-
K --> L[Qdrant Indexer\nCase-Scoped Namespace]
585-
L --> M[Entity Extract\nFeed to Resolution Queue]
586-
L --> N([Queryable Evidence])
587-
style A fill:#1B3A5C,color:#fff
588-
style N fill:#27AE60,color:#fff
589-
</div></div>
488+
<img src="diagrams/diagram-3.svg" alt="Diagram 3" style="width:100%;max-width:100%;display:block;"></div>
590489
<div class="callout warn"><div class="cl">Most Common Ingestion Failures</div>
591490
<p><strong>Scanned PDFs without OCR layer:</strong> If PyMuPDF returns no text, auto-route to Tesseract. <strong>Password-protected files:</strong> Require investigator to decrypt before upload. <strong>Non-standard CSV encodings:</strong> Detect encoding with chardet before parsing.</p></div>
592491
</div>
@@ -596,33 +495,7 @@ <h1 class="page-title">Workflow: <span>Document Ingestion</span></h1>
596495
<h1 class="page-title">Workflow: <span>Entity Resolution</span></h1>
597496
<p class="page-sub">How the system determines "Rajesh Kumar", "R. Kumar", and "RJSH_KMR" are the same person — and what happens when it is not sure.</p>
598497
<div class="dw"><div class="dt">Three-Tier Entity Resolution</div>
599-
<div class="mermaid">
600-
flowchart TD
601-
A([Raw Entities from Parser]) --> B[Tier 1: Exact Match\nNational ID · Account No · Tax ID]
602-
B --> C{Exact match found?}
603-
C -->|Yes| D[Auto-Resolve: Merge Nodes\nLog to Audit Chain]
604-
D --> E[Notify Investigator\n24hr Override Window]
605-
C -->|No| F[Tier 2: Fuzzy Match\nName variants · Address · Phonetic]
606-
F --> G{Confidence?}
607-
G -->|60-94%| H[BLOCKING: HITL Screen\nEvidence For + Against + Score]
608-
H --> I{Investigator Decision}
609-
I -->|Approve| J[Merge + Log]
610-
I -->|Reject| K[Keep Separate + Log]
611-
I -->|Defer| L[Add to Queue]
612-
G -->|Below 60%| M[Tier 3: Contextual Match\nShared director · address · agent]
613-
M --> N[BLOCKING: HITL Screen\nThree Options]
614-
N --> O{Decision}
615-
O -->|Merge| J
616-
O -->|Separate| K
617-
O -->|Alias Edge| P[Link as Related Party\nNot Merged + Log]
618-
J --> Q([Graph Build Proceeds])
619-
K --> Q
620-
P --> Q
621-
style A fill:#1B3A5C,color:#fff
622-
style Q fill:#27AE60,color:#fff
623-
style H fill:#D4A017,color:#fff
624-
style N fill:#D4A017,color:#fff
625-
</div></div>
498+
<img src="diagrams/diagram-4.svg" alt="Diagram 4" style="width:100%;max-width:100%;display:block;"></div>
626499
<div class="callout danger"><div class="cl">Critical: Do Not Skip Entity Resolution</div>
627500
<p>A 10% entity duplication rate in a 500-node graph produces 50 phantom nodes — enough to break circular flow detection entirely. Build Tier 2/3 resolution before building the graph layer.</p></div>
628501
</div>
@@ -632,29 +505,7 @@ <h1 class="page-title">Workflow: <span>Entity Resolution</span></h1>
632505
<h1 class="page-title">Workflow: <span>Recursive Reasoning Loop</span></h1>
633506
<p class="page-sub">How the AI forms a hypothesis, tests it against evidence, and iterates until confident or flagged. This is Cluster D — the reasoning engine.</p>
634507
<div class="dw"><div class="dt">Cluster D — Recursive Reasoning Loop</div>
635-
<div class="mermaid">
636-
flowchart TD
637-
A([Initialise: Case State + Goal + Model Version]) --> B[PLAN\nWhat evidence do I need next?]
638-
B --> C[RETRIEVE\nRAG Query + Graph Query]
639-
C --> D[ANALYZE\n70B Agent temp=0\nRetrieval-only mode]
640-
D --> E[VERIFY — DETERMINISTIC\nDoes chunk exist? Is claim recoverable?]
641-
E --> F{Claim passes?}
642-
F -->|Yes| G[GROUND\nAttach chunk_id + source + page_ref]
643-
F -->|No| H[Strip Claim\nLog to Unverified Register]
644-
G --> I[CRITIQUE\nLogical consistency + Gaps]
645-
H --> I
646-
I --> J[SCORE\nUpdate 4-component Evidence Score]
647-
J --> K{Score >= threshold\nAND >= 3 citations?}
648-
K -->|No| L{Ceiling reached?}
649-
L -->|No| B
650-
L -->|Yes| M[Flag: Send to Human Review]
651-
K -->|Yes| N[Draft Finding + Citation Map]
652-
N --> O([Human Review Gate])
653-
style A fill:#1B3A5C,color:#fff
654-
style O fill:#D4A017,color:#fff
655-
style M fill:#C0392B,color:#fff
656-
style E fill:#2E6DA4,color:#fff
657-
</div></div>
508+
<img src="diagrams/diagram-5.svg" alt="Diagram 5" style="width:100%;max-width:100%;display:block;"></div>
658509
<h2 class="sh">Evidence Score Thresholds</h2>
659510
<div class="cards">
660511
<div class="card"><h4>0.80 — Section Lock</h4><p>Individual report sections lock at this score. Still requires minimum 3 citations.</p></div>
@@ -668,25 +519,7 @@ <h2 class="sh">Evidence Score Thresholds</h2>
668519
<h1 class="page-title">Workflow: <span>Privacy Gateway</span></h1>
669520
<p class="page-sub">How PII is removed before data reaches any external AI, and restored only after investigator approval — inside your environment.</p>
670521
<div class="dw"><div class="dt">Cluster E — Pseudonymisation Pipeline</div>
671-
<div class="mermaid">
672-
flowchart TD
673-
A([Raw Case Data with PII]) --> B[Presidio NER + Custom Recognisers\nDetect: names · IDs · accounts · spoken refs]
674-
B --> C[Token Map Store\nIn-memory · Encrypted · Never leaves environment]
675-
C --> D[Pseudonymised Payload\nNames to IND_T001 · Accounts to ACCT_T089\nAmounts/dates/types RETAINED]
676-
D --> E[External SOTA API\nReasons on tokens only]
677-
E --> F[Tokenised Reasoning Output]
678-
F --> G[Human Checkpoint\nCheck for re-identification risk]
679-
G --> H{Investigator Approves?}
680-
H -->|No| I[Edit or Reject + Log]
681-
H -->|Yes| J[De-tokeniser\nRestores real identifiers\nON-PREMISE ONLY]
682-
J --> K[Final Report with Real Names]
683-
K --> L[Audit Log: pseudonymisation + approval + de-tokenisation]
684-
L --> M([Complete])
685-
style A fill:#1B3A5C,color:#fff
686-
style M fill:#27AE60,color:#fff
687-
style G fill:#D4A017,color:#fff
688-
style E fill:#2E6DA4,color:#fff
689-
</div></div>
522+
<img src="diagrams/diagram-6.svg" alt="Diagram 6" style="width:100%;max-width:100%;display:block;"></div>
690523
<div class="callout warn"><div class="cl">Known Limitation — Inference Re-Identification</div>
691524
<p>Pseudonymisation controls what enters the AI — not what it infers. A model given pseudonymised data may produce output that, combined with other information, re-identifies a data subject. The Human Checkpoint is the primary control. Get legal counsel to review your approach before going live.</p></div>
692525
</div>
@@ -696,20 +529,7 @@ <h1 class="page-title">Workflow: <span>Privacy Gateway</span></h1>
696529
<h1 class="page-title">Workflow: <span>Audit Chain</span></h1>
697530
<p class="page-sub">How every action is permanently recorded and cryptographically linked — making tampering mathematically detectable.</p>
698531
<div class="dw"><div class="dt">Hash-Chain Immutability Design</div>
699-
<div class="mermaid">
700-
flowchart LR
701-
A[Event 1: Case Open\npayload_hash: a1b2\nchain_hash: SHA256] --> B[Event 2: File Ingestion\npayload_hash: c3d4\nchain_hash: SHA256 + prev]
702-
B --> C[Event 3: Entity Resolution\npayload_hash: e5f6\nchain_hash: SHA256 + prev]
703-
C --> D[Event N: Human Approval\npayload_hash: g7h8\nchain_hash: SHA256 + prev]
704-
D --> E{Cluster G Active?}
705-
E -->|Yes| F[AWS QLDB\nExternal Timestamp]
706-
E -->|No| G[Institutional-grade\nTamper detectable internally]
707-
F --> H([Independently Verifiable Audit Integrity\nIndependently Verifiable])
708-
G --> I([Institutional-Grade Integrity])
709-
style A fill:#1B3A5C,color:#fff
710-
style H fill:#27AE60,color:#fff
711-
style I fill:#2E6DA4,color:#fff
712-
</div></div>
532+
<img src="diagrams/diagram-7.svg" alt="Diagram 7" style="width:100%;max-width:100%;display:block;"></div>
713533
<h2 class="sh">What Gets Logged</h2>
714534
<div class="tw"><table>
715535
<tr><th>Event</th><th>Key Fields</th><th>Why It Matters</th></tr>

0 commit comments

Comments
 (0)