Evidentia

A comprehensive medical fact-checking skill for Claude Code.

Evaluates any medical content — research papers, news articles, social media posts, YouTube/podcast transcripts, conference slides, clinical guidelines, pharma marketing, patient leaflets, AI-generated text, and more — across 15 criteria, then generates a structured Markdown report with an overall A–F score and actionable improvement suggestions.

日本語での概要: 医学情報のファクトチェックと批評的評価を行う Claude Code スキルです。論文・記事・SNS投稿・動画/ポッドキャスト・学会スライド・診療ガイドライン・製薬マーケティング資料・患者向けリーフレット・AI生成コンテンツなど、あらゆる医学情報を15項目で包括的に評価し、構造化レポートを生成します。

Features

15-item evaluation framework — covers evidence quality, citation accuracy, statistics, causation, bias, ethics, and more
AI hallucination detection — cross-references DOIs against actual publications to catch fabricated citations (4-tier classification)
12 media types supported — research papers, news, social media, video/podcast transcripts, slides, guidelines, marketing, patient materials, AI-generated content, textbooks, infographics, health apps
Media-adaptive evaluation — automatically adjusts evaluation criteria weights based on content type
Public health risk assessment — flags content with LOW / MEDIUM / HIGH misinformation risk
Structured report generation — produces a Markdown report with an A–F overall score, per-item ratings, and concrete fix suggestions
Post-correction re-verification — re-evaluates articles after edits to confirm issues are resolved (Step 9)
Multi-language support — evaluates content in its original language

Supported Media Types

Category	Examples	Key Focus
Research papers	Journal articles, preprints, systematic reviews	Evidence level, methodology, statistical rigor
News & articles	Health news, medical blogs, magazine articles	Accuracy of claims, source attribution, exaggeration
Social media	X (Twitter), Instagram, TikTok, Reddit, note	Brevity-induced omissions, clickbait, misinformation risk
Newsletters	Email newsletters, Substack, medical columns	Citation completeness, audience calibration
Patient materials	Leaflets, brochures, hospital handouts	Readability, completeness, fear-mongering
Video/audio transcripts	YouTube, podcasts, webinar transcripts	Verbal exaggeration, missing nuance, source attribution
Presentations	Conference slides, lecture materials, grand rounds	Slide oversimplification, citation on slides
Clinical guidelines	Practice guidelines, protocols, algorithms	AGREE II compliance, evidence grading, COI
Marketing materials	Pharma ads, device brochures, supplement claims	Regulatory compliance, selective data, COI
Health apps & digital	App descriptions, chatbot outputs, AI-generated content	Hallucination detection, accuracy of automated advice
Textbooks & education	Textbook chapters, CME/CPD materials	Currency, completeness, pedagogical accuracy
Infographics	Visual summaries, data visualizations, social cards	Data integrity, oversimplification, source attribution

Evaluation Criteria

#	Item	Description
1	Evidence level & study design	Quality of RCTs, meta-analyses, observational studies
2	Citation & source accuracy	DOI cross-check, hallucination detection
3	Statistical interpretation	Relative vs. absolute risk, p-values, effect sizes
4	Causation vs. correlation	Validity of causal claims
5	Bias & conflicts of interest	COI disclosure, publication bias
6	Exaggeration & overclaiming	Clickbait, overgeneralization
7	Target population fit	Match between study population and audience
8	Temporal validity	Currency of information, guideline alignment
9	Jargon–readability balance	Terminology appropriate for the target audience
10	Ethical considerations	Stigma avoidance, fear-mongering detection
11	Logical consistency	Coherence between claims and evidence
12	Images & figures	Data visualization integrity and sourcing
13	Alternative explanations	Balanced presentation of competing viewpoints
14	Clinical relevance	Real-world applicability and significance
15	Information completeness	Coverage of risks, benefits, and alternatives

Scoring

Each item is rated Excellent / Good / Fair / Poor. The overall score is derived as follows:

Score	Criteria
A	12+ Excellent, 0 Poor
B	12+ Excellent or Good, ≤1 Poor
C	12+ Fair or better, ≤2 Poor
D	3+ Poor
F	5+ Poor, or critical ethical issues

Installation

Prerequisites

Claude Code installed and working

Setup

# 1. Clone this repository
git clone https://github.com/kgraph57/evidentia.git

# 2. Copy to Claude Code skills directory
mkdir -p ~/.claude/skills/medical-fact-check
cp -r evidentia/SKILL.md ~/.claude/skills/medical-fact-check/
cp -r evidentia/references ~/.claude/skills/medical-fact-check/
cp -r evidentia/templates ~/.claude/skills/medical-fact-check/

That's it. Claude Code automatically discovers skills in ~/.claude/skills/.

How to Use (Invocation)

The skill activates automatically when Claude Code detects a fact-checking intent. There are several ways to invoke it:

Trigger Phrases (English)

Type any of these in the Claude Code chat:

Fact-check this article

Check the evidence in this post

Evaluate this medical content

Is this health claim accurate?

日本語トリガー例: 「ファクトチェックして」「エビデンスチェック」「この記事を評価して」「この投稿の問題点を教えて」「この医学情報を確認して」

Input Methods

You can provide content in several ways:

1. Paste text directly

Fact-check this article:

[paste your article text here]

2. Provide a file path

Fact-check this file: ~/Documents/my-article.md

3. Provide a URL

Fact-check this: https://example.com/health-article

4. Provide a video/podcast transcript

Fact-check this YouTube transcript: [paste transcript]

Output

A structured Markdown report is saved to the working directory:

medical-fact-check-report-YYYY-MM-DD.md

The report includes:

Overall A–F score and public health risk level
Per-item ratings with specific issues and suggestions
Citation verification results (4-tier classification)
Before/after correction examples
References used during evaluation

Post-Correction Re-Check (Step 9)

After fixing issues, ask Claude Code to re-evaluate:

Re-check the corrected article: ~/Documents/my-article-v2.md

The updated report is saved with a -rev2 suffix.

AI Hallucination Detection

A key feature of Evidentia is its ability to detect fabricated citations commonly found in AI-generated medical content.

Citations are classified into 4 tiers:

Tier	Description
Verified	Paper exists and content matches the citation
Content mismatch	Paper exists but is cited out of context
Bibliographic mismatch	Paper exists but DOI, author, or journal info is wrong
Hallucination	DOI points to an unrelated paper, or the paper does not exist at all

Rather than stopping at "could not verify," the skill actively cross-references DOIs to determine whether a citation is merely unverifiable or provably fabricated.

Workflow (9 Steps)

Acquire & analyze — identify content type, media format, audience, main claims, public health risk level
Load checklist — read the 15-item evaluation criteria
Assess evidence levels — apply GRADE methodology where applicable
Verify citations — search DOI/PMID, cross-check against originals, detect hallucinations
Detailed evaluation — rate each of the 15 items with media-specific adjustments
Determine overall score — aggregate item ratings into A–F, assign risk level
Generate report — produce structured Markdown from the template
Deliver report — save file and summarize findings
Post-correction re-verification (optional) — re-evaluate after article revisions

File Structure

evidentia/
├── SKILL.md                    # Main skill definition (9-step workflow + media-specific handling)
├── references/
│   ├── checklist.md            # Detailed 15-item evaluation checklist with media-specific notes
│   └── evidence-levels.md      # Evidence hierarchy, GRADE, & quality assessment tools
└── templates/
    └── report-template.md      # Report template (9 sections incl. citation verification)

Customization

Evaluation criteria

Edit references/checklist.md to add domain-specific check items (e.g., oncology-specific criteria, drug interaction checks).

Report format

Edit templates/report-template.md to modify section structure or add custom sections.

Evidence levels

Edit references/evidence-levels.md to add specialty-specific assessment standards (e.g., pediatrics, cardiology, emergency medicine, mental health).

Limitations

This is an AI-based evaluation and does not replace expert medical judgment
Full-text review of cited papers is limited to what is accessible via web search (abstracts, open-access articles, bibliographic metadata)
Image, video, and audio evaluation is limited to text-based analysis (transcripts, captions)
Rapidly evolving fields may have evidence not yet indexed
Final medical decisions should always be made by qualified healthcare professionals

免責事項: 本スキルはAIによる評価であり、医療専門家の判断を代替するものではありません。最終的な医学的判断は資格を持つ医療従事者が行ってください。

Contributing

Bug reports and feature requests are welcome via Issues.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
references		references
templates		templates
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evidentia

Features

Supported Media Types

Evaluation Criteria

Scoring

Installation

Prerequisites

Setup

How to Use (Invocation)

Trigger Phrases (English)

Input Methods

Output

Post-Correction Re-Check (Step 9)

AI Hallucination Detection

Workflow (9 Steps)

File Structure

Customization

Evaluation criteria

Report format

Evidence levels

Limitations

Contributing

License

About

Uh oh!

Releases

Packages

License

kgraph57/evidentia

Folders and files

Latest commit

History

Repository files navigation

Evidentia

Features

Supported Media Types

Evaluation Criteria

Scoring

Installation

Prerequisites

Setup

How to Use (Invocation)

Trigger Phrases (English)

Input Methods

Output

Post-Correction Re-Check (Step 9)

AI Hallucination Detection

Workflow (9 Steps)

File Structure

Customization

Evaluation criteria

Report format

Evidence levels

Limitations

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages