Fact-Checking Workflow Documentation

This document describes the step-by-step fact-checking pipeline, explaining what each step does and the parameters available for customization.

Quick Overview

The workflow processes a statement provided by the user through 11 main steps:

flowchart LR
    A[Statement] --> B[Critical Question Generation]
    B --> C[Critical Question Refinement]
    C --> D[Sources Generation]
    D --> E[Source Collection]
    E --> F[Source Refinement]
    F --> G[Download Sources]
    G --> H[RAG Processing]
    H --> I[Article Generation]
    I --> J[Critical Question Answer Generation]
    J --> K[Metadata Generation]

    A --> L[Image Generation]

Examples

We provide three different 📓Jupyter notebooks that demostrate how to use the fact-cheking software:

📓 notebooks/run_workflow.ipynb: Runs the full pipeline in a single call based on a workflow configuration file.
📓 notebooks/test_stepwiseworkflow-local.ipynb: Runs the workflow manually step-by step, this allows to see the output of each step and iterate over them if needed.
📓 notebooks/test_stepwiseworkflow.ipynb: Runs the workflow manually step-by step, using a locally hosted model (Latxa 8B).

Pipeline Configuration

Pipeline configurations are YAML files that define which models and parameters to use for each step of the workflow. Configuration files are stored in configs/pipeline_configs/. This is neccesary if you want to run the pipeline in a single step as demostrated in notebooks/run_workflow.ipynb or if you want to use the evaluator.

Available Configurations

Config File	Description
`gemini_rag.yaml`	Uses Gemini models with RAG enabled
`gemini_full.yaml`	Uses Gemini models without RAG
`gpt_rag.yaml`	Uses GPT models with RAG enabled
`gpt_full.yaml`	Uses GPT models without RAG
`latxa8b_rag.yaml`	Uses local Latxa 8B model with RAG
`latxa70b_rag.yaml`	Uses local Latxa 70B model with RAG

Configuration Structure

# Workflow type
workflow_type: "FactCheckingWithPipelineWorkflow"

# Model configuration - specify different models for each agent
models:
  critical_question_agent: "google/gemini-2.5-pro"    # Critical question generation
  gen_searches_agent: "google/gemini-2.5-flash"      # Search query generation
  web_search_provider: "serper"                       # Web search API provider
  embedding: "gemini-embedding-001"                   # Embedding model for RAG
  question_agent: "google/gemini-2.5-flash"          # Question answering
  article_writer: "google/gemini-2.5-flash"          # Article generation
  metadata_agent: "google/gemini-2.5-flash"          # Metadata extraction
  image_prompt: "google/gemini-2.5-flash"            # Image description
  image_model: "flux-replicate"                       # Image generation

# Web search configuration
web_search:
  top_k: 5                    # Results per query
  max_searches: 5             # Maximum search queries
  ban_domains: null           # Domains to exclude (or list of domains)
  download_text: true         # Download full page content

# RAG configuration
rag:
  do_rag: True                # Enable/disable RAG
  chunk_size: 128             # Characters per chunk
  chunk_overlap: 0            # Overlap between chunks
  split_separators: ["\n"]    # Text splitting separators
  fp16_storage: true          # Use float16 for memory efficiency
  l2_normalise: true          # L2-normalize embeddings
  top_k: 3                    # Passages to retrieve per query
  get_scores: false           # Include similarity scores

# Question generation
questions:
  max_questions: 5            # Maximum critical questions

# Image generation
image:
  size: "1280x256"            # Output image size
  style: "Digital art..."     # Style guidance for the image

# General settings
general:
  max_history_size: null      # History limit (null = unlimited)

Project Structure

The fact-checking pipeline follows a modular, agent-based architecture. Each workflow step is implemented as an independent agent with well-defined inputs, outputs, and prompts.

Agent Architecture

All agents are defined in veridika/src/agents/ and inherit from a common BaseAgent class:

veridika/src/agents/
├── baseagent.py              # Base class with common functionality
├── CriticalQuestionAgent.py  # Steps 1-2: Question generation/refinement
├── GenSearchesAgent.py       # Steps 3, 5: Search query generation/refinement
├── WebSearchAgent.py         # Step 4: Web search execution (Serper)
├── RagAgent.py               # Step 7: Retrieval-Augmented Generation
├── ArticleWriterAgent.py     # Step 8: Fact-check article generation
├── QuestionAgent.py          # Step 9: Critical question answering
├── MetadataAgent.py          # Step 10: Metadata extraction
├── ImagePromptAgent.py       # Step 11a: Image description generation
└── ImageGenAgent.py          # Step 11b: Image generation (Flux)

Key Features

Structured Outputs

Agents use Pydantic models to enforce structured JSON outputs from the LLM. This ensures consistent, parseable responses:

from pydantic import BaseModel

class Metadata(BaseModel):
    title: str
    categories: list[str]
    label: Literal["Fake", "True", "Undetermined"]
    main_claim: str

# LLM returns validated structured output 
response, cost = self.model(messages=conversation, pydantic_model=Metadata)

Cost Tracking

Every agent call returns a cost value that is automatically tracked:

# Each agent.run() returns: (result, cost, history_entry)
result, cost, history = agent.run(...)

# Access accumulated costs
print(agent.cost)           # Total cost
print(agent.get_stats())    # Detailed statistics

Execution History

Agents maintain a history of all operations for debugging and auditing:

history = agent.get_history()
# Returns: [{"date": "...", "run_time": 1.23, "cost": 0.001, "history_entry": {...}}, ...]

Async Execution

Agents support asynchronous execution for parallel processing:

# Agents can be called directly and awaited
result, cost = await agent(statement="...", langid="es")

# Or run multiple agents in parallel
results = await asyncio.gather(agent1(...), agent2(...))

Agent Interface

Each agent implements the following pattern:

Method	Description
`__init__(model_name, ...)`	Initialize with model and configuration
`run(args, *kwargs)`	Execute the agent's core logic
`get_history()`	Get execution history
`get_stats()`	Get performance statistics
`reset()`	Reset agent state and history

Fact Cheking Steps

1. Critical Question Generation

Agent: CriticalQuestionAgent

Generates critical research questions to guide the fact-checking investigation. These questions identify what information is needed to verify or refute the statement.

Parameters

Parameter	Type	Description
`statement`	`str`	The input statement to analyze and fact-check
`model`	`str`	LLM model name (e.g., `"google/gemini-2.5-flash"`)
`language`	`str`	Language code for output (e.g., `"es"`, `"en"`)
`location`	`str`	Location context for regional relevance (e.g., `"es"`)

Example

Input:

statement = "Los coches eléctricos son más contaminantes que los coches de gasolina"
language = "es"
location = "es"

Output:

{
    "questions": [
        "¿Cuáles son las emisiones de carbono promedio en la producción de un coche eléctrico (incluyendo la batería) en comparación con la producción de un coche de gasolina en 2025?",
        "¿Qué estudios de análisis de ciclo de vida (LCA) publicados entre 2020 y 2025 comparan el impacto ambiental total de los coches eléctricos y los coches de gasolina?",
        "¿Cuál es la huella de carbono de la generación de electricidad para cargar un coche eléctrico en países con diferentes mezclas energéticas?"
    ]
}

2. Critical Question Refinement (Optional)

Agent: CriticalQuestionRefinementAgent

Allows users to refine the generated questions based on additional requirements or feedback. You can add new questions, modify existing ones, or request additional focus areas.

Parameters

Parameter	Type	Description
`questions`	`list[str]`	Current list of critical research questions
`input`	`str`	Original statement being fact-checked
`refinement`	`str`	User feedback describing desired changes
`language`	`str`	Language code for output
`location`	`str`	Location context
`model`	`str`	LLM model name

Example

Input:

refinement = "También quiero que consideres la contaminación del proceso de reciclaje de la batería"

Output:

{
    "questions": [
        "¿Cuáles son las emisiones de carbono promedio en la producción de un coche eléctrico...?",
        "¿Qué estudios de análisis de ciclo de vida (LCA)...?",
        "¿Cuál es el impacto ambiental y los niveles de contaminación asociados específicamente al proceso de reciclaje de las baterías de coches eléctricos en 2025?"
    ]
}

3. Sources Generation

Agent: GenSearchesAgent

Converts critical questions into optimized search queries. The agent analyzes the questions and generates effective search strings to find relevant sources. Web searches are expensive, so this step aims to condense the critical questions into a minimum number of search queries and optimize the search strings to find the most relevant sources with the least number of searches.

Parameters

Parameter	Type	Description
`statement`	`str`	Original statement being fact-checked
`questions`	`list[str]`	Critical research questions to guide searches
`langid`	`str`	Language code for search queries
`override_prompt`	`str \| None`	Optional custom prompt
`override_pydantic_model`	`BaseModel \| None`	Optional custom output model

Example

Output (Search Queries):

emisiones carbono producción coche eléctrico vs gasolina 2025 estudio
análisis ciclo vida coche eléctrico gasolina 2020-2025 impacto ambiental
huella carbono electricidad coche eléctrico mezcla energética países europeos 2025
agencia europea medio ambiente impacto ambiental coches eléctricos vs gasolina
reciclaje baterías coche eléctrico impacto ambiental 2025 desafíos

4. Source Collection (Serper)

Agent: WebSearchAgent

Executes web searches using the generated queries and retrieves source metadata. We use the Serper API to get Google Search results. This API requires pre-paid credits, altough when creating an account you get a generous amount of free credits. It is not posible to get high-quality web search results (google, bing...) locally without an API service, as after a few searches the API will block your IP, so a service like Serper is required.

Parameters

Parameter	Type	Default	Description
`queries`	`list[str]`	—	List of search queries to execute
`language`	`str`	`"en"`	Language code for search results
`location`	`str \| None`	`None`	Location code for regional results
`top_k`	`int`	`10`	Number of results per query
`ban_domains`	`list[str] \| None`	`None`	Domains to exclude from results
`download_text`	`bool`	`True`	Whether to download full text content
`download_favicon`	`bool`	`True`	Whether to download favicon
`web_search_provider`	`str`	`"serper"`	Search provider identifier

Example

Output (Source Entry):

{
    "title": "El coche eléctrico es sostenible a partir de los 17.000 km",
    "link": "https://revista.dgt.es/es/motor/noticias/2025/...",
    "snippet": "El estudio estima en 63 gramos de CO2 por kilómetro las emisiones de un coche...",
    "favicon": "https://revista.dgt.es/Galerias/_config_/favicon.ico",
    "base_url": "https://revista.dgt.es"
}

5. Source Refinement (Optional)

Agent: GenSearchesRefinementAgent

Allows users to request additional sources based on feedback. Generates new search queries to fill gaps in the current source collection.

Parameters

Parameter	Type	Description
`current_searches`	`list[str]`	Existing search queries
`statement`	`str`	Original statement being fact-checked
`user_feedback`	`str`	Feedback describing additional information needs
`language`	`str`	Search language code
`location`	`str`	Search location/region
`model`	`str`	LLM model for query generation
`web_search_provider`	`str`	Search provider (default: `"serper"`)
`top_k`	`int`	Results per query (default: `5`)
`ban_domains`	`list[str] \| None`	Domains to exclude

Example

Input:

user_feedback = "Quiero que también busques información oficial de la Union Europea o algún comunicado de prensa"

Output (New Search):

Unión Europea comunicado prensa coches eléctricos vs gasolina impacto ambiental 2025

6. Download of the Sources

The system downloads the full text content of each source URL. This step:

Fetches HTML content from each source URL
Extracts clean text from the HTML (Using Newspaper3k)
Filters out sources with insufficient content (< 50 words)
Prepares documents for indexing

Note that only sources in which the content is available in the HTML (no paywall, no subscription required, no dynamic content, etc.) will be downloaded successfully. Fortunatelly, most the websites in our test were static and could be downloaded successfully.

Parameters (in `FactCheckingWorkflow`)

These parameters affect how sources are processed:

Parameter	Type	Default	Description
`sources`	`list[dict]`	—	List of source dicts with `link`, `title`, `favicon`, `base_url`, `snippet`

7. RAG (Retrieval-Augmented Generation)

Agent: RagAgent

Note

This step is optional and can be disabled by setting use_rag=False.

Builds an embedding index from downloaded documents and retrieves the most relevant passages for each question. This improves answer quality by focusing on the most pertinent information.

Parameters

Parameter	Type	Default	Description
`embedding_model`	`str`	—	Embedding model identifier (e.g., `"gemini-embedding-001"`)
`chunk_size`	`int`	`128`	Character window for text splitting
`chunk_overlap`	`int`	`0`	Overlap between chunks
`split_separators`	`tuple`	`("\n",)`	Separators for text splitting
`fp16_storage`	`bool`	`True`	Store embeddings in float16 (memory efficient)
`l2_normalise`	`bool`	`True`	L2-normalize embeddings
`top_k`	`int`	`5`	Passages to retrieve per query
`use_rag`	`bool`	`True`	Enable/disable RAG processing

What happends if you disable RAG?

If you disable RAG, instead of finding a set of relevant passages (by default paragraphs), and using them as the context for the article generation (and for answering the questions). The system will use the full text of all the documents as context. This creates a massive context with a huge amount of tokens. Modern LLMs, which contexts sizes of +256K tokens can handle this, and in many cases, they achieve better performance than with RAG, as well as faster generation times (as the embedding index is not needed). However, since the number of tokens is much higher, and APIs are billed by tokens, it can increase the cost significantly. In the case of local LLMs, the current version of Latxa only support 8K tokens, which is not enough for this configuration. If you use a local model such as Qwen/GPT-OSS which do support long context sizes, take into account that it will require a significant amount of GPU memory.

8. Article Generation

Agent: ArticleWriterAgent

Generates a comprehensive fact-checking article that analyzes the statement, presents evidence from sources, and provides a reasoned conclusion. Articles will always contain three paragraphs: Supporting Arguments, Refuting Arguments, and Conclusion. If no supporting or refuting arguments are found, the model will explain it to the user. The system will not produce these paragraphs if prompted with a harmful statement, instead it will produce a single paragraph refusing to perform the tasks (Altough we recommend pre-processing the statement before running them in the pipeline using a SafeGuard model or a API such as the OpenAI Moderation API)

Parameters

Parameter	Type	Default	Description
`search_results`	`dict[str, list[dict]]`	—	Unified results from WebSearch/RAG
`fact_checking_topic`	`str`	—	The statement being fact-checked
`lang`	`str`	`"es"`	Language code for the article
`citation_manager`	`CitationManager \| None`	`None`	Shared citation manager for consistent numbering
`article_writer_model`	`str`	—	LLM model for article generation

Output Format

The article includes:

Arguments supporting the claim
Arguments refuting the claim
Evidence from cited sources (using [X] citation format)
A clear conclusion with verdict

9. Critical Question Answer Generation

Agent: QuestionAgent

Generates individual answers to each critical question using the retrieved sources. Each answer is properly cited.

Parameters

Parameter	Type	Default	Description
`search_results`	`dict[str, list[dict]]`	—	Unified results from WebSearch/RAG
`lang`	`str`	`"es"`	Language code for answers
`citation_manager`	`CitationManager \| None`	`None`	Shared citation manager
`question_answer_model`	`str`	—	LLM model for QA generation

Example Output

{
    "¿Cuáles son las emisiones de carbono...?": "La producción de un vehículo eléctrico genera inicialmente más emisiones que un coche de gasolina [1] [2]. Sin embargo, esta diferencia se compensa en aproximadamente 17,000 km de uso [1]...",
    "¿Qué estudios de análisis de ciclo de vida...?": "Según el estudio del ICCT, los coches eléctricos producen un 73% menos de emisiones durante su ciclo de vida completo [3] [4]..."
}

10. Metadata Generation

Agent: MetadataAgent

Extracts structured metadata from the generated fact-checking article, including title, categories, verdict label, and main claim.

Parameters

Parameter	Type	Default	Description
`article_output`	`dict[str, str]`	—	Output from ArticleWriterAgent containing `{"article": "..."}`
`fact_checking_title`	`str`	—	Original title/topic of the fact-check
`lang`	`str`	`"es"`	Language code
`metadata_model`	`str`	—	LLM model for metadata extraction

Output Format

{
    "title": "Los coches eléctricos son más contaminantes que los coches de gasolina",
    "categories": ["Automoción", "Medio ambiente", "Ciencia"],
    "label": "Fake",
    "main_claim": "La afirmación de que los coches eléctricos son más contaminantes que los coches de gasolina es FALSA."
}

Label Values

Label	Description
`"Fake"`	The statement is false
`"True"`	The statement is true
`"Undetermined"`	Cannot be conclusively verified

11. Image Generation

Agents: ImagePromptAgent + ImageGenAgent

Generates a header image for the fact-checking article in two steps:

ImagePromptAgent: Creates a concise image description from the article title
ImageGenAgent: Generates an image using Flux image generator

Parameters

Parameter	Type	Default	Description
`statement`	`str`	—	The article title/statement to illustrate
`model`	`str`	—	LLM model for prompt generation
`size`	`str`	`"1280x256"`	Output image dimensions
`style`	`str`	(see below)	Style guidance for image generation
`image_model`	`str`	`"flux_replicate"`	Image generation model

Default Style:

"The style is bold and energetic, using vibrant colors and dynamic compositions to create visually engaging scenes that emphasize activity, culture, and lively atmospheres. Digital art."

Example

Input:

statement = "Los coches eléctricos son más contaminantes que los coches de gasolina"

Output:

{
    "image_description": "Split-screen image comparing electric and gasoline cars with CO2 emission cloud visualization",
    "image_url": "data:image/webp;base64,...",
    "size": "1280x256"
}

Complete Workflow Example

from veridika.src.workflows.StepwiseWorkflows import (
    CrititalQuestionWorkflow,
    CrititalQuestionRefinementWorkflow,
    FactCheckingWorkflow,
    GenSearchesWorkflow,
    ImageGenerationWorkflow,
    SourceRefinementWorkflow,
)

# 1. Generate critical questions
critical_question_workflow = CrititalQuestionWorkflow()
critical_questions, cost = await critical_question_workflow.run(
    statement="Los coches eléctricos son más contaminantes que los coches de gasolina",
    language="es",
    location="es",
    model="google/gemini-2.5-flash",
)

# 2. (Optional) Refine questions
critial_question_refinement_workflow = CrititalQuestionRefinementWorkflow()
new_critical_questions, cost = await critial_question_refinement_workflow.run(
    questions=critical_questions["questions"],
    input=statement,
    refinement="También quiero que consideres la contaminación del reciclaje de batería",
    language="es",
    location="es",
    model="google/gemini-2.5-flash",
)

# 3-4. Generate searches and collect sources
gen_searches_workflow = GenSearchesWorkflow()
sources, searches, cost = await gen_searches_workflow.run(
    statement=statement,
    questions=new_critical_questions["questions"],
    language="es",
    location="es",
    model="google/gemini-2.5-flash",
)

# 5. (Optional) Refine sources
source_refinement_workflow = SourceRefinementWorkflow()
new_sources, new_searches, cost = await source_refinement_workflow.run(
    statement=statement,
    current_searches=searches,
    language="es",
    location="es",
    model="google/gemini-2.5-flash",
    user_feedback="Quiero información oficial de la Union Europea",
)
sources = sources + new_sources

# 6-10. Run fact-checking pipeline (download, RAG, article, QA, metadata)
article_fact_checking_workflow = FactCheckingWorkflow()
result, cost = await article_fact_checking_workflow.run(
    question=new_critical_questions["questions"],
    statement=statement,
    language="es",
    location="es",
    sources=sources,
    model="google/gemini-2.5-flash",
    use_rag=True,
    embedding_model="gemini-embedding-001",
    article_writer_model="google/gemini-2.5-flash",
    question_answer_model="google/gemini-2.5-flash",
    metadata_model="google/gemini-2.5-flash",
)

print(f"Label: {result['metadata']['label']}")
print(f"Article: {result['answer']}")

# 11. Generate header image
image_generation_workflow = ImageGenerationWorkflow()
image_result, cost = await image_generation_workflow.run(
    statement=statement,
    model="google/gemini-2.5-flash",
)
print(f"Image URL: {image_result['image_url']}")

Cost Tracking

Each workflow step returns a cost value representing API usage. The costs are tracked for:

LLM calls (critical questions, searches, article, QA, metadata, image prompts)
Web search API calls (Serper)
Embedding operations (RAG)
Image generation (Flux/Replicate)

Example cost output:

💰 Step cost: 0.0019604 (Critical Questions)
💰 Step cost: 0.0055707 (GenSearches + WebSearch)
💰 Step cost: 0.0096946 (FactChecking Pipeline)
💰 Step cost: 0.0031064 (Image Generation)

Summary Table

Step	Agent	Optional	Description
1	`CriticalQuestionAgent`	No	Generate research questions
2	`CriticalQuestionRefinementAgent`	Yes	Refine questions with user feedback
3	`GenSearchesAgent`	No	Generate search queries
4	`WebSearchAgent`	No	Execute web searches (Serper)
5	`GenSearchesRefinementAgent`	Yes	Add more sources via user feedback
6	Download Utility	No	Download source content
7	`RagAgent`	Yes	Index and retrieve relevant passages
8	`ArticleWriterAgent`	No	Generate fact-check article
9	`QuestionAgent`	No	Answer each critical question
10	`MetadataAgent`	No	Extract structured metadata
11	`ImagePromptAgent` + `ImageGenAgent`	Yes	Generate header image

FilesExpand file tree

fact-checking-workflow-description.md

Latest commit

History

fact-checking-workflow-description.md

File metadata and controls

Fact-Checking Workflow Documentation

Quick Overview

Examples

Pipeline Configuration

Available Configurations

Configuration Structure

Project Structure

Agent Architecture

Key Features

Structured Outputs

Cost Tracking

Execution History

Async Execution

Agent Interface

Fact Cheking Steps

1. Critical Question Generation

Parameters

Example

2. Critical Question Refinement (Optional)

Parameters

Example

3. Sources Generation

Parameters

Example

4. Source Collection (Serper)

Parameters

Example

5. Source Refinement (Optional)

Parameters

Example

6. Download of the Sources

Parameters (in FactCheckingWorkflow)

7. RAG (Retrieval-Augmented Generation)

Parameters

What happends if you disable RAG?

8. Article Generation

Parameters

Output Format

9. Critical Question Answer Generation

Parameters

Example Output

10. Metadata Generation

Parameters

Output Format

Label Values

11. Image Generation

Parameters

Example

Complete Workflow Example

Cost Tracking

Summary Table

Parameters (in `FactCheckingWorkflow`)