Skip to content

Latest commit

 

History

History
140 lines (97 loc) · 4.78 KB

File metadata and controls

140 lines (97 loc) · 4.78 KB

FactCheck Documentation

Welcome to the FactCheck documentation. This guide provides comprehensive information about the fake news detection system, including technical methodology, API references, and usage examples.

Table of Contents

  1. Getting Started
  2. Documentation Index
  3. Quick Reference

Getting Started

FactCheck is a machine learning system for detecting fake news articles. Before diving into the documentation, ensure you have:

  1. Python 3.8+ installed
  2. Virtual environment set up
  3. Dependencies installed via pip install -r requirements.txt
  4. Dataset placed in dataset/ directory

Documentation Index

Core Documentation

Document Description
Methodology Technical approach, algorithms, and model architecture
API Reference Complete function and class documentation
Results Analysis Detailed performance analysis and insights

Guides

Guide Description
Installation Guide Step-by-step setup instructions
Training Guide How to train and evaluate models
Deployment Guide Deploying models in production

Quick Reference

Training a Model

# Train all models
python train.py

# Train specific model
python train.py --model logistic_regression

Making Predictions

# Command line
python predict.py "Your article text here"

# Interactive mode
python predict.py --interactive

Python API

from predict import FakeNewsPredictor

predictor = FakeNewsPredictor()
result = predictor.predict("News article text...")
print(result['prediction'])  # 'REAL' or 'FAKE'

Project Architecture

┌─────────────────────────────────────────────────────────────┐
│                      FactCheck System                        │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │    Input     │───▶│ Preprocessing │───▶│   Feature    │  │
│  │    Text      │    │   Pipeline    │    │  Extraction  │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│                                                  │           │
│                                                  ▼           │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │    Output    │◀───│   Ensemble   │◀───│   TF-IDF     │  │
│  │  Prediction  │    │    Model     │    │   Vectors    │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Module Overview

src/preprocessing.py

Text cleaning and feature extraction utilities.

  • TextPreprocessor: Clean and normalize text
  • FeatureExtractor: TF-IDF vectorization
  • load_and_prepare_data(): Dataset loading

src/models.py

Machine learning model implementations.

  • ModelFactory: Create model instances
  • FakeNewsClassifier: Main classifier wrapper
  • EnsembleModel: Voting ensemble
  • ModelEvaluator: Metric calculations

src/visualization.py

Plotting and visualization functions.

  • plot_confusion_matrix(): Confusion matrix heatmap
  • plot_model_comparison(): Model performance comparison
  • plot_feature_importance(): Feature importance charts

src/utils.py

Helper utilities and configuration.

  • Config: Project configuration
  • save_model() / load_model(): Model persistence
  • print_metrics(): Formatted metric display

Next Steps