Skip to content

Progati00/FDA-FAERS-2025

Repository files navigation

FDA FAERS Q2 2025 – Data Cleaning, Integration & Dashboard Analysis

📄 Project Overview

This project processes the FDA Adverse Event Reporting System (FAERS) Q2 2025 dataset (April–June 2025, published July 29, 2025).
The objective was to clean, standardize, and merge multiple raw data sources into one analysis-ready dataset, then visualize key insights using Power BI.


📂 Dataset Information

Source: FDA FAERS Quarterly Data Extract Files
The files contain raw data extracted from the FAERS database for the indicated time range and are not cumulative.

The Q2 2025 dataset contains six separate sheets:

Sheet Name Description
INDI25Q2 Drug indication details
OUTC25Q2 Reported outcomes
REAC25Q2 Reported reactions
RPSR25Q2 Report sources
DRUG25Q2 Drug details
DEMO25Q2 Patient demographics

🛠 Data Preparation Process (MySQL Workbench)

1. Data Import

  • Imported all six datasets into MySQL Workbench as separate tables.

2. Data Cleaning

  • Dropped irrelevant or redundant columns.
  • Standardized formats and replaced missing/blank values with placeholders ('Unknown', 0).
  • Removed duplicates and records with NULL values in key fields.
  • Normalized text and numeric fields for consistency.

3. Data Integration

  • Performed INNER JOIN and LEFT JOIN operations on primaryid and caseid.
  • Merged all six tables into a single table named combined_faers.

📑 Final Table Structure – combined_faers

Column Name Description
primaryid, caseid Unique report identifiers
age, sex, wt Patient demographics
reporter_country, occr_country Reporting & occurrence country
drug_seq, role_cod, drugname, prod_ai, val_vbm, route Drug details
indi_drug_seq, indi_pt Medical indication
outc_cod Reported outcome
pt Reported reaction

📊 Power BI Data Transformation

  • Added Age Groups:
    • Child/Adolescent: <18
    • Young Adult: 18–44
    • Middle-aged Adult: 45–64
    • Older Adult: 65+
    • Unknown: Missing values

📈 Dashboard Insights

Key Metrics (KPIs)

  • Total Reports: 114.71K
  • Distinct Drugs Reported: 63
  • Countries Reporting: 5
  • Most Common Outcomes: 4 categories (DE, HO, LT, OT)

Demographic Analysis

  • Older adults (mostly female) dominate the dataset.
  • Middle-aged adults follow; fewer cases in younger groups.

Top 10 Drugs by Report Count

  • Actemra: 41K reports
  • Aspirin & herbal products: ~8K each

Outcome by Drug Role

  • Primary Suspect: 39.29%
  • Secondary Suspect: 32.14%
  • Concomitant: 28.57%

Route of Administration

  • Unknown: 37.5%
  • Oral: 33.33%

Geographical Trends

  • Canada: 113.18K reports (largest share).
  • Most Canadian reports were Unknown or Intravenous route.

📦 Tools & Technologies

  • SQL (MySQL Workbench) – Data cleaning, transformation, and integration
  • Power BI – Visualization & dashboard creation
  • Dataset Source: FDA FAERS Q2 2025 (April–June 2025, published July 29, 2025)

📷 Dashboard Preview

For a visual walkthrough of the dashboard insights, view the full report here:
View FAERS Dashboard (PDF)


📌 Summary

This project transformed six separate datasets into a single, clean, analysis-ready table for drug safety analysis.
The Power BI dashboard provides clear insights into demographics, drug usage patterns, outcomes, and geographical trends.


About

End-to-end project processing FDA Adverse Event Reporting System (FAERS) April–June 2025 data. Includes SQL-based cleaning, transformation, and merging of six raw datasets into a unified table, followed by interactive Power BI dashboards analyzing demographics, drug usage, outcomes, and geographical trends.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors