Skip to content

UmairAmjad-developer/SDA-Project-2026

Repository files navigation

GDP Analysis System (SDA-Project-2026)

A comprehensive data analysis and visualization system for global GDP statistics. This is a semester project for Structured Data Analysis (SDA) - Phase 1.

Group Members:

  • Umair Amjad
  • Muhammad Ali

Table of Contents


Project Overview

The GDP Analysis System is a Python-based data processing and visualization tool that analyzes global GDP statistics. It loads GDP data from a CSV dataset, processes it according to user-defined configurations, and generates four sequential visualizations:

  1. Bar Chart - GDP comparison across countries/regions
  2. Histogram - GDP distribution frequency
  3. Dot Plot - Individual GDP values with reference lines
  4. Pie Chart - Market share with final statistics report

Features

  • Data Loading: Efficiently loads and parses large GDP datasets
  • Data Processing: Supports multiple operations (average, sum, min, max, etc.)
  • Regional Filtering: Analyze specific regions or continents
  • Year-based Analysis: Query data for any year in the dataset
  • Sequential Visualizations: Displays graphs one-by-one with user control
  • Automatic Graph Saving: All graphs saved to out/ directory
  • Cross-Platform Support: Works on Linux, macOS, and Windows
  • Comprehensive Logging: Detailed output for debugging and tracking

Project Structure

SDA-Project-2026/
├── main.py                          # Entry point of the application
├── config.json                      # Configuration file (region, year, operation)
├── requirements.txt                 # Python dependencies
├── fix_csv.py                       # Data cleaning utility
├── README.md                        # This file
│
├── data/
│   ├── gdp_dataset.csv             # Original GDP dataset
│   └── gdp_dataset_fixed.csv       # Cleaned dataset (generated by fix_csv.py)
│
├── src/
│   ├── __init__.py                 # Package initialization
│   ├── loader.py                   # Data loading module
│   ├── processor.py                # Data processing logic
│   └── visualizer.py               # Visualization module
│
└── out/
    ├── 01_bar_*.png               # Bar chart outputs
    ├── 02_hist_*.png              # Histogram outputs
    ├── 03_dot_*.png               # Dot plot outputs
    └── 04_pie_*.png               # Pie chart outputs

📦 Requirements

  • Python 3.8+
  • Dependencies (see requirements.txt):
    • pandas >= 1.3.0
    • matplotlib >= 3.4.0
    • numpy >= 1.21.0

🚀 Installation & Setup

Step 1: Clone/Download the Repository

cd /path/to/SDA-Project-2026

Step 2: Install Dependencies

pip install -r requirements.txt

Step 3: Prepare the Data

Run the data cleaning script to fix the CSV file:

python fix_csv.py

This creates data/gdp_dataset_fixed.csv with cleaned data.

Step 4: Configure Settings

Edit config.json to set your analysis parameters:

{
    "region": "Asia",
    "year": 2020,
    "operation": "average",
    "output": "dashboard"
}

Configuration Options:

  • region: Target region/continent (e.g., "Asia", "Europe", "Africa")
  • year: Analysis year (e.g., 2020, 2021)
  • operation: Calculation type ("average", "sum", "min", "max")
  • output: Output mode ("dashboard" for visualizations)

💻 Usage Guide

Basic Usage

python main.py

Program Flow

  1. Step 1: Loads configuration from config.json
  2. Step 2: Reads and validates GDP dataset
  3. Step 3: Processes data based on configuration
  4. Step 4: Displays 4 sequential visualizations

Interactive Graph Viewing

  • Each graph displays in your system's default image viewer
  • Close the current graph to see the next one
  • All graphs are saved to the out/ folder automatically

Example Run

$ python main.py
---------------------------------------
   GDP Analysis System (SDA 2026)     
---------------------------------------
Step 1: Loading Configuration...
   -> Region: Asia
   -> Year: 2020

Step 2: Loading Dataset from 'data/gdp_dataset_fixed.csv'...
   -> Data loaded successfully.

Step 3: Processing Data...
   -> Calculation (average): 1,523,398,324,074.54

Step 4: Launching Visualizations...
   (Graphs will open sequentially. Close one to see the next.)
   -> Opening Graph 1/4: Bar Chart...
   -> Opening Graph 2/4: Histogram...
   -> Opening Graph 3/4: Dot Plot...
   -> Opening Graph 4/4: Pie Chart & Final Report...

Configuration

config.json Example

{
    "region": "Asia",
    "year": 2020,
    "operation": "average",
    "output": "dashboard"
}

Supported Regions

  • Africa
  • Asia
  • Europe
  • North America
  • South America
  • Oceania

Supported Operations

Operation Description
average Mean GDP value
sum Total GDP
min Minimum GDP
max Maximum GDP

Testing

Test Version 1.0.0

  • Date: February 2026
  • Status: Initial Release
  • Features Tested:
    • Data loading from CSV ✓
    • Basic data processing ✓
    • Single visualization output ✓

Test Version 1.1.0

  • Date: February 2026
  • Status: Sequential Visualization Release
  • Features Added:
    • Sequential graph display ✓
    • System image viewer integration ✓
    • User-controlled graph flow (close to advance) ✓
    • Improved error handling ✓

Test Version 1.2.0 (Current)

  • Date: February 2026
  • Status: Production Ready
  • Features:
    • Complete documentation ✓
    • Multiple visualization types ✓
    • Regional filtering ✓
    • Year-based analysis ✓
    • Robust error handling ✓
    • Cross-platform compatibility ✓

Running Tests

Test Case 1: Basic Execution

python main.py

Expected: All 4 graphs display sequentially

Test Case 2: Data Validation

python fix_csv.py

Expected: Creates gdp_dataset_fixed.csv without errors

Test Case 3: Custom Configuration

Edit config.json with different values and run python main.py

Test Case 4: Error Handling

  • Delete config.json → Should show error message
  • Delete data files → Should show file not found error
  • Invalid region → Should filter to empty results gracefully

Troubleshooting

Issue: "File not found: gdp_dataset_fixed.csv"

Solution: Run python fix_csv.py first to generate the cleaned dataset

Issue: Graphs not opening/showing blank screen

Solution:

  • Ensure you have an image viewer installed (default system viewer)
  • Check that X11 display is available on Linux systems
  • Graphs are always saved to out/ folder as backup

Issue: ModuleNotFoundError for pandas/matplotlib

Solution: Install dependencies

pip install -r requirements.txt

Issue: Different years show no data

Solution: Check if the year exists in the dataset (1960-2024)

Issue: Empty results for region

Solution: Verify region name matches dataset (check data file for available regions)


Visualization Details

1. Bar Chart

  • Displays GDP values as horizontal bars
  • Countries sorted by GDP value
  • Color: Teal
  • Includes grid for readability

2. Histogram

  • Shows distribution of GDP values across bins
  • Default bins: 10
  • Color: Sky Blue
  • Helps identify GDP concentration patterns

3. Dot Plot

  • Individual data points with reference lines
  • Color: Purple with gray reference lines
  • Useful for outlier detection
  • Sorted by GDP value

4. Pie Chart + Final Report

  • Top 8 countries shown individually
  • Remaining countries grouped as "Others"
  • Percentage labels for each segment
  • Final statistics box showing:
    • Region name
    • Analysis year
    • Operation type
    • Calculated result

Module Documentation

loader.py

Handles CSV data loading and initial validation

load_data(file_path: str) -> DataFrame

processor.py

Processes data based on configuration

process_data(df: DataFrame, config: dict) -> tuple

visualizer.py

Creates and displays visualizations

show_dashboard(data: DataFrame, result_value: float, config: dict) -> None

Version History

Version Date Status Changes
1.0.0 Feb 2026 Released Initial release with basic functionality
1.1.0 Feb 2026 Released Added sequential graph display
1.2.0 Feb 2026 Current Complete documentation and production ready

Contributors

  • Umair Amjad - Co-Developer
  • Muhammad Ali - Co-Developer

License

This is a semester project for educational purposes.


Support

For issues or questions, please contact the project contributors or refer to the troubleshooting section above.

About

That's our semester project for SDA as Phase 1. Umair Amjad and Ali Khan Lodhi are group members for this project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages