GDP Analysis System (SDA-Project-2026)

A comprehensive data analysis and visualization system for global GDP statistics. This is a semester project for Structured Data Analysis (SDA) - Phase 1.

Group Members:

Umair Amjad
Muhammad Ali

Project Overview

The GDP Analysis System is a Python-based data processing and visualization tool that analyzes global GDP statistics. It loads GDP data from a CSV dataset, processes it according to user-defined configurations, and generates four sequential visualizations:

Bar Chart - GDP comparison across countries/regions
Histogram - GDP distribution frequency
Dot Plot - Individual GDP values with reference lines
Pie Chart - Market share with final statistics report

Features

Data Loading: Efficiently loads and parses large GDP datasets
Data Processing: Supports multiple operations (average, sum, min, max, etc.)
Regional Filtering: Analyze specific regions or continents
Year-based Analysis: Query data for any year in the dataset
Sequential Visualizations: Displays graphs one-by-one with user control
Automatic Graph Saving: All graphs saved to out/ directory
Cross-Platform Support: Works on Linux, macOS, and Windows
Comprehensive Logging: Detailed output for debugging and tracking

Project Structure

SDA-Project-2026/
├── main.py                          # Entry point of the application
├── config.json                      # Configuration file (region, year, operation)
├── requirements.txt                 # Python dependencies
├── fix_csv.py                       # Data cleaning utility
├── README.md                        # This file
│
├── data/
│   ├── gdp_dataset.csv             # Original GDP dataset
│   └── gdp_dataset_fixed.csv       # Cleaned dataset (generated by fix_csv.py)
│
├── src/
│   ├── __init__.py                 # Package initialization
│   ├── loader.py                   # Data loading module
│   ├── processor.py                # Data processing logic
│   └── visualizer.py               # Visualization module
│
└── out/
    ├── 01_bar_*.png               # Bar chart outputs
    ├── 02_hist_*.png              # Histogram outputs
    ├── 03_dot_*.png               # Dot plot outputs
    └── 04_pie_*.png               # Pie chart outputs

📦 Requirements

Python 3.8+
Dependencies (see requirements.txt):
- pandas >= 1.3.0
- matplotlib >= 3.4.0
- numpy >= 1.21.0

🚀 Installation & Setup

Step 1: Clone/Download the Repository

cd /path/to/SDA-Project-2026

Step 2: Install Dependencies

pip install -r requirements.txt

Step 3: Prepare the Data

Run the data cleaning script to fix the CSV file:

python fix_csv.py

This creates data/gdp_dataset_fixed.csv with cleaned data.

Step 4: Configure Settings

Edit config.json to set your analysis parameters:

{
    "region": "Asia",
    "year": 2020,
    "operation": "average",
    "output": "dashboard"
}

Configuration Options:

region: Target region/continent (e.g., "Asia", "Europe", "Africa")
year: Analysis year (e.g., 2020, 2021)
operation: Calculation type ("average", "sum", "min", "max")
output: Output mode ("dashboard" for visualizations)

💻 Usage Guide

Basic Usage

python main.py

Program Flow

Step 1: Loads configuration from config.json
Step 2: Reads and validates GDP dataset
Step 3: Processes data based on configuration
Step 4: Displays 4 sequential visualizations

Interactive Graph Viewing

Each graph displays in your system's default image viewer
Close the current graph to see the next one
All graphs are saved to the out/ folder automatically

Example Run

$ python main.py
---------------------------------------
   GDP Analysis System (SDA 2026)     
---------------------------------------
Step 1: Loading Configuration...
   -> Region: Asia
   -> Year: 2020

Step 2: Loading Dataset from 'data/gdp_dataset_fixed.csv'...
   -> Data loaded successfully.

Step 3: Processing Data...
   -> Calculation (average): 1,523,398,324,074.54

Step 4: Launching Visualizations...
   (Graphs will open sequentially. Close one to see the next.)
   -> Opening Graph 1/4: Bar Chart...
   -> Opening Graph 2/4: Histogram...
   -> Opening Graph 3/4: Dot Plot...
   -> Opening Graph 4/4: Pie Chart & Final Report...

Configuration

config.json Example

{
    "region": "Asia",
    "year": 2020,
    "operation": "average",
    "output": "dashboard"
}

Supported Regions

Africa
Asia
Europe
North America
South America
Oceania

Supported Operations

Operation	Description
`average`	Mean GDP value
`sum`	Total GDP
`min`	Minimum GDP
`max`	Maximum GDP

Testing

Test Version 1.0.0

Date: February 2026
Status: Initial Release
Features Tested:
- Data loading from CSV ✓
- Basic data processing ✓
- Single visualization output ✓

Test Version 1.1.0

Date: February 2026
Status: Sequential Visualization Release
Features Added:
- Sequential graph display ✓
- System image viewer integration ✓
- User-controlled graph flow (close to advance) ✓
- Improved error handling ✓

Test Version 1.2.0 (Current)

Date: February 2026
Status: Production Ready
Features:
- Complete documentation ✓
- Multiple visualization types ✓
- Regional filtering ✓
- Year-based analysis ✓
- Robust error handling ✓
- Cross-platform compatibility ✓

Running Tests

Test Case 1: Basic Execution

python main.py

Expected: All 4 graphs display sequentially

Test Case 2: Data Validation

python fix_csv.py

Expected: Creates gdp_dataset_fixed.csv without errors

Test Case 3: Custom Configuration

Edit config.json with different values and run python main.py

Test Case 4: Error Handling

Delete config.json → Should show error message
Delete data files → Should show file not found error
Invalid region → Should filter to empty results gracefully

Troubleshooting

Issue: "File not found: gdp_dataset_fixed.csv"

Solution: Run python fix_csv.py first to generate the cleaned dataset

Issue: Graphs not opening/showing blank screen

Solution:

Ensure you have an image viewer installed (default system viewer)
Check that X11 display is available on Linux systems
Graphs are always saved to out/ folder as backup

Issue: ModuleNotFoundError for pandas/matplotlib

Solution: Install dependencies

pip install -r requirements.txt

Issue: Different years show no data

Solution: Check if the year exists in the dataset (1960-2024)

Issue: Empty results for region

Solution: Verify region name matches dataset (check data file for available regions)

Visualization Details

1. Bar Chart

Displays GDP values as horizontal bars
Countries sorted by GDP value
Color: Teal
Includes grid for readability

2. Histogram

Shows distribution of GDP values across bins
Default bins: 10
Color: Sky Blue
Helps identify GDP concentration patterns

3. Dot Plot

Individual data points with reference lines
Color: Purple with gray reference lines
Useful for outlier detection
Sorted by GDP value

4. Pie Chart + Final Report

Top 8 countries shown individually
Remaining countries grouped as "Others"
Percentage labels for each segment
Final statistics box showing:
- Region name
- Analysis year
- Operation type
- Calculated result

Module Documentation

loader.py

Handles CSV data loading and initial validation

load_data(file_path: str) -> DataFrame

processor.py

Processes data based on configuration

process_data(df: DataFrame, config: dict) -> tuple

visualizer.py

Creates and displays visualizations

show_dashboard(data: DataFrame, result_value: float, config: dict) -> None

Version History

Version	Date	Status	Changes
1.0.0	Feb 2026	Released	Initial release with basic functionality
1.1.0	Feb 2026	Released	Added sequential graph display
1.2.0	Feb 2026	Current	Complete documentation and production ready

Contributors

Umair Amjad - Co-Developer
Muhammad Ali - Co-Developer

License

This is a semester project for educational purposes.

Support

For issues or questions, please contact the project contributors or refer to the troubleshooting section above.

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
Phase 3		Phase 3
Phase2		Phase2
data		data
out		out
src		src
.gitignore		.gitignore
Plugins		Plugins
README.md		README.md
config.json		config.json
fix_csv.py		fix_csv.py
main.py		main.py
requirements.txt		requirements.txt
tempCodeRunnerFile.py		tempCodeRunnerFile.py

Folders and files

Latest commit

History

Repository files navigation

GDP Analysis System (SDA-Project-2026)

Table of Contents

Project Overview

Features

Project Structure

📦 Requirements

🚀 Installation & Setup

Step 1: Clone/Download the Repository

Step 2: Install Dependencies

Step 3: Prepare the Data

Step 4: Configure Settings

💻 Usage Guide

Basic Usage

Program Flow

Interactive Graph Viewing

Example Run

Configuration

config.json Example

Supported Regions

Supported Operations

Testing

Test Version 1.0.0

Test Version 1.1.0

Test Version 1.2.0 (Current)

Running Tests

Test Case 1: Basic Execution

Test Case 2: Data Validation

Test Case 3: Custom Configuration

Test Case 4: Error Handling

Troubleshooting

Issue: "File not found: gdp_dataset_fixed.csv"

Issue: Graphs not opening/showing blank screen

Issue: ModuleNotFoundError for pandas/matplotlib

Issue: Different years show no data

Issue: Empty results for region

Visualization Details

1. Bar Chart

2. Histogram

3. Dot Plot

4. Pie Chart + Final Report

Module Documentation

loader.py

processor.py

visualizer.py

Version History

Contributors

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages