A Python-based ETL pipeline that fetches real-time weather data from Open-Meteo (no API key req) for multiple cities, transforms the data using Pandas, and stores it into a PostgreSQL database. You can later perform analysis on historical weather patterns such as average temperatures, wind speeds, etc.
- Python 3
- Pandas – for data transformation
- Requests – for interacting with the API
- PostgreSQL – for storing time-series weather data
- psycopg2-binary – to connect Python to PostgreSQL
-
Extract:
Read city coordinates from a CSV and fetch current weather data using the Open-Meteo API. -
Transform:
Normalize the JSON response into a clean Pandas DataFrame with consistent schema. -
Load:
Insert the cleaned weather data into a PostgreSQL table (weather_data). -
Analyze (optional):
Run queries to calculate daily average, min, max temperatures, etc.
git clone https://github.com/your-username/weather_data_collector.git cd weather_data_collector
python -m venv .venv source .venv/bin/activate # macOS/Linux .venv\Scripts\activate # Windows
pip install -r requirements.txt
-- Run in psql terminal CREATE USER postgres(username) WITH PASSWORD 'your_password'; CREATE DATABASE weatherdb postgres;
DB_CONFIG = {
'host': 'localhost',
'port': '5432',
'user': 'postgres',
'password': 'your_password',
'dbname': 'weatherdb'
}
python main.py --- Expected output: 🚀 Fetching weather data... ✅ Data loaded successfully.
psql -U postgres -d weatherdb -- Then in the psql shell: SELECT * FROM weather_data LIMIT 10;
Kavin Kishore
B.Tech Student, DTU
Built as a real-time data engineering mini project.