Skip to content

Pasindu-Nimsara/companies_dataEngineering

Repository files navigation

🌍 Global Companies Data Pipeline and BI Dashboard

This solo project demonstrates a complete data engineering and BI pipeline using:

🔧 Tools: Python | Azure Data Factory | Azure Data Lake | Azure SQL | Power BI

🔁 Workflow

  1. Extracted company data
  2. Cleaned data using Python (Google Colab)/Use Azure Data flows(Recommended)
  3. Stored raw data in Azure Data Lake
  4. Moved cleaned data to Azure SQL using Azure Data Factory
  5. Visualized insights using Power BI

📊 Dashboard Features

  • Top 10 companies by revenue
  • Revenue by country (map view)
  • Revenue by industry (pie chart)
  • Average number of employees
  • Slicers for filtering by rank, country, and industry

🔗 Technologies Used

  • Python (BeautifulSoup, Pandas)
  • Azure Data Factory (ETL)
  • Azure Data Lake (Storage)
  • Azure SQL (Database)
  • Power BI (Dashboard)

🙋‍♂️ Built By

Pasindu Nimsara
LinkedIn Profile

About

This project showcases a solo end-to-end Data Engineering pipeline built on Microsoft Azure and Power BI. The focus was on analyzing the Top 50 Global Companies by Revenue, starting from raw data collection to a fully interactive BI dashboard.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors