🔄 Azure Data Factory: Dynamic Pivot Transformation Pipeline

📖 Business Scenario

In retail data processing, transactional records are often stored in a "Tall" format (one row per transaction). To enable per-customer performance analysis, this data must be transformed into a "Wide" format.

Goal: Automate the transformation of multi-row customer transactions into a single, consolidated record using ADF Mapping Data Flows.

🏗 ETL Logic & Transformation

Ingestion

Source: Delimited text files (CSV/TXT) stored in input container of Azure Blob Storage.
Dataset: Parametrized DelimitedText dataset to handle incoming sales records.

Transformation (Mapping Data Flow)

Pivot Logic: Grouped by CustomerID and pivoted on the Product key.
Dynamic Aggregations: - Calculated SUM(Quantity) with prefix Qty_
- Calculated SUM(Amount) with prefix Amt_
Type Casting: Converted string inputs to Integers within the data flow expression builder to ensure mathematical accuracy.

Loading (Sink)

Target: Optimized output stored as a single partitioned CSV in the output container for direct ingestion by BI tools.

🛠 Technical Implementation Details

ADF Version: V2
Compute: Auto-resolve Integration Runtime (Central India)
Optimization: Used a Single Partition setting in the Sink to ensure a consolidated output file for reporting compatibility.
Security: Public network access restricted to specific Azure services.

📊 Sample Data Transformation

From (Transactional):

CustomerID	Product	Month	Quantity	Amount
101	Pen	Jan	10	100
101	Notebook	Jan	5	250

To (Analytical):

CustomerID	Qty_Notebook	Qty_Pen	Amt_Notebook	Amt_Pen
101	5	10	250	100

👨‍💻 Project by

Sharad Jadhav | LinkedIn

Azure Data Factory Transformation Project 🚀

📌 Business Use Case

Raw data is stored in Azure Storage Account from external sources.
Requirement: Transform multiple rows → single row per customer using ADF Pivot Transformation.
Store the transformed data into a separate target storage for reporting & analytics.

⚙️ Architecture

Source → Raw data (CSV/TXT) stored in Azure Blob / Data Lake.
Data Flow in ADF
- Pivot Transformation → Reshape rows into columns per CustomerID.
- Derived Column (optional) → Calculate totals or new metrics.
Sink → Final transformed dataset stored in output container.

📂 Input File

sales_raw_data.txt

CustomerID	Product	Month	Quantity	Amount
101	Pen	Jan	10	100
101	Notebook	Jan	5	250
101	Pencil	Feb	20	200
101	Pen	Feb	15	150
102	Pen	Jan	8	80
102	Notebook	Jan	12	600
102	Pencil	Feb	10	100
102	Pen	Mar	20	200
103	Notebook	Jan	7	350
103	Pencil	Feb	15	150
103	Pen	Feb	12	120
103	Notebook	Mar	9	450
104	Pen	Jan	25	250
104	Pencil	Jan	30	300
104	Notebook	Feb	20	1000
104	Pen	Mar	10	100
105	Notebook	Jan	18	900
105	Pen	Feb	22	220
105	Pencil	Mar	12	120
105	Notebook	Mar	10	500

✅ Expected Output File

CustomerID	Qty_Notebook	Qty_Pen	Qty_Pencil	Amt_Notebook	Amt_Pen	Amt_Pencil
101	5	25	20	250	250	200
104	20	35	30	1000	350	300
102	12	28	10	600	280	100
103	16	12	15	800	120	150
105	28	22	12	1400	220	120

🛠 Step-by-Step Implementation

🔹 1. Create Storage Account

Name: sharadstorageaccount
Region: Asia Pacific (South India)
Performance: Standard
Redundancy: Locally Redundant Storage (LRS)
Access Tier: Hot
Networking: Public network access → Enabled

🔹 2. Create Containers

input → For raw data (sales_raw_data.txt)
output → For transformed data

🔹 3. Setup Azure Data Factory

Name: SharadDataFactory1
Region: Central India
Version: V2
Launch ADF Studio.

🔹 4. Create Dataset (Source)

Type: Azure Blob Storage → DelimitedText
Linked Service: Connected to sharadstorageaccount
File Path: input/sales_raw_data.txt
Verified with Test Connection ✅

🔹 5. Build Data Flow

Source → Dataset from step 4.
Data Preview → Verified data ingestion.

Pivot Transformation

Group By → CustomerID
Pivot Key → Product
Aggregations:
- SUM(toInteger(Quantity)) → prefix Qty_
- SUM(toInteger(Amount)) → prefix Amt_

Sink

Dataset: Azure Blob Storage → DelimitedText
Linked Service: Same as source
File Path: output/
Optimized with Single Partition → single output file.

🔹 6. Pipeline Setup

Created new pipeline.
Drag & Drop Data Flow into pipeline.
Publish → Trigger Now.
Monitor Tab → Verified successful pipeline execution.

🔹 7. Validate Output

Final transformed file stored in output container.
Verified correct pivoted structure.

📸 Screenshots

Storage Account creation

Container Setup

Azure Data Fctory Creation

Dataset Configuration

4.Data Flow Creation

5.Pipeline Creation

Trigeering

Monetring

Destination

🎯 Key Learnings

Azure Storage Account setup (Blob & Data Lake Gen2).
Dataset creation and Linked Service in ADF.
Pivot Transformation for reshaping data.
End-to-end pipeline execution & monitoring.
Data validation in Sink.

🏁 Conclusion

This project demonstrates how to use Azure Data Factory to transform raw transactional data into a customer-centric summary using pivot transformations.
It can be extended for real-world analytics pipelines such as sales aggregation, reporting dashboards, and data warehouse ETL.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
part-00000-9af6bbd1-aeae-4fb6-ada6-1e99183c5a24-c000.csv		part-00000-9af6bbd1-aeae-4fb6-ada6-1e99183c5a24-c000.csv
sales_raw_data.txt		sales_raw_data.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔄 Azure Data Factory: Dynamic Pivot Transformation Pipeline

📖 Business Scenario

🏗 ETL Logic & Transformation

Ingestion

Transformation (Mapping Data Flow)

Loading (Sink)

🛠 Technical Implementation Details

📊 Sample Data Transformation

👨‍💻 Project by

Azure Data Factory Transformation Project 🚀

📌 Business Use Case

⚙️ Architecture

📂 Input File

✅ Expected Output File

🛠 Step-by-Step Implementation

🔹 1. Create Storage Account

🔹 2. Create Containers

🔹 3. Setup Azure Data Factory

🔹 4. Create Dataset (Source)

🔹 5. Build Data Flow

🔹 6. Pipeline Setup

🔹 7. Validate Output

📸 Screenshots

🎯 Key Learnings

🏁 Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🔄 Azure Data Factory: Dynamic Pivot Transformation Pipeline

📖 Business Scenario

🏗 ETL Logic & Transformation

Ingestion

Transformation (Mapping Data Flow)

Loading (Sink)

🛠 Technical Implementation Details

📊 Sample Data Transformation

👨‍💻 Project by

Azure Data Factory Transformation Project 🚀

📌 Business Use Case

⚙️ Architecture

📂 Input File

✅ Expected Output File

🛠 Step-by-Step Implementation

🔹 1. Create Storage Account

🔹 2. Create Containers

🔹 3. Setup Azure Data Factory

🔹 4. Create Dataset (Source)

🔹 5. Build Data Flow

🔹 6. Pipeline Setup

🔹 7. Validate Output

📸 Screenshots

🎯 Key Learnings

🏁 Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages