Skip to content

Latest commit

 

History

History
738 lines (656 loc) · 34.4 KB

File metadata and controls

738 lines (656 loc) · 34.4 KB

🏗️ System Architecture

AI-Driven Manufacturing Intelligence — Track A

Track: A — Predictive Modelling Specialization
Domain: Pharmaceutical Tablet Manufacturing
Data: _h_batch_process_data.xlsx + _h_batch_production_data.xlsx


Table of Contents

  1. High-Level Architecture
  2. Layer-by-Layer Breakdown
  3. Module Designs
  4. Full Pipeline Flow
  5. Project Folder Structure
  6. API Design
  7. Dashboard Design
  8. Data Flow Between Files

1. High-Level Architecture

┌──────────────────────────────────────────────────────────────────────────────────┐
│                              DATA INGESTION LAYER                                │
│                                                                                  │
│  ┌─────────────────────────────┐       ┌──────────────────────────────────────┐  │
│  │  _h_batch_process_data.xlsx │       │   _h_batch_production_data.xlsx      │  │
│  │   (Time-Series Sensors)     │       │    (Batch Outcome Records)           │  │
│  │                             │       │                                      │  │
│  │   • 211 rows, 1 batch       │       │   • 60 batches (T001–T060)           │  │
│  │   • 8 manufacturing phases  │       │   • 8 input process features         │  │
│  │   • Power + Vibration sigs  │       │   • 6 quality/yield/perf targets     │  │
│  └──────────────┬──────────────┘       └──────────────────┬───────────────────┘  │
└─────────────────┼────────────────────────────────────────┼─────────────────────┘
                  │                                         │
                  ▼                                         ▼
┌──────────────────────────────────────────────────────────────────────────────────┐
│                      PREPROCESSING & FEATURE ENGINEERING                         │
│                                                                                  │
│  ┌────────────────────────────────┐   ┌──────────────────────────────────────┐  │
│  │   Time-Series Processing       │   │   Batch-Level Engineering            │  │
│  │                                │   │                                      │  │
│  │  • Phase segmentation          │   │  • MinMax normalization              │  │
│  │  • Phase-wise aggregation      │   │  • IQR outlier detection             │  │
│  │  • Rolling statistics (5-min)  │   │  • Physics-based energy simulation   │  │
│  │  • FFT vibration features      │   │  • Carbon footprint derivation       │  │
│  │  • Anomaly injection (sim)     │   │  • Feature correlation analysis      │  │
│  └───────────────┬────────────────┘   └──────────────────────┬───────────────┘  │
│                  └──────────────────────────┬─────────────────┘                 │
│                                             ▼                                   │
│                          ┌──────────────────────────────┐                       │
│                          │     MERGED FEATURE MATRIX     │                       │
│                          │   60 batches × ~22 features   │                       │
│                          └──────────────────────────────┘                       │
└──────────────────────────────────────────────────────────────────────────────────┘
                  │
                  ▼
┌──────────────────────────────────────────────────────────────────────────────────┐
│                               ML MODEL LAYER                                     │
│                                                                                  │
│  ┌──────────────────────────────────┐   ┌────────────────────────────────────┐  │
│  │  MODULE 1                        │   │  MODULE 2                          │  │
│  │  Multi-Target Regressor          │   │  Energy Pattern Analyser           │  │
│  │                                  │   │                                    │  │
│  │  Input:  8 process params        │   │  Input:  Power_kW, Vibration_mm_s  │  │
│  │  Output: 7 simultaneous targets  │   │          Time_Minutes, Phase       │  │
│  │                                  │   │                                    │  │
│  │  Stack:                          │   │  Stack:                            │  │
│  │  • XGBoost MultiOutputRegressor  │   │  • Isolation Forest (batch-level)  │  │
│  │  • Random Forest                 │   │  • LSTM Autoencoder (time-series)  │  │
│  │  • MLP Neural Network            │   │  • Phase-wise z-score rules        │  │
│  │  • Ridge Stacking Meta-Learner   │   │                                    │  │
│  │                                  │   │  Output:                           │  │
│  │  Target: R² ≥ 0.90 all targets   │   │  • Anomaly score (0–1)             │  │
│  └────────────────┬─────────────────┘   │  • Root cause attribution          │  │
│                   │                     │  • Phase health flags              │  │
│                   │                     └──────────────────┬─────────────────┘  │
│                   └──────────────────────────┬─────────────┘                    │
│                                              ▼                                  │
│                   ┌──────────────────────────────────────────┐                  │
│                   │   MODULE 3: SHAP Explainability Engine    │                  │
│                   │   • Per-target feature importance         │                  │
│                   │   • Per-batch waterfall explanations      │                  │
│                   │   • Beeswarm + bar summary plots          │                  │
│                   └──────────────────────────────────────────┘                  │
│                                              │                                  │
│                   ┌──────────────────────────────────────────┐                  │
│                   │   MODULE 4: Carbon Footprint Tracker      │                  │
│                   │   • CO₂e per batch  (Energy × 0.716)      │                  │
│                   │   • Adaptive target setting               │                  │
│                   │   • Regulatory compliance tracking        │                  │
│                   └──────────────────────────────────────────┘                  │
└──────────────────────────────────────────────────────────────────────────────────┘
                  │
                  ▼
┌──────────────────────────────────────────────────────────────────────────────────┐
│                        SERVING & VISUALIZATION LAYER                             │
│                                                                                  │
│   ┌──────────────────────────────────┐   ┌────────────────────────────────────┐ │
│   │   FastAPI REST Backend           │   │   Next.js Web Dashboard            │ │
│   │                                  │   │   (React 19 + TypeScript)          │ │
│   │   POST /api/predict              │   │   Tab 1: Predictions               │ │
│   │   POST /api/anomaly              │   │   Tab 2: Energy Monitor            │ │
│   │   GET  /api/explain/{batch_id}   │   │   Tab 3: Batch Comparison          │ │
│   │   GET  /api/carbon/{batch_id}    │   │   Tab 4: Carbon Footprint          │ │
│   │   GET  /api/batches              │   │   Tab 5: What-If Optimizer         │ │
│   │   GET  /api/carbon_history       │   │   Tab 6: Benchmark Report          │ │
│   │   GET  /api/model_metrics        │   │                                    │ │
│   │   GET  /api/health               │   │   Recharts visualizations          │ │
│   │   < 100ms inference time         │   │   Real-time slider predictions     │ │
│   │   Swagger auto-docs at /docs     │   │   http://localhost:3000            │ │
│   └──────────────────────────────────┘   └────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────────────┘

2. Layer-by-Layer Breakdown

Layer 1 — Data Ingestion

Two source files feed the system. They join on Batch_ID where T001 is the only common batch. This join is used for sensor simulation — T001 sensor profile becomes the template scaled to all 60 batches.

File What It Represents Primary Use
_h_batch_process_data.xlsx Sensor log of one batch (T001), minute-by-minute Energy pattern module + sensor simulation template
_h_batch_production_data.xlsx Summary records of 60 batches Multi-target prediction + target derivation

Layer 2 — Preprocessing & Feature Engineering

Two parallel pipelines that merge before model training:

Time-Series Pipeline (from File 1):

Raw sensor data (211 rows)
  → Segment by Phase (8 phases)
  → Aggregate per phase: mean, max, std of Power and Vibration
  → Compute rolling 5-min power slope (trend indicator)
  → FFT dominant frequency of vibration (motor frequency signature)
  → Result: 1 row of 32 features representing T001's phase profile
  → Simulate for T002–T060 using physics-based scaling

Batch-Level Pipeline (from File 2):

Raw batch records (60 rows)
  → Validate: no nulls, check value ranges
  → Detect outliers using IQR (flag, do not drop)
  → Derive Energy_kWh using physics formula
  → Derive Carbon_kgCO2e = Energy_kWh × 0.716
  → Normalize with MinMaxScaler (fit on train only)
  → Result: 60 rows × 10 engineered features

Merge:

Join both pipelines on Batch_ID
→ Final matrix: 60 rows × ~22 features
→ 80/20 split: 48 train, 12 test

Layer 3 — ML Model Layer

Four modules, each independently trained and serialized:

Module Algorithm(s) Input Output
1 — Prediction XGBoost + RF + MLP → Ridge Stacking 8 process params + 4 sim features 7 quality + energy targets
2 — Anomaly Isolation Forest + LSTM Autoencoder Phase-aggregated sensor features / raw time-series Anomaly score + root cause
3 — Explainability SHAP TreeExplainer XGBoost model + input features Feature contribution scores
4 — Carbon Rule-based formula Predicted Energy_kWh CO₂e + dynamic targets

Layer 4 — Serving & Visualization

FastAPI (port 8000):
  → Loads all serialized models at startup
  → Handles prediction, anomaly, explanation, carbon requests
  → All responses in JSON

Next.js Dashboard (port 3000):
  → Calls FastAPI endpoints via REST
  → Renders Recharts visualizations for all charts
  → Real-time slider updates trigger /api/predict calls (<100ms)

3. Module Designs

Module 1 — Multi-Target Prediction

Architecture

Input (8 features):
  Granulation_Time, Binder_Amount, Drying_Temp, Drying_Time,
  Compression_Force, Machine_Speed, Lubricant_Conc, Moisture_Content
  + sim features: phase_power_compression, phase_vibration_milling, etc.

                    ┌─────────────────┐
                    │   5-Fold CV     │
                    └────────┬────────┘
                             │ out-of-fold predictions
          ┌──────────────────┼──────────────────┐
          ▼                  ▼                  ▼
   ┌────────────┐    ┌────────────┐    ┌────────────┐
   │  XGBoost   │    │  Random    │    │    MLP     │
   │  MultiOut  │    │  Forest    │    │  7-output  │
   └──────┬─────┘    └─────┬──────┘    └──────┬─────┘
          │                │                  │
          └────────────────┼──────────────────┘
                           ▼
                  ┌─────────────────┐
                  │  Ridge Stacking │  ← meta-learner
                  │  Meta-Learner   │
                  └────────┬────────┘
                           ▼
Output (7 targets):
  Hardness, Friability, Dissolution_Rate, Content_Uniformity,
  Disintegration_Time, Tablet_Weight, Energy_kWh

Model Comparison Table

Model Strength Weakness Role in Ensemble
XGBoost Best single-model performance; handles non-linearity Sensitive to hyperparameters Primary base learner
Random Forest Stable, low-variance; bagging diversity Slightly lower accuracy than XGBoost Diversity provider
MLP Captures inter-target correlations via shared layers Needs more data; slower to train Non-linear interaction catcher
Ridge Stacking Learns optimal weights for blending Linear only Final combiner

Module 2 — Energy Pattern Analyser

Two-Layer Detection Design

Batch sensor data (211 timesteps × 2 channels: Power, Vibration)
                         │
          ┌──────────────┴──────────────┐
          ▼                             ▼
  ┌───────────────┐             ┌──────────────────────┐
  │ Layer 1:      │             │ Layer 2:             │
  │ Isolation     │             │ LSTM Autoencoder     │
  │ Forest        │             │                      │
  │               │             │ Encoder:             │
  │ Input: 32     │             │  LSTM(64) → LSTM(32) │
  │ phase-agg     │             │  → Dense(16) [z]     │
  │ features      │             │                      │
  │               │             │ Decoder:             │
  │ Output:       │             │  Dense(32) →         │
  │ anomaly_flag  │             │  LSTM(32) →          │
  │ score (0-1)   │             │  LSTM(64) →          │
  │               │             │  TimeDistributed(2)  │
  │ Speed: ~5ms   │             │                      │
  └───────┬───────┘             │ Output:              │
          │                     │ reconstruction_error │
          │                     │ per timestep         │
          │                     │                      │
          │                     │ Speed: ~50ms         │
          │                     └──────────┬───────────┘
          └──────────────────────────┬─────┘
                                     ▼
                         ┌──────────────────────┐
                         │  Root Cause Engine   │
                         │  (Rules + ML fusion) │
                         │                      │
                         │ IF vib_milling > 8.5 │
                         │ → bearing wear       │
                         │                      │
                         │ IF pwr_comp > 58.0   │
                         │ → motor overload     │
                         │                      │
                         │ IF pwr_dry > 28.0    │
                         │ → damp raw material  │
                         └──────────────────────┘

Anomaly Types & Signals

Anomaly Type Affected Phase Signal Pattern Root Cause Message
Bearing wear Milling Vibration > 8.5 mm/s "Bearing wear suspected — schedule inspection"
Motor overload Compression Power > 58.0 kW "Motor stress — check tooling and die fill"
Process drift Drying Power > 28.0 kW consistently "High moisture in raw material — check intake specs"
Gradual degradation Any Batch-to-batch power increase "CUSUM drift detected — maintenance review needed"

Module 3 — SHAP Explainability Engine

Trained XGBoost models (one per target)
               │
               ▼
    shap.TreeExplainer(xgb_model)
               │
    ┌──────────┴──────────────────────┐
    ▼                                 ▼
Global view                     Local view
(all 60 batches)                (1 specific batch)
                                
  Beeswarm plot:                  Waterfall plot:
  Feature vs SHAP value           Base value
  spread across batches           + Compression_Force (+18N)
                                  + Moisture_Content  (-7N)
  Bar chart:                      + Machine_Speed     (+2N)
  Mean |SHAP| ranking             = Final prediction: 95N
  per target

Output per API Call

{
  "target": "Dissolution_Rate",
  "base_value": 90.93,
  "prediction": 87.2,
  "feature_contributions": {
    "Compression_Force": -2.8,
    "Moisture_Content":  -1.5,
    "Machine_Speed":     +0.6,
    "Binder_Amount":     -0.3,
    "Drying_Temp":       +1.2,
    "Granulation_Time":  -0.9,
    "Drying_Time":       -0.1,
    "Lubricant_Conc":    +0.07
  },
  "top_driver": "Compression_Force is pulling Dissolution_Rate down by 2.8%"
}

Module 4 — Carbon Footprint Tracker

Predicted Energy_kWh (from Module 1)
               │
               ▼
  Carbon_kgCO2e = Energy_kWh × 0.716
  (India grid emission factor, CEA FY 2022-23)
               │
               ▼
  Adaptive Target Algorithm:
  ┌────────────────────────────────────────────┐
  │  1. Look at last 20 batches' emissions     │
  │  2. best_10p = percentile(emissions, 10)   │
  │  3. stretch = best_10p × 0.95              │
  │  4. target = max(stretch, regulatory_floor)│
  └────────────────────────────────────────────┘
               │
               ▼
  Output: {
    dynamic_target, current_avg,
    trend (📉/📈), on_target (✅/❌),
    gap_to_target
  }

4. Full Pipeline Flow

START
  │
  ▼
[1] LOAD DATA
  ├── _h_batch_process_data.xlsx   → df_process (211 × 11)
  └── _h_batch_production_data.xlsx → df_prod  (60 × 15)
  │
  ▼
[2] VALIDATE
  ├── Assert zero nulls (both files confirmed clean)
  ├── IQR outlier detection → flag extremes
  └── Value range checks per column
  │
  ▼
[3] FEATURE ENGINEERING
  ├── Phase-aggregate df_process → 32 phase features (1 row)
  ├── Physics-based sensor simulation for T002–T060
  ├── Inject anomalies into 10% of simulated batches
  ├── Derive Energy_kWh per batch (physics formula)
  ├── Derive Carbon_kgCO2e = Energy_kWh × 0.716
  └── Merge all → merged_dataset (60 × ~22)
  │
  ▼
[4] TRAIN/TEST SPLIT
  ├── 80/20 → 48 train / 12 test
  └── MinMaxScaler fit on train only, transform both
  │
  ▼
[5] TRAIN MODULE 1 — MULTI-TARGET PREDICTION
  ├── 5-fold CV → out-of-fold predictions
  ├── XGBoost MultiOutputRegressor (Optuna tuning, 50 trials)
  ├── Random Forest MultiOutputRegressor
  ├── MLP Neural Network (shared layers + 7 output heads)
  ├── Ridge Stacking Meta-Learner on OOF predictions
  └── Evaluate: assert R² ≥ 0.90 on all primary targets
  │
  ▼
[6] TRAIN MODULE 2 — ANOMALY DETECTION
  ├── Extract phase-aggregated features from simulated sensors
  ├── Isolation Forest (contamination=0.05, n_estimators=200)
  ├── LSTM Autoencoder (train on normal batches only)
  ├── Compute reconstruction threshold = mean + 3×std
  └── Evaluate: Precision, Recall, AUC-ROC on injected anomalies
  │
  ▼
[7] SET UP MODULE 3 — SHAP
  ├── shap.TreeExplainer on trained XGBoost models
  ├── Compute SHAP values for all 60 batches
  └── Pre-generate beeswarm + bar plots for dashboard
  │
  ▼
[8] SET UP MODULE 4 — CARBON
  ├── Compute Carbon_kgCO2e for all 60 batches
  └── Initialize adaptive target state from batch history
  │
  ▼
[9] SERIALIZE ALL MODELS
  ├── models/xgb_multitarget.pkl
  ├── models/rf_multitarget.pkl
  ├── models/mlp_model.keras
  ├── models/stacking_meta.pkl
  ├── models/isolation_forest.pkl
  ├── models/lstm_autoencoder.keras
  └── models/scaler.pkl
  │
  ▼
[10] START SERVICES
  ├── uvicorn api/main.py --port 8000
  └── cd dashboard && npm run dev  (http://localhost:3000)
  │
  ▼
END — System is live

5. Project Folder Structure

manufacturing-intelligence/
│
├── README.md
├── SETUP.md                            ← Environment setup guide
├── PIPELINE.md                         ← Plain-English pipeline walkthrough
├── BENCHMARK.md                        ← Model benchmark report
│
├── data/
│   ├── raw/
│   │   ├── _h_batch_process_data.xlsx   ← T001 sensor log (211 rows × 11 cols)
│   │   └── _h_batch_production_data.xlsx ← T001–T060 batch records
│   ├── processed/
│   │   ├── merged_dataset.csv           ← 60 batches × ~22 features (final ML input)
│   │   ├── phase_features.csv           ← Phase-aggregated sensor features
│   │   ├── batch_outcomes.csv           ← Cleaned production data + derived targets
│   │   └── carbon_history.csv           ← Per-batch CO₂e with adaptive targets
│   └── simulated/
│       └── simulated_sensors.csv        ← Physics-based sensor data T001–T060
│
├── notebooks/                           ← Core analysis notebooks (run in order)
│   ├── 01_EDA.ipynb
│   ├── 02_feature_engineering.ipynb
│   ├── 03_multitarget_models.ipynb
│   ├── 04_anomaly_detection.ipynb
│   └── 05_explainability.ipynb
│
├── analysis/                            ← Deep-dive analysis notebooks
│   ├── 01_data_profiling.ipynb
│   ├── 02_correlation_deep_dive.ipynb
│   ├── 03_phase_energy_analysis.ipynb
│   ├── 04_model_comparison.ipynb
│   └── 05_business_impact.ipynb
│
├── src/
│   ├── __init__.py
│   ├── config.py                        ← All constants: paths, thresholds, emission factors
│   ├── preprocessing.py                 ← Load, validate, normalize
│   ├── simulate_sensors.py              ← Physics-based simulation for T002–T060
│   ├── feature_engineering.py           ← Phase aggregation, FFT, derived features
│   ├── multi_target_model.py            ← XGBoost + RF + MLP + stacking
│   ├── anomaly_detector.py              ← Isolation Forest + LSTM Autoencoder
│   ├── shap_explainer.py                ← SHAP value computation + plots
│   ├── carbon_calculator.py             ← CO₂e computation + adaptive targets
│   ├── run_pipeline.py                  ← Master script: runs all training steps
│   └── utils.py                         ← Metrics, plot helpers, serialization
│
├── models/
│   ├── xgb_multitarget.pkl
│   ├── rf_multitarget.pkl
│   ├── mlp_model.keras                  ← Keras SavedModel format
│   ├── stacking_meta.pkl
│   ├── isolation_forest.pkl
│   ├── lstm_autoencoder.keras           ← Keras SavedModel format
│   ├── scaler.pkl
│   ├── shap_values.pkl
│   ├── lstm_threshold.json
│   ├── lstm_norm_params.json
│   ├── evaluation_results.json          ← Per-target R², MAE, RMSE, MAPE
│   └── pipeline_summary.json
│
├── api/
│   ├── main.py                          ← FastAPI app + all route handlers
│   └── schemas.py                       ← Pydantic request/response models
│
├── dashboard/                           ← Next.js web dashboard
│   ├── package.json
│   ├── next.config.ts
│   └── src/
│       ├── app/
│       │   ├── layout.tsx
│       │   ├── page.tsx
│       │   └── ClientLayout.tsx         ← Tab-based navigation
│       └── components/
│           ├── MetricCard.tsx
│           ├── Slider.tsx
│           └── tabs/
│               ├── PredictionsTab.tsx
│               ├── EnergyTab.tsx
│               ├── ComparisonTab.tsx
│               ├── CarbonTab.tsx
│               ├── WhatIfTab.tsx
│               └── BenchmarkTab.tsx     ← NEW: model benchmark report
│
├── tests/
│   ├── test_preprocessing.py
│   ├── test_models.py
│   └── test_api.py
│
└── requirements.txt

6. API Design

Endpoints Overview

Base URL: http://localhost:8000

POST   /api/predict                 → Quality + energy prediction for given parameters
POST   /api/anomaly                 → Energy pattern anomaly check for a batch
GET    /api/explain/{batch_id}      → SHAP explanation for a specific prediction
GET    /api/carbon/{batch_id}       → Carbon footprint + adaptive targets
GET    /api/batches                 → List all available batch IDs
GET    /api/carbon_history          → Full carbon history for all batches
GET    /api/model_metrics           → Full benchmark (R², MAE, RMSE, MAPE, anomaly metrics)
GET    /api/health                  → Service health check (all model load statuses)

Request / Response Schemas

POST /api/predict

Request:
{
  "granulation_time":  16.0,
  "binder_amount":     9.0,
  "drying_temp":       60,
  "drying_time":       29,
  "compression_force": 12.0,
  "machine_speed":     170,
  "lubricant_conc":    1.2,
  "moisture_content":  2.0
}

Response:
{
  "hardness":                89.4,
  "friability":              0.81,
  "dissolution_rate":        90.7,
  "content_uniformity":      98.2,
  "disintegration_time":     8.3,
  "tablet_weight":           202.1,
  "energy_kwh":              72.4,
  "carbon_kg_co2e":          51.8,
  "composite_quality_score": 82.3
}

POST /api/anomaly

Request:  { "batch_id": "T045" }

Response:
{
  "batch_id":    "T045",
  "is_anomaly":  true,
  "severity":    "HIGH",
  "root_causes": [
    {
      "phase":           "Milling",
      "signal":          "Vibration spike (11.2 mm/s vs normal 6.5)",
      "interpretation":  "Bearing wear suspected in milling unit",
      "action":          "Schedule inspection before next batch run"
    }
  ],
  "isolation_forest_score": -0.18,
  "lstm_reconstruction_error": 0.042
}

GET /api/explain/{batch_id}?target=Dissolution_Rate

Response:
{
  "target":     "Dissolution_Rate",
  "base_value": 90.93,
  "prediction": 87.2,
  "feature_contributions": {
    "Compression_Force": -2.8,
    "Moisture_Content":  -1.5,
    "Machine_Speed":     +0.6,
    "Binder_Amount":     -0.3,
    "Drying_Temp":       +1.2,
    "Granulation_Time":  -0.9,
    "Drying_Time":       -0.1,
    "Lubricant_Conc":    +0.07
  },
  "top_driver": "Compression_Force reduced Dissolution_Rate by 2.8% below average"
}

GET /api/carbon/{batch_id}

Response:
{
  "batch_id":              "T045",
  "energy_kwh":            74.2,
  "carbon_kg_co2e":        53.1,
  "grid":                  "India (0.716 kg CO2e/kWh)",
  "dynamic_target":        48.5,
  "on_target":             false,
  "trend":                 "📈 Worsening",
  "gap_to_target_kg":      +4.6
}

7. Dashboard Design

Tab Layout (Next.js Dashboard — http://localhost:3000)

Next.js App (port 3000)
│
├── Tab 1 — 🔮 Predictions
│   ├── 8 input sliders for process parameters
│   ├── Metric cards (Hardness, Friability, Dissolution, Energy, Carbon)
│   └── Composite Quality Score display
│
├── Tab 2 — ⚡ Energy Monitor
│   ├── Batch selector dropdown (T001–T060)
│   ├── Power + Vibration recharts line chart colored by Phase
│   └── Anomaly score + root cause alerts (🔴 HIGH / 🟡 MEDIUM / 🟢 OK)
│
├── Tab 3 — 📊 Batch Comparison
│   ├── Two batch selectors
│   ├── Normalized radar charts for each batch
│   └── Delta table showing improvements/worsened targets
│
├── Tab 4 — 🌍 Carbon Footprint
│   ├── Carbon trend chart (all 60 batches)
│   ├── Adaptive target line
│   └── Grid selector (India / EU / US / Renewable)
│
├── Tab 5 — 🎛️ What-If Optimizer
│   ├── All 8 parameter sliders (real-time update < 100ms)
│   └── Live predictions update as sliders move
│
└── Tab 6 — 📈 Benchmark (NEW)
    ├── Full model performance table (R², MAE, RMSE, MAPE)
    ├── Per-target breakdown for XGBoost / RF / MLP / Stacking
    ├── Anomaly detector metrics (Precision, Recall, F1, AUC-ROC)
    └── Dataset metadata (batch count, feature count, CV folds)

8. Data Flow Between Files

_h_batch_process_data.xlsx (T001 only)
         │
         ├──→ Phase energy profile extraction
         │    (8 phase segments → aggregated stats)
         │
         ├──→ Physics-based scaling with _h_batch_production_data params
         │    → Simulated sensor profiles for T002–T060
         │
         └──→ LSTM Autoencoder training
              (learns "normal" power + vibration patterns)

_h_batch_production_data.xlsx (T001–T060)
         │
         ├──→ Input features → XGBoost / RF / MLP training
         │
         ├──→ Target variables → model output supervision
         │
         ├──→ Scaling params for sensor simulation
         │    (Machine_Speed, Drying_Temp, Compression_Force per batch)
         │
         └──→ Energy_kWh derivation
              → Carbon_kgCO2e computation
              → Adaptive target history

         BOTH join on Batch_ID (T001)
              → Correlation between sensor patterns and quality outcomes
              → Used to validate simulation realism

Architecture version: 2.0 | AI-Driven Manufacturing Intelligence Hackathon