🏭 AI-Driven Manufacturing Intelligence — Setup & Run Guide

Hackathon Project · Pharmaceutical Tablet Manufacturing · Track A: Predictive Modelling

⚡ Prerequisites

Tool	Minimum Version	Check
Python	3.10+	`python --version`
pip	23+	`pip --version`
Git	Any	`git --version`

📁 Step 1 — Clone / Open the Project

# If you cloned via git:
cd "d:\CODE FILES\Projects\manufacturing-intelligence"

# Otherwise just open a terminal in the project root.

🐍 Step 2 — Create & Activate Virtual Environment

# Create venv
python -m venv venv

# Activate (Windows PowerShell)
venv\Scripts\Activate.ps1

# Activate (Windows CMD)
venv\Scripts\activate.bat

# Activate (Git Bash / WSL / macOS / Linux)
source venv/bin/activate

You should see (venv) prefix in your terminal after activation.

📦 Step 3 — Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt

Note: TensorFlow and XGBoost are large packages — this may take 3–5 minutes on first install.

📂 Step 4 — Place Raw Data Files

Put the two Excel files into data/raw/:

data/
└── raw/
    ├── _h_batch_process_data.xlsx      ← T001 minute-by-minute sensor log
    └── _h_batch_production_data.xlsx   ← T001–T060 batch production records

The data/raw/ folder is already created by config.py at import time. Just drop the files in.

🚀 Step 5 — Run the Training Pipeline

This single command runs all 7 steps end-to-end:

# Quick run (no hyperparameter tuning — ~2 min):
python src/run_pipeline.py

# With Optuna tuning for best XGBoost params (~7 min):
python src/run_pipeline.py --tune

What the pipeline does:

Step	Description	Output
1	Load & validate raw data	`data/processed/batch_outcomes.csv`
2	Simulate sensors for T002–T060	`data/simulated/simulated_sensors.csv`
3	Extract phase features & merge	`data/processed/merged_dataset.csv`
4	Train XGBoost + RF + MLP + Stacking	`models/.pkl` / `models/.keras`
5	Train Isolation Forest + LSTM AE	`models/isolation_forest.pkl` + `models/lstm_autoencoder.keras`
6	Compute SHAP values	`models/shap_values.pkl` + `reports/shap_plots/`
7	Build carbon footprint history	`data/processed/carbon_history.csv`

Expected console output at the end:

════════════════════════════════════════════════════════════════
  🏁 PIPELINE COMPLETE in ~X s

📊 MODEL PERFORMANCE SUMMARY:
  Model                |  Overall R²
  ─────────────────────────────────
  ✅ XGBoost           |     0.9200
  ✅ Random Forest     |     0.8900
  ✅ Stacking Ensemble |     0.9350
════════════════════════════════════════════════════════════════

🌐 Step 6 — Start the REST API

uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc
Health check: http://localhost:8000/api/health

Available Endpoints:

Method	URL	Description
`GET`	`/api/health`	Model load status
`POST`	`/api/predict`	Predict quality + energy
`POST`	`/api/anomaly`	Detect energy anomalies
`GET`	`/api/explain/{batch_id}`	SHAP feature explanations
`GET`	`/api/carbon/{batch_id}`	CO₂e + adaptive target
`GET`	`/api/batches`	List all batch IDs
`GET`	`/api/carbon_history`	Full carbon history

📊 Step 7 — Launch the Dashboard

Open a new terminal (keep API running), then:

cd dashboard
npm install        # first run only
npm run dev

Dashboard: http://localhost:3000

Dashboard Tabs:

Tab	Feature
🔮 Predictions	Predict all 7 quality targets from sliders
⚡ Energy Monitor	Phase-colored sensor charts + anomaly detection
📊 Batch Comparison	Radar chart fingerprinting of two batches
🌍 Carbon Tracker	CO₂e trend + adaptive targets + grid scenarios
🎛️ What-If Optimizer	Real-time parameter explorer (<100ms)
📈 Benchmark	Full model performance report (R², MAE, RMSE, MAPE)

🧪 Step 8 — Run Tests

pytest tests/ -v

Note: test_api.py tests can be run without the API server — they use FastAPI's TestClient. However, they require trained models to be present. Run the pipeline first.

📓 Step 9 — Explore Notebooks

Project Notebooks (in order):

jupyter notebook notebooks/

Notebook	Description
`01_EDA.ipynb`	Sensor profiling, distributions, correlations
`02_feature_engineering.ipynb`	Simulation, phase features, LSTM sequences
`03_multitarget_models.ipynb`	Train all models, R² comparison charts
`04_anomaly_detection.ipynb`	IF + LSTM AE training, confusion matrices
`05_explainability.ipynb`	SHAP beeswarm, waterfall, summary

Analysis Notebooks (deep-dive):

jupyter notebook analysis/

Notebook	Description
`01_data_profiling.ipynb`	Full stats, missing values, outliers, heatmap
`02_correlation_deep_dive.ipynb`	Pearson/Spearman/VIF multicollinearity
`03_phase_energy_analysis.ipynb`	Phase energy breakdown, CUSUM drift
`04_model_comparison.ipynb`	CV scores, residuals, timing benchmarks
`05_business_impact.ipynb`	ROI, carbon savings, grid scenarios

📁 Output Files (after pipeline run)

models/
├── xgb_multitarget.pkl           ← XGBoost multi-output model
├── rf_multitarget.pkl            ← Random Forest model
├── mlp_model.keras               ← Keras MLP model
├── stacking_meta.pkl             ← Stacking ensemble bundle
├── isolation_forest.pkl          ← Isolation Forest (anomaly)
├── lstm_autoencoder.keras        ← LSTM Autoencoder (anomaly)
├── scaler.pkl                    ← MinMaxScaler (feature scaling)
├── shap_values.pkl               ← Pre-computed SHAP values
├── lstm_threshold.json           ← LSTM anomaly threshold
├── lstm_norm_params.json         ← LSTM normalization params
├── pipeline_summary.json         ← Full run summary
└── evaluation_results.json       ← Per-target R², MAE, RMSE, MAPE

reports/
├── shap_plots/                   ← Beeswarm + waterfall PNGs
├── sensor_profile_T001.png
├── phase_energy_breakdown.png
├── correlation_heatmap.png
├── model_r2_comparison.png
├── actual_vs_predicted.png
├── business_impact.png
└── ...

⚠️ Troubleshooting

Problem	Fix
`ModuleNotFoundError`	Make sure venv is activated: `venv\Scripts\Activate.ps1`
`FileNotFoundError` on data	Place Excel files in `data/raw/`
`API 503 error`	Run the pipeline first: `python src/run_pipeline.py`
TensorFlow GPU warning	Ignore — CPU training works fine for this dataset size
`ExecutionPolicy` error on Activate	Run: `Set-ExecutionPolicy RemoteSigned -Scope CurrentUser`

🛑 Deactivate venv

deactivate

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🏭 AI-Driven Manufacturing Intelligence — Setup & Run Guide

⚡ Prerequisites

📁 Step 1 — Clone / Open the Project

🐍 Step 2 — Create & Activate Virtual Environment

📦 Step 3 — Install Dependencies

📂 Step 4 — Place Raw Data Files

🚀 Step 5 — Run the Training Pipeline

What the pipeline does:

🌐 Step 6 — Start the REST API

Available Endpoints:

📊 Step 7 — Launch the Dashboard

Dashboard Tabs:

🧪 Step 8 — Run Tests

📓 Step 9 — Explore Notebooks

Project Notebooks (in order):

Analysis Notebooks (deep-dive):

📁 Output Files (after pipeline run)

⚠️ Troubleshooting

🛑 Deactivate venv

FilesExpand file tree

SETUP.md

Latest commit

History

SETUP.md

File metadata and controls

🏭 AI-Driven Manufacturing Intelligence — Setup & Run Guide

⚡ Prerequisites

📁 Step 1 — Clone / Open the Project

🐍 Step 2 — Create & Activate Virtual Environment

📦 Step 3 — Install Dependencies

📂 Step 4 — Place Raw Data Files

🚀 Step 5 — Run the Training Pipeline

What the pipeline does:

🌐 Step 6 — Start the REST API

Available Endpoints:

📊 Step 7 — Launch the Dashboard

Dashboard Tabs:

🧪 Step 8 — Run Tests

📓 Step 9 — Explore Notebooks

Project Notebooks (in order):

Analysis Notebooks (deep-dive):

📁 Output Files (after pipeline run)

⚠️ Troubleshooting

🛑 Deactivate venv