Skip to content

YL-Raj/Sustainable-Manufacturing-DEA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sustainable Manufacturing — Data Envelopment Analysis (DEA)

Python NumPy pandas SciPy Matplotlib License: MIT

Benchmarking the operational + environmental efficiency of four global FMCG manufacturers — Nestlé, Henkel, Procter & Gamble, and Unilever — using Data Envelopment Analysis, a non-parametric linear-programming technique for relative efficiency measurement.

This project turns a research study into a fully reproducible data-science pipeline: clean datasets, a from-scratch DEA solver, a tested analysis script, publication-quality visualizations, and a narrative Jupyter notebook.


Why this project

Manufacturers face a hard question: are we converting resources (water, energy, labour) into sustainability outcomes (lower emissions, recyclable packaging, worker safety) as efficiently as our peers? Traditional ratios compare one input to one output at a time. DEA handles many inputs and outputs at once, with no need to pre-assign prices or weights, and identifies an efficient frontier of best performers that everyone else is measured against. It is widely used in real operations research — banking branch performance, hospital productivity, supply-chain benchmarking, and ESG analytics.


Headline result

Technical efficiency trends

Company Technical Efficiency (2018–2022) Read
Nestlé ~1.00 On the efficient frontier almost every year — lean, stable operations
Henkel ~0.98 Consistently near-frontier, marginal slack in 2020–21
P&G high Φ-slack Largest measured improvement headroom, gradually improving
Unilever rising Started furthest from the frontier but shows the clearest upward trend

Technical efficiency heatmap

Full numbers: data/processed/reported_efficiency_summary.csv and the model-recomputed scores in results/computed_efficiency.csv.


The method in one minute

For each Decision-Making Unit (here a company-year), DEA solves one linear program. The output-oriented, variable-returns-to-scale (BCC) model asks:

Holding inputs fixed, by what factor Φ ≥ 1 could this unit expand its outputs to reach the frontier?

Efficiency is reported as TE = 1 / Φ, where 1.0 means the unit is already efficient. The solver (src/dea.py) also supports the input-oriented and constant-returns (CCR) formulations.

from dea import efficiency_scores

scores = efficiency_scores(
    inputs=[[8359, 53000], [8324, 52450]],     # water, labour
    outputs=[[682, 11938, 125], [665, 11618, 113]],  # emissions, waste, safety
    names=["2018", "2019"],
    orientation="output", rts="VRS",
)
print(scores)

Repository structure

Sustainable-Manufacturing-DEA/
├── data/
│   ├── raw/                      # cleaned per-company input/output CSVs
│   │   ├── nestle.csv  henkel.csv  pg.csv  unilever.csv
│   └── processed/
│       └── reported_efficiency_summary.csv
├── src/
│   ├── dea.py                    # DEA solver (CCR/BCC, input/output oriented)
│   ├── visualize.py             # matplotlib plotting helpers
│   └── run_analysis.py          # end-to-end pipeline -> results/
├── notebooks/
│   └── 01_DEA_walkthrough.ipynb  # narrated analysis with inline charts
├── results/
│   ├── computed_efficiency.csv
│   └── figures/                  # generated PNGs
├── docs/
│   ├── research_paper.docx / .pdf
│   └── source_files/             # original Excel workbooks
├── requirements.txt
├── LICENSE
└── README.md

Quickstart

# 1. clone and enter
git clone <your-repo-url>
cd Sustainable-Manufacturing-DEA

# 2. install dependencies
pip install -r requirements.txt

# 3. reproduce all results and figures
cd src && python run_analysis.py

# 4. (optional) open the guided walkthrough
jupyter notebook notebooks/01_DEA_walkthrough.ipynb

run_analysis.py regenerates everything in results/ from the raw CSVs, so the analysis is fully reproducible from scratch.


Data

Sustainability metrics were compiled from the four companies' public ESG / annual sustainability reports (2018–2022; Unilever 2010–2022). Each company is modelled with its own inputs and outputs:

Company Example inputs Example outputs
Nestlé water usage, energy, % covered by collective bargaining renewable electricity, recyclable packaging, safety
Henkel water consumption, labour recycled waste, packaging recyclability, occupational safety
P&G energy consumption, recyclable packaging GHG scopes, fresh-water metrics
Unilever water, value of contributions emissions, waste disposed, safety at work

A note on rigour (DEA discrimination)

DEA loses discriminating power when the number of DMUs is small relative to the number of inputs + outputs — a common rule of thumb is n_DMU ≥ 3 × (inputs + outputs). With only 4–5 years per company, a textbook VRS model labels nearly every unit "efficient" (see results/computed_efficiency.csv). The original study addresses this with a two-phase subjective/objective weighting and aggregated outputs, which sharpens the comparison between firms. Both views are included here on purpose — surfacing the limitation is part of doing the analysis honestly.

Possible extensions: pooled cross-company DEA on common normalised metrics, Malmquist productivity indices for year-over-year change, super-efficiency models to rank frontier units, and bootstrapped confidence intervals (Simar & Wilson).


Tech stack

Python · NumPy · pandas · SciPy (HiGHS LP solver) · Matplotlib · Jupyter

Author

Likhith Raj Yesala — research, modelling, and implementation.

License

Released under the MIT License.

About

the sustainability efficiency of Nestlé, Henkel, P&G & Unilever across 50+ ESG KPIs — reproducible pipeline, tests, and an honest note on the method's limits.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors