Skip to content

VictorAMachado/bafo-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

BAFO Framework β€” Dynamic System Monitoring with Regime Detection and Risk Projection

⚑ TL;DR

A framework to model real-world dynamic systems under regime changes and quantify future operational risk using probabilistic simulation.

πŸ” Integrated View β€” Signal, Regimes and Degradation


Integrated view of signal, regimes and degradation

This visualization summarizes the core idea of the framework:

  • Raw process signals (temperature and energy)
  • Operational regimes identified by HMM
  • Degradation evolution estimated by BAFO

It highlights how:

  • regime changes impact degradation
  • short critical periods lead to accumulation
  • system health is not fully recovered after deviations

πŸ“Š Key Results

  • Critical state occurrence: ~7% of total time
  • Disproportionate degradation impact during short events
  • Estimated time to risk: ~24 cycles

🧠 Motivation

Most models fail in real-world systems because they assume the system behaves consistently over time.

In practice, this is not true.

Operational regimes change, behavior shifts, and uncertainty is inherent.

Instead of trying to build a single global model, this framework focuses on:

Detecting regime changes and evaluating system health within each regime.


βš™οΈ Framework Overview

The pipeline follows:

data β†’ features β†’ regimes (HMM) β†’ degradation (BAFO) β†’ risk projection (Monte Carlo)
  • HMM identifies hidden operational regimes
  • BAFO estimates degradation dynamically
  • Monte Carlo simulates future trajectories and risk exposure

🏭 Applications

This framework is designed for real-world industrial systems, including:

  • Manufacturing processes (thermal, chemical, mechanical)
  • Energy systems and utilities
  • Continuous production lines
  • Equipment health monitoring

It is especially useful in systems where:

  • behavior is non-stationary
  • regimes shift over time
  • degradation is cumulative and not immediately visible

⚑ What makes this different

Unlike traditional approaches, this framework:

  • does not assume stationarity
  • does not rely on fixed thresholds
  • explicitly models regime-dependent behavior
  • combines state detection + degradation + future risk

This transforms monitoring from:

anomaly detection β†’ dynamic system understanding


🧠 BAFO β€” Bayesian Adaptive Fault Observer

The BAFO (Bayesian Adaptive Fault Observer) is the core component responsible for estimating the dynamic degradation of the system over time.

Unlike static anomaly detection methods, BAFO:

  • adapts its baseline online
  • incorporates system dynamics (signal + slope)
  • models uncertainty through Bayesian inference
  • accumulates degradation with memory

βš™οΈ How it works

At each time step:

  1. The signal is smoothed and normalized relative to a dynamic baseline
  2. A slope term is incorporated to capture transient behavior
  3. A latent variable combines level and dynamics
  4. A Bayesian update estimates fault probability (posterior)
  5. A degradation state (deg_est) is updated with memory

πŸ” Key mechanisms

  • Adaptive baseline β†’ updated only in healthy conditions
  • Bayesian inference β†’ balances normal vs fault likelihood
  • Forgetting factor β†’ adapts to slow system drift
  • Degradation memory β†’ accumulates past deviations

πŸ“ˆ Output

  • posterior β†’ probability of fault
  • deg_est β†’ accumulated degradation
  • detected_index β†’ first critical detection

πŸ”Ž Regime Detection (HMM)

The HMM segments the process into distinct operational states, each with different statistical behavior.

HMM


🚨 Risk Projection (Monte Carlo)

Monte Carlo

Future degradation is simulated under uncertainty, allowing estimation of:

  • probability of reaching critical states
  • expected time to risk
  • cumulative exposure

πŸ“ˆ Key Insights

  • The system operates mostly in stable conditions
  • Short critical periods drive disproportionate degradation
  • The process exhibits degradation memory

πŸš€ Outcome

This approach enables, in practical terms:

  • Early detection of operational risk
  • Transition from reactive to predictive monitoring
  • Decision-making based on probabilistic future scenarios
  • Reduction of unplanned downtime
  • Early identification of hidden degradation
  • Risk-aware operational decisions

πŸ’‘ Takeaway

It is not about predicting the system.
It is about tracking its dynamics and managing risk over time.


▢️ How to Run

1. Clone the repository

git clone https://github.com/VictorAMachado/bafo-framework.git
cd bafo-framework

2. Create environment

python -m venv venv
source venv/bin/activate  # Linux / Mac

# or

venv\Scripts\activate     # Windows

3. Install dependencies

pip install -r requirements.txt

4. Run the notebook

jupyter notebook

Open

notebooks/bafo_framework_analysis.ipynb

5. Execute

Run all cells to reproduce:

  • Data ingestion and preprocessing
  • Feature engineering
  • HMM regime detection
  • BAFO degradation estimation
  • Monte Carlo risk simulation

πŸ“Œ Notes

  • The dataset is already anonymized (data/data_anon.csv)
  • No external data is required
  • All results and figures can be reproduced directly from the notebook

πŸ“ Repository Structure

bafo-framework/
β”‚
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ LICENSE
β”‚
β”œβ”€β”€ data/
β”‚   └── data_anon.csv
β”‚
β”œβ”€β”€ notebooks/
β”‚   └── bafo_framework_analysis.ipynb
β”‚
β”œβ”€β”€ outputs/
β”‚   β”œβ”€β”€ hmm_states.png
β”‚   β”œβ”€β”€ integrated_process.png
β”‚   └── monte_carlo.png
β”‚
└── src/
    β”œβ”€β”€ bafo.py
    β”œβ”€β”€ simulation.py
    └── metrics.py

⚠️ Data Note

All data has been anonymized and transformed to preserve structural behavior while removing any sensitive or proprietary information.


⚠️ Operational Context

The estimated time to risk (~24 cycles) should be interpreted within the context of the process.

Considering that each cycle lasts approximately t seconds, this corresponds to a relatively short time window for the system to reach critical degradation levels.

This indicates that:

  • risk escalation can occur rapidly
  • short sequences of non-ideal operation may have significant impact
  • early detection is essential for maintaining process stability

It is important to note that this estimation is based solely on operational data.

Further refinement of the model can be achieved by incorporating:

  • historical failure events
  • maintenance records
  • real downtime occurrences

This would allow calibration of degradation thresholds and improve alignment between modeled risk and actual system failures.


πŸ‘€ Author

Victor Augusto Machado Dutra
Engineer β€” Control Systems, Industrial Data & Decision Intelligence