A framework to model real-world dynamic systems under regime changes and quantify future operational risk using probabilistic simulation.
Integrated view of signal, regimes and degradation
This visualization summarizes the core idea of the framework:
- Raw process signals (temperature and energy)
- Operational regimes identified by HMM
- Degradation evolution estimated by BAFO
It highlights how:
- regime changes impact degradation
- short critical periods lead to accumulation
- system health is not fully recovered after deviations
- Critical state occurrence: ~7% of total time
- Disproportionate degradation impact during short events
- Estimated time to risk: ~24 cycles
Most models fail in real-world systems because they assume the system behaves consistently over time.
In practice, this is not true.
Operational regimes change, behavior shifts, and uncertainty is inherent.
Instead of trying to build a single global model, this framework focuses on:
Detecting regime changes and evaluating system health within each regime.
The pipeline follows:
data β features β regimes (HMM) β degradation (BAFO) β risk projection (Monte Carlo)
- HMM identifies hidden operational regimes
- BAFO estimates degradation dynamically
- Monte Carlo simulates future trajectories and risk exposure
This framework is designed for real-world industrial systems, including:
- Manufacturing processes (thermal, chemical, mechanical)
- Energy systems and utilities
- Continuous production lines
- Equipment health monitoring
It is especially useful in systems where:
- behavior is non-stationary
- regimes shift over time
- degradation is cumulative and not immediately visible
Unlike traditional approaches, this framework:
- does not assume stationarity
- does not rely on fixed thresholds
- explicitly models regime-dependent behavior
- combines state detection + degradation + future risk
This transforms monitoring from:
anomaly detection β dynamic system understanding
The BAFO (Bayesian Adaptive Fault Observer) is the core component responsible for estimating the dynamic degradation of the system over time.
Unlike static anomaly detection methods, BAFO:
- adapts its baseline online
- incorporates system dynamics (signal + slope)
- models uncertainty through Bayesian inference
- accumulates degradation with memory
At each time step:
- The signal is smoothed and normalized relative to a dynamic baseline
- A slope term is incorporated to capture transient behavior
- A latent variable combines level and dynamics
- A Bayesian update estimates fault probability (
posterior) - A degradation state (
deg_est) is updated with memory
- Adaptive baseline β updated only in healthy conditions
- Bayesian inference β balances normal vs fault likelihood
- Forgetting factor β adapts to slow system drift
- Degradation memory β accumulates past deviations
posteriorβ probability of faultdeg_estβ accumulated degradationdetected_indexβ first critical detection
The HMM segments the process into distinct operational states, each with different statistical behavior.
Future degradation is simulated under uncertainty, allowing estimation of:
- probability of reaching critical states
- expected time to risk
- cumulative exposure
- The system operates mostly in stable conditions
- Short critical periods drive disproportionate degradation
- The process exhibits degradation memory
This approach enables, in practical terms:
- Early detection of operational risk
- Transition from reactive to predictive monitoring
- Decision-making based on probabilistic future scenarios
- Reduction of unplanned downtime
- Early identification of hidden degradation
- Risk-aware operational decisions
It is not about predicting the system.
It is about tracking its dynamics and managing risk over time.
git clone https://github.com/VictorAMachado/bafo-framework.git
cd bafo-frameworkpython -m venv venv
source venv/bin/activate # Linux / Mac
# or
venv\Scripts\activate # Windowspip install -r requirements.txtjupyter notebooknotebooks/bafo_framework_analysis.ipynbRun all cells to reproduce:
- Data ingestion and preprocessing
- Feature engineering
- HMM regime detection
- BAFO degradation estimation
- Monte Carlo risk simulation
- The dataset is already anonymized (
data/data_anon.csv) - No external data is required
- All results and figures can be reproduced directly from the notebook
bafo-framework/
β
βββ README.md
βββ requirements.txt
βββ LICENSE
β
βββ data/
β βββ data_anon.csv
β
βββ notebooks/
β βββ bafo_framework_analysis.ipynb
β
βββ outputs/
β βββ hmm_states.png
β βββ integrated_process.png
β βββ monte_carlo.png
β
βββ src/
βββ bafo.py
βββ simulation.py
βββ metrics.py
All data has been anonymized and transformed to preserve structural behavior while removing any sensitive or proprietary information.
The estimated time to risk (~24 cycles) should be interpreted within the context of the process.
Considering that each cycle lasts approximately t seconds, this corresponds to a relatively short time window for the system to reach critical degradation levels.
This indicates that:
- risk escalation can occur rapidly
- short sequences of non-ideal operation may have significant impact
- early detection is essential for maintaining process stability
It is important to note that this estimation is based solely on operational data.
Further refinement of the model can be achieved by incorporating:
- historical failure events
- maintenance records
- real downtime occurrences
This would allow calibration of degradation thresholds and improve alignment between modeled risk and actual system failures.
Victor Augusto Machado Dutra
Engineer β Control Systems, Industrial Data & Decision Intelligence

