A Django-based AI service that provides machine learning predictions for FarmInsight Food Production Facilities.
- The FarmInsight Project
- Overview
- Features
- Model Documentation
- Development Setup
- Running the Application
- API Endpoints
- Model Training
- Contributing
- License
Welcome to the FarmInsight Project by ETCE!
The FarmInsight platform brings together advanced monitoring of "Food Production Facilities" (FPF), enabling users to document, track, and optimize every stage of food production seamlessly.
All FarmInsight Repositories:
- Dashboard-Frontend
- Dashboard-Backend
- FPF-Backend
- AI-Backend Link to our deployed System: FarmInsight.etce.isse.tu-clausthal.de
The AI-Backend provides predictive models for FarmInsight. It runs as an independent service that the Dashboard-Backend queries periodically for forecasts. Currently supported model types:
- Water Model: Predicts water tank levels, soil moisture. and optimal irrigation plans
- Energy Model: Forecasts battery state-of-charge (SoC), solar production, and proactive energy actions
The service exposes a standardized REST API that allows easy integration of new model types.
Additional ML libraries: LightGBM (Water Model), scikit-learn GradientBoostingRegressor (Energy Model)
- Water Level & Soil Moisture Forecasting: Physics-first hybrid model with ML residual correction, including best/average/worst-case scenarios
- Energy Forecasting: Battery SoC & solar production predictions with multi-scenario support (expected/optimistic/pessimistic)
- Proactive Actions: Water model recommends optimal irrigation schedules; Energy model triggers grid-connection when battery drops below threshold
- Weather Integration: Live weather forecasts from Open-Meteo (free, no API key), historical weather from Open-Meteo Archive API and DWD
- Beam Search Optimization: Intelligent irrigation plan optimization using beam search instead of exponential enumeration
- Conformal Prediction: Quantile-based uncertainty intervals for scenario generation without model retraining
- REST API: Standardized endpoints for easy Dashboard-Backend integration
- Model Training: Endpoints to trigger retraining with new data
Both models follow a tiered approach that selects different model types depending on the amount and quality of available training data:
| Data Availability | Water Model Strategy | Energy Model Strategy |
|---|---|---|
| No data / minimal data | Pure physics model (TankPhysics + SoilPhysics equations) | Physics-based energy balance (charging rates calibrated from historical data per month) |
| Some data (~25+ days) | Physics + ML (LightGBM quantile regression for residual correction) | GradientBoosting / RandomForest single-step model |
| Sufficient data (~100+ days) | Physics + ML + Conformal Prediction uncertainty intervals | Multi-horizon ML model (separate models per forecast horizon, backtest-validated) |
| Synthetic data available | Synthetic data generation for bootstrap training when real data is scarce | — |
This ensures that the system provides useful predictions even before enough training data is collected, and improves accuracy progressively as more data becomes available.
The Water Model predicts tank water levels and soil moisture for a configurable forecast period (default: 7 days), and computes optimal irrigation plans across three scenarios.
The Water Model employs a physics-first, ML-residual architecture. Core physical equations are used to compute a baseline prediction, and a machine learning model provides bounded residual corrections on top:
Prediction = Physics_Baseline + clamp(ML_Residual, ±20% of physics value)
This ensures that the ML model can never override physical constraints (e.g., water conservation, tank capacity limits).
-
TankPhysics (
physics_model.py) — Mass-balance based tank level calculation:tank_level_new = tank_level_prev + inflow - outflow - evaporation_loss- Clips values to
[0, tank_capacity] - Tracks overflow separately
- Clips values to
-
SoilPhysics (
physics_model.py) — Bucket/ET0-proxy soil moisture model:soil_new = soil_prev + k_irrig × irrigation_mm + k_rain × rain_mm - k_evap × (temp / 30)- Coefficients:
k_irrig = 1.5,k_rain = 0.3,k_evap = 0.4 - ML corrections bounded to max ±20% of the physics value
- Output clipped to
[0%, 100%]
- Coefficients:
- Algorithm:
MultiOutputRegressor(LGBMRegressor)with quantile loss - Quantiles trained: Q10 (pessimistic), Q50 (median), Q90 (optimistic)
- Target: Residuals (difference between true values and physics predictions)
- Hyperparameter tuning:
GridSearchCVwithTimeSeriesSplitcross-validation - Model storage: Serialized via
joblibtotrained_models/synthetic_quantile_q{10,50,90}.pkl
The Water Model receives the following input data:
| Input | Source | Type | Description |
|---|---|---|---|
soil_moisture |
FPF Sensor | float (%) | Current soil moisture reading |
water_level |
FPF Sensor | float (L) | Current water tank level in liters |
soil_moisture_prev |
Derived | float (%) | Previous day's soil moisture |
water_level_prev |
Derived | float (L) | Previous day's water level |
day_of_year |
System | int | Day of year (1–366) |
month |
System | int | Month (1–12) |
temp_today |
Open-Meteo / DWD | float (°C) | Today's max temperature |
rain_today |
Open-Meteo / DWD | float (mm) | Today's rain amount |
temp_tomorrow |
Open-Meteo | float (°C) | Tomorrow's max temperature (forecast) |
rain_tomorrow |
Open-Meteo | float (mm) | Tomorrow's rain (forecast) |
irrigation_last_h_days |
Derived | float (L) | Total irrigation over last 7 days |
pump_usage |
FPF System | int (0/1/2) | Current pump activation level |
inflow_forecast_l_today |
Computed | float (L) | Greenhouse roof rainwater inflow |
Rain inflow computation (greenhouse_calculator.py):
The rainwater inflow is computed using a physics-based greenhouse roof model that accounts for:
- Roof geometry (slope angle
30.26°, face azimuth30°, face area6.668 m²) - Wind direction bias (affects rain distribution on roof faces A/B)
- Runoff coefficient (
0.9) and first-flush loss (3%) - Wind exposure factors per roof face
Weather data sources:
- Training: Historical weather from Open-Meteo Archive API (temperature, rain, wind direction, sunshine) — no API key required
- Inference: Live forecasts from Open-Meteo Forecast API (daily: rain, temperature, wind direction)
| Output | Unit | Description |
|---|---|---|
tank_l |
Liters | Predicted tank water level |
qin_l |
Liters | Water inflow (rainwater) |
qout_l |
Liters | Water outflow (irrigation) |
overflow_l |
Liters | Tank overflow |
soil_mm |
% | Predicted soil moisture |
irrigation_mm |
mm | Irrigation amount applied |
pump_usage |
0/1/2 | Recommended pump activation level |
The model generates three scenarios using Conformal Prediction (preferred) or legacy multipliers:
| Scenario | Quantile | Method |
|---|---|---|
| best_case | Q90 (optimistic) | Adds 90th percentile residual to prediction |
| average_case | Q50 (median) | Baseline physics + median ML correction |
| worst_case | Q10 (pessimistic) | Adds 10th percentile residual to prediction |
Legacy multiplier fallback (when conformal prediction is disabled):
- Best case: temp ×0.9, rain ×1.1, inflow ×1.1
- Worst case: temp ×1.1, rain ×0.9, inflow ×0.9
Irrigation plan optimization uses beam search (beam_search.py) instead of exponential plan enumeration:
- Beam width: 200 candidates (default), greedy fallback for >14 day forecasts
- Adaptive beam: Larger beam early in forecast, smaller later (enabled by default)
- Scoring function: Multi-objective with weights for:
- Moisture deficit penalty (α = 10.0)
- Irrigation cost (β = 0.1)
- Final tank level reward (γ = 0.3)
- Overflow penalty (δ = 1.0)
The Energy Model predicts battery state-of-charge (SoC) in Watt-hours and solar energy production for up to 14 days (336 hours), including proactive energy management actions.
The Energy Model uses a cascade of prediction strategies:
graph TD
A[Energy Forecast Request] --> B{Multi-Horizon ML Model available?}
B -- Yes --> C[ML-Based Prediction]
B -- No --> D{Enhanced ML Model available?}
D -- Yes --> E[Hybrid ML + Physics Blend]
D -- No --> F[Physics-Based Prediction]
C --> G[Generate Proactive Actions]
E --> G
F --> G
G --> H[Return Forecast + Actions]
- When: Trained model file
energy_forecast_models.pklexists - Algorithm: Separate models per forecast horizon (e.g., +1h, +3h, +6h), currently uses +1h for iterative prediction
- Method: Iterative — predict 1 hour ahead, use prediction as input for next hour
- Features: 16 engineered features (see table below)
- Validated: With backtesting on historical data
-
When:
energy_model.pkl+energy_scaler.pklexist -
Algorithm:
GradientBoostingRegressor(scikit-learn) orRandomForestRegressor -
Method: Hybrid blend with time-decay:
SoC = ml_weight × time_decay × ML_prediction + (1 - ml_weight × time_decay) × Physics_prediction- Expected: ML weight 60%, Optimistic: 40%, Pessimistic: 30%
time_decay = max(0.3, 1.0 - (hour / 336))— ML influence decreases over time
-
Features: 22 engineered features including temporal, weather, lag, and rolling statistics
- When: No ML models available
- Method: Uses historically calibrated monthly charging rates and consumption patterns:
-
Charging window: 9:00–12:00 local time (site-specific, terrain-dependent)
-
Charging rates by month (from historical analysis):
Month Charging Rate (Wh/hour) Note Jan 30 Minimal winter sun Feb 40 — Mar 50 — Apr 60 — May 70 — Jun 80 High sun partially blocked by terrain Jul 56 Measured: 55.9 Wh/h Aug 91 Measured: 91.0 Wh/h — lower sun angle clears terrain better Sep 96 Measured: 95.5 Wh/h — peak performance Oct 61 Measured: 60.7 Wh/h Nov 24 Measured: 24.0 Wh/h Dec 20 Estimated -
Consumption rates: Night (~5 Wh/h), Day (~20 Wh/h), Battery drain average ~12 Wh/h
-
| Feature | Type | Description |
|---|---|---|
capacity_wh |
float (Wh) | Current battery SoC in Watt-hours |
power_watts |
float (W) | Current power consumption (~12 W average) |
hour_sin, hour_cos |
float | Cyclical encoding of hour of day |
month_sin, month_cos |
float | Cyclical encoding of month |
shortwave_radiation |
float (W/m²) | Shortwave solar radiation from Open-Meteo |
cloud_cover |
float (%) | Cloud cover percentage from Open-Meteo |
sunshine_minutes |
float (min) | Sunshine duration per hour from Open-Meteo |
effective_sunshine |
float | sunshine_minutes × in_solar_window |
in_solar_window |
boolean (0/1) | 1 if hour ∈ [7, 11], else 0 |
capacity_lag_1h |
float (Wh) | Battery SoC 1 hour ago |
capacity_lag_3h |
float (Wh) | Battery SoC 3 hours ago |
capacity_lag_6h |
float (Wh) | Battery SoC 6 hours ago |
capacity_rolling_6h |
float (Wh) | Rolling mean SoC over last 6 hours |
capacity_rolling_24h |
float (Wh) | Rolling mean SoC over last 24 hours |
| Feature | Type | Description |
|---|---|---|
dow_sin, dow_cos |
float | Cyclical encoding of day of week |
capacity_lag_12h, capacity_lag_24h |
float (Wh) | Extended lag features |
power_rolling_6h, power_rolling_24h |
float (W) | Rolling power consumption averages |
radiation_rolling_6h |
float (W/m²) | Rolling radiation average |
| Parameter | Type | Default | Description |
|---|---|---|---|
latitude |
float | 51.9 | Location latitude for weather forecast |
longitude |
float | 10.4 | Location longitude for weather forecast |
forecast_hours |
int | 336 | Forecast period in hours (14 days) |
max_solar_output_watts |
float | 600 | Maximum solar panel output in watts |
avg_consumption_watts |
float | 50 | Average power consumption (auto-injected from consumers) |
initial_soc_wh |
float | 800 | Current battery SoC (auto-injected from battery sensor) |
battery_max_wh |
float | 1600 | Maximum battery capacity in Wh |
Training data sources:
- Battery capacity:
Anker Daten ALL Kapazität in Wh.json— historical Anker Solar Bank capacity readings - Power consumption:
Power Daten ALL Vebrauch in Watt.json— historical power consumption in Watts - Weather: Historical hourly data from Open-Meteo Archive API (shortwave radiation, cloud cover, sunshine duration, temperature, humidity) — no API key required
- Location: Goslar (51.90°N, 10.43°E, 289m elevation)
| Output | Unit | Description |
|---|---|---|
| Battery SoC Forecast | Wh | Hourly predicted battery state-of-charge |
| Solar Production | Wh | Hourly estimated solar energy production |
Both outputs include three scenarios: expected, optimistic, pessimistic.
| Scenario | Solar Factor | Consumption Factor | ML Charge Bonus (solar hours) |
|---|---|---|---|
| expected | ×1.0 | ×1.0 | ±0 Wh |
| optimistic | ×1.4 | ×0.75 | +30 Wh/h |
| pessimistic | ×0.5 | ×1.35 | −20 Wh/h |
Additionally, weather inputs are adjusted per scenario (cloud cover ±20–30%, sunshine ×0.5–1.3).
The Energy Model generates proactive energy management actions:
| Action | Trigger | Description |
|---|---|---|
connect_grid = true |
Predicted SoC < 15% | Connect FPF to electrical grid to prevent battery depletion |
connect_grid = false |
Predicted SoC > 50% (after grid was connected) | Disconnect grid, battery has recovered |
Actions are scheduled 2 hours before the threshold is predicted to be hit (buffer time for Dashboard-Backend to execute the action).
Note: Individual consumer shutdowns (e.g., turning off specific equipment when battery is low) are handled by the Dashboard-Backend based on per-consumer threshold configuration, not by the AI-Backend.
The Energy Model includes site-specific calibration (site_config.py) for the Goslar installation:
- Site shading factor:
0.35— only 35% of theoretical solar radiation reaches the panels (terrain/building shading) - Effective solar window: 9:00–12:00 local time (peak at 10:00–11:00)
- Battery specs: Anker Solar Bank, 1600 Wh max, 160 Wh practical minimum (10%)
- Seasonal calibration: Monthly solar factors derived from real measurements (Aug/Sep are peak months due to low sun angle clearing terrain)
- Python 3.13 or higher
pip(Python package manager)virtualenv(recommended for isolated environments)
- Navigate to the model_service directory:
cd model_service- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # Linux/Mac
# or
.venv\Scripts\activate # Windows- Install the required dependencies:
pip install -r requirements.txtStart the development server:
python manage.py runserver 8002The default port is 8002 to avoid conflicts with other FarmInsight services.
| Method | Endpoint | Description |
|---|---|---|
| GET | /water/params |
Returns model configuration and input parameters |
| GET | /water/farm-insight |
Returns water level & soil moisture forecasts with optimal irrigation plans |
| POST | /water/train |
Triggers model retraining with latest sensor data |
| Method | Endpoint | Description |
|---|---|---|
| GET | /energy/params |
Returns model configuration and input parameters |
| GET | /energy/farm-insight |
Returns battery SoC & solar production forecasts with proactive actions |
| POST | /energy/train |
Triggers model retraining (prepares training data from JSON + weather, trains GradientBoosting model) |
| Method | Endpoint | Description |
|---|---|---|
| GET | /alive |
Returns 200 if service is running |
Trained models are stored in model_service/model_service/trained_models/.
python train_hybrid_model.py # Train LightGBM quantile regression modelsOutput files:
synthetic_quantile_q10.pkl— Pessimistic scenario modelsynthetic_quantile_q50.pkl— Median (expected) modelsynthetic_quantile_q90.pkl— Optimistic scenario model
Training uses TimeSeriesSplit cross-validation with GridSearchCV for hyperparameter tuning. Supports training on real or synthetic data (generated via synthetic_historical_data.py when real data is insufficient).
Via the /energy/train API endpoint or manually:
cd model_service
python -m model_service.utils.energy_model_trainerPipeline:
- Load battery capacity data (
Anker Daten ALL Kapazität in Wh.json) and power data (Power Daten ALL Vebrauch in Watt.json) - Resample to hourly intervals, merge by timestamp
- Fetch historical weather from Open-Meteo Archive API
- Engineer features: cyclical temporal encoding, lag features (1h, 3h, 6h, 12h, 24h), rolling statistics (6h, 24h)
- Target:
capacity_next_hour(SoC one hour in the future) - Time-series split (80/20),
StandardScaler, trainGradientBoostingRegressor - Save model, scaler, and metadata
Output files:
energy_model.pkl— Trained GradientBoosting modelenergy_scaler.pkl— StandardScaler for feature normalizationenergy_model_metadata.pkl— Training metrics and feature importance
We welcome contributions! Please follow these steps:
- Fork the repository.
- Create a new branch:
git checkout -b feature/your-feature - Make your changes and commit them:
git commit -m 'Add new feature' - Push the branch:
git push origin feature/your-feature - Create a pull request.
This project was developed as part of the Digitalisierungsprojekt at DigitalTechnologies WS24/25 by:
- Tom Luca Heering
- Theo Lesser
- Mattes Knigge
- Julian Schöpe
- Marius Peter
- Paul Golke
- Niklas Schaumann
- M. Linke
Project supervision:
- Johannes Mayer
- Benjamin Leiding
This project is licensed under the AGPL-3.0 license.
For more information or questions, please contact the ETCE-Lab team.