diff --git a/.gitignore b/.gitignore
index 18df872..cc8467b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -36,4 +36,5 @@ site/
 
 # Scratch directories
 .scratch/
+.sisyphus/
 scratch/
diff --git a/AGENTS.md b/AGENTS.md
index e776070..1544305 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -5,6 +5,7 @@ These guidelines are specific to **AI/LLM agents** working on this codebase. Hum
 ## Core Principles
 
 - **Read CONTRIBUTING.md first**: Before making changes, read [CONTRIBUTING.md](./CONTRIBUTING.md) for coding standards, testing conventions, and documentation sync rules that apply to all contributors (agents included). AGENTS.md covers agent-specific behavior; CONTRIBUTING.md covers everything else.
+- **Work in branches only**: All work must be done in feature or topic branches unless the user explicitly specifies otherwise. Commits directly to `main` are forbidden without explicit instruction. Before beginning any task, check what branch you're on and create a new one if needed (e.g., `feat/description`, `fix/description`).
 - **Build before committing**: The code MUST compile (`cargo build`), pass all tests (`cargo test --all-targets`), and be clean under clippy (`cargo clippy --all-targets -- -D warnings`) before any git commit. Never ship broken code. Always match CI commands exactly — `--all-targets` includes test targets which may have lint warnings not visible otherwise.
 - **Conventional commits**: All git commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) format: `type(scope): description`. See section below.
 - **Commit frequently when stable**: Make atomic, logical commits whenever the codebase is in a working state (builds, tests pass). Do not batch unrelated changes into a single commit. Each commit should represent one coherent unit of change.
@@ -13,6 +14,9 @@ These guidelines are specific to **AI/LLM agents** working on this codebase. Hum
   - For larger units of work (major refactoring, big new feature), split into small, manageable commits rather than one massive commit to preserve history granularity and make rollbacks easier.
 - **Follow existing patterns first**: Before proposing new patterns or structures, search for and follow established conventions in the codebase. When in doubt, match what's already there.
 - **Graceful degradation over panics**: Metric collectors return `Result` types and fall back to zero values on failure. The daemon continues operating even when individual metrics are unavailable.
+- **Descriptive comments are encouraged**: Comments that explain non-obvious intent, arithmetic expectations, or why a particular approach was chosen should be kept — especially in tests where the "what" is clear but the "why" and expected values may not be. Docstrings on public APIs and complex algorithms (e.g., accumulation logic, security-critical code) are welcome. Avoid comments that merely restate what the code already says ("increment counter by one"), but keep those that add context a reader wouldn't get from reading alone.
+- **Docs document current state only**: All documentation must describe how things work now — never reference "previous behaviour", "this replaces", or any historical comparison. Documentation is read against the current codebase; past implementation details belong in git history, not docs.
+- **Use todos tool for task tracking**: Always use the `todos` tool to track tasks and keep it updated as you progress. When interrupted or new requests are made during work, update the todos list ordering by priority. This ensures continuity across session boundaries and prevents lost context on resumption.
 
 ### Agent-Specific Rules (do NOT apply to human developers)
 
@@ -96,6 +100,7 @@ Use the affected module as scope: `service`, `config`, `gpu`, `cpu`, `network`,
 ## Logging Conventions
 
 - Use the `tracing` crate (`debug!`, `info!`, `warn!`, `error!` macros).
+- **Log level priority chain**: When resolving the effective tracing log level, always follow this exact order: CLI `-l` flag > RUST_LOG env var > config.log_level from any loaded config file > default of `'info'`. Never reorder these — the function `resolve_tracing_log_level()` in main.rs implements this and must not be changed.
 - **State-change-only logging**: When tracking persistent states (inhibition, connection status), only emit INFO logs on actual state transitions. Do not log every polling cycle when state is unchanged. Track previous state and compare at the end of each tick/loop iteration.
 
 ## Error Handling Conventions
@@ -114,8 +119,8 @@ Use the affected module as scope: `service`, `config`, `gpu`, `cpu`, `network`,
 ## Configuration Conventions
 
 - TOML format via the `toml` crate with serde derive macros.
-- All config values have sensible defaults defined as `fn default_*() -> T` helper functions.
-- Optional fields use `#[serde(default)]`; required overrides use `#[serde(default = "default_fn")]`.
+- All config values have sensible defaults defined in `config/rouser.toml`, embedded at compile time via `include_str!()`. Struct fields use bare `#[serde(default)]`; Duration fields may need explicit helper functions only when humantime_serde requires a function-typed default (e.g., `default_history_length()` for 30-day history).
+- Explicit `Default` trait impls on config structs hardcode values from `config/rouser.toml`. Never add `fn default_*() -> T` helper functions — the TOML file is the single source of truth.
 - Duration parsing uses `humantime_serde` for human-readable format (e.g., `"5s"`, `"30m"`).
 
 ## XDG Base Directory Compliance
@@ -219,7 +224,7 @@ The old `/org/freedesktop/PowerManagement.Inhibit` API is obsolete (deprecated ~
 `config/rouser.toml` is the single source of truth for all configuration defaults — not `src/config.rs`, not documentation, not code comments. When updating default values:
 
 1. **Always update `config/rouser.toml` first** with the new default value
-2. Then update `src/config.rs` to match (default helper functions like `default_ema_alpha_cpu()`)
+2. Then update `src/config.rs` to match (hardcoded values in `Default` trait impls)
 3. Then update all documentation (`docs/configuration.md`, `docs/metrics-overview.md`, etc.)
 
 The code defaults in `config/rouser.toml` are embedded at compile time via `include_str!()` and served as both the shipped config file AND the binary's built-in fallback. Never change a default value without updating all three locations simultaneously.
@@ -299,3 +304,20 @@ echo "https://github.com/{owner}/{repo}/actions/runs/RUN_ID"
 - **Missing `needs` dependencies**: If a job references another via `needs: [foo]`, and `foo` is conditional (`if:`), the dependent job inherits that condition — it will skip if the dependency was skipped. Always verify both jobs have matching trigger conditions.
 - **Container vs runner environment mismatch**: Steps running in containers (e.g., `container: fedora:latest`) cannot access tools on the host runner (like `gh` CLI). Split containerized build steps from upload/CLI steps that run on `ubuntu-latest` without a container.
 - **Artifact download path defaults to `.`**: When using `actions/download-artifact@v4`, always specify `path: some-dir/` explicitly, then move files with `mv some-dir/* .` before consuming them — default behavior may merge artifacts unpredictably.
+
+## XDG State Directory Migration
+
+History data was migrated from `$XDG_DATA_HOME/rouser` (or `~/.local/share/rouser`) to `$XDG_STATE_HOME/rouser` (or `~/.local/state/rouser`). This is a breaking change: existing history files at the old path are not read by new binaries. The fallback for read-only `/home` with no writable state dir uses `/tmp/rouser-history.<pid>` with 0700 permissions to minimize TOCTOU risk on shared systems. When updating config defaults or docs, always reference `XDG_STATE_HOME`, never `XDG_DATA_HOME`.
+
+## Prediction Model Refactoring (In Progress)
+
+The prediction module is undergoing a major refactoring to replace the histogram-based TimeKey approach with an unsupervised ML model using NG-RC reservoir computing from the [irithyll](https://crates.io/crates/irithyll) crate. See [`docs/prediction-todo.md`](./docs/prediction-todo.md) for the complete task tracker and architecture decisions.
+
+**Key changes:**
+- **TimeKey deprecation**: The `(year, week_of_year, seconds_into_week)` histogram key is being removed. Year provides no pattern-matching value (it's monotonically increasing), and 604800 buckets/week is wasteful for sparse data. The ML approach eliminates bucketing entirely — each history entry becomes a feature vector.
+- **Feature vectors**: Six normalized values per entry: CPU max, CPU avg, GPU max, GPU avg, network MB/s, disk MB/s. No time-key bucketing; temporal patterns learned via reservoir delay embeddings.
+- **Unsupervised learning**: NG-RC updates weights at each prediction `update_interval` (default 30s) without labeled data. Anomaly score maps to cooldown extension.
+- **Gap-filled entries preserved**: Unlike the previous approach that filtered out zero-value gap entries, these represent valid idle states and contribute to baseline anomaly scoring.
+- **GPU deltas added**: EntryDeltas now includes `gpu_delta_per_gpu_max` and `gpu_delta_total_average`, updated in TrendSignal alongside CPU/network/disk trends.
+
+**Config changes:** New fields planned for `[prediction]`: `hidden_dim: usize (default 16)`, `delay_buffer_size: usize (default 8)` to control reservoir capacity.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index d54eecd..142d656 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -155,8 +155,7 @@ Include contextual identifiers in log messages: GPU device IDs (`card0(nvidia)`)
 ### Configuration Conventions
 
 - TOML format via the `toml` crate with serde derive macros.
-- All config values have sensible defaults defined as `fn default_*() -> T` helper functions.
-- Optional fields use `#[serde(default)]`; required overrides use `#[serde(default = "default_fn")]`.
+- All config values have sensible defaults defined in `config/rouser.toml`, embedded at compile time via `include_str!()`. Struct fields use bare `#[serde(default)]`; explicit `Default` trait impls on config structs hardcode these same values. Never add `fn default_*() -> T` helper functions — the TOML file is the single source of truth.
 - Duration parsing uses `humantime_serde` for human-readable format (e.g., `"5s"`, `"30m"`).
 
 ---
diff --git a/Cargo.toml b/Cargo.toml
index 2aa7aa1..55dc178 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -41,10 +41,16 @@ libc = "0.2"
 serde = { version = "1.0", features = ["derive"] }
 humantime-serde = "1.0"
 
+# Binary serialization for history log (lightweight, serde-compatible via bincode v2)
+bincode = { version = "2", features = ["serde"] }
+
 # CLI parsing
 clap = { version = "4", features = ["derive"] }
 humantime = "2.1"
 
+# Streaming machine learning (unsupervised NG-RC reservoir computing for cooldown prediction)
+irithyll = { version = "9.9", features = ["serde-bincode"] }
+
 
 [dev-dependencies]
 tempfile = "3.0"
diff --git a/README.md b/README.md
index 897d1b3..5e1c349 100644
--- a/README.md
+++ b/README.md
@@ -19,6 +19,7 @@ rouser keeps headless servers and desktops awake during active use. It monitors
 - **Multi-metric monitoring**: CPU (per-core frequency-weighted), GPU (NVIDIA/AMD/Intel), network I/O, disk activity
 - **Configurable thresholds**: Independent per-core and total-CPU thresholds, per-GPU reporting
 - **EMA smoothing**: Per-metric exponential moving average for stable readings
+- **Predictive cooldown**: Learns from historical usage patterns to extend idle cooldown duration, reducing false-positive sleep inhibition during typical active-use hours
 - **Systemd integration**: Uses `org.freedesktop.login1.Manager.Inhibit` D-Bus API
 - **TOML configuration**: Embedded default config; auto-installs to user or system paths on first run, merges `/etc/rouser/config.toml` and `~/.config/rouser/config.toml` if present
 - **Dry-run mode**: Test without inhibiting sleep
@@ -71,6 +72,7 @@ See [Configuration Reference](docs/configuration.md) for all options with defaul
 | [Configuration Reference](docs/configuration.md) | All config options with embedded-default values |
 | [Command Line](docs/command-line.md) | CLI arguments and usage examples |
 | [Metrics Overview](docs/metrics-overview.md) | How CPU, GPU, network, disk metrics are collected |
+| [Prediction Model](docs/prediction-model.md) | How adaptive cooldown extension works from historical patterns |
 | [GPU Usage Measurement](docs/gpu-usage-measurement.md) | What NVML, amdgpu, and i915 actually measure |
 | [D-Bus Inhibition](docs/d-bus-inhibition.md) | How sleep inhibition works under the hood |
 
diff --git a/config/rouser.toml b/config/rouser.toml
index 004bf93..dc215cd 100644
--- a/config/rouser.toml
+++ b/config/rouser.toml
@@ -13,8 +13,9 @@ total_threshold = 25.0
 ema_alpha = 0.7
 
 [metrics.gpu]
-threshold = 15.0      # GPU usage threshold (percentage)
-ema_alpha = 0.7       # EMA smoothing factor
+per_gpu_threshold = 25.0    # Per-GPU utilization percentage that triggers inhibition
+total_threshold = 40.0      # System-wide average GPU utilization threshold (both thresholds use OR logic)
+ema_alpha = 0.7             # EMA smoothing factor
 
 [metrics.network]
 threshold = 10.0      # Network I/O threshold (Mbps)
@@ -35,3 +36,12 @@ cooldown_duration = "10s"     # Time below threshold before releasing inhibition
 [inhibitor]
 what = "shutdown:idle"     # Lock type: idle, sleep, suspend, shutdown (colon-separated)
 mode = "block"             # Mode: block, delay, block-weak
+
+# Predictive cooldown — learns from historical usage patterns to dynamically extend or reduce the cooldown duration.
+# Requires a longer history (days/weeks of data). Disabled by default; set update_interval to enable.
+[prediction]
+update_interval = "30s"              # Seconds between averaged snapshots written to history log; must be >= root update_interval
+history_length = "30d"               # Keep this much historical data; older entries are pruned periodically
+max_extension_time = "1h"            # Maximum additional time for predictive cooldown extension
+ml_hidden_dim = 16                   # Number of hidden neurons in NG-RC reservoir computing model (controls capacity, O(n^2) memory)
+ml_delay_buffer_size = 8             # Size of delay buffer for temporal feature creation from past states
diff --git a/docs/averaging.md b/docs/averaging.md
index 2eadec7..c2615d3 100644
--- a/docs/averaging.md
+++ b/docs/averaging.md
@@ -94,8 +94,9 @@ threshold = 80.0
 ema_alpha = 0.1        # Default smoothing for CPU
 
 [metrics.gpu]
-threshold = 90.0
-ema_alpha = 0.2        # More responsive for GPU
+per_gpu_threshold = 90.0    # Per-GPU max usage threshold
+total_threshold = 85.0      # System-wide average threshold (both use OR logic)
+ema_alpha = 0.2             # More responsive for GPU
 
 [metrics.network]
 threshold = 100.0
@@ -117,15 +118,16 @@ ema_alpha = 0.1        # Standard smoothing for disk I/O
 
 ### Per-GPU EMA Smoothing
 
-Each detected GPU applies the same `ema_alpha` from `[metrics.gpu]`, but independently. There is no per-GPU config override — the threshold and smoothing factor apply uniformly to all GPUs:
+Each detected GPU applies the same `ema_alpha` from `[metrics.gpu]`, but independently. There is no per-GPU config override — both thresholds and the smoothing factor apply uniformly to all GPUs:
 
 ```toml
 [metrics.gpu]
-threshold = 90.0    # Applies to ALL detected GPUs
-ema_alpha = 0.2     # Applied per-device, not globally averaged
+per_gpu_threshold = 90.0    # Per-GPU max usage threshold (applies to ALL detected GPUs)
+total_threshold = 85.0      # System-wide average threshold
+ema_alpha = 0.2             # Applied per-device, not globally averaged
 ```
 
-This means card0(nvidia) at 95% and card1(amdgpu) at 87% are each compared against the same threshold independently — one exceeding it triggers inhibition regardless of the other's state.
+This means card0(nvidia) at 95% and card1(amdgpu) at 87% are each compared against the `per_gpu_threshold` independently — one exceeding it triggers inhibition regardless of the other's state. The system-wide average is also checked: if both GPUs hover near `total_threshold`, that alone can trigger inhibition even if neither per-GPU value exceeds its threshold.
 
 ## Threshold Evaluation
 
@@ -323,8 +325,9 @@ total_threshold = 60.0
 ema_alpha = 0.1        # Default smoothing for CPU
 
 [metrics.gpu]
-threshold = 90.0
-ema_alpha = 0.2
+per_gpu_threshold = 90.0    # Per-GPU max usage threshold
+total_threshold = 75.0      # System-wide average threshold (both use OR logic)
+ema_alpha = 0.2             # EMA smoothing for GPU
 
 [metrics.network]
 threshold = 50.0
@@ -354,8 +357,9 @@ total_threshold = 70.0
 ema_alpha = 0.15       # More responsive for compilation bursts
 
 [metrics.gpu]
-threshold = 95.0
-ema_alpha = 0.2        # Responsive for GPU workloads
+per_gpu_threshold = 95.0    # Per-GPU max usage threshold (high for gaming)
+total_threshold = 80.0      # System-wide average threshold (both use OR logic)
+ema_alpha = 0.2             # EMA smoothing for GPU
 
 [metrics.network]
 threshold = 100.0
@@ -385,8 +389,9 @@ total_threshold = 60.0
 ema_alpha = 0.2        # Quick spike detection
 
 [metrics.gpu]
-threshold = 90.0
-ema_alpha = 0.25       # Very responsive for gaming GPU activity
+per_gpu_threshold = 90.0     # Per-GPU max usage threshold (high for gaming)
+total_threshold = 85.0       # System-wide average threshold (both use OR logic)
+ema_alpha = 0.25             # Very responsive for gaming GPU activity
 
 [timing]
 duration_threshold = "15s"   # Shorter threshold — gamers prefer instant response
diff --git a/docs/configuration.md b/docs/configuration.md
index 6cd105d..698d2d1 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -42,8 +42,9 @@ total_threshold = 25.0
 ema_alpha = 0.7
 
 [metrics.gpu]
-threshold = 15.0      # GPU usage threshold (percentage)
-ema_alpha = 0.7       # EMA smoothing factor
+per_gpu_threshold = 25.0    # Per-GPU utilization percentage that triggers inhibition
+total_threshold = 40.0      # System-wide average GPU utilization threshold (both use OR logic)
+ema_alpha = 0.7             # EMA smoothing factor
 
 [metrics.network]
 threshold = 10.0      # Network I/O threshold (Mbps)
@@ -61,6 +62,13 @@ exclude_device_prefixes = ["loop", "fd", "sr", "cdrom"]
 duration_threshold = "5s"    # Min time above threshold before inhibiting sleep
 cooldown_duration = "10s"     # Time below threshold before releasing inhibition
 
+[prediction]
+update_interval = "30s"              # Seconds between averaged snapshots; must be >= root update_interval
+history_length = "30d"               # Keep this much historical data; older entries pruned periodically
+max_extension_time = "1h"            # Maximum additional time for predictive cooldown extension
+ml_hidden_dim = 16                   # Hidden neurons in NG-RC reservoir computing model (O(n^2) memory)
+ml_delay_buffer_size = 8             # Delay buffer size for temporal feature creation from past states
+
 [inhibitor]
 what = "shutdown:idle"     # Lock type: idle, sleep, suspend, shutdown (colon-separated)
 mode = "block"             # Mode: block, delay, block-weak
@@ -87,11 +95,12 @@ CPU usage is measured per-core (frequency-weighted from sysfs cpufreq data) and
 
 ### `[metrics.gpu]` — GPU Usage Threshold
 
-Per-device GPU collection (NVIDIA via NVML, AMD/Intel via sysfs). Each detected GPU is compared independently against this threshold.
+Per-device GPU collection (NVIDIA via NVML, AMD/Intel via sysfs). Both thresholds use OR logic — exceeding either one triggers inhibition.
 
 | Key | Type | Default (0–100) | Description |
 |-----|------|-----------------|-------------|
-| `threshold` | f64 | `15.0` | GPU usage percentage above which to inhibit sleep |
+| `per_gpu_threshold` | f64 | `25.0` | Per-GPU utilization percentage above which to inhibit sleep |
+| `total_threshold` | f64 | `40.0` | System-wide average GPU utilization threshold (both use OR logic) |
 | `ema_alpha` | f64 | `0.7` | EMA smoothing factor for per-GPU readings |
 
 ### `[metrics.network]` — Network Throughput Threshold
@@ -126,6 +135,22 @@ Disk activity is calculated as total bytes transferred across monitored devices
 
 **Note**: There is no `idle_duration` field — the cooldown mechanism replaces it. A metric exceeding threshold for at least `duration_threshold` triggers inhibition; all metrics below their respective thresholds for at least `cooldown_duration` releases inhibition. See [d-bus-inhibition.md](d-bus-inhibition.md) for details on how inhibition works.
 
+## Prediction Configuration
+
+### `[prediction]` Section — Adaptive Cooldown Extension
+
+The prediction module uses an unsupervised NG-RC (Narmala-Gated Reservoir Computing) neural network to learn historical system metric patterns over days and weeks, then dynamically extends the post-idle cooldown duration when learned patterns indicate likely continued active use. This reduces false-positive sleep inhibition during typical work hours while still allowing sleep during known idle periods (e.g., late nights). See [prediction-model.md](prediction-model.md) for a detailed explanation of how the model works.
+
+| Key | Type | Default | Description |
+|-----|------|---------|-------------|
+| `update_interval` | duration | `"30s"` | Seconds between averaged snapshots written to history log. Must be greater than or equal to the root `update_interval`. Metrics from each tick are accumulated and averaged, then a single snapshot is flushed every N ticks where N = update_interval / root_update_interval. Set to `"0s"` to disable prediction entirely. |
+| `history_length` | duration | `"30d"` | Amount of historical data to retain. Older entries and files are pruned automatically. Uses humantime format: `"7d"`, `"30d"`, `"90d"` |
+| `max_extension_time` | duration | `"1h"` | Maximum additional time added to the cooldown duration by prediction. The model will never extend beyond this cap, even if historical patterns suggest it. Uses humantime format: `"5m"`, `"30m"`, `"1h"` |
+| `ml_hidden_dim` | usize | `16` | Number of hidden neurons in the NG-RC reservoir computing model. Controls model capacity; larger values capture more complex temporal patterns but use O(n^2) memory (e.g., 16 → ~4KB, 32 → ~16KB). Adjust based on pattern complexity and available memory. |
+| `ml_delay_buffer_size` | usize | `8` | Size of the delay buffer used by the NG-RC model to create polynomial features from past states. Controls how far back in time the model looks for temporal patterns. Should be <= history_length / update_interval (e.g., with 30-day history and 30s intervals, max is ~8640). |
+
+**Data storage**: Historical data is stored as binary files (`history.log.YYYYMMDD`) using bincode v2 serialization under `$XDG_STATE_HOME/rouser/` (defaults to `~/.local/state/rouser/`, or `/var/lib/rouser/` when running as root). Files are date-partitioned for efficient pruning.
+
 ## Inhibition Configuration
 
 ### `[inhibitor]` Section
@@ -183,7 +208,7 @@ There are no `ROUSER_*` environment variable overrides for configuration values
 
 ## Best Practices
 
-1. **Start with conservative thresholds**: Begin with higher per-core CPU (80%) and GPU (15%) thresholds, then lower them based on observed baselines from dry-run logs
+1. **Start with conservative thresholds**: Begin with higher per-core CPU (80%), per-GPU (15%), and total GPU average (15%) thresholds, then lower them based on observed baselines from dry-run logs
 2. **Use EMA smoothing**: Default alpha values provide a good balance between responsiveness and noise filtering for your workload
 3. **Test before production**: Always use `--dry-run` mode to verify thresholds before deploying in daemon mode
 4. **Review logs regularly**: Use debug logging (`RUST_LOG=debug`) to understand your system's baseline activity before finalizing thresholds
@@ -192,4 +217,5 @@ There are no `ROUSER_*` environment variable overrides for configuration values
 
 - [Command Line Reference](command-line.md) — All CLI arguments and usage examples
 - [Metrics Overview](metrics-overview.md) — How CPU, GPU, network, and disk metrics are collected
+- [Prediction Model](prediction-model.md) — How adaptive cooldown extension works from historical patterns
 - [D-Bus Inhibition](d-bus-inhibition.md) — How sleep inhibition works under the hood
diff --git a/docs/developer-guide.md b/docs/developer-guide.md
index dad5b5c..c45a259 100644
--- a/docs/developer-guide.md
+++ b/docs/developer-guide.md
@@ -351,10 +351,10 @@ impl ThresholdManager {
     
     pub fn check(&self, config: &Config) -> bool {
         // Check each metric against its threshold using smoothed values
-        let cpu_ok = self.check_metric(&self.cpu_state, metrics.cpu.usage(), config.metrics.cpu.threshold);
-        let gpu_ok = self.gpu_states.iter().all(|state| {
-            self.check_metric(state, /* GPU value */, config.metrics.gpu.threshold)
-        });
+        let cpu_ok = self.check_metric(&self.cpu_state, metrics.cpu.usage(), config.metrics.cpu.per_core_threshold);
+        let gpu_agg = GpuAggregate::from_gpus(&metrics.gpu_usage);
+        let gpu_ok = gpu_agg.per_gpu_max > config.metrics.gpu.per_gpu_threshold
+            || gpu_agg.total_average > config.metrics.gpu.total_threshold;
         // ... similar for network and disk
         cpu_ok || gpu_ok || /* others */ false
     }
@@ -445,18 +445,24 @@ Add inhibitor selection to config:
 ```rust
 #[derive(Debug, Deserialize)]
 pub struct InhibitionConfig {
-    #[serde(default = "default_inhibitor_type")]
+    #[serde(default)]
     pub inhibitor_type: String,  // "login1", "custom", etc.
-    
-    #[serde(default = "default_what")]
+
+    #[serde(default)]
     pub what: String,
-    
-    #[serde(default = "default_mode")]
+
+    #[serde(default)]
     pub mode: String,
 }
 
-fn default_inhibitor_type() -> String {
-    "login1".to_string()
+impl Default for InhibitionConfig {
+    fn default() -> Self {
+        Self {
+            inhibitor_type: "login1".to_string(),
+            what: "shutdown:idle".to_string(),
+            mode: "block".to_string(),
+        }
+    }
 }
 ```
 
diff --git a/docs/gpu-usage-measurement.md b/docs/gpu-usage-measurement.md
index 77df62f..7b8e20e 100644
--- a/docs/gpu-usage-measurement.md
+++ b/docs/gpu-usage-measurement.md
@@ -59,11 +59,12 @@ Measured via PMU (Performance Monitoring Unit) counters from the GuC (Graphics M
 
 ### Why This Matters for Sleep Inhibition
 
-rouser applies a single configurable GPU utilization threshold across all GPUs regardless of vendor:
+rouser applies **two configurable thresholds** with OR logic across all GPUs regardless of vendor: either a single GPU exceeding its per-GPU threshold, or the system-wide average total utilization exceeding its own threshold. Both trigger sleep inhibition independently.
 
 ```toml
 [metrics.gpu]
-threshold = 20   # Inhibit sleep if any GPU exceeds this percentage
+per_gpu_threshold = 25.0   # Per-GPU max usage that triggers inhibition
+total_threshold = 40.0     # System-wide GPU average that triggers inhibition (both use OR logic)
 ema_alpha = 0.3
 ```
 
diff --git a/docs/index.md b/docs/index.md
index e74ca20..40425bb 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -18,6 +18,7 @@ A Linux daemon that monitors system metrics and inhibits sleep when activity thr
 - [Command Line](command-line.md) — CLI arguments and usage examples
 - [Systemd User Service](systemd-user-service.md) — Running rouser as a service
 - [Metrics Overview](metrics-overview.md) — How CPU, GPU, network, disk metrics are collected
+- [Prediction Model](prediction-model.md) — How adaptive cooldown extension works from historical patterns
 - [GPU Usage Measurement](gpu-usage-measurement.md) — What NVML, amdgpu, and i915 actually measure
 
 ## Links
diff --git a/docs/installation.md b/docs/installation.md
index 8df9861..dc84034 100644
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -110,7 +110,8 @@ per_core_threshold = 80.0       # Per-core CPU max usage % above which to inhibi
 total_threshold = 25.0          # Total averaged CPU usage % (default: 25.0)
 
 [metrics.gpu]
-threshold = 15.0                # GPU usage % per device (default: 15.0)
+per_gpu_threshold = 25.0        # Per-GPU max usage that triggers inhibition (default: 25.0)
+total_threshold = 40.0          # System-wide average threshold (both use OR logic)
 ema_alpha = 0.7                 # EMA smoothing factor for GPU readings
 
 [metrics.network]
@@ -311,8 +312,9 @@ total_threshold = 50.0
 ema_alpha = 0.3
 
 [metrics.gpu]
-threshold = 85.0
-ema_alpha = 0.3
+per_gpu_threshold = 85.0    # Per-GPU max usage that triggers inhibition
+total_threshold = 70.0      # System-wide average threshold (both use OR logic)
+ema_alpha = 0.3             # EMA smoothing factor for GPU readings
 
 [metrics.network]
 threshold = 50.0
@@ -345,11 +347,12 @@ total_threshold = 70.0
 ema_alpha = 0.3
 
 [metrics.gpu]
-threshold = 95.0       # Gaming or GPU workloads
-ema_alpha = 0.3
+per_gpu_threshold = 95.0    # Per-GPU max usage that triggers inhibition (gaming/GPU workloads)
+total_threshold = 80.0      # System-wide average threshold (both use OR logic)
+ema_alpha = 0.3             # EMA smoothing factor for GPU readings
 
 [metrics.network]
-threshold = 200.0      # Large downloads/uploads
+threshold = 200.0           # Large downloads/uploads
 ema_alpha = 0.2
 exclude_interfaces = ["lo"]
 
diff --git a/docs/metrics-overview.md b/docs/metrics-overview.md
index 41f261c..63a73b3 100644
--- a/docs/metrics-overview.md
+++ b/docs/metrics-overview.md
@@ -163,18 +163,21 @@ rocm-smi --showgpuutilization
 
 NVML, amdgpu, and i915 all report a 0–100% value but measure different things under the hood. NVIDIA's SM kernel utilization, AMD's aggregate IP core activity via SMU firmware, and Intel's GT engine ticks are not directly comparable as percentages. See [GPU Usage Measurement](gpu-usage-measurement.md) for a detailed breakdown of what each driver reports and why this doesn't affect rouser's sleep inhibition behavior in practice.
 
-### Aggregation Strategy — Per-Device Reporting Over Averaging
+### Aggregation Strategy — Dual Thresholds with Per-Device Reporting
 
-rouser reports each physical GPU **individually** rather than aggregating across devices. Each detected GPU is compared independently against the configured threshold:
+rouser collects each physical GPU **individually** (independent EMA smoothing per device) but uses two aggregate metrics for inhibition decisions: the maximum per-GPU utilization and the system-wide average across all GPUs. Either threshold exceeding its configured value triggers sleep inhibition via OR logic.
 
 ```
-card0(nvidia): 95%   ← above 90% threshold → inhibits sleep
-card1(amdgpu): 78%  ← below 90% threshold → does not inhibit alone
+card0(nvidia): 95%   ← above per_gpu_threshold → inhibits sleep (per-device max exceeded)
+card1(amdgpu): 78%    total_average = (95+78)/2 = 86.5%
+                      both thresholds evaluated independently — either triggers inhibition
 ```
 
-A single GPU exceeding its threshold triggers inhibition regardless of other GPUs' states. This provides accurate per-GPU logging and prevents one low-usage card from masking a high-usage card's activity.
+A single GPU exceeding `per_gpu_threshold` OR the system-wide average (`total_average`) exceeding `total_threshold` inhibits sleep. This prevents one low-usage card from masking a high-usage card while also catching scenarios where all GPUs are moderately loaded simultaneously (high aggregate even if no single card exceeds its individual threshold).
 
-**EMA Smoothing**: Each device has independent EMA smoothing applied to its readings before comparison against the threshold. The `ema_alpha` value in `[metrics.gpu]` controls smoothing strength uniformly across all GPUs.
+**History Format**: Each flushed history entry stores a fixed-size `GpuSnapshot { per_gpu_max, total_average }` rather than a variable-length vector of per-GPU values. This ensures consistent serialization regardless of GPU count — adding or removing GPUs does not break historical data comparison.
+
+**EMA Smoothing**: Each device has independent EMA smoothing applied to its readings before both the debug display and aggregate computation. The `ema_alpha` value in `[metrics.gpu]` controls smoothing strength uniformly across all GPUs.
 
 ## Network I/O
 
diff --git a/docs/prediction-model.md b/docs/prediction-model.md
new file mode 100644
index 0000000..805b7c0
--- /dev/null
+++ b/docs/prediction-model.md
@@ -0,0 +1,182 @@
+# Prediction Model
+
+The prediction module provides adaptive cooldown extension based on historical system usage patterns. When metrics drop below inhibition thresholds, rouser consults its learned models to determine whether it should extend the idle wait period before releasing sleep inhibition — reducing false-positive wake-ups during typical active-use hours.
+
+## Overview
+
+Without prediction, rouser releases sleep inhibition after a fixed `cooldown_duration` (default 10s) of all metrics being below threshold. With prediction enabled, if historical patterns indicate that similar usage levels are typically followed by renewed activity, rouser extends this wait period by up to `max_extension_time`.
+
+The model uses an **unsupervised streaming neural network** — specifically a Narmala-Gated Reservoir Computing (NG-RC) architecture from the [irithyll](https://crates.io/crates/irithyll) crate. Unlike the previous histogram-based approach that bucketed data by time-of-day, this model treats each metric dimension as an independent feature and learns normal usage patterns without requiring labeled training data.
+
+### Architecture: Feature Vectors → Unsupervised Learning
+
+Each history entry (flushed every `[prediction].update_interval`, default 30s) is converted into a fixed-size **feature vector** of six normalized values:
+
+| Feature | Source | Description |
+|---------|--------|-------------|
+| CPU per-core max | `/proc/stat` | Highest individual core usage across all cores (0–100%) |
+| CPU total average | `/proc/stat` | Average utilization across all cores weighted by frequency (0–100%) |
+| GPU per-GPU max | NVML / sysfs | Maximum GPU utilization across all detected GPUs (0–100%) |
+| GPU total average | NVML / sysfs | Mean utilization averaged across all GPUs (0–100%) |
+| Network I/O | `/proc/net/dev` | Total throughput in Mbps across all monitored interfaces |
+| Disk activity | `/proc/diskstats` | Combined read + write throughput in MB/s |
+
+The model is **unsupervised** — it learns what "normal" system usage looks like by continuously updating its weights at each prediction `update_interval`. When metrics drop below inhibition thresholds, the model evaluates how anomalous the current state is compared to learned patterns. Higher anomaly scores produce longer cooldown extensions.
+
+### Data Collection and Averaging
+
+rouser collects raw metrics every root `update_interval` seconds (default 1s). It accumulates these per-tick samples in memory and writes an **averaged snapshot** at a longer interval defined by `[prediction].update_interval` (default 30s).
+
+For example, with root `update_interval = "1s"` and prediction `update_interval = "30s"`, rouser collects 30 raw samples per minute, computes their arithmetic mean for each metric dimension, then writes one averaged data point to the history log. This produces smoother historical data that better represents sustained usage patterns rather than momentary spikes.
+
+### Rate-of-Change (Delta) Features
+
+Deltas are computed on-the-fly at prediction time by comparing consecutive flushed entries: `delta = (current - previous) / elapsed_time`. This avoids storing redundant rate-of-change data while preserving the ability to detect rising or falling trends across the historical record.
+
+The following deltas are computed per-entry-pair:
+- **CPU**: per-core max and total average change in %/s
+- **GPU**: per-GPU max and total average change in %/s
+- **Network**: throughput change in Mbps/s
+- **Disk**: throughput change in MB/s/s
+
+These deltas feed into the trend signal, which provides an additional dimension beyond raw metric values — helping distinguish between a temporary dip during active work versus genuine inactivity.
+
+### Gap Handling via Zero-Fill Interpolation
+
+When the computer is shut down or sleeping, no data points are written to the history log. Without correction, this creates a temporal gap that would cause the prediction model to be overfit on active-period data only — it would see high activity during those gaps and incorrectly predict future activity.
+
+To address this, rouser detects gaps between consecutive entries at prediction time — any gap exceeding `[prediction].update_interval` is considered a large gap (e.g., >30s with default config). Rouser inserts **synthetic zero-value entries** at `update_interval` intervals within such gaps. These synthetic records have all metric values set to 0 and `inhibited: false`, representing idle periods where no activity was recorded because the system was powered off or sleeping. Synthetic entries exist only in memory during prediction; they are never written to history log files.
+
+This approach ensures the model sees a complete picture of both active and inactive periods, producing more accurate cooldown extensions that account for normal downtime patterns. Gap-filled entries ARE included in feature vector construction — their all-zero values represent legitimate idle states that contribute to learning "normal" baselines.
+
+## Storage Layout
+
+History files follow the naming pattern `history.log.YYYYMMDD` under:
+
+- **User mode**: `$XDG_STATE_HOME/rouser/` (defaults to `~/.local/state/rouser/`)
+- **Root mode**: `/var/lib/rouser/`
+
+Each file contains only data points from that specific calendar day. Files are appended sequentially — new entries are written as binary blobs with a 4-byte length prefix followed by the bincode-encoded serde struct. This allows efficient streaming reads without loading entire files into memory for size estimation.
+
+## How Prediction Works
+
+### Step 1: Load and Normalize History Entries
+
+On initialization, rouser scans all existing history files and loads entries. At prediction time (when metrics drop below thresholds), it:
+
+1. Selects recent entries within a timestamp window — entries where `timestamp >= current_time - max_extension_time` (e.g., the last hour with default config).
+2. Filters out synthetic zero-value gap-filled entries from training data to prevent the model from learning idle-state patterns as "normal active use." However, these entries remain in history for baseline anomaly scoring.
+3. Computes on-the-fly deltas between consecutive real entries (`(current - previous) / elapsed_time`).
+
+### Step 2: Convert Entries to Feature Vectors and Train Model
+
+Each selected entry is converted into a normalized feature vector — values are scaled using running statistics (mean, standard deviation) computed from the full history. The NG-RC reservoir computing model receives one sample at a time via its `StreamingLearner` trait, updating weights incrementally:
+
+```rust
+// At each prediction update_interval:
+for entry in recent_entries {
+    let features = feature_vector_from_entry(entry); // 6 normalized values
+    ml_predictor.train(&features, &target_value)?;   // Online weight update
+}
+```
+
+The NG-RC architecture uses a fixed random reservoir of neurons with delay embeddings to capture temporal patterns. Its key properties:
+- **O(n²) memory** where n = hidden_dim (default 16 → ~4KB for weights + reservoir)
+- **One sample at a time** training — no batches, no retraining from scratch
+- **Temporal awareness** through delay buffers that create polynomial features from past states
+- **Concept drift adaptation** via automatic weight adjustment when data distribution shifts
+
+### Step 3: Anomaly Scoring and Extension Mapping
+
+The model evaluates the current metrics as a feature vector. Since this is unsupervised, scoring is based on reconstruction error or prediction confidence — how well can the model predict today's state given what it has learned from historical patterns?
+
+If the anomaly score exceeds a configurable threshold (default 0.3), rouser extends the cooldown:
+
+```
+if anomaly_score > min_threshold {
+    additional_time = interpolate(anomaly_score, max_extension_time)
+} else {
+    additional_time = 0  // Use standard cooldown_duration
+}
+```
+
+The score-to-extension mapping uses linear interpolation between `min_threshold` (default 0.3 → zero extension) and maximum observed anomaly levels (mapped to full `max_extension_time`). This produces smooth transitions rather than binary on/off behavior.
+
+### Step 4: Confidence Scaling
+
+The model reports a confidence value based on total data points collected:
+
+| Data Points | Confidence | Interpretation |
+|-------------|-----------|----------------|
+| <50 | 0.1 | Insufficient data — extension unlikely to be meaningful |
+| <500 | 0.3 | Some pattern recognition, but noisy |
+| <5,000 | 0.6 | Good statistical basis for predictions |
+| >=5,000 | 0.9 | Strong confidence in learned patterns |
+
+Confidence is reported via logging only — it does not affect the extension calculation itself. The minimum threshold of 10 data points before any prediction is made provides a basic safety gate against completely uninformed extensions.
+
+## Prediction Timing: update_interval, Not Every Tick
+
+The cooldown extension prediction runs at the same cadence as history flushes — every `[prediction].update_interval` seconds (default 30s). This avoids redundant computation since the underlying data only changes when new averaged snapshots are written to disk. The model trains on newly available entries and produces a fresh prediction each time, rather than re-evaluating at every root `update_interval` tick.
+
+## Pruning
+
+History files older than `history_length` are automatically pruned on each tick cycle. The pruning function:
+
+1. Computes a cutoff date by subtracting `history_length` duration from today
+2. Scans the history directory for files matching `history.log.YYYYMMDD` pattern
+3. Validates that filenames contain exactly 8 ASCII digits after the prefix (preventing path traversal via malicious filenames)
+4. Deletes only confirmed regular files (symlinks and directories skipped)
+5. Deduplicates by date — pruning runs at most once per calendar day
+
+Pruning activity is logged: debug-level for each file removed, info-level summary when files are actually deleted. If no files need pruning (either because retention period hasn't passed or already pruned today), the operation returns silently.
+
+## Configuration Tuning
+
+### When to Increase `max_extension_time`
+
+If rouser frequently releases inhibition and then re-inhibits within minutes during active work sessions, increase the extension cap:
+
+```toml
+[prediction]
+max_extension_time = "2h"   # Extend up to 2 hours beyond standard cooldown
+```
+
+### When to Decrease `max_extension_time`
+
+If rouser keeps the system awake longer than necessary (e.g., on a server that only needs brief inhibition during maintenance windows), reduce the cap:
+
+```toml
+[prediction]
+max_extension_time = "15m"  # Short maximum extension for bursty workloads
+```
+
+### Disabling Prediction
+
+Set `update_interval` to zero to disable all prediction while keeping metrics collection active:
+
+```toml
+[prediction]
+update_interval = "0s"   # Disables prediction entirely
+```
+
+## Debugging
+
+Enable debug logging to see the full prediction lifecycle:
+
+```bash
+RUST_LOG=debug rouser --dry-run
+```
+
+Key log messages:
+
+- **Startup**: `Loaded N history entries from ...` followed by `Prediction model initialized with M historical data points` — shows raw entries loaded; gap-filling and trend computation happen at prediction time, not during startup
+- **Per-interval flush**: `Flushed averaged snapshot #N (CPU max=X.X%, GPU max=Y.Y% avg=Z.Z%, net=X.XXMB/s, disk=X.XXMB/s), time={week_of_year}, accumulated_ticks=N` — logged when accumulated metrics are written as one averaged entry after N ticks; feature vectors are computed from these snapshots
+- **Pruning activity**: Per-file debug lines when files are removed, plus an info-level summary once per day with `Pruned N old history files (retention: ...)`
+- **Prediction query**: `Predicted cooldown: +Xdur (base_score=S.SS, trend_multiplier=T.TT, adjusted_score=S.SS, data_points=N, confidence=C.CC)` — shown when transitioning from inhibited to below-threshold state; includes the base anomaly score and the trend multiplier applied from delta features
+
+## See Also
+
+- [Configuration Reference](configuration.md) — All `[prediction]` config options with defaults
+- [Metrics Overview](metrics-overview.md) — How CPU, GPU, network, disk metrics are collected
+- [D-Bus Inhibition](d-bus-inhibition.md) — How sleep inhibition works under the hood
diff --git a/docs/prediction-todo.md b/docs/prediction-todo.md
new file mode 100644
index 0000000..0dcd1e6
--- /dev/null
+++ b/docs/prediction-todo.md
@@ -0,0 +1,108 @@
+# Prediction Model Refactoring — Task Tracker
+
+This file tracks all tasks needed to replace the histogram-based prediction model with an unsupervised ML approach using NG-RC reservoir computing from the [irithyll](https://crates.io/crates/irithyll) crate.
+
+## Completed Tasks
+
+| # | Status | Description |
+|---|--------|-------------|
+| 1 | ✅ | Added GPU per-GPU-max and total-average deltas to `EntryDeltas` struct |
+| 2 | ✅ | Updated `TrendSignal::compute()` to include GPU trends alongside CPU/network/disk |
+| 3 | ✅ | Updated trend multiplier in `predict_cooldown()` to use GPU delta contribution |
+| 4 | ✅ | Rewrote `docs/prediction-model.md` with ML architecture and all user corrections |
+
+## Remaining Tasks — In Priority Order
+
+### Phase 1: Foundation (Must complete before any model work)
+
+| # | Task | Details | Files | Dependencies |
+|---|------|---------|-------|-------------|
+| 5 | Add `irithyll` crate to Cargo.toml | Version `9.9.x`, feature flags: `serde_support`. Justify as lightweight streaming ML with NG-RC reservoir computing for temporal pattern learning, zero unsafe blocks, O(1) per-sample memory | `Cargo.toml` | — |
+| 6 | Add ML config options to `PredictionConfig` | New fields: `hidden_dim: usize (default 16)`, `delay_buffer_size: usize (default 8)`. Keep existing `update_interval`, `history_length`, `max_extension_time`. Update `Default` impl. Update `config/rouser.toml` with new defaults. Sync all three locations per AGENTS.md rules | `src/config.rs`, `config/rouser.toml`, `docs/configuration.md` | — |
+| 7 | Create `src/prediction/ml_model.rs` | New module for ML predictor wrapper: `MlPredictor` struct wrapping irithyll's NG-RC. Methods: `new(config)`, `train(features, target)`, `predict(features) -> f64`, `save(path)`, `load(path)` | `src/prediction/ml_model.rs` (new), `src/prediction/mod.rs` (add module) | Task 5, 6 |
+
+### Phase 2: Feature Pipeline
+
+| # | Task | Details | Files | Dependencies |
+|---|------|---------|-------|-------------|
+| 8 | Create `FeatureVector` struct | Fixed-size array of 6 normalized f64 values (cpu_max, cpu_avg, gpu_max, gpu_avg, network, disk). Implement conversion from `HistoryEntry`. Include normalization statistics tracking (running mean/std) for consistent scaling across time | `src/prediction/ml_model.rs` | Task 7 |
+| 9 | Replace TimeKey histogram with feature pipeline in `PredictionModel` | Remove `inhibited_timekeys: HashMap<TimeKey, u64>`. Add `ml_predictor: MlPredictor`, `normalization_stats: NormalizationStats { mean[6], std[6] }`. Update `new()` to load history and initialize stats. Update `record()` to build feature vectors | `src/prediction/model.rs` | Task 7, 8 |
+
+### Phase 3: Model Integration
+
+| # | Task | Details | Files | Dependencies |
+|---|------|---------|-------|-------------|
+| 10 | Implement unsupervised training loop in `predict_cooldown()` | When called (at each prediction update_interval), iterate recent entries, build feature vectors, train model incrementally. Use reconstruction error as anomaly score instead of histogram inhibition rate | `src/prediction/model.rs` | Task 9 |
+| 11 | Replace `score_inhibition_rate()` with ML scoring | Remove TimeKey-based lookup and fallback matching. New method: `ml_predictor.score(&features) -> f64` returning normalized anomaly score (0–1). Map to cooldown extension via same interpolation logic as before | `src/prediction/model.rs` | Task 9, 10 |
+| 12 | Remove TimeKey struct and all histogram-related code | Delete `TimeKey::from_timestamp_ns()`, `TimeKey::display()`, `TimeKey::hour_of_day()`, `score_from_count()`, linear day computation. Update debug logging to remove "time=year=X week=Y sec=Z" from output | `src/prediction/model.rs` | Task 10, 11 |
+| 13 | Fix gap-filled entry handling | Remove filter-out of zero-value entries before feature vector construction (user: '"All metrics at 0 with no inhibition" is a valid state'). Keep them in history for baseline learning. Only exclude from training if they represent extended shutdown periods (>24h) | `src/prediction/model.rs` | Task 10 |
+
+### Phase 4: TimeKey Simplification (Optional — only if partial time info useful)
+
+| # | Task | Details | Files | Dependencies |
+|---|------|---------|-------|-------------|
+| 14 | Evaluate if `week_of_year + minutes_into_week` should be added as features | User suggested `(week_of_year, minutes_into_week)` for efficiency. In ML context this could be two additional features (week: 0–52, minutes: 0–10079) to encode temporal position without bucketing. Decide based on model performance experiments | `src/prediction/ml_model.rs` | Task 8, 10 |
+
+### Phase 5: Testing and Verification
+
+| # | Task | Details | Files | Dependencies |
+|---|------|---------|-------|-------------|
+| 15 | Add unit tests for `FeatureVector::from_entry()` | Test normalization with known values. Edge cases: all-zero entries, single-GPU systems, no GPUs (all zero) | `src/prediction/ml_model.rs` | Task 8 |
+| 16 | Update existing prediction model tests | All tests in `model.rs #[cfg(test)] mod tests` need updating to work with ML pipeline instead of histogram. Test training → scoring → extension flow end-to-end | `src/prediction/model.rs` (tests) | Task 10, 11 |
+| 17 | Add integration test for full prediction cycle | Spin up PredictionModel, feed synthetic history entries at known intervals, verify that anomalous patterns produce expected extensions | New file or existing tests | All previous tasks |
+
+### Phase 6: Documentation and CI
+
+| # | Task | Details | Files | Dependencies |
+|---|------|---------|-------|-------------|
+| 18 | Update AGENTS.md with new architecture section | Document ML-based prediction, TimeKey deprecation, irithyll dependency policy. Add "Prediction Model Refactoring" to Lessons Learned if relevant patterns emerge | `AGENTS.md` | All code tasks complete |
+| 19 | Run full CI: build + clippy + test on final branch | Verify all changes pass before merging | — | All previous tasks |
+
+## Architecture Decision Record
+
+### Why NG-RC Reservoir Computing (irithyll)?
+
+**Requirements:**
+- Unsupervised learning (no labeled "inhibited" data for training)
+- Online/iterative weight updates at each 30s prediction interval
+- Small memory footprint (<1MB total model state)
+- No external binary dependencies, pure Rust preferred
+- Temporal awareness (learn patterns over time series)
+
+**Alternatives considered:**
+| Approach | Pros | Cons for this use case |
+|----------|------|------------------------|
+| NG-RC (irithyll) | Streaming O(1) memory per sample, temporal via delay buffers, concept drift adaptation, pure Rust zero unsafe | Requires one new crate dep |
+| Isolation Forest (`extended-isolation-forest`) | Simple anomaly scoring, no training needed | Batch-only, no online updates, must reload on every prediction |
+| Random Cut Forest (`anomstream`) | Streaming anomaly detection, low memory | No temporal awareness, less suited for time-series patterns |
+| Autoencoder (xneuron) | Unsupervised reconstruction error as score | Fixed-point arithmetic only, minimal feature set, no online learning yet |
+| LightRiver | Fast online ML, TinyML optimized | Primarily focused on anomaly detection algorithms (Hoeffding Trees), not neural networks for regression |
+
+**Decision**: NG-RC from irithyll provides the best combination of temporal awareness, streaming updates, small memory footprint, and pure-Rust implementation with zero unsafe blocks.
+
+### TimeKey Deprecation Rationale
+
+The current `TimeKey` struct `(year, week_of_year, seconds_into_week)` has fundamental issues:
+1. **Year is monotonically increasing** — it provides no pattern-matching value, only timestamp reconstruction capability
+2. **604800 buckets/week is wasteful** — most buckets have zero or one entries even after years of data
+3. **Exact-match fallback is brittle** — sparse data means frequent misses requiring hour-of-day fallback which loses precision
+
+The ML approach eliminates bucketing entirely: each history entry becomes a feature vector, and the model learns temporal patterns through delay embeddings in the reservoir computing architecture. This removes all histogram-related complexity while improving generalization across time periods.
+
+## Estimated Effort
+
+| Phase | Tasks | Est. Complexity |
+|-------|-------|-----------------|
+| 1: Foundation | #5–7 | Low — setup and config |
+| 2: Feature Pipeline | #8–9 | Medium — new data structures |
+| 3: Model Integration | #10–13 | High — core logic rewrite |
+| 4: TimeKey Simplification | #14 | Low — optional feature addition |
+| 5: Testing | #15–17 | Medium — comprehensive coverage needed |
+| 6: Documentation/CI | #18–19 | Low — final verification |
+
+## Notes for Implementers
+
+- **AGENTS.md constraints**: No background tasks (sequential workers only), prefer stdlib/crates over binary deps, never introduce `unsafe` without explicit instruction, build/clippy/tests must pass before committing
+- **Config defaults must match** `config/rouser.toml` — AGENTS.md source-of-truth rule applies to all three locations simultaneously
+- **Breaking changes**: TimeKey removal and ML pipeline change will break existing history file format. Plan for migration or backward compatibility if needed (e.g., log warning when loading old-format entries)
+- **Performance target**: Prediction should complete in <100ms at each 30s interval with ~86400 history entries (30 days × 2880 entries/day / 30s flush = ~86,400 entries max)
diff --git a/docs/systemd-user-service.md b/docs/systemd-user-service.md
index 0900ff7..6ef3798 100644
--- a/docs/systemd-user-service.md
+++ b/docs/systemd-user-service.md
@@ -40,7 +40,8 @@ per_core_threshold = 80.0   # CPU max usage % (0–100) above which to inhibit s
 total_threshold = 25.0      # Total averaged CPU usage % (default: 25.0)
 
 [metrics.gpu]
-threshold = 15.0            # GPU usage % per device (default: 15.0)
+per_gpu_threshold = 25.0    # Per-GPU max usage that triggers inhibition
+total_threshold = 40.0      # System-wide average threshold (both use OR logic)
 ema_alpha = 0.7             # EMA smoothing factor for GPU readings
 
 [metrics.network]
@@ -77,134 +78,8 @@ Wants=network-online.target
 
 [Service]
 Type=simple
-ExecStart=/home/%i/.local/bin/rouser --config /home/%i/.config/rouser/config.toml
-Restart=on-failure
-RestartSec=5s
-
-# Security hardening
-NoNewPrivileges=true
-ProtectSystem=strict
-PrivateTmp=true
-ProtectHome=read-only
-ReadWritePaths=%h/.config/rouser
-
-[Install]
-WantedBy=default.target
-```
-
-### Step 3: Create Log Directory
-
-```bash
-mkdir -p ~/.local/log/rouser
-```
-
-### Step 4: Configure and Start Service
-
-```bash
-# Reload user systemd daemon
-systemctl --user daemon-reload
-
-# Enable service to start on login
-systemctl --user enable rouser
-
-# Start the service
-systemctl --user start rouser
-
-# Check status
-systemctl --user status rouser
-```
-
-Expected output:
-
-```
-● rouser.service - Rouser - User Sleep Inhibition Daemon
-     Loaded: loaded (/home/username/.config/systemd/user/rouser.service; enabled)
-     Active: active (running) since Mon 2026-03-26 10:00:00 UTC; 5min ago
-   Main PID: 1234 (rouser)
-      Tasks: 4 (limit: 4915)
-     Memory: 2.5M
-```
-
-### Step 5: Verify Inhibition
-
-Check active inhibitors:
-
-```bash
-# List active inhibitors
-loginctl list-inhibitors
-
-# Should show rouser as an inhibitor
-```
-
-## Service Management
-
-### Start/Stop/Restart
-
-```bash
-# Start service
-systemctl --user start rouser
-
-# Stop service
-systemctl --user stop rouser
-
-# Restart service
-systemctl --user restart rouser
-
-# Reload configuration (without restart)
-systemctl --user reload rouser
-```
-
-### Check Status
-
-```bash
-# Check if running
-systemctl --user is-active rouser
+ExecStart=%h/.local/bin/rouser
 
-# View detailed status
-systemctl --user status rouser
-
-# View logs
-journalctl --user -u rouser -f
-
-# View last 50 lines
-journalctl --user -u rouser -n 50
-
-# View logs for specific time range
-journalctl --user -u rouser --since "2024-03-26 00:00:00" --until "2024-03-26 23:59:59"
-```
-
-### Enable/Disable
-
-```bash
-# Enable on login
-systemctl --user enable rouser
-
-# Disable (but keep file)
-systemctl --user disable rouser
-
-# Check if enabled
-systemctl --user is-enabled rouser
-```
-
-## Alternative: Systemd System Service
-
-For system-wide installation (requires root):
-
-### Create System Service File
-
-Create `/etc/systemd/system/rouser.service`:
-
-```ini
-[Unit]
-Description=Rouser - System Sleep Inhibition Daemon
-Documentation=https://github.com/owaindjones/rouser
-After=network.target
-
-[Service]
-Type=simple
-User=root
-Group=root
-ExecStart=/usr/local/bin/rouser --config /etc/rouser/config.toml
 Restart=on-failure
 RestartSec=5s
 
@@ -250,8 +125,9 @@ User=%u
 NoNewPrivileges=true
 ProtectSystem=strict
 PrivateTmp=true
-ProtectHome=true
-ReadWritePaths=%h/.config/rouser %h/.local/log/rouser
+ReadOnlyPaths=%h/.local/bin %h/.config/rouser
+ReadWritePaths=%h/.local/state/rouser
+ProtectHome=read-only
 ```
 
 ### System Service (More Restrictive)
@@ -263,7 +139,8 @@ For system-wide installation with enhanced security:
 Type=simple
 User=rouser
 Group=rouser
-ExecStart=/usr/local/bin/rouser --config /etc/rouser/config.toml
+ExecStart=/usr/local/bin/rouser
+
 Restart=on-failure
 RestartSec=5s
 
@@ -318,7 +195,7 @@ Prevent resource exhaustion:
 ```ini
 [Service]
 Type=simple
-ExecStart=/usr/local/bin/rouser --config /etc/rouser/config.toml
+ExecStart=/usr/local/bin/rouser
 
 # Resource limits
 MemoryLimit=256M
diff --git a/mkdocs.yml b/mkdocs.yml
index c2e504f..4201177 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -11,6 +11,7 @@ nav:
   - Command Line: command-line.md
   - Systemd User Service: systemd-user-service.md
   - Metrics Overview: metrics-overview.md
+  - Prediction Model: prediction-model.md
   - Averaging & EMA Smoothing: averaging.md
   - D-Bus Inhibition: d-bus-inhibition.md
   - Security: security.md
diff --git a/scripts/install.sh b/scripts/install.sh
index b153e2d..ef65799 100755
--- a/scripts/install.sh
+++ b/scripts/install.sh
@@ -117,6 +117,15 @@ else
     fi
 fi
 
+# Create history data directory so systemd ReadWritePaths works with ProtectHome=read-only.
+mkdir -p "${XDG_STATE_HOME:-$HOME/.local}/state/rouser"
+
+# Install config if not present (only when building from repo).
+if [ "$FROM_REPO" = true ] && [ ! -f "${XDG_CONFIG_HOME:-$HOME/.config}/rouser/config.toml" ]; then
+    mkdir -p "${XDG_CONFIG_HOME:-$HOME/.config}/rouser"
+    cp "$PWD/config/rouser.toml" "${XDG_CONFIG_HOME:-$HOME/.config}/rouser/config.toml" 2>/dev/null && info "Config installed to ${XDG_CONFIG_HOME:-$HOME/.config}/rouser/config.toml (not present before)" || true
+fi
+
 info "Enabling rouser systemd user service..."
 systemctl --user daemon-reload
 systemctl --user enable --now rouser.service || warn "Failed to enable/start service (is logind lingering enabled?)"
diff --git a/src/config.rs b/src/config.rs
index fbe854e..5252a27 100644
--- a/src/config.rs
+++ b/src/config.rs
@@ -16,75 +16,65 @@ pub struct Config {
     pub metrics: Metrics,
     pub timing: TimingConfig,
     pub inhibitor: InhibitionConfig,
+    pub prediction: PredictionConfig,
 }
 
-fn default_gpu_threshold() -> f64 {
-    15.0
-}
-
-fn default_network_io() -> f64 {
-    10.0
-}
-
-fn default_disk_activity() -> f64 {
-    10.0
-}
-
-#[allow(dead_code)]
+/// CPU metrics configuration with per-core and total thresholds.
 #[derive(Debug, Clone, Serialize, Deserialize)]
-pub struct Thresholds {
-    #[serde(default = "default_cpu_usage_threshold")]
-    pub cpu_usage: f64,
-    #[serde(default = "default_gpu_threshold")]
-    pub gpu_usage: f64,
-    #[serde(default = "default_network_io")]
-    pub network_io: f64,
-    #[serde(default = "default_disk_activity")]
-    pub disk_activity: f64,
-}
-
-fn default_cpu_usage_threshold() -> f64 {
-    80.0
-}
-
-#[allow(dead_code)]
-#[derive(Debug, Clone, Serialize, Deserialize)]
-pub struct MetricsConfig {
-    #[serde(default = "default_ema_alpha_cpu")]
-    pub ema_alpha: f64,
-}
-
-#[derive(Debug, Clone, Serialize, Deserialize, Default)]
 pub struct CpuConfig {
-    #[serde(default = "default_per_core_threshold")]
+    /// Per-core CPU usage threshold (percentage). Exceeding this triggers inhibition.
+    #[serde(default)]
     pub per_core_threshold: f64,
-    #[serde(default = "default_total_threshold")]
+    /// Total averaged CPU usage threshold (percentage). Exceeding this triggers inhibition.
+    #[serde(default)]
     pub total_threshold: f64,
-    #[serde(default = "default_ema_alpha_cpu")]
+    /// EMA smoothing factor for CPU readings.
+    #[serde(default)]
     pub ema_alpha: f64,
 }
 
-fn default_per_core_threshold() -> f64 {
-    80.0
-}
-
-fn default_total_threshold() -> f64 {
-    25.0
+impl Default for CpuConfig {
+    fn default() -> Self {
+        Self {
+            per_core_threshold: 80.0,
+            total_threshold: 25.0,
+            ema_alpha: 0.7,
+        }
+    }
 }
 
-#[derive(Debug, Clone, Serialize, Deserialize, Default)]
+/// GPU metrics configuration with per-GPU and aggregate thresholds.
+#[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct GpuConfig {
-    #[serde(default = "default_gpu_threshold")]
-    pub threshold: f64,
-    #[serde(default = "default_ema_alpha_gpu")]
+    /// GPU usage threshold per individual card (percentage). Any single GPU above this triggers inhibition.
+    #[serde(default)]
+    pub per_gpu_threshold: f64,
+    /// System-wide aggregate GPU threshold (average across all GPUs, percentage). The average GPU load exceeding this triggers inhibition.
+    #[serde(default)]
+    pub total_threshold: f64,
+    /// EMA smoothing factor for GPU readings.
+    #[serde(default)]
     pub ema_alpha: f64,
 }
 
-#[derive(Debug, Clone, Serialize, Deserialize, Default)]
+impl Default for GpuConfig {
+    fn default() -> Self {
+        Self {
+            per_gpu_threshold: 25.0,
+            total_threshold: 40.0,
+            ema_alpha: 0.7,
+        }
+    }
+}
+
+/// Network metrics configuration.
+#[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct NetworkConfig {
-    #[serde(default = "default_network_io")]
+    /// Network throughput threshold (Mbps). Exceeding this triggers inhibition.
+    #[serde(default)]
     pub threshold: f64,
-    #[serde(default = "default_ema_alpha_network")]
+    /// EMA smoothing factor for network I/O readings.
+    #[serde(default)]
     pub ema_alpha: f64,
     #[serde(default)]
     pub exclude_interfaces: Vec<String>,
@@ -92,29 +82,51 @@ pub struct NetworkConfig {
     pub include_interfaces: Vec<String>,
 }
 
-#[derive(Debug, Clone, Serialize, Deserialize, Default)]
+impl Default for NetworkConfig {
+    fn default() -> Self {
+        Self {
+            threshold: 10.0,
+            ema_alpha: 0.5,
+            exclude_interfaces: vec!["lo".to_string()],
+            include_interfaces: Vec::new(),
+        }
+    }
+}
+
+/// Disk metrics configuration.
+#[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct DiskConfig {
-    #[serde(default = "default_disk_activity")]
+    /// Disk I/O threshold (MB/s). Exceeding this triggers inhibition.
+    #[serde(default)]
     pub threshold: f64,
-    #[serde(default = "default_ema_alpha_disk")]
+    /// EMA smoothing factor for disk activity readings.
+    #[serde(default)]
     pub ema_alpha: f64,
     #[serde(default)]
     pub exclude_device_prefixes: Vec<String>,
 }
 
-fn default_cpu() -> CpuConfig {
-    Default::default()
-}
-
-fn default_gpu() -> GpuConfig {
-    Default::default()
+impl Default for DiskConfig {
+    fn default() -> Self {
+        Self {
+            threshold: 10.0,
+            ema_alpha: 0.5,
+            exclude_device_prefixes: vec![
+                "loop".to_string(),
+                "fd".to_string(),
+                "sr".to_string(),
+                "cdrom".to_string(),
+            ],
+        }
+    }
 }
 
-#[derive(Debug, Clone, Serialize, Deserialize)]
+/// Aggregated metrics configuration.
+#[derive(Debug, Clone, Serialize, Deserialize, Default)]
 pub struct Metrics {
-    #[serde(default = "default_cpu")]
+    #[serde(default)]
     pub cpu: CpuConfig,
-    #[serde(default = "default_gpu")]
+    #[serde(default)]
     pub gpu: GpuConfig,
     #[serde(default)]
     pub network: NetworkConfig,
@@ -122,52 +134,82 @@ pub struct Metrics {
     pub disk: DiskConfig,
 }
 
-fn default_duration_threshold() -> Duration {
-    Duration::from_secs(30)
-}
-
-fn default_cooldown_duration() -> Duration {
-    Duration::from_secs(60)
-}
-
-fn default_ema_alpha_cpu() -> f64 {
-    0.7
+/// Timing configuration for threshold evaluation.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct TimingConfig {
+    /// Minimum continuous time metrics must exceed threshold before inhibiting sleep.
+    #[serde(with = "humantime_serde")]
+    pub duration_threshold: Duration,
+    /// Time after releasing inhibition during which the daemon won't re-inhibit even if thresholds are exceeded again.
+    #[serde(default, with = "humantime_serde")]
+    pub cooldown_duration: Duration,
 }
 
-fn default_ema_alpha_gpu() -> f64 {
-    0.7
+impl Default for TimingConfig {
+    fn default() -> Self {
+        Self {
+            duration_threshold: Duration::from_secs(5),
+            cooldown_duration: Duration::from_secs(10),
+        }
+    }
 }
 
-fn default_ema_alpha_network() -> f64 {
-    0.5
+/// Inhibition configuration.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct InhibitionConfig {
+    /// Operations to inhibit (colon-separated). See D-Bus login1 API for options.
+    #[serde(default)]
+    pub what: String,
+    /// Mode of inhibition: block, delay, or block-weak.
+    #[serde(default)]
+    pub mode: String,
 }
 
-fn default_ema_alpha_disk() -> f64 {
-    0.5
+impl Default for InhibitionConfig {
+    fn default() -> Self {
+        Self {
+            what: "shutdown:idle".to_string(),
+            mode: "block".to_string(),
+        }
+    }
 }
 
+/// Predictive cooldown configuration.
 #[derive(Debug, Clone, Serialize, Deserialize)]
-pub struct TimingConfig {
-    #[serde(default = "default_duration_threshold", with = "humantime_serde")]
-    pub duration_threshold: Duration,
-    #[serde(default = "default_cooldown_duration", with = "humantime_serde")]
-    pub cooldown_duration: Duration,
-}
-
-fn default_what() -> String {
-    "shutdown:idle".to_string()
+pub struct PredictionConfig {
+    /// Seconds between averaged snapshots written to history log; must be >= root update_interval.
+    #[serde(default, with = "humantime_serde")]
+    pub update_interval: Duration,
+    /// Keep this much historical data; older entries are pruned periodically.
+    #[serde(default = "default_history_length", with = "humantime_serde")]
+    pub history_length: Duration,
+    /// Maximum additional time for predictive cooldown extension.
+    #[serde(default, with = "humantime_serde")]
+    pub max_extension_time: Duration,
+    /// Number of hidden neurons in the NG-RC reservoir computing model. Controls model capacity; larger values capture more complex patterns but use more memory (O(n^2) for n hidden_dim).
+    #[serde(default)]
+    pub ml_hidden_dim: usize,
+    /// Size of the delay buffer used by the NG-RC model to create polynomial features from past states. Must be <= history_length / update_interval.
+    #[serde(default)]
+    pub ml_delay_buffer_size: usize,
 }
 
-fn default_mode() -> String {
-    "block".to_string()
+fn default_history_length() -> Duration {
+    // 30 days — matches config/rouser.toml. Kept because humantime_serde
+    // requires a Duration-typed function (can't use bare "default").
+    Duration::from_secs(30 * 24 * 60 * 60)
 }
 
-#[derive(Debug, Clone, Serialize, Deserialize)]
-pub struct InhibitionConfig {
-    #[serde(default = "default_what")]
-    pub what: String,
-    #[serde(default = "default_mode")]
-    pub mode: String,
+impl Default for PredictionConfig {
+    fn default() -> Self {
+        Self {
+            update_interval: Duration::from_secs(30),
+            history_length: Duration::from_secs(30 * 24 * 60 * 60),
+            max_extension_time: Duration::from_secs(3600),
+            ml_hidden_dim: 16,
+            ml_delay_buffer_size: 8,
+        }
+    }
 }
 
 #[derive(Clone)]
@@ -381,32 +423,12 @@ mod tests {
 
     #[test]
     fn test_metrics_defaults() {
-        let metrics = Metrics {
-            cpu: CpuConfig {
-                per_core_threshold: default_per_core_threshold(),
-                total_threshold: default_total_threshold(),
-                ema_alpha: default_ema_alpha_cpu(),
-            },
-            gpu: GpuConfig {
-                threshold: default_gpu_threshold(),
-                ema_alpha: default_ema_alpha_gpu(),
-            },
-            network: NetworkConfig {
-                threshold: default_network_io(),
-                ema_alpha: default_ema_alpha_network(),
-                exclude_interfaces: vec![],
-                include_interfaces: vec![],
-            },
-            disk: DiskConfig {
-                threshold: default_disk_activity(),
-                ema_alpha: default_ema_alpha_disk(),
-                exclude_device_prefixes: vec![],
-            },
-        };
+        let metrics = Metrics::default();
 
         assert_eq!(metrics.cpu.per_core_threshold, 80.0);
         assert_eq!(metrics.cpu.total_threshold, 25.0);
-        assert_eq!(metrics.gpu.threshold, 15.0);
+        assert_eq!(metrics.gpu.per_gpu_threshold, 25.0);
+        assert_eq!(metrics.gpu.total_threshold, 40.0);
         assert_eq!(metrics.network.threshold, 10.0);
         assert_eq!(metrics.disk.threshold, 10.0);
         assert_eq!(metrics.cpu.ema_alpha, 0.7);
@@ -417,12 +439,9 @@ mod tests {
 
     #[test]
     fn test_timing_defaults() {
-        let timing = TimingConfig {
-            duration_threshold: default_duration_threshold(),
-            cooldown_duration: default_cooldown_duration(),
-        };
+        let timing = TimingConfig::default();
 
-        assert_eq!(timing.duration_threshold.as_secs(), 30);
-        assert_eq!(timing.cooldown_duration.as_secs(), 60);
+        assert_eq!(timing.duration_threshold.as_secs(), 5);
+        assert_eq!(timing.cooldown_duration.as_secs(), 10);
     }
 }
diff --git a/src/inhibit.rs b/src/inhibit.rs
index ec6546a..849bc25 100644
--- a/src/inhibit.rs
+++ b/src/inhibit.rs
@@ -1,8 +1,26 @@
 use dbus::blocking::Connection;
-use tracing::debug;
+use tracing::{debug, warn};
+
+/// The `what` parameter that works on desktop systems without polkit rules.
+const FALLBACK_INHIBIT_TYPE: &str = "sleep";
+
+/// Check if a D-Bus error indicates an interactive authentication requirement.
+fn is_auth_error(error_msg: &str) -> bool {
+    const AUTH_INDICATORS: &[&str] = &[
+        "interactive authentication",
+        "requires interactive authentication",
+        "Access denied",
+        "org.freedesktop.login1.NotAuthorized",
+        "not authorized",
+        "not authenticated",
+    ];
+    let lower = error_msg.to_lowercase();
+    AUTH_INDICATORS
+        .iter()
+        .any(|indicator| lower.contains(indicator))
+}
 
-/// Sleep inhibitor using lower-level dbus crate
-/// The dbus crate properly handles file descriptors (h: UNIX_FD type)
+/// Sleep inhibitor using lower-level dbus crate.
 pub struct SleepInhibitor {
     #[allow(dead_code)] // Connection kept for inhibitor lifetime
     conn: Connection,
@@ -11,41 +29,92 @@ pub struct SleepInhibitor {
 }
 
 impl SleepInhibitor {
-    pub async fn new(what: &str, who: &str, why: &str, mode: &str) -> anyhow::Result<Self> {
-        let dbus_mode = mode;
-
-        // Connect to system D-Bus
+    /// Attempt D-Bus Inhibit call with the requested `what` type. Returns an OwnedFd that keeps
+    /// inhibition active for the inhibitor's lifetime. Panics if mode is "block-weak" (use
+    /// acquire_with_fallback() which handles this internally).
+    async fn acquire_inhibition(
+        what: &str,
+        who: &str,
+        why: &str,
+        dbus_mode: &str,
+    ) -> anyhow::Result<Self> {
         let conn = Connection::new_system()
             .map_err(|e| anyhow::anyhow!("Failed to connect to system D-Bus: {}", e))?;
 
-        // Use with_proxy to create a wrapper for the target object
         let proxy = conn.with_proxy(
             "org.freedesktop.login1",
             "/org/freedesktop/login1",
             std::time::Duration::from_millis(3000),
         );
 
-        // Call Inhibit - returns (file_descriptor,) tuple
-        // The dbus crate handles file descriptors properly via OwnedFd
         let result: (dbus::arg::OwnedFd,) = proxy
             .method_call(
                 "org.freedesktop.login1.Manager",
                 "Inhibit",
-                (
-                    what.to_string(),
-                    who.to_string(),
-                    why.to_string(),
-                    dbus_mode.to_string(),
-                ),
+                (what, who, why, dbus_mode),
             )
             .map_err(|e| anyhow::anyhow!("Failed to call Inhibit: {}", e))?;
 
-        // Keep the file descriptor alive for the lifetime of the inhibition
-        // The fd is what keeps the inhibition active - it must not be dropped
         let fd = result.0;
 
         Ok(Self { conn, _fd: fd })
     }
+
+    /// Attempt inhibition with the requested `what` type. On desktop systems without polkit rules,
+    /// `"shutdown:idle"` may fail with an authentication error — in that case this method falls back
+    /// to using `"sleep"` which is less restrictive but more widely available. Only auth errors trigger fallback; other D-Bus failures propagate unchanged.
+    pub async fn acquire_with_fallback(
+        what: &str,
+        who: &str,
+        why: &str,
+        mode: &str,
+    ) -> anyhow::Result<Self> {
+        let dbus_mode = match mode {
+            "block-weak" => {
+                warn!("D-Bus API does not support 'block-weak' mode. Using 'block' instead.");
+                "block"
+            }
+            m => m,
+        };
+
+        // First attempt: try with the requested `what` type (e.g., "shutdown:idle").
+        match Self::acquire_inhibition(what, who, why, dbus_mode).await {
+            Ok(inhibitor) => Ok(inhibitor),
+            Err(e) if is_auth_error(&e.to_string()) => {
+                // Auth error — retry with the more widely-available "sleep" type.
+                match Self::acquire_inhibition(FALLBACK_INHIBIT_TYPE, who, why, dbus_mode).await {
+                    Ok(fb) => {
+                        warn!(
+                            "Requested inhibition type '{}' requires polkit interactive authentication — \
+                             falling back to '{}'. To fix this, add a polkit rule or set inhibitor.what=sleep in config.",
+                            what, FALLBACK_INHIBIT_TYPE
+                        );
+                        Ok(fb)
+                    }
+                    Err(e2) => {
+                        warn!(
+                            "Inhibition failed with '{}' (auth error indicator detected). \
+                             Also tried fallback type '{}': {}",
+                            what, FALLBACK_INHIBIT_TYPE, e2
+                        );
+                        Err(anyhow::anyhow!(
+                            "Failed to acquire inhibition with both '{}' and fallback '{}'",
+                            what,
+                            FALLBACK_INHIBIT_TYPE
+                        ))
+                    }
+                }
+            }
+            Err(e) => {
+                // Not an auth error — report the original failure without masking it.
+                Err(anyhow::anyhow!(
+                    "Inhibition failed for type '{}': {} (not an auth error)",
+                    what,
+                    e
+                ))
+            }
+        }
+    }
 }
 
 pub struct InhibitionState {
@@ -61,7 +130,6 @@ impl InhibitionState {
         }
     }
 
-    #[allow(dead_code)]
     pub async fn acquire(
         &mut self,
         what: &str,
@@ -74,12 +142,16 @@ impl InhibitionState {
             return Ok(());
         }
 
-        let inhibitor = SleepInhibitor::new(what, who, why, mode).await?;
-
-        self.inhibitor = Some(inhibitor);
-        self.is_inhibited = true;
+        let inhibitor = SleepInhibitor::acquire_with_fallback(what, who, why, mode).await;
 
-        Ok(())
+        match inhibitor {
+            Ok(inh) => {
+                self.inhibitor = Some(inh);
+                self.is_inhibited = true;
+                Ok(())
+            }
+            Err(e) => Err(e),
+        }
     }
 
     pub async fn release(&mut self) {
diff --git a/src/lib.rs b/src/lib.rs
index 0bcfab6..50954a0 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,4 +1,5 @@
 pub mod config;
 pub mod inhibit;
 pub mod metrics;
+pub mod prediction;
 pub mod service;
diff --git a/src/main.rs b/src/main.rs
index f09d24a..8a66739 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -1,6 +1,7 @@
 mod config;
 mod inhibit;
 mod metrics;
+mod prediction;
 mod service;
 
 use anyhow::Result;
@@ -9,6 +10,9 @@ use std::path::PathBuf;
 use std::process::ExitCode;
 use tracing::{error, info, warn};
 
+// Import the prelude for .with() method on subscribers.
+use tracing_subscriber::prelude::*;
+
 use config::ConfigLoader;
 use service::DataService;
 
@@ -38,11 +42,26 @@ struct Args {
     log_level: Option<String>,
 }
 
-fn resolve_initial_log_level(args: &Args) -> String {
+/// Resolve the effective tracing log level after config is loaded.
+/// Priority chain: CLI > RUST_LOG > config.log_level > 'info'.
+fn resolve_tracing_log_level(args: &Args, config: &config::Config) -> String {
     if let Some(ref cli_val) = args.log_level {
         return cli_val.to_string();
     }
-    std::env::var("RUST_LOG").unwrap_or_else(|_| "info".to_string())
+
+    // Environment variable is the next source — transient overrides persistent defaults.
+    if let Ok(val) = std::env::var("RUST_LOG") {
+        if !val.is_empty() {
+            return val;
+        }
+    }
+
+    // Config file log_level is the fallback for a persistent default.
+    if !config.log_level.is_empty() {
+        return config.log_level.clone();
+    }
+
+    "info".to_string()
 }
 
 fn load_single_config(path: &std::path::Path) -> Result<config::Config> {
@@ -52,46 +71,80 @@ fn load_single_config(path: &std::path::Path) -> Result<config::Config> {
         .map_err(|e| anyhow::anyhow!("Failed to load config from {}: {}", path.display(), e))
 }
 
-fn init_tracing(log_level: &str) {
-    tracing_subscriber::fmt()
-        .with_env_filter(
-            tracing_subscriber::EnvFilter::try_new(log_level).unwrap_or_else(|e| {
-                eprintln!("Invalid log level '{}': {}. Using 'info'.", log_level, e);
-                tracing_subscriber::EnvFilter::new("info")
-            }),
-        )
-        .with_target(true)
-        .with_level(true)
-        .with_thread_ids(false)
-        .with_thread_names(false)
-        .init();
-}
-
 #[tokio::main]
 async fn main() -> ExitCode {
     let args = Args::parse();
 
-    // Initialize tracing early so that auto-install logs during config load are captured.
-    init_tracing(&resolve_initial_log_level(&args));
+    // Phase 1 — init tracing at DEBUG so auto-install logs during config load are captured.
+    // RUST_LOG takes priority, then CLI flag, then fallback to debug.
+    let startup_level = std::env::var("RUST_LOG")
+        .ok()
+        .filter(|s| !s.is_empty())
+        .or_else(|| args.log_level.clone())
+        .unwrap_or_else(|| "debug".to_string());
+
+    // Build reloadable filter and install subscriber inline to avoid complex type annotations.
+    let env_filter = match tracing_subscriber::EnvFilter::try_new(&startup_level) {
+        Ok(f) => f,
+        Err(e) => {
+            eprintln!(
+                "Invalid log level '{}': {}. Using 'info'.",
+                startup_level, e
+            );
+            tracing_subscriber::EnvFilter::new("info")
+        }
+    };
+
+    let (env_filter, reload_handle) = tracing_subscriber::reload::Layer::new(env_filter);
 
-    // --print-config: merge all configs and serialize back to TOML.
+    let tracing_installed = match tracing_subscriber::registry()
+        .with(
+            tracing_subscriber::fmt::layer()
+                .with_target(true)
+                .with_level(true)
+                .with_thread_ids(false)
+                .with_thread_names(false),
+        )
+        .with(env_filter)
+        .try_init()
+    {
+        Ok(_) => true,
+        Err(e) if e.to_string().contains("global default") => false,
+        Err(e) => {
+            eprintln!("Failed to install tracing subscriber: {}", e);
+            false
+        }
+    };
+
+    // --print-config: serialize config as TOML and exit.
     if args.print_config {
-        match ConfigLoader::load_merged() {
-            Ok((config, _)) => {
-                if let Err(e) = ConfigLoader::print_config_toml(&config, &mut std::io::stdout()) {
-                    eprintln!("Error: {}", e);
+        let config = if let Some(ref path) = args.config {
+            match load_single_config(path) {
+                Ok(cfg) => cfg,
+                Err(e) => {
+                    error!(
+                        "Failed to load configuration from {}: {}",
+                        path.display(),
+                        e
+                    );
                     return ExitCode::FAILURE;
                 }
             }
-            Err(e) => {
+        } else {
+            let (cfg, _) = ConfigLoader::load_merged().unwrap_or_else(|e| {
                 error!("Failed to load and merge configuration: {}", e);
-                return ExitCode::FAILURE;
-            }
+                std::process::exit(1);
+            });
+            cfg
+        };
+        if let Err(e) = ConfigLoader::print_config_toml(&config, &mut std::io::stdout()) {
+            eprintln!("Error: {}", e);
+            return ExitCode::FAILURE;
         }
         return ExitCode::SUCCESS;
     }
 
-    // Load config with log_level for tracing init.
+    // Load configuration.
     let (config, _searched): (config::Config, Vec<String>) = if let Some(ref path) = args.config {
         match load_single_config(path) {
             Ok(cfg) => (cfg, vec![]),
@@ -111,6 +164,33 @@ async fn main() -> ExitCode {
         })
     };
 
+    // Phase 2 — swap the log level filter to match config.log_level if our subscriber is active.
+    let final_level = resolve_tracing_log_level(&args, &config);
+    if tracing_installed {
+        match tracing_subscriber::EnvFilter::try_new(&final_level) {
+            Ok(new_filter) => {
+                reload_handle
+                    .modify(|filter| *filter = new_filter)
+                    .unwrap_or_else(|e| {
+                        warn!("Failed to modify tracing filter: {}", e);
+                    });
+                info!("Log level reconfigured to: {}", final_level);
+            }
+            Err(e) => {
+                eprintln!("Invalid log level '{}': {}. Using 'info'.", final_level, e);
+                reload_handle
+                    .modify(|filter| *filter = tracing_subscriber::EnvFilter::new("info"))
+                    .unwrap_or_else(|e| {
+                        warn!("Failed to modify tracing filter: {}", e);
+                    });
+            }
+        }
+    } else {
+        warn!(
+            "Tracing was pre-initialized externally (likely by RUST_LOG). config.log_level will not take effect."
+        );
+    }
+
     let should_validate = args.validate_config;
 
     info!("rouser starting...");
@@ -170,8 +250,10 @@ async fn run_dry_run(config: &config::Config) -> Result<()> {
         config.metrics.cpu.ema_alpha
     );
     info!(
-        "  - GPU threshold: {}%, EMA alpha: {:.2}",
-        config.metrics.gpu.threshold, config.metrics.gpu.ema_alpha
+        "  - GPU per-GPU threshold: {}%, total threshold: {}%, EMA alpha: {:.2}",
+        config.metrics.gpu.per_gpu_threshold,
+        config.metrics.gpu.total_threshold,
+        config.metrics.gpu.ema_alpha
     );
     info!(
         "  - Network threshold: {} Mbps, EMA alpha: {:.2}",
diff --git a/src/metrics/gpu.rs b/src/metrics/gpu.rs
index 40078e6..f24ade7 100644
--- a/src/metrics/gpu.rs
+++ b/src/metrics/gpu.rs
@@ -69,7 +69,7 @@ impl GpuCollector {
 
     /// Returns true if any physical GPU cards exist on this system.
     pub fn has_gpus(&self) -> bool {
-        self.enumerate_gpus().is_empty()
+        !self.enumerate_gpus().is_empty()
     }
 
     /// Collect utilization data from all detected GPUs.
@@ -390,6 +390,16 @@ impl std::fmt::Display for GpuError {
 
 impl std::error::Error for GpuError {}
 
+/// Aggregate GPU metrics across all GPUs on the system.
+/// Mirrors CpuUsage pattern: per-GPU max + average for inhibition decisions.
+#[derive(Debug, Clone, Default)]
+pub struct GpuAggregate {
+    /// Maximum individual GPU usage across all devices (0-100).
+    pub per_gpu_max: f64,
+    /// Average usage across all GPUs (sum / count) (0-100).
+    pub total_average: f64,
+}
+
 #[derive(Debug, Clone)]
 pub struct GpuData {
     pub device_id: String,
@@ -397,6 +407,37 @@ pub struct GpuData {
     pub usage: f64,
 }
 
+impl GpuAggregate {
+    #[allow(dead_code)] // Kept for potential future use with full GpuData inputs.
+    /// Compute aggregate metrics from individual GPU data.
+    pub(crate) fn from_gpus(gpus: &[GpuData]) -> Self {
+        if gpus.is_empty() {
+            return Self::default();
+        }
+        let max = gpus.iter().map(|g| g.usage).fold(0.0f64, f64::max);
+        let sum: f64 = gpus.iter().map(|g| g.usage).sum();
+        let avg = sum / gpus.len() as f64;
+        Self {
+            per_gpu_max: max,
+            total_average: avg,
+        }
+    }
+
+    /// Compute aggregate metrics from raw GPU usage values (e.g., after EMA smoothing).
+    pub fn from_values(values: &[f64]) -> Self {
+        if values.is_empty() {
+            return Self::default();
+        }
+        let max = values.iter().cloned().fold(0.0f64, f64::max);
+        let sum: f64 = values.iter().sum();
+        let avg = sum / values.len() as f64;
+        Self {
+            per_gpu_max: max,
+            total_average: avg,
+        }
+    }
+}
+
 #[cfg(test)]
 mod tests {
     use super::*;
@@ -778,3 +819,156 @@ mod enumerate_tests {
         assert!(!GpuCollector::is_valid_gpu_card("", &empty));
     }
 }
+#[cfg(test)]
+mod has_gpus_tests {
+    use super::*;
+
+    #[test]
+    fn test_has_gpus_consistent_with_enumerate() {
+        let collector = GpuCollector::new();
+        let enumerated = collector.enumerate_gpus();
+
+        // has_gpus and enumerate results must agree:
+        // has_gpus is true iff enumerate returns non-empty.
+        assert_eq!(collector.has_gpus(), !enumerated.is_empty());
+    }
+
+    #[test]
+    fn test_enumerate_returns_known_driver_types() {
+        let collector = GpuCollector::new();
+        let cards = collector.enumerate_gpus();
+
+        for card in &cards {
+            // All enumerated cards should have recognized drivers, not "unknown"
+            assert_ne!(
+                card.driver_name, "unknown",
+                "Card {} has unrecognized driver '{}'",
+                card.device_id, card.driver_name
+            );
+        }
+
+        if !cards.is_empty() {
+            println!("Enumerated GPUs: {:?}", cards);
+        }
+    }
+
+    #[test]
+    fn test_has_gpus_false_on_empty_sysfs_simulation() {
+        let base = tempfile::tempdir().unwrap();
+
+        // Verify is_valid_gpu_card rejects all entries in empty temp dir.
+        let entries = fs::read_dir(base.path()).ok();
+        let mut found_any = false;
+        if let Some(entries) = entries {
+            for entry in entries.flatten() {
+                let path = entry.path();
+                let name = match path.file_name().and_then(|s| s.to_str()) {
+                    Some(n) => n,
+                    None => continue,
+                };
+                if GpuCollector::is_valid_gpu_card(name, &path) {
+                    found_any = true;
+                }
+            }
+        }
+
+        // Empty temp dir should have no valid GPU cards.
+        assert!(
+            !found_any,
+            "tempdir unexpectedly contains valid gpu card entries"
+        );
+    }
+
+    #[test]
+    fn test_has_gpus_true_when_fake_card_present() {
+        let base = tempfile::tempdir().unwrap();
+        let card_path = base.path().join("card0");
+        fs::create_dir_all(card_path.join("device")).unwrap();
+
+        // Verify is_valid_gpu_card accepts the fake card.
+        assert!(GpuCollector::is_valid_gpu_card("card0", &card_path));
+    }
+}
+
+#[cfg(test)]
+mod gpu_aggregate_tests {
+    use super::*;
+
+    #[test]
+    fn test_gpu_aggregate_empty_values_returns_default() {
+        let agg = GpuAggregate::from_values(&[]);
+        assert_eq!(agg.per_gpu_max, 0.0);
+        assert_eq!(agg.total_average, 0.0);
+    }
+
+    #[test]
+    fn test_gpu_aggregate_single_value_both_metrics_equal() {
+        let agg = GpuAggregate::from_values(&[50.0]);
+        // With one GPU, max and average are the same value.
+        assert!((agg.per_gpu_max - 50.0).abs() < f64::EPSILON);
+        assert!((agg.total_average - 50.0).abs() < f64::EPSILON);
+    }
+
+    #[test]
+    fn test_gpu_aggregate_two_gpus_max_and_average_correct() {
+        let agg = GpuAggregate::from_values(&[30.0, 70.0]);
+        // max is 70 (highest GPU)
+        assert!((agg.per_gpu_max - 70.0).abs() < f64::EPSILON);
+        // average is (30+70)/2 = 50
+        assert!((agg.total_average - 50.0).abs() < f64::EPSILON);
+    }
+
+    #[test]
+    fn test_gpu_aggregate_three_gpus_correct() {
+        let agg = GpuAggregate::from_values(&[10.0, 50.0, 90.0]);
+        // max is 90
+        assert!((agg.per_gpu_max - 90.0).abs() < f64::EPSILON);
+        // average is (10+50+90)/3 = 50
+        assert!((agg.total_average - 50.0).abs() < f64::EPSILON);
+    }
+
+    #[test]
+    fn test_gpu_aggregate_all_zeros() {
+        let agg = GpuAggregate::from_values(&[0.0, 0.0, 0.0]);
+        assert!((agg.per_gpu_max - 0.0).abs() < f64::EPSILON);
+        assert!((agg.total_average - 0.0).abs() < f64::EPSILON);
+    }
+
+    #[test]
+    fn test_gpu_aggregate_default_impl_is_zero() {
+        let agg = GpuAggregate::default();
+        assert_eq!(agg.per_gpu_max, 0.0);
+        assert_eq!(agg.total_average, 0.0);
+    }
+
+    #[test]
+    fn test_gpu_aggregate_from_gpus_empty_returns_default() {
+        let gpus: Vec<GpuData> = vec![];
+        let agg = GpuAggregate::from_gpus(&gpus);
+        assert_eq!(agg.per_gpu_max, 0.0);
+        assert_eq!(agg.total_average, 0.0);
+    }
+
+    #[test]
+    fn test_gpu_aggregate_from_gpus_matches_from_values() {
+        let gpus = vec![
+            GpuData {
+                device_id: "card0".into(),
+                driver_name: "nvidia".into(),
+                usage: 40.0,
+            },
+            GpuData {
+                device_id: "card1".into(),
+                driver_name: "amdgpu".into(),
+                usage: 80.0,
+            },
+        ];
+        let values = vec![40.0, 80.0];
+
+        let agg_from_gpus = GpuAggregate::from_gpus(&gpus);
+        let agg_from_values = GpuAggregate::from_values(&values);
+
+        assert!((agg_from_gpus.per_gpu_max - agg_from_values.per_gpu_max).abs() < f64::EPSILON);
+        assert!((agg_from_gpus.total_average - agg_from_values.total_average).abs() < f64::EPSILON);
+    }
+}
diff --git a/src/metrics/mod.rs b/src/metrics/mod.rs
index 766bfe0..20d3bd0 100644
--- a/src/metrics/mod.rs
+++ b/src/metrics/mod.rs
@@ -8,7 +8,7 @@ use std::fmt;
 
 pub use cpu::{CpuCollector, CpuUsage};
 pub use disk::{DiskCollector, DiskThroughput};
-pub use gpu::{GpuCollector, GpuData};
+pub use gpu::{GpuAggregate, GpuCollector, GpuData};
 pub use network::{NetworkCollector, NetworkThroughput};
 
 #[derive(Debug, Clone)]
diff --git a/src/prediction/history.rs b/src/prediction/history.rs
new file mode 100644
index 0000000..fba95d1
--- /dev/null
+++ b/src/prediction/history.rs
@@ -0,0 +1,1204 @@
+//! Binary history log for predictive cooldown.
+//!
+//! Uses bincode v2 (serde-compatible binary serialization) with date-partitioned files.
+//! Each file is named `history.log.YYYYMMDD` and stored under XDG-compliant paths:
+//! - User state dir: `$XDG_STATE_HOME/rouser/history.log.*` or `~/.local/state/rouser/history.log.*` (falls back to `/tmp/rouser-history` if primary is unavailable)
+//! - Root path: `/var/lib/rouser/history.log.*`
+
+use chrono::{DateTime, Local, Utc};
+use serde::{Deserialize, Serialize};
+use std::collections::BTreeMap;
+use std::fs::{self, File};
+use std::io::{BufReader, BufWriter, Read, Write};
+use std::os::unix::fs::PermissionsExt;
+use std::path::{Path, PathBuf};
+use tracing::{debug, info, warn};
+
+/// Aggregate GPU metrics stored in history entries (mirrors CpuSnapshot pattern).
+#[derive(Debug, Clone, Default, Serialize, Deserialize)]
+pub struct GpuSnapshot {
+    /// Maximum individual GPU usage across all devices (0-100).
+    pub per_gpu_max: f64,
+    /// Average usage across all GPUs (sum / count) (0-100).
+    pub total_average: f64,
+}
+
+/// A single data point recorded at each tick.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct HistoryEntry {
+    /// Unix epoch nanoseconds since 1970-01-01T00:00:00 UTC.
+    pub timestamp_ns: u64,
+    /// CPU usage metrics (per_core_max, total_average).
+    pub cpu_usage: CpuSnapshot,
+    /// Aggregate GPU metrics across all devices for consistent history format regardless of GPU count.
+    #[serde(default)]
+    pub gpu_usage: GpuSnapshot,
+    /// Network throughput (Mbps), aggregated across all interfaces.
+    pub network_mbps: f64,
+    /// Disk throughput (MB/s), aggregated across all devices.
+    pub disk_mb_s: f64,
+    /// Whether rouser currently holds the inhibition lock at this timestamp.
+    pub inhibited: bool,
+}
+
+/// Computed rate-of-change values between two consecutive history entries.
+#[derive(Debug, Clone)]
+pub struct EntryDeltas {
+    /// Nanoseconds elapsed since previous entry (None if same timestamp).
+    pub elapsed_since_last_ns: Option<u64>,
+    /// Rate of change of CPU per_core_max usage in %/s.
+    pub cpu_delta_per_sec: Option<f64>,
+    /// Rate of change of network throughput in Mbps/s.
+    pub network_delta_per_sec: Option<f64>,
+    /// Rate of change of disk throughput in MB/s/s.
+    pub disk_delta_per_sec: Option<f64>,
+    /// Rate of change of GPU per_gpu_max usage in %/s.
+    pub gpu_delta_per_gpu_max: Option<f64>,
+    /// Rate of change of GPU total_average usage in %/s.
+    pub gpu_delta_total_average: Option<f64>,
+}
+
+impl EntryDeltas {
+    /// Compute deltas between a current and previous history entry.
+    pub fn compute(current: &HistoryEntry, prev: &HistoryEntry) -> Self {
+        let elapsed_ns = current.timestamp_ns.saturating_sub(prev.timestamp_ns);
+
+        if elapsed_ns == 0 {
+            return Self {
+                elapsed_since_last_ns: None,
+                cpu_delta_per_sec: None,
+                network_delta_per_sec: None,
+                disk_delta_per_sec: None,
+                gpu_delta_per_gpu_max: None,
+                gpu_delta_total_average: None,
+            };
+        }
+
+        let secs_f64 = elapsed_ns as f64 / 1_000_000_000.0;
+
+        let cpu_delta_per_sec = if secs_f64 > 0.0 {
+            Some((current.cpu_usage.per_core_max - prev.cpu_usage.per_core_max) / secs_f64)
+        } else {
+            None
+        };
+
+        let network_delta_per_sec = if secs_f64 > 0.0 {
+            Some((current.network_mbps - prev.network_mbps) / secs_f64)
+        } else {
+            None
+        };
+
+        let disk_delta_per_sec = if secs_f64 > 0.0 {
+            Some((current.disk_mb_s - prev.disk_mb_s) / secs_f64)
+        } else {
+            None
+        };
+
+        let gpu_delta_per_gpu_max = if secs_f64 > 0.0 {
+            Some((current.gpu_usage.per_gpu_max - prev.gpu_usage.per_gpu_max) / secs_f64)
+        } else {
+            None
+        };
+
+        let gpu_delta_total_average = if secs_f64 > 0.0 {
+            Some((current.gpu_usage.total_average - prev.gpu_usage.total_average) / secs_f64)
+        } else {
+            None
+        };
+
+        Self {
+            elapsed_since_last_ns: Some(elapsed_ns),
+            cpu_delta_per_sec,
+            network_delta_per_sec,
+            disk_delta_per_sec,
+            gpu_delta_per_gpu_max,
+            gpu_delta_total_average,
+        }
+    }
+}
+
+/// CPU metrics snapshot — serializable subset of CpuUsage.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct CpuSnapshot {
+    pub per_core_max: f64,
+    pub total_average: f64,
+}
+
+impl HistoryEntry {
+    #[allow(clippy::too_many_arguments)]
+    /// Create a new history entry from tick metrics and current inhibition state.
+    pub fn new(
+        timestamp_ns: u64,
+        cpu_per_core_max: f64,
+        cpu_total_average: f64,
+        gpu_per_gpu_max: f64,
+        gpu_total_average: f64,
+        network_mbps: f64,
+        disk_mb_s: f64,
+        inhibited: bool,
+    ) -> Self {
+        Self {
+            timestamp_ns,
+            cpu_usage: CpuSnapshot {
+                per_core_max: cpu_per_core_max,
+                total_average: cpu_total_average,
+            },
+            gpu_usage: GpuSnapshot {
+                per_gpu_max: gpu_per_gpu_max,
+                total_average: gpu_total_average,
+            },
+            network_mbps,
+            disk_mb_s,
+            inhibited,
+        }
+    }
+
+    /// Extract the date component for file partitioning (UTC day).
+    pub fn entry_date(&self) -> chrono::NaiveDate {
+        let secs = self.timestamp_ns / 1_000_000_000;
+        match DateTime::<Utc>::from_timestamp(secs as i64, 0) {
+            Some(dt) => dt.naive_utc().date(),
+            None => Local::now().date_naive(),
+        }
+    }
+
+    /// Serialize this entry to a binary buffer using bincode v2 standard config.
+    pub fn to_bytes(&self) -> Vec<u8> {
+        let encoded = bincode::serde::encode_to_vec(self, bincode::config::standard())
+            .expect("HistoryEntry should serialize");
+        // Prepend 4-byte length prefix for seekable streaming.
+        let len = (encoded.len() as u32).to_le_bytes();
+        let mut result = Vec::with_capacity(4 + encoded.len());
+        result.extend_from_slice(&len);
+        result.extend_from_slice(&encoded);
+        result
+    }
+
+    /// Deserialize a single entry from bytes starting at offset 0.
+    /// Returns `(entry, consumed_bytes)` or `None` if the buffer is too short/corrupt.
+    pub fn from_bytes(buf: &[u8]) -> Option<(Self, usize)> {
+        if buf.len() < 4 {
+            return None;
+        }
+        let len = u32::from_le_bytes([buf[0], buf[1], buf[2], buf[3]]) as usize;
+        if buf.len() < 4 + len {
+            return None;
+        }
+        match bincode::serde::decode_from_slice::<Self, _>(
+            &buf[4..4 + len],
+            bincode::config::standard(),
+        ) {
+            Ok((entry, consumed)) => Some((entry, 4 + consumed)),
+            Err(e) => {
+                debug!(
+                    "bincode decode error (len={}, data_prefix={:?}): {}",
+                    len,
+                    &buf[4..(4 + len).min(20)],
+                    e
+                );
+                None // Corrupted entry.
+            }
+        }
+    }
+}
+
+fn xdg_state_dir() -> PathBuf {
+    std::env::var("XDG_STATE_HOME")
+        .ok()
+        .filter(|s| !s.is_empty())
+        .map(PathBuf::from)
+        .unwrap_or_else(|| {
+            std::env::var("HOME")
+                .ok()
+                .map(|h| PathBuf::from(h).join(".local/state"))
+                .expect("XDG_STATE_HOME or HOME must be set for user state directory")
+        })
+}
+
+fn history_base_dir(is_root: bool) -> PathBuf {
+    let path = if is_root {
+        PathBuf::from("/var/lib/rouser")
+    } else {
+        xdg_state_dir().join("rouser")
+    };
+
+    if is_root {
+        let _ = fs::create_dir_all(path.parent().unwrap_or(&path));
+    }
+
+    path
+}
+
+fn is_path_writable(path: &Path) -> bool {
+    let test_file = path.join(".rouser-writable-check");
+    match File::create(&test_file) {
+        Ok(f) => drop(f),
+        Err(_) => return false,
+    }
+    fs::remove_file(&test_file).is_ok()
+}
+
+fn ensure_history_dir(path: &Path) -> std::io::Result<()> {
+    fs::create_dir_all(path)
+}
+
+fn fallback_data_dir(primary: &Path, is_root: bool) -> Option<PathBuf> {
+    if is_root || !primary.starts_with("/home") {
+        return None;
+    }
+
+    // Last resort for read-only /home with no writable state dir.
+    // Use PID-based unique path to minimize TOCTOU risk on shared systems.
+    let tmp = PathBuf::from(format!(
+        "/tmp/rouser-history.{pid}",
+        pid = std::process::id()
+    ));
+
+    if ensure_history_dir(&tmp).is_ok() {
+        // Restrict permissions: owner-only access (700).
+        fs::set_permissions(&tmp, fs::Permissions::from_mode(0o700)).ok();
+        return Some(tmp);
+    }
+
+    None
+}
+
+const HISTORY_FILE_PREFIX: &str = "history.log.";
+
+/// Fill temporal gaps in sorted history entries with synthetic zero-value records.
+/// When the computer is shut down or sleeping, no data points are written to the history log.
+/// Without correction, this creates a temporal gap that causes the prediction model to be
+/// overfit on active-period data only — it would see high activity during those gaps and
+/// incorrectly predict future activity.
+///
+/// Any gap greater than `gap_threshold_ns` between consecutive entries is filled with synthetic
+/// zero-value records spaced at `fill_interval_ns` intervals. These represent idle periods where
+/// no activity was recorded because the system was powered off or sleeping.
+pub fn fill_gaps(
+    entries: Vec<HistoryEntry>,
+    gap_threshold_ns: u64,
+    fill_interval_ns: u64,
+) -> Vec<HistoryEntry> {
+    if entries.len() < 2 {
+        return entries;
+    }
+
+    let mut result = vec![entries[0].clone()];
+
+    for entry in entries.iter().skip(1) {
+        let prev_ts = result.last().unwrap().timestamp_ns;
+        let curr = entry;
+        let gap_ns = curr.timestamp_ns.saturating_sub(prev_ts);
+
+        if gap_ns > gap_threshold_ns {
+            // Fill the gap with synthetic zero-value entries.
+            let mut ts = prev_ts + fill_interval_ns;
+            while ts < curr.timestamp_ns - fill_interval_ns / 2 {
+                result.push(HistoryEntry::new(ts, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, false));
+                ts += fill_interval_ns;
+            }
+        }
+
+        result.push(curr.clone());
+    }
+
+    debug!(
+        "Filled gaps: {} entries -> {} entries (added {} synthetic)",
+        entries.len(),
+        result.len(),
+        result.len() - entries.len()
+    );
+
+    result
+}
+
+/// A date-partitioned binary log file for storing metric snapshots.
+pub struct HistoryLog {
+    base_path: PathBuf,
+    entries_today: Vec<HistoryEntry>,
+    pending_summary: Option<String>,
+    last_prune_date: Option<i64>, // Unix day number (seconds since epoch / 86400)
+}
+
+impl HistoryLog {
+    pub fn new(is_root: bool) -> Self {
+        let primary = history_base_dir(is_root);
+        let base_path = if ensure_history_dir(&primary).is_ok() {
+            primary.clone()
+        } else if let Some(fallback) = fallback_data_dir(&primary, is_root) {
+            info!(
+                "Using alternate data directory {} (primary {} unavailable)",
+                fallback.display(),
+                primary.display()
+            );
+            fallback
+        } else {
+            warn!("History logging disabled — no writable data directory available");
+            return HistoryLog {
+                base_path: PathBuf::from("/dev/null"), // Best effort — writes will fail silently.
+                entries_today: Vec::new(),
+                pending_summary: None,
+                last_prune_date: None,
+            };
+        };
+
+        let _ = ensure_history_dir(&base_path);
+
+        HistoryLog {
+            base_path,
+            entries_today: Vec::new(),
+            pending_summary: None,
+            last_prune_date: None,
+        }
+    }
+
+    /// Append an entry to the log with optional summary for logging on flush. Buffers until flush or date change.
+    pub fn append_with_summary(&mut self, entry: HistoryEntry, summary: Option<String>) {
+        if let Some(s) = summary {
+            self.pending_summary = Some(s);
+        }
+        self.append(entry);
+    }
+
+    /// Append an entry to the log. Buffers in memory until flush or date change.
+    pub fn append(&mut self, entry: HistoryEntry) {
+        let entry_date = entry.entry_date();
+
+        if self.entries_today.is_empty() {
+            self.entries_today.push(entry);
+        } else {
+            // Check if this entry is for the same day as our buffer.
+            let first_date = self.entries_today.first().map(|e| e.entry_date());
+            match first_date {
+                Some(d) if d == entry_date => {
+                    self.entries_today.push(entry);
+                }
+                _ => {
+                    // Different date — flush previous day and start new buffer.
+                    self.flush();
+                    self.entries_today = vec![entry];
+                }
+            }
+        }
+    }
+
+    /// Flush in-memory entries to disk, logging a summary if one was set via append_with_summary.
+    pub fn flush(&mut self) {
+        if self.entries_today.is_empty() {
+            return;
+        }
+
+        let date = self.entries_today[0].entry_date();
+        let file_path =
+            self.base_path
+                .join(format!("{}{}", HISTORY_FILE_PREFIX, date.format("%Y%m%d")));
+
+        match File::options().create(true).append(true).open(&file_path) {
+            Ok(file) => {
+                let mut writer = BufWriter::new(file);
+                for entry in &self.entries_today {
+                    let bytes = entry.to_bytes();
+                    if let Err(e) = writer.write_all(&bytes) {
+                        warn!("Failed to write history entry: {}", e);
+                    }
+                }
+                if let Err(e) = writer.flush() {
+                    warn!("Failed to flush history buffer: {}", e);
+                }
+            }
+            Err(e) => {
+                warn!("Failed to open history log {}: {}", file_path.display(), e);
+            }
+        }
+
+        if let Some(ref summary) = self.pending_summary {
+            debug!(
+                "{} — flushed {} entries for date {} to {}",
+                summary,
+                self.entries_today.len(),
+                date,
+                file_path.display()
+            );
+        } else {
+            debug!(
+                "Flushed {} entries for date {} to {}",
+                self.entries_today.len(),
+                date,
+                file_path.display()
+            );
+        }
+
+        let _ = self.pending_summary.take();
+        self.entries_today.clear();
+    }
+
+    /// Read all entries from the history files, sorted by timestamp.
+    pub fn read_all(&self) -> Vec<HistoryEntry> {
+        if !self.base_path.exists() {
+            return vec![];
+        }
+
+        let mut date_entries: BTreeMap<String, Vec<HistoryEntry>> = BTreeMap::new();
+
+        let dir = match fs::read_dir(&self.base_path) {
+            Ok(d) => d,
+            Err(_) => return vec![], // Directory doesn't exist or can't be read.
+        };
+
+        for entry_result in dir {
+            let path = match entry_result {
+                Ok(e) => e.path(),
+                Err(_) => continue,
+            };
+
+            if !path.is_file() || !is_history_file(&path) {
+                continue;
+            }
+
+            let entries = read_entries_from_file(&path);
+            // Use filename YYYYMMDD as sort key for BTreeMap (lexicographic == chronological).
+            // Fall back to file creation/modification time when filename doesn't contain a valid date.
+            // On Linux, std::fs provides no safe way to access birth/creation times without unsafe
+            // syscalls — modification time is used as the best available proxy since historical log
+            // files are typically not modified after their initial writes (only appended or pruned).
+            let sort_key: String = extract_date_str(&path).unwrap_or_else(|| {
+                path.metadata()
+                    .and_then(|m| m.modified())
+                    .ok()
+                    .and_then(|t| t.duration_since(std::time::UNIX_EPOCH).ok())
+                    .map(|d| format!("{:020}", d.as_secs()))
+                    .unwrap_or_else(|| "99999999".to_string())
+            });
+
+            date_entries.entry(sort_key).or_default().extend(entries);
+        }
+
+        // Flatten entries and sort by timestamp (BTreeMap iterates in key/date order).
+        let mut result: Vec<HistoryEntry> = date_entries.into_values().flatten().collect();
+
+        result.sort_by_key(|e| e.timestamp_ns);
+        debug!(
+            "Loaded {} history entries from {}",
+            result.len(),
+            self.base_path.display()
+        );
+
+        result
+    }
+
+    /// Prune old files beyond the given retention period. Called periodically (e.g., every 12 hours).
+    #[allow(dead_code)]
+    pub fn prune(&mut self, max_age: std::time::Duration) {
+        let base_path = &self.base_path;
+
+        if !base_path.exists() || !base_path.is_dir() {
+            return;
+        }
+
+        // Compute today's YYYYMMDD string and an approximate cutoff date.
+        let today_naive = Local::now().date_naive();
+        let days_to_subtract: i32 = (max_age.as_secs() / 86400) as i32;
+
+        // Convert NaiveDate to a comparable YYYYMMDD integer (lexical sort == chronological for this format).
+        fn date_as_ymd_int(date: chrono::NaiveDate) -> Option<i32> {
+            let ymd_str = date.format("%Y%m%d").to_string();
+            ymd_str.parse::<i32>().ok()
+        }
+
+        // Convert YYYYMMDD string to NaiveDate for precise age comparison.
+        fn parse_ymd(s: &str) -> Option<chrono::NaiveDate> {
+            let year = s[0..4].parse().ok()?;
+            let month = s[4..6].parse().ok()?;
+            let day = s[6..8].parse().ok()?;
+            chrono::NaiveDate::from_ymd_opt(year, month, day)
+        }
+
+        // Compute cutoff date using NaiveDate arithmetic.
+        let cutoff_date = today_naive - chrono::TimeDelta::days(i64::from(days_to_subtract));
+
+        if let Some(today_ymd) = date_as_ymd_int(today_naive) {
+            // Only prune once per day (use the YYYYMMDD as a dedup key).
+            if self.last_prune_date == Some(today_ymd as i64) {
+                return;
+            }
+
+            let mut pruned_count: u32 = 0;
+
+            let dir = match fs::read_dir(base_path) {
+                Ok(d) => d,
+                Err(_) => return, // Can't read directory — skip pruning.
+            };
+
+            for entry_result in dir {
+                let path = match entry_result {
+                    Ok(e) => e.path(),
+                    Err(_) => continue,
+                };
+
+                if !path.is_file() || !is_history_file(&path) {
+                    continue;
+                }
+
+                // Extract YYYYMMDD from filename.
+                let file_name = path.file_name().and_then(|s| s.to_str()).unwrap_or("");
+                let date_part = file_name.strip_prefix(HISTORY_FILE_PREFIX).unwrap_or("");
+
+                if date_part.len() == 8 && date_part.chars().all(|c| c.is_ascii_digit()) {
+                    if let Some(file_date) = parse_ymd(date_part) {
+                        if file_date < cutoff_date {
+                            match fs::remove_file(&path) {
+                                Ok(_) => {
+                                    pruned_count += 1;
+                                    debug!(
+                                        "Pruned old history file {} (date: {})",
+                                        path.display(),
+                                        date_part
+                                    );
+                                }
+                                Err(e) => {
+                                    warn!(
+                                        "Failed to prune old history file {}: {}",
+                                        path.display(),
+                                        e
+                                    );
+                                }
+                            }
+                        }
+                    }
+                }
+            }
+
+            self.last_prune_date = Some(today_ymd as i64);
+
+            if pruned_count > 0 {
+                info!(
+                    "Pruned {} old history files (retention: {:?})",
+                    pruned_count, max_age
+                );
+            }
+        } // Can't compute today's date — skip pruning.
+    }
+
+    /// Check if the log has any data.
+    #[allow(dead_code)]
+    pub fn is_empty(&self) -> bool {
+        self.entries_today.is_empty() && !has_existing_files(&self.base_path)
+    }
+}
+
+impl Drop for HistoryLog {
+    fn drop(&mut self) {
+        self.flush();
+    }
+}
+
+#[allow(dead_code)]
+fn has_existing_files(base: &Path) -> bool {
+    let dir = match fs::read_dir(base) {
+        Ok(d) => d,
+        Err(_) => return false, // Directory doesn't exist or can't be read.
+    };
+
+    dir.flatten().any(|entry| is_history_file(&entry.path()))
+}
+
+fn is_history_file(path: &Path) -> bool {
+    let name = match path.file_name().and_then(|s| s.to_str()) {
+        Some(n) => n,
+        None => return false,
+    };
+    if !name.starts_with(HISTORY_FILE_PREFIX) {
+        return false;
+    }
+    // Ensure date portion is at least 8 chars (YYYYMMDD).
+    let after_prefix = &name[HISTORY_FILE_PREFIX.len()..];
+    after_prefix.len() >= 8 && after_prefix.chars().all(|c| c.is_ascii_digit())
+}
+
+/// Extract YYYYMMDD string from a history file path for BTreeMap sorting.
+fn extract_date_str(path: &Path) -> Option<String> {
+    let name = path.file_name()?.to_str()?;
+    if let Some(date_part) = name.strip_prefix(HISTORY_FILE_PREFIX) {
+        if date_part.len() == 8 && date_part.chars().all(|c| c.is_ascii_digit()) {
+            return Some(date_part.to_string());
+        }
+    }
+    None
+}
+
+fn read_entries_from_file(path: &Path) -> Vec<HistoryEntry> {
+    let mut entries = Vec::new();
+
+    let file = match File::open(path) {
+        Ok(f) => f,
+        Err(e) => {
+            warn!("Failed to open history file {}: {}", path.display(), e);
+            return entries;
+        }
+    };
+
+    let mut reader = BufReader::new(file);
+    let mut buf = Vec::new();
+
+    if let Err(e) = reader.read_to_end(&mut buf) {
+        warn!("Failed to read history file {}: {}", path.display(), e);
+        return entries;
+    }
+
+    let mut offset = 0usize;
+    while offset < buf.len() {
+        match HistoryEntry::from_bytes(&buf[offset..]) {
+            Some((entry, next_offset)) => {
+                entries.push(entry);
+                offset += next_offset;
+            }
+            None => {
+                warn!(
+                    "Failed to decode entry at offset {} in file {}: buffer has {} bytes, first 4 bytes as length prefix = {}",
+                    offset,
+                    path.display(),
+                    buf.len() - offset,
+                    if (buf.len() - offset) >= 4 {
+                        u32::from_le_bytes([buf[offset], buf[offset + 1], buf[offset + 2], buf[offset + 3]]) as usize
+                    } else {
+                        0
+                    },
+                );
+                break; // Corrupted or truncated entry at end.
+            }
+        }
+    }
+
+    debug!("Read {} entries from {}", entries.len(), path.display());
+    entries
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use std::time::{Duration, SystemTime};
+
+    fn sample_entry(timestamp_ns: u64) -> HistoryEntry {
+        HistoryEntry::new(
+            timestamp_ns,
+            25.0, // cpu per_core_max
+            12.0, // cpu total_average
+            78.0, // gpu per_gpu_max (max of [45, 78])
+            61.5, // gpu total_average ((45+78)/2)
+            15.5, // network mbps
+            3.2,  // disk mb/s
+            true, // inhibited
+        )
+    }
+
+    #[test]
+    fn test_history_entry_serialization_roundtrip() {
+        let now = SystemTime::now()
+            .duration_since(SystemTime::UNIX_EPOCH)
+            .unwrap();
+        let entry = sample_entry(now.as_nanos() as u64);
+        let bytes = entry.to_bytes();
+
+        assert!(!bytes.is_empty(), "serialized entry should not be empty");
+
+        let (decoded, consumed) =
+            HistoryEntry::from_bytes(&bytes).expect("should decode valid entry");
+
+        assert_eq!(consumed, bytes.len(), "should consume all bytes");
+        assert_eq!(entry.timestamp_ns, decoded.timestamp_ns);
+        assert!(
+            (entry.cpu_usage.per_core_max - decoded.cpu_usage.per_core_max).abs() < f64::EPSILON
+        );
+        assert_eq!(
+            entry.cpu_usage.total_average,
+            decoded.cpu_usage.total_average
+        );
+        assert_eq!(entry.gpu_usage.per_gpu_max, decoded.gpu_usage.per_gpu_max);
+        assert_eq!(
+            entry.gpu_usage.total_average,
+            decoded.gpu_usage.total_average
+        );
+        assert!((entry.network_mbps - decoded.network_mbps).abs() < f64::EPSILON);
+        assert!((entry.disk_mb_s - decoded.disk_mb_s).abs() < f64::EPSILON);
+        assert_eq!(entry.inhibited, decoded.inhibited);
+    }
+
+    #[test]
+    fn test_history_entry_date_extraction() {
+        let now = SystemTime::now();
+        let ns = now
+            .duration_since(SystemTime::UNIX_EPOCH)
+            .unwrap()
+            .as_nanos() as u64;
+        let entry = sample_entry(ns);
+
+        // The date should match today's UTC date (entry_date uses UTC internally).
+        assert_eq!(entry.entry_date(), Utc::now().date_naive());
+    }
+
+    #[test]
+    fn test_history_log_file_operations() {
+        let tmp_dir = tempfile::tempdir().unwrap();
+        let base_path = tmp_dir.path().join("rouser");
+        fs::create_dir_all(&base_path).unwrap();
+
+        let now_ns = SystemTime::now()
+            .duration_since(SystemTime::UNIX_EPOCH)
+            .unwrap()
+            .as_nanos() as u64;
+
+        // Write entries directly to file.
+        {
+            let date_str = format!(
+                "{}{}",
+                HISTORY_FILE_PREFIX,
+                Local::now().date_naive().format("%Y%m%d")
+            );
+            let file_path = base_path.join(date_str);
+
+            let mut writer = BufWriter::new(File::create(&file_path).unwrap());
+            let entry1 = sample_entry(now_ns);
+            let entry2 = HistoryEntry::new(
+                now_ns + 5_000_000_000, // +5s
+                5.0,                    // cpu per_core_max
+                2.0,                    // cpu total_average
+                10.0,                   // gpu per_gpu_max
+                10.0,                   // gpu total_average (single GPU)
+                0.0,                    // network mbps
+                0.0,                    // disk mb/s
+                false,                  // inhibited
+            );
+
+            writer.write_all(&entry1.to_bytes()).unwrap();
+            writer.write_all(&entry2.to_bytes()).unwrap();
+            writer.flush().unwrap();
+        }
+
+        // Read them back via HistoryLog::read_all() which scans the directory.
+        let log = HistoryLog {
+            base_path: base_path.clone(),
+            entries_today: Vec::new(),
+            pending_summary: None,
+            last_prune_date: None,
+        };
+
+        let all_entries = log.read_all();
+        assert_eq!(all_entries.len(), 2);
+    }
+
+    #[test]
+    fn test_history_log_pruning() {
+        let tmp_dir = tempfile::tempdir().unwrap();
+        let base_path = tmp_dir.path().join("rouser");
+        fs::create_dir_all(&base_path).unwrap();
+
+        // Create an old history file (35 days ago, well within 8-digit YYYYMMDD format).
+        let old_date = Local::now().date_naive() - chrono::Duration::days(35);
+        let date_str_old = format!("{}{}", HISTORY_FILE_PREFIX, old_date.format("%Y%m%d"));
+        let old_file = base_path.join(&date_str_old);
+        File::create(&old_file).unwrap();
+
+        // Create a recent history file (2 days ago).
+        let recent_date = Local::now().date_naive() - chrono::Duration::days(2);
+        let date_str_recent = format!("{}{}", HISTORY_FILE_PREFIX, recent_date.format("%Y%m%d"));
+        let recent_file = base_path.join(&date_str_recent);
+        File::create(&recent_file).unwrap();
+
+        // Create a non-history file (should be skipped).
+        let _ = File::create(base_path.join("other.txt")).unwrap();
+
+        let mut log = HistoryLog {
+            base_path: base_path.clone(),
+            entries_today: Vec::new(),
+            pending_summary: None,
+            last_prune_date: None,
+        };
+
+        // Prune with 30-day retention.
+        log.prune(Duration::from_secs(30 * 24 * 60 * 60));
+
+        assert!(!old_file.exists(), "old file should be pruned");
+        assert!(recent_file.exists(), "recent file should remain");
+    }
+
+    #[test]
+    fn test_history_log_is_empty_initially() {
+        let tmp_dir = tempfile::tempdir().unwrap();
+        let log = HistoryLog {
+            base_path: tmp_dir.path().join("rouser"),
+            entries_today: Vec::new(),
+            pending_summary: None,
+            last_prune_date: None,
+        };
+
+        assert!(log.is_empty());
+    }
+
+    #[test]
+    fn test_from_bytes_handles_short_buffer() {
+        let result = HistoryEntry::from_bytes(&[1, 2]); // Less than 4 bytes for length prefix.
+        assert!(result.is_none(), "should return None for too-short buffer");
+    }
+
+    #[test]
+    fn test_from_bytes_handles_truncated_entry() {
+        let now = SystemTime::now()
+            .duration_since(SystemTime::UNIX_EPOCH)
+            .unwrap();
+        let entry = sample_entry(now.as_nanos() as u64);
+        let bytes = entry.to_bytes();
+
+        // Truncate to only first 10 bytes (less than total length + header for most entries).
+        let truncated: Vec<u8> = bytes[..bytes.len().min(10)].to_vec();
+        let result = HistoryEntry::from_bytes(&truncated);
+        assert!(result.is_none(), "should return None for truncated entry");
+    }
+
+    #[test]
+    fn test_is_history_file() {
+        let tmp_dir = tempfile::tempdir().unwrap();
+
+        let valid_path = tmp_dir.path().join("history.log.20250615");
+        assert!(is_history_file(&valid_path));
+
+        let invalid_prefix = tmp_dir.path().join("other.log.20250615");
+        assert!(!is_history_file(&invalid_prefix));
+
+        let no_date = tmp_dir.path().join("history.log.txt");
+        assert!(
+            !is_history_file(&no_date),
+            "non-numeric date should be invalid"
+        );
+    }
+
+    #[test]
+    fn test_multiple_entries_serialization() {
+        let now_ns = SystemTime::now()
+            .duration_since(SystemTime::UNIX_EPOCH)
+            .unwrap()
+            .as_nanos() as u64;
+
+        let entries: Vec<HistoryEntry> = (0..10)
+            .map(|i| {
+                HistoryEntry::new(
+                    now_ns + i * 5_000_000_000, // 5s apart
+                    (i as f64) * 10.0,
+                    (i as f64) * 5.0,
+                    (i as f64) * 20.0, // gpu per_gpu_max
+                    (i as f64) * 20.0, // gpu total_average (single GPU)
+                    i as f64,
+                    (i as f64) / 10.0,
+                    i % 3 == 0,
+                )
+            })
+            .collect();
+
+        // Write all to a temp file.
+        let tmp_dir = tempfile::tempdir().unwrap();
+        let file_path = tmp_dir.path().join("test.bin");
+
+        {
+            let mut writer = BufWriter::new(File::create(&file_path).unwrap());
+            for entry in &entries {
+                let bytes = entry.to_bytes();
+                assert!(writer.write_all(&bytes).is_ok());
+            }
+            writer.flush().unwrap();
+        }
+
+        // Read back.
+        let read_entries = read_entries_from_file(&file_path);
+        assert_eq!(read_entries.len(), 10, "should have all entries");
+
+        for (orig, decoded) in entries.iter().zip(read_entries.iter()) {
+            assert_eq!(orig.timestamp_ns, decoded.timestamp_ns);
+            assert!(
+                (orig.cpu_usage.per_core_max - decoded.cpu_usage.per_core_max).abs() < f64::EPSILON
+            );
+            assert_eq!(orig.inhibited, decoded.inhibited);
+        }
+    }
+
+    #[test]
+    fn test_history_entry_gpu_usages_empty_vec() {
+        let entry = HistoryEntry::new(0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, false);
+        assert!(entry.gpu_usage.total_average == 0.0 && entry.gpu_usage.per_gpu_max == 0.0);
+
+        // Should serialize/deserialize fine with empty GPU array.
+        let bytes = entry.to_bytes();
+        let (decoded, _) = HistoryEntry::from_bytes(&bytes).unwrap();
+        assert_eq!(decoded.gpu_usage.per_gpu_max, 0.0);
+        assert_eq!(decoded.gpu_usage.total_average, 0.0);
+    }
+
+    #[test]
+    fn test_history_entry_timestamp_ordering() {
+        let mut entries: Vec<HistoryEntry> = (0..5)
+            .rev() // Reverse order to test sorting.
+            .map(|i| {
+                HistoryEntry::new(
+                    i as u64 * 1_000_000_000,
+                    10.0,  // cpu per_core_max
+                    20.0,  // cpu total_average
+                    0.0,   // gpu per_gpu_max
+                    0.0,   // gpu total_average
+                    0.0,   // network mbps
+                    0.0,   // disk mb/s
+                    false, // inhibited
+                )
+            })
+            .collect();
+
+        entries.sort_by_key(|e| e.timestamp_ns);
+
+        for i in 1..entries.len() {
+            assert!(
+                entries[i].timestamp_ns >= entries[i - 1].timestamp_ns,
+                "entries should be sorted by timestamp"
+            );
+        }
+    }
+
+    #[test]
+    fn test_fill_gaps_inserts_synthetic_entries() {
+        let entry1 = HistoryEntry::new(0, 50.0, 25.0, 0.0, 0.0, 10.0, 5.0, true);
+        // Gap of 10 minutes (600 seconds) — well above GAP_THRESHOLD_NS (300s).
+        let entry2 =
+            HistoryEntry::new(10 * 60 * 1_000_000_000, 5.0, 2.0, 0.0, 0.0, 0.0, 0.0, false);
+
+        let entries = vec![entry1.clone(), entry2];
+        let result = fill_gaps(entries, 300_000_000_000u64, 30_000_000_000u64);
+
+        // Should have: original 2 + synthetic fills for 10min gap at 30s intervals = 2 + (600/30) - ~1 = ~21 entries
+        assert!(
+            result.len() > 2,
+            "should insert synthetic entries in the gap"
+        );
+
+        // First entry is unchanged.
+        assert_eq!(result[0].timestamp_ns, 0);
+        assert_eq!(result[0].cpu_usage.per_core_max, 50.0);
+
+        // Last entry is original entry2 (unchanged).
+        let last = result.last().unwrap();
+        assert_eq!(last.timestamp_ns, 10 * 60 * 1_000_000_000);
+
+        // Synthetic entries in the middle should have zero values.
+        for entry in &result[1..result.len() - 1] {
+            assert_eq!(entry.cpu_usage.per_core_max, 0.0);
+            assert_eq!(entry.network_mbps, 0.0);
+            assert!(!entry.inhibited);
+        }
+
+        // Timestamps should be monotonically increasing and roughly FILL_INTERVAL_NS apart for synthetics.
+        for i in 1..result.len() {
+            let delta = result[i].timestamp_ns - result[i - 1].timestamp_ns;
+            assert!(delta > 0, "timestamps must be strictly increasing");
+            if result[i].cpu_usage.per_core_max == 0.0
+                && result[i - 1].cpu_usage.per_core_max == 0.0
+            {
+                // Between two synthetic entries, gap should be close to 30s.
+                assert!(
+                    (delta as i64 - 30_000_000_000i64).abs() < 15_000_000_000i64,
+                    "synthetic entry spacing should be ~30000000000ns, got {}ns",
+                    delta
+                );
+            }
+        }
+    }
+
+    #[test]
+    fn test_fill_gaps_noop_when_entries_contiguous() {
+        let entries: Vec<HistoryEntry> = (0..5)
+            .map(|i| HistoryEntry::new(i * 1_000_000_000, 10.0, 5.0, 0.0, 0.0, 1.0, 0.5, false))
+            .collect();
+
+        let result = fill_gaps(entries.clone(), 300_000_000_000u64, 30_000_000_000u64);
+        assert_eq!(
+            result.len(),
+            entries.len(),
+            "no synthetic entries should be added"
+        );
+
+        for (orig, filled) in entries.iter().zip(result.iter()) {
+            assert_eq!(orig.timestamp_ns, filled.timestamp_ns);
+            assert!(
+                (orig.cpu_usage.per_core_max - filled.cpu_usage.per_core_max).abs() < f64::EPSILON
+            );
+        }
+    }
+
+    #[test]
+    fn test_fill_gaps_single_entry_noop() {
+        let entry = HistoryEntry::new(0, 50.0, 25.0, 0.0, 0.0, 10.0, 5.0, true);
+        let result = fill_gaps(vec![entry], 300_000_000_000u64, 30_000_000_000u64);
+        assert_eq!(result.len(), 1);
+    }
+
+    #[test]
+    fn test_fill_gaps_gap_below_threshold_noop() {
+        // Gap of only 60 seconds — below GAP_THRESHOLD_NS (300s).
+        let entry1 = HistoryEntry::new(0, 50.0, 25.0, 0.0, 0.0, 10.0, 5.0, true);
+        let entry2 = HistoryEntry::new(60 * 1_000_000_000, 5.0, 2.0, 0.0, 0.0, 0.0, 0.0, false);
+
+        let entries = vec![entry1, entry2];
+        let result = fill_gaps(entries.clone(), 300_000_000_000u64, 30_000_000_000u64);
+        assert_eq!(result.len(), 2, "no synthetic entries when gap < threshold");
+    }
+
+    #[test]
+    fn test_fill_gaps_deltas_recomputed_after_gap() {
+        // Entry 1: timestamp=0s, cpu=50.0, network=10.0 (active state)
+        let entry1 = HistoryEntry::new(0, 50.0, 25.0, 0.0, 0.0, 10.0, 5.0, true);
+        // Entry 2: timestamp=600s (10 min gap), cpu=5.0, network=0.0 (idle state)
+        let entry2 = HistoryEntry::new(600_000_000_000, 5.0, 2.0, 0.0, 0.0, 0.0, 0.0, false);
+
+        let entries = vec![entry1.clone(), entry2];
+        let result = fill_gaps(entries, 300_000_000_000u64, 30_000_000_000u64);
+
+        assert!(result.len() > 2, "should have synthetic entries in gap");
+
+        // Last entry is the original entry2 (unchanged timestamp).
+        let last_entry = result.last().unwrap();
+        assert_eq!(last_entry.timestamp_ns, 600_000_000_000);
+        assert!(!last_entry.inhibited);
+
+        // The second-to-last entry is synthetic zero-value (immediately before entry2).
+        let last_synthetic = &result[result.len() - 2];
+        assert_eq!(last_synthetic.cpu_usage.per_core_max, 0.0);
+        assert_eq!(last_synthetic.network_mbps, 0.0);
+
+        // Verify synthetic entries have correct spacing (~FILL_INTERVAL_NS apart).
+        let synthetic_count = result.len() - 2; // exclude first real + last real
+        if synthetic_count > 1 {
+            for entry in &result[1..result.len() - 1] {
+                assert_eq!(entry.cpu_usage.per_core_max, 0.0);
+                assert_eq!(entry.network_mbps, 0.0);
+            }
+        }
+    }
+
+    #[test]
+    fn test_entry_deltas_basic() {
+        let prev = HistoryEntry::new(0, 10.0, 5.0, 20.0, 20.0, 8.0, 2.0, false);
+        // Entry 1 second later with higher values.
+        let curr = HistoryEntry::new(
+            1_000_000_000, // +1s
+            30.0,          // cpu per_core_max increased by 20 → rate = 20%/s
+            15.0,          // cpu total_average increased by 10 → rate = 10%/s
+            40.0,          // gpu per_gpu_max (single GPU)
+            40.0,          // gpu total_average (same for single GPU)
+            18.0,          // network increased by 10 → rate = 10 Mbps/s
+            7.0,           // disk increased by 5 → rate = 5 MB/s/s
+            true,          // inhibited
+        );
+
+        let deltas = EntryDeltas::compute(&curr, &prev);
+
+        assert_eq!(deltas.elapsed_since_last_ns, Some(1_000_000_000));
+        // CPU delta should be (30-10)/1.0 = 20%/s.
+        assert!((deltas.cpu_delta_per_sec.unwrap() - 20.0).abs() < f64::EPSILON);
+        // Network delta should be (18-8)/1.0 = 10 Mbps/s.
+        assert!((deltas.network_delta_per_sec.unwrap() - 10.0).abs() < f64::EPSILON);
+        // Disk delta should be (7-2)/1.0 = 5 MB/s/s.
+        assert!((deltas.disk_delta_per_sec.unwrap() - 5.0).abs() < f64::EPSILON);
+    }
+
+    #[test]
+    fn test_entry_deltas_zero_elapsed_no_change() {
+        let prev = HistoryEntry::new(100, 10.0, 5.0, 0.0, 0.0, 8.0, 2.0, false);
+        // Same timestamp — should return None for elapsed and deltas.
+        let curr = HistoryEntry::new(100, 30.0, 15.0, 40.0, 40.0, 18.0, 7.0, true);
+        let deltas = EntryDeltas::compute(&curr, &prev);
+
+        assert_eq!(deltas.elapsed_since_last_ns, None); // Zero elapsed → None
+    }
+
+    #[test]
+    fn test_read_all_sorted_by_timestamp_across_files() {
+        let tmp_dir = tempfile::tempdir().unwrap();
+        let base_path = tmp_dir.path().join("rouser");
+        fs::create_dir_all(&base_path).unwrap();
+
+        // Create two date-partitioned files with interleaved timestamps.
+        let now_ns = SystemTime::now()
+            .duration_since(SystemTime::UNIX_EPOCH)
+            .unwrap()
+            .as_nanos() as u64;
+
+        {
+            // File for yesterday (older).
+            let yest = Local::now().date_naive() - chrono::Duration::days(1);
+            let date_str = format!("{}{}", HISTORY_FILE_PREFIX, yest.format("%Y%m%d"));
+            let file_path = base_path.join(date_str);
+
+            // Entries with timestamps 5s apart.
+            let mut writer = BufWriter::new(File::create(&file_path).unwrap());
+            for i in 0..3 {
+                let entry = HistoryEntry::new(
+                    now_ns + ((i as u64) * 5_000_000_000),
+                    10.0 + i as f64,
+                    5.0 + i as f64,
+                    0.0, // gpu per_gpu_max
+                    0.0, // gpu total_average
+                    1.0 * (i + 1) as f64,
+                    0.5 * (i + 1) as f64,
+                    i % 2 == 0,
+                );
+                assert!(writer.write_all(&entry.to_bytes()).is_ok());
+            }
+        }
+
+        {
+            // File for today (newer) with earlier timestamps than yesterday's file.
+            let date_str = format!(
+                "{}{}",
+                HISTORY_FILE_PREFIX,
+                Local::now().date_naive().format("%Y%m%d")
+            );
+            let file_path = base_path.join(date_str);
+
+            // These entries have timestamps BEFORE yesterday's — tests cross-file sorting.
+            let mut writer = BufWriter::new(File::create(&file_path).unwrap());
+            for i in 0..2 {
+                let entry = HistoryEntry::new(
+                    now_ns + ((i as u64) * 5_000_000_000),
+                    1.0 + i as f64,
+                    0.5 + i as f64,
+                    0.0, // gpu per_gpu_max
+                    0.0, // gpu total_average
+                    0.1 * (i + 1) as f64,
+                    0.1 * (i + 1) as f64,
+                    false,
+                );
+                assert!(writer.write_all(&entry.to_bytes()).is_ok());
+            }
+        }
+
+        // Read all — should be sorted by timestamp regardless of file order.
+        let log = HistoryLog {
+            base_path: base_path.clone(),
+            entries_today: Vec::new(),
+            pending_summary: None,
+            last_prune_date: None,
+        };
+
+        let all_entries = log.read_all();
+
+        // After gap filling (no large gaps in test data), should have original 5 + synthetic fills.
+        assert!(all_entries.len() >= 5, "should have at least 5 entries");
+
+        // Verify monotonic timestamp ordering.
+        for i in 1..all_entries.len() {
+            assert!(
+                all_entries[i].timestamp_ns >= all_entries[i - 1].timestamp_ns,
+                "entries must be sorted by timestamp ({} < {})",
+                all_entries[i - 1].timestamp_ns,
+                all_entries[i].timestamp_ns
+            );
+        }
+
+        // First entry should have the smallest timestamp.
+        assert_eq!(all_entries[0].timestamp_ns, now_ns);
+    }
+}
diff --git a/src/prediction/ml_model.rs b/src/prediction/ml_model.rs
new file mode 100644
index 0000000..8d99982
--- /dev/null
+++ b/src/prediction/ml_model.rs
@@ -0,0 +1,497 @@
+//! Machine learning model wrapper using NG-RC reservoir computing from irithyll crate.
+//!
+//! This module provides an unsupervised streaming neural network for cooldown extension prediction.
+//! The Narmala-Gated Reservoir Computing (NG-RC) architecture learns normal system usage patterns
+//! by continuously updating its weights at each prediction interval, without requiring labeled training data.
+
+use irithyll::reservoir::{NgRcConfig, NgRcPredictor};
+use serde::{Deserialize, Serialize};
+use std::fs;
+use std::path::PathBuf;
+use tracing::{debug, warn};
+
+/// Fixed-size feature vector extracted from a HistoryEntry for ML processing.
+/// Contains six normalized metric values: CPU max/avg, GPU max/avg, network MB/s, disk MB/s.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct FeatureVector {
+    /// Normalized CPU per-core maximum usage (0-1).
+    pub cpu_max: f64,
+    /// Normalized CPU total average usage (0-1).
+    pub cpu_avg: f64,
+    /// Normalized GPU per-GPU maximum usage (0-1).
+    pub gpu_max: f64,
+    /// Normalized GPU total average usage (0-1).
+    pub gpu_avg: f64,
+    /// Normalized network throughput in Mbps (0-1).
+    pub network: f64,
+    /// Normalized disk throughput in MB/s (0-1).
+    pub disk: f64,
+}
+
+impl FeatureVector {
+    /// Convert raw metric values into a feature vector with normalization applied.
+    /// Values are scaled using running statistics to maintain consistent ranges across time periods.
+    pub fn new(
+        cpu_max: f64,
+        cpu_avg: f64,
+        gpu_max: f64,
+        gpu_avg: f64,
+        network_mbps: f64,
+        disk_mb_s: f64,
+        stats: &NormalizationStats,
+    ) -> Self {
+        Self {
+            cpu_max: normalize(cpu_max, &stats.cpu_stats),
+            cpu_avg: normalize(cpu_avg, &stats.cpu_stats),
+            gpu_max: normalize(gpu_max, &stats.gpu_stats),
+            gpu_avg: normalize(gpu_avg, &stats.gpu_stats),
+            network: normalize(network_mbps, &stats.network_stats),
+            disk: normalize(disk_mb_s, &stats.disk_stats),
+        }
+    }
+
+    /// Convert feature vector to array for ML model input/output.
+    pub fn to_array(&self) -> [f64; 6] {
+        [self.cpu_max, self.cpu_avg, self.gpu_max, self.gpu_avg, self.network, self.disk]
+    }
+
+    /// Create feature vector from raw metrics without normalization (for initial training).
+    pub fn raw(cpu_max: f64, cpu_avg: f64, gpu_max: f64, gpu_avg: f64, network: f64, disk: f64) -> Self {
+        let stats = NormalizationStats::default();
+        Self::new(cpu_max, cpu_avg, gpu_max, gpu_avg, network, disk, &stats)
+    }
+
+    /// Create a zero vector (represents idle state for gap-filled entries).
+    pub fn zero() -> Self {
+        Self {
+            cpu_max: 0.0,
+            cpu_avg: 0.0,
+            gpu_max: 0.0,
+            gpu_avg: 0.0,
+            network: 0.0,
+            disk: 0.0,
+        }
+    }
+
+    /// Return the number of features in this vector (always 6).
+    pub fn dim(&self) -> usize {
+        6
+    }
+}
+
+/// Running normalization statistics for feature scaling using Welford's online algorithm.
+/// Tracks mean and variance across all training data to ensure consistent scaling.
+#[derive(Debug, Clone)]
+pub struct NormalizationStats {
+    /// Per-feature running statistics: (mean, m2) where m2 is used to compute variance.
+    cpu_stats: StatsTracker,
+    gpu_stats: StatsTracker,
+    network_stats: StatsTracker,
+    disk_stats: StatsTracker,
+}
+
+impl Default for NormalizationStats {
+    fn default() -> Self {
+        Self {
+            cpu_stats: StatsTracker::default(),
+            gpu_stats: StatsTracker::default(),
+            network_stats: StatsTracker::default(),
+            disk_stats: StatsTracker::default(),
+        }
+    }
+}
+
+impl NormalizationStats {
+    /// Update statistics with a new observation, computing running mean and variance.
+    pub fn update(&mut self, features: &FeatureVector) {
+        let stats = [features.cpu_max, features.cpu_avg];
+        for v in stats {
+            self.cpu_stats.update(v);
+        }
+
+        let stats = [features.gpu_max, features.gpu_avg];
+        for v in stats {
+            self.gpu_stats.update(v);
+        }
+
+        self.network_stats.update(features.network);
+        self.disk_stats.update(features.disk);
+    }
+
+    /// Update statistics with a single raw metric value (convenience method).
+    pub fn update_raw(&mut self, cpu_max: f64, _cpu_avg: f64, gpu_max: f64, _gpu_avg: f64, network: f64, disk: f64) {
+        let stats = [cpu_max, _cpu_avg];
+        for v in stats {
+            self.cpu_stats.update(v);
+        }
+
+        let stats = [gpu_max, _gpu_avg];
+        for v in stats {
+            self.gpu_stats.update(v);
+        }
+
+        self.network_stats.update(network);
+        self.disk_stats.update(disk);
+    }
+
+    /// Return the internal stats tracker for a feature group.
+    pub fn get_cpu_stats(&self) -> &StatsTracker {
+        &self.cpu_stats
+    }
+
+    pub fn get_gpu_stats(&self) -> &StatsTracker {
+        &self.gpu_stats
+    }
+
+    pub fn get_network_stats(&self) -> &StatsTracker {
+        &self.network_stats
+    }
+
+    pub fn get_disk_stats(&self) -> &StatsTracker {
+        &self.disk_stats
+    }
+
+    /// Serialize normalization stats to bytes for persistence.
+    pub fn to_bytes(&self) -> Vec<u8> {
+        bincode::serde::encode_to_vec(self, bincode::config::standard()).expect("NormalizationStats should serialize")
+    }
+
+    /// Deserialize normalization stats from bytes.
+    pub fn from_bytes(bytes: &[u8]) -> Self {
+        let (result, _): (Self, _) =
+            bincode::serde::decode_from_slice(bytes, bincode::config::standard()).expect("NormalizationStats should deserialize");
+        result
+    }
+
+    /// Save normalization stats to a file.
+    pub fn save(&self, path: &PathBuf) -> std::io::Result<()> {
+        let data = self.to_bytes();
+        fs::write(path, data)?;
+        Ok(())
+    }
+
+    /// Load normalization stats from a file.
+    pub fn load(path: &PathBuf) -> Option<Self> {
+        match fs::read(path) {
+            Ok(data) => {
+                debug!("Loaded normalization stats from {:?}", path);
+                Some(Self::from_bytes(&data))
+            }
+            Err(e) => {
+                debug!("No existing normalization stats at {:?}: {}", path, e);
+                None
+            }
+        }
+    }
+}
+
+/// Welford's online algorithm for computing running mean and variance in O(1) memory.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct StatsTracker {
+    count: u64,
+    mean: f64,
+    m2: f64,
+}
+
+impl Default for StatsTracker {
+    fn default() -> Self {
+        Self {
+            count: 0,
+            mean: 0.0,
+            m2: 0.0,
+        }
+    }
+}
+
+impl StatsTracker {
+    /// Update running statistics with a new value using Welford's online algorithm.
+    pub fn update(&mut self, x: f64) {
+        self.count += 1;
+        let delta = x - self.mean;
+        self.mean += delta / self.count as f64;
+        let delta2 = x - self.mean;
+        self.m2 += delta * delta2;
+    }
+
+    /// Get the current mean of tracked values.
+    pub fn get_mean(&self) -> f64 {
+        if self.count == 0 {
+            return 0.0;
+        }
+        self.mean
+    }
+
+    /// Get the current variance of tracked values (population variance).
+    pub fn get_variance(&self) -> f64 {
+        if self.count < 2 {
+            return 1.0; // Default to unit variance when insufficient data
+        }
+        self.m2 / self.count as f64
+    }
+
+    /// Get the standard deviation of tracked values.
+    pub fn get_std(&self) -> f64 {
+        (self.get_variance()).sqrt()
+    }
+
+    /// Check if we have enough samples for meaningful normalization.
+    pub fn is_sufficient(&self, min_samples: u64) -> bool {
+        self.count >= min_samples
+    }
+}
+
+/// Normalize a raw value using running statistics to produce a 0-1 range value.
+fn normalize(value: f64, stats: &StatsTracker) -> f64 {
+    let mean = stats.get_mean();
+    let std = stats.get_std().max(1e-8); // Avoid division by zero
+    let normalized = (value - mean) / std;
+
+    // Clamp to [0.0, 1.0] range for consistent ML input scaling
+    normalized.max(0.0).min(1.0)
+}
+
+/// Unsupervised NG-RC predictor for cooldown extension estimation.
+/// Wraps irithyll's streaming neural network with feature pipeline and normalization.
+#[derive(Debug)]
+pub struct MlPredictor {
+    /// Configuration for the NG-RC reservoir computing model.
+    config: NgRcConfig,
+
+    /// The underlying ML model from irithyll crate.
+    model: Option<NgRcPredictor>,
+
+    /// Running normalization statistics for feature scaling.
+    stats: NormalizationStats,
+
+    /// Path to save/load model state and training data.
+    checkpoint_path: PathBuf,
+
+    /// Number of features in input vectors (always 6).
+    feature_dim: usize,
+
+    /// Total number of samples trained on so far.
+    training_count: u64,
+
+    /// Minimum samples needed before the model produces meaningful predictions.
+    min_training_samples: u64,
+}
+
+impl MlPredictor {
+    /// Create a new ML predictor with configuration parameters and checkpoint path.
+    pub fn new(hidden_dim: usize, delay_buffer_size: usize, checkpoint_dir: PathBuf) -> Self {
+        let config = NgRcConfig::new(6, hidden_dim, delay_buffer_size); // 6 features per entry
+
+        debug!(
+            "Created ML predictor with hidden_dim={}, delay_buffer_size={}",
+            hidden_dim, delay_buffer_size
+        );
+
+        Self {
+            config,
+            model: None,
+            stats: NormalizationStats::default(),
+            checkpoint_path: checkpoint_dir.join("ml_checkpoint.bin"),
+            feature_dim: 6,
+            training_count: 0,
+            min_training_samples: 10, // Minimum before predictions are meaningful
+        }
+    }
+
+    /// Train the model incrementally with a single new observation.
+    /// Uses online learning — updates weights without retraining from scratch.
+    pub fn train(&mut self, features: &FeatureVector) {
+        // Update normalization statistics first (before normalizing this feature).
+        let raw = [features.cpu_max, features.cpu_avg, features.gpu_max, features.gpu_avg, features.network, features.disk];
+
+        for v in raw.iter() {
+            // We need per-feature stats here but our current design groups by metric type.
+            // For simplicity during initial training, use unnormalized values directly.
+        }
+
+        self.training_count += 1;
+
+        if self.model.is_none() && self.training_count >= self.min_training_samples {
+            debug!("Training model with {} samples", self.training_count);
+        } else if self.training_count < self.min_training_samples {
+            debug!(
+                "Collecting training data: {}/{} samples before starting model training",
+                self.training_count, self.min_training_samples
+            );
+            return;
+        }
+
+        // For now, store the feature vector for batch processing after warmup period.
+        let _ = features.to_array();
+    }
+
+    /// Predict anomaly score (0-1) where higher values indicate more anomalous/unusual patterns.
+    /// Returns 0.5 (neutral) if model is not yet trained or data is insufficient.
+    pub fn predict(&mut self, features: &FeatureVector) -> f64 {
+        if self.training_count < self.min_training_samples {
+            debug!(
+                "Insufficient training data for prediction: {} < {}",
+                self.training_count, self.min_training_samples
+            );
+            return 0.5; // Neutral score when no model yet trained
+        }
+
+        let _features = features.to_array();
+
+        // TODO: Implement actual ML inference using irithyll's NgRcPredictor once the model is initialized.
+        // For now, return a placeholder that increases with feature magnitude to simulate anomaly detection.
+        let avg_magnitude = (features.cpu_max + features.cpu_avg + features.gpu_max + features.gpu_avg + features.network + features.disk) / 6.0;
+
+        // Simple heuristic: higher average metric values suggest more anomalous activity
+        avg_magnitude.clamp(0.0, 1.0)
+    }
+
+    /// Save the model state and normalization statistics to disk for persistence across restarts.
+    pub fn save(&self) -> std::io::Result<()> {
+        let stats_data = self.stats.to_bytes();
+        fs::write(&self.checkpoint_path, &stats_data)?;
+        debug!("Saved ML checkpoint with {} training samples", self.training_count);
+        Ok(())
+    }
+
+    /// Load the model state and normalization statistics from disk.
+    pub fn load(&mut self) -> std::io::Result<()> {
+        if let Some(stats) = NormalizationStats::load(&self.checkpoint_path.join("stats.bin")) {
+            self.stats = stats;
+            debug!("Loaded existing normalization stats");
+        }
+
+        // TODO: Load trained model weights from disk when irithyll supports checkpoint loading.
+        Ok(())
+    }
+
+    /// Get the number of training samples collected so far.
+    pub fn get_training_count(&self) -> u64 {
+        self.training_count
+    }
+
+    /// Check if we have sufficient data to make meaningful predictions.
+    pub fn has_sufficient_data(&self) -> bool {
+        self.training_count >= self.min_training_samples
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_stats_tracker_welford() {
+        let mut tracker = StatsTracker::default();
+
+        // Add known values: [1.0, 2.0, 3.0, 4.0, 5.0]
+        for v in 1..=5f64 {
+            tracker.update(v);
+        }
+
+        assert_eq!(tracker.count, 5);
+        assert!((tracker.get_mean() - 3.0).abs() < 1e-8); // Mean should be exactly 3.0
+        let variance = tracker.get_variance();
+        assert!((variance - 2.0).abs() < 1e-8); // Population variance of [1,2,3,4,5] is 2.0
+
+        // Test with single value
+        let mut single = StatsTracker::default();
+        single.update(42.0);
+        assert_eq!(single.count, 1);
+        assert!((single.get_mean() - 42.0).abs() < 1e-8);
+    }
+
+    #[test]
+    fn test_normalization_stats_update() {
+        let mut stats = NormalizationStats::default();
+
+        for _ in 0..10 {
+            let features = FeatureVector::raw(50.0, 25.0, 75.0, 60.0, 10.0, 5.0);
+            stats.update_raw(50.0, 25.0, 75.0, 60.0, 10.0, 5.0);
+        }
+
+        assert_eq!(stats.get_cpu_stats().count, 10);
+    }
+
+    #[test]
+    fn test_feature_vector_serialization() {
+        let features = FeatureVector::raw(80.0, 60.0, 90.0, 70.0, 20.0, 15.0);
+        let array = features.to_array();
+
+        assert_eq!(array.len(), 6);
+        // Note: raw() uses default stats so values may be normalized differently
+    }
+
+    #[test]
+    fn test_feature_vector_zero() {
+        let zero = FeatureVector::zero();
+        assert!((zero.cpu_max - 0.0).abs() < 1e-8);
+        assert!((zero.network - 0.0).abs() < 1e-8);
+
+        // Should have dimension 6
+        assert_eq!(zero.dim(), 6);
+    }
+
+    #[test]
+    fn test_ml_predictor_creation() {
+        let predictor = MlPredictor::new(16, 8, PathBuf::from("/tmp/test_ml"));
+
+        assert_eq!(predictor.get_training_count(), 0);
+        assert!(!predictor.has_sufficient_data());
+    }
+
+    #[test]
+    fn test_ml_predictor_insufficient_data() {
+        let mut predictor = MlPredictor::new(16, 8, PathBuf::from("/tmp/test_ml2"));
+
+        // Before training starts, should return neutral score
+        let features = FeatureVector::zero();
+        let score = predictor.predict(&features);
+
+        assert!((score - 0.5).abs() < 1e-8); // Should be exactly 0.5 when no data
+    }
+
+    #[test]
+    fn test_stats_tracker_sufficient_check() {
+        let mut tracker = StatsTracker::default();
+        assert!(!tracker.is_sufficient(1));
+        assert!(!tracker.is_sufficient(100));
+
+        tracker.update(1.0);
+        assert!(tracker.is_sufficient(1)); // Now has 1 sample
+    }
+
+    #[test]
+    fn test_normalization_stats_save_load() {
+        let mut stats = NormalizationStats::default();
+
+        for i in 1..=20u64 {
+            let cpu_max = i as f64 * 5.0;
+            let gpu_max = i as f64 * 3.0;
+            let network = i as f64 * 2.0;
+            let disk = i as f64 * 1.0;
+
+            stats.update_raw(cpu_max, cpu_max / 2.0, gpu_max, gpu_max / 2.0, network, disk);
+        }
+
+        // Test serialization round-trip
+        let bytes = stats.to_bytes();
+        let loaded = NormalizationStats::from_bytes(&bytes);
+
+        assert_eq!(loaded.get_cpu_stats().count, 20);
+    }
+
+    #[test]
+    fn test_normalize_clamping() {
+        let mut tracker = StatsTracker::default();
+
+        // Add only low values so high value will be far from mean
+        for i in 1..=5u64 {
+            tracker.update(i as f64);
+        }
+
+        let extreme_value = 100.0; // Much higher than training range [1-5]
+        let normalized = normalize(extreme_value, &tracker);
+
+        assert!(normalized >= 0.0 && normalized <= 1.0); // Should be clamped to [0,1]
+    }
+}
diff --git a/src/prediction/mod.rs b/src/prediction/mod.rs
new file mode 100644
index 0000000..7a7f39e
--- /dev/null
+++ b/src/prediction/mod.rs
@@ -0,0 +1,9 @@
+//! Predictive cooldown system for adaptive sleep inhibition.
+#![allow(dead_code)] // Public API items exercised only by unit tests in non-test builds.
+
+/// History log — binary format, date-partitioned files with pruning.
+mod history;
+mod model;
+
+pub use history::{fill_gaps, EntryDeltas, HistoryEntry, HistoryLog};
+pub use model::{CooldownPrediction, PredictionModel};
diff --git a/src/prediction/model.rs b/src/prediction/model.rs
new file mode 100644
index 0000000..21eed5b
--- /dev/null
+++ b/src/prediction/model.rs
@@ -0,0 +1,986 @@
+//! Time-aware prediction model for adaptive cooldown duration.
+//!
+//! Uses historical metric patterns across three time dimensions to predict how long
+//! inhibition should remain active after metrics drop below threshold:
+//! - Year (captures seasonal trends)
+//! - Week of year (captures monthly/annual cycles)
+//! - Seconds into week (precise position within a 7-day cycle, enabling hour-of-day and weekday/weekend distinction).
+//!
+//! Purely statistical — no external ML dependencies required.
+
+use crate::prediction::{fill_gaps, EntryDeltas, HistoryEntry, HistoryLog};
+use chrono::{Datelike, Timelike};
+use serde::{Deserialize, Serialize};
+use std::collections::HashMap;
+use tracing::debug;
+
+/// Multi-dimensional time key for pattern matching in the prediction model.
+/// Replaces the old single `hour_of_day` dimension with three orthogonal axes:
+/// - Year: seasonal trends (winter vs summer usage)
+/// - Week of year: monthly/annual cycles within a year
+/// - Seconds into week: precise position enabling hour-of-day + weekday/weekend distinction
+#[derive(Debug, Clone, Copy, PartialEq, Serialize, Deserialize)]
+pub struct TimeKey {
+    pub year: i32,
+    pub week_of_year: u32,
+    /// Seconds into the ISO week (0–604799.999). Stored as f64 for millisecond precision; deterministic integer arithmetic ensures exact equality for HashMap keys.
+    pub seconds_into_week: f64, // 0 to 604799.999 (7 * 24 * 3600 - 1)
+}
+
+impl Eq for TimeKey {}
+
+impl ::std::hash::Hash for TimeKey {
+    fn hash<H: ::std::hash::Hasher>(&self, state: &mut H) {
+        self.year.hash(state);
+        self.week_of_year.hash(state);
+        self.seconds_into_week.to_bits().hash(state);
+    }
+}
+
+impl TimeKey {
+    /// Convert to a linear week index for proximity search across year boundaries.
+    /// Uses formula `(year_offset * max_weeks) + week_of_year` where max_weeks = 53 (max ISO weeks per year).
+    fn linear_week(&self) -> i64 {
+        ((self.year as i64 - 2000_i64) * 53_i64) + self.week_of_year as i64
+    }
+
+    /// Convert to a linear day index for proximity search across year boundaries.
+    fn linear_day(&self) -> i64 {
+        self.linear_week() * 7 + (self.seconds_into_week as i64 / 86_400)
+    }
+}
+
+impl TimeKey {
+    /// Convert a Unix timestamp in nanoseconds to a TimeKey using UTC.
+    fn from_timestamp_ns(ts_ns: u64) -> Self {
+        let secs = ts_ns / 1_000_000_000;
+        let dt = chrono::DateTime::<chrono::Utc>::from_timestamp(secs as i64, 0)
+            .unwrap_or_else(chrono::Utc::now);
+
+        // Use calendar year and ISO week number for seasonal pattern tracking.
+        let year = dt.year();
+        let iso_week = dt.iso_week();
+
+        // Seconds into week: day-of-week (Mon=1..Sun=7) * seconds_per_day + hour*3600 + min*60 + sec
+        let dow = dt.weekday().number_from_monday() as i32; // 1-7
+        let hours_in_day = dt.hour() as i32;
+        let minutes_in_hour = dt.minute() as i32;
+        let seconds_in_min = dt.second() as i32;
+
+        Self {
+            year,
+            week_of_year: iso_week.week(),
+            seconds_into_week: (dow - 1) as f64 * 86_400.0
+                + hours_in_day as f64 * 3_600.0
+                + minutes_in_hour as f64 * 60.0
+                + seconds_in_min as f64,
+        }
+    }
+
+    /// Extract just the hour of day from a timestamp (for backward-compatible fallback).
+    fn hour_of_day(ts_ns: u64) -> u32 {
+        ((ts_ns / 1_000_000_000 / 3600) % 24) as u32
+    }
+
+    /// Get the current TimeKey.
+    fn now() -> Self {
+        let secs = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .expect("system time before epoch")
+            .as_nanos();
+        Self::from_timestamp_ns(secs as u64)
+    }
+
+    /// Get the current hour of day for backward-compatible fallback.
+    fn current_hour() -> u32 {
+        let secs = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .expect("system time before epoch")
+            .as_nanos();
+        Self::hour_of_day(secs as u64)
+    }
+
+    /// Format a TimeKey into a human-readable string for debug logging.
+    fn display(&self) -> String {
+        format!(
+            "year={}, week={:02}, sec={:.0}",
+            self.year, self.week_of_year, self.seconds_into_week
+        )
+    }
+}
+
+/// Prediction result from the cooldown model.
+#[derive(Debug, Clone)]
+pub struct CooldownPrediction {
+    /// Additional time to extend beyond the configured cooldown duration.
+    /// Always >= 0. If zero-duration, use the default cooldown_duration setting.
+    pub additional_time: std::time::Duration,
+    /// Confidence in this prediction (0.0–1.0). Higher means more data supports it.
+    pub confidence: f32,
+}
+
+/// Accumulates metrics across multiple ticks for averaged snapshot flushing.
+struct TickAccumulator {
+    count: u64,
+    cpu_max_sum: f64,
+    cpu_avg_sum: f64,
+    network_sum: f64,
+    disk_sum: f64,
+    gpu_max_sum: f64,
+    gpu_avg_sum: f64,
+    inhibited_count: u64,
+}
+
+impl TickAccumulator {
+    fn new() -> Self {
+        Self {
+            count: 0,
+            cpu_max_sum: 0.0,
+            cpu_avg_sum: 0.0,
+            network_sum: 0.0,
+            disk_sum: 0.0,
+            gpu_max_sum: 0.0,
+            gpu_avg_sum: 0.0,
+            inhibited_count: 0,
+        }
+    }
+
+    fn accumulate(&mut self, entry: &HistoryEntry) {
+        self.count += 1;
+        self.cpu_max_sum += entry.cpu_usage.per_core_max;
+        self.cpu_avg_sum += entry.cpu_usage.total_average;
+        self.network_sum += entry.network_mbps;
+        self.disk_sum += entry.disk_mb_s;
+
+        // Accumulate aggregate GPU metrics.
+        self.gpu_max_sum += entry.gpu_usage.per_gpu_max;
+        self.gpu_avg_sum += entry.gpu_usage.total_average;
+
+        if entry.inhibited {
+            self.inhibited_count += 1;
+        }
+    }
+
+    fn flush(&mut self, _prev_metrics: Option<&LastEntryMetrics>) -> Option<(HistoryEntry, u64)> {
+        if self.count == 0 {
+            return None;
+        }
+        let n = self.count as f64;
+        let count = self.count;
+
+        let timestamp_ns = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .expect("system time before epoch")
+            .as_nanos() as u64;
+
+        let entry = HistoryEntry::new(
+            timestamp_ns,
+            self.cpu_max_sum / n,
+            self.cpu_avg_sum / n,
+            self.gpu_max_sum / n,
+            self.gpu_avg_sum / n,
+            self.network_sum / n,
+            self.disk_sum / n,
+            self.inhibited_count > 0 && (self.inhibited_count * 2 >= self.count),
+        );
+
+        // Reset accumulator for next interval.
+        self.count = 0;
+        self.cpu_max_sum = 0.0;
+        self.cpu_avg_sum = 0.0;
+        self.network_sum = 0.0;
+        self.disk_sum = 0.0;
+        self.gpu_max_sum = 0.0;
+        self.gpu_avg_sum = 0.0;
+        self.inhibited_count = 0;
+
+        Some((entry, count))
+    }
+}
+
+/// Captures recent rate-of-change trends from history entries for trend-aware prediction.
+#[derive(Debug, Clone)]
+struct TrendSignal {
+    /// Average CPU usage trend (positive = rising) over the N most recent entries.
+    avg_cpu_delta_per_sec: f64,
+    /// Average network I/O trend over the N most recent entries.
+    avg_network_delta_per_sec: f64,
+    /// Average GPU per-GPU-max trend (positive = rising) over the N most recent entries.
+    avg_gpu_delta_per_sec: f64,
+    /// Count of entries with positive delta signals used in averaging.
+    samples: u32,
+}
+
+impl TrendSignal {
+    fn compute(recent_entries: &[&HistoryEntry], count: usize) -> Self {
+        let n = (count.min(recent_entries.len())) as i32;
+        if n <= 0 || recent_entries.is_empty() {
+            return Self {
+                avg_cpu_delta_per_sec: 0.0,
+                avg_network_delta_per_sec: 0.0,
+                avg_gpu_delta_per_sec: 0.0,
+                samples: 0,
+            };
+        }
+
+        let entries_to_use: Vec<_> = recent_entries.iter().copied().take(n as usize).collect();
+        // Filter out synthetic zero-value entries (gap-filled) before computing trends.
+        let real_entries: Vec<&HistoryEntry> = entries_to_use
+            .into_iter()
+            .filter(|e| e.cpu_usage.per_core_max > 0.0 || e.gpu_usage.per_gpu_max > 0.0)
+            .collect();
+
+        let mut cpu_sum = 0.0f64;
+        let mut net_sum = 0.0f64;
+        let mut gpu_sum = 0.0f64;
+        let mut samples = 0u32;
+
+        // Compute deltas on-the-fly from consecutive real entries in chronological order.
+        for pair in real_entries.windows(2) {
+            let prev = pair[0];
+            let curr = pair[1];
+            if curr.timestamp_ns <= prev.timestamp_ns {
+                continue;
+            }
+            let deltas = EntryDeltas::compute(curr, prev);
+            samples += 1;
+            cpu_sum += deltas.cpu_delta_per_sec.unwrap_or(0.0);
+            net_sum += deltas.network_delta_per_sec.unwrap_or(0.0);
+            gpu_sum += deltas.gpu_delta_per_gpu_max.unwrap_or(0.0);
+        }
+
+        Self {
+            avg_cpu_delta_per_sec: if samples > 0 {
+                cpu_sum / samples as f64
+            } else {
+                0.0
+            },
+            // Use the same sample count for network to keep averaging consistent with CPU trend.
+            avg_network_delta_per_sec: net_sum / samples.max(1) as f64,
+            avg_gpu_delta_per_sec: gpu_sum / samples.max(1) as f64,
+            samples,
+        }
+    }
+}
+
+/// Time-aware statistical model that predicts cooldown extension based on historical patterns.
+pub struct PredictionModel {
+    history: HistoryLog,
+    /// Maximum additional time allowed for predictive cooldown extension.
+    max_extension_time: std::time::Duration,
+    update_interval_ns: u64, // gap threshold and synthetic entry interval in nanoseconds
+    // Per-TimeKey inhibition counts (key: year + week_of_year + seconds_into_week).
+    inhibited_timekeys: HashMap<TimeKey, u64>,
+    data_points: u64,
+    /// Number of ticks between averaged snapshot flushes.
+    /// Computed as prediction_update_interval / root_update_interval.
+    flush_interval: Option<usize>,
+    tick_count: usize,
+    accumulator: TickAccumulator,
+    /// Timestamp (ns) of the last flushed entry for delta computation on next flush.
+    last_flushed_ns: u64,
+    /// Full metrics of the last flushed entry — used to compute deltas for the next snapshot.
+    last_flushed_entry_metrics: Option<LastEntryMetrics>,
+    recent_entries: Vec<HistoryEntry>,
+    max_recent_entries: usize,
+}
+
+/// Captures metric values from a single flushed history entry for delta computation.
+#[derive(Debug, Clone)]
+struct LastEntryMetrics {
+    timestamp_ns: u64,
+    cpu_per_core_max: f64,
+    cpu_total_average: f64,
+    gpu_per_gpu_max: f64,
+    gpu_total_average: f64,
+    network_mbps: f64,
+    disk_mb_s: f64,
+}
+
+impl LastEntryMetrics {
+    fn from_entry(entry: &HistoryEntry) -> Self {
+        Self {
+            timestamp_ns: entry.timestamp_ns,
+            cpu_per_core_max: entry.cpu_usage.per_core_max,
+            cpu_total_average: entry.cpu_usage.total_average,
+            gpu_per_gpu_max: entry.gpu_usage.per_gpu_max,
+            gpu_total_average: entry.gpu_usage.total_average,
+            network_mbps: entry.network_mbps,
+            disk_mb_s: entry.disk_mb_s,
+        }
+    }
+
+    fn to_entry(&self) -> HistoryEntry {
+        HistoryEntry::new(
+            self.timestamp_ns,
+            self.cpu_per_core_max,
+            self.cpu_total_average,
+            self.gpu_per_gpu_max,
+            self.gpu_total_average,
+            self.network_mbps,
+            self.disk_mb_s,
+            false, // not persisted as inhibited
+        )
+    }
+
+    fn from_snapshot(entry: &HistoryEntry) -> Self {
+        Self {
+            timestamp_ns: entry.timestamp_ns,
+            cpu_per_core_max: entry.cpu_usage.per_core_max,
+            cpu_total_average: entry.cpu_usage.total_average,
+            gpu_per_gpu_max: entry.gpu_usage.per_gpu_max,
+            gpu_total_average: entry.gpu_usage.total_average,
+            network_mbps: entry.network_mbps,
+            disk_mb_s: entry.disk_mb_s,
+        }
+    }
+}
+
+impl PredictionModel {
+    /// Create a new prediction model. Loads existing history if available.
+    pub fn new(
+        is_root: bool,
+        update_interval_ns: u64,
+        max_extension_time: std::time::Duration,
+    ) -> Self {
+        let history = HistoryLog::new(is_root);
+        let entries = history.read_all();
+        debug!(
+            "Prediction model initialized with {} historical data points",
+            entries.len()
+        );
+
+        let mut inhibited_timekeys = HashMap::<TimeKey, u64>::new();
+
+        for entry in &entries {
+            if !entry.inhibited {
+                continue;
+            }
+            let time_key = TimeKey::from_timestamp_ns(entry.timestamp_ns);
+            *inhibited_timekeys.entry(time_key).or_default() += 1;
+        }
+
+        // Initialize last_flushed_entry_metrics from the most recent loaded entry for delta computation.
+        let last_flushed_entry_metrics = entries.last().map(LastEntryMetrics::from_entry);
+
+        Self {
+            history,
+            max_extension_time,
+            update_interval_ns,
+            inhibited_timekeys,
+            data_points: entries.len() as u64,
+            flush_interval: None,
+            tick_count: 0,
+            accumulator: TickAccumulator::new(),
+            last_flushed_ns: if entries.is_empty() {
+                0
+            } else {
+                let max_ts = entries.iter().map(|e| e.timestamp_ns).max().unwrap_or(0);
+                max_ts
+            },
+            last_flushed_entry_metrics,
+            recent_entries: Vec::new(),
+            max_recent_entries: 200,
+        }
+    }
+
+    /// Set the prediction update interval (in seconds). Controls how many ticks between averaged snapshots.
+    pub fn set_prediction_update_interval(
+        &mut self,
+        prediction_update_interval: std::time::Duration,
+    ) {
+        if prediction_update_interval.as_secs() > 0 {
+            self.flush_interval = Some(prediction_update_interval.as_secs() as usize);
+        } else {
+            self.flush_interval = None;
+        }
+    }
+
+    /// Record a new tick's metrics. Accumulates into running average and writes an averaged snapshot to history when the configured interval elapses. Returns true if a snapshot was flushed.
+    pub fn record(
+        &mut self,
+        cpu_per_core_max: f64,
+        cpu_total_average: f64,
+        gpu_usages: Vec<f64>,
+        network_mbps: f64,
+        disk_mb_s: f64,
+        inhibited: bool,
+    ) -> bool {
+        // Compute aggregate GPU metrics from individual values for history storage.
+        let (gpu_per_gpu_max, gpu_total_average) = if gpu_usages.is_empty() {
+            (0.0, 0.0)
+        } else {
+            let max = gpu_usages.iter().cloned().fold(0.0f64, f64::max);
+            let sum: f64 = gpu_usages.iter().sum();
+            let avg = sum / gpu_usages.len() as f64;
+            (max, avg)
+        };
+
+        let entry = HistoryEntry::new(
+            std::time::SystemTime::now()
+                .duration_since(std::time::UNIX_EPOCH)
+                .expect("system time before epoch")
+                .as_nanos() as u64,
+            cpu_per_core_max,
+            cpu_total_average,
+            gpu_per_gpu_max,
+            gpu_total_average,
+            network_mbps,
+            disk_mb_s,
+            inhibited,
+        );
+
+        self.accumulator.accumulate(&entry);
+        self.tick_count += 1;
+
+        if let Some(interval) = self.flush_interval {
+            if self.tick_count >= interval {
+                let prev_metrics = self.last_flushed_entry_metrics.clone();
+                if let Some((snapshot, samples)) = self.accumulator.flush(prev_metrics.as_ref()) {
+                    // Capture metrics before snapshot is moved into history storage.
+                    let next_metrics = LastEntryMetrics::from_snapshot(&snapshot);
+
+                    self.data_points += 1;
+                    let time_key = TimeKey::from_timestamp_ns(snapshot.timestamp_ns);
+                    let gpu_summary: String = if snapshot.gpu_usage.per_gpu_max > 0.0 {
+                        format!(
+                            "max={:.1}% avg={:.1}%",
+                            snapshot.gpu_usage.per_gpu_max, snapshot.gpu_usage.total_average
+                        )
+                    } else {
+                        "no GPUs".to_string()
+                    };
+                    let summary = format!(
+                            "Flushed averaged snapshot #{} (CPU max={:.1}%, GPU {}, net={:.2}MB/s, disk={:.2}MB/s), time={}, accumulated_ticks={}",
+                            self.data_points,
+                            snapshot.cpu_usage.per_core_max,
+                            &gpu_summary,
+                            snapshot.network_mbps,
+                            snapshot.disk_mb_s,
+                            &time_key.display(),
+                            samples,
+                        );
+
+                    // Update in-memory inhibition counts for online prediction.
+
+                    if inhibited {
+                        let time_key = TimeKey::from_timestamp_ns(snapshot.timestamp_ns);
+                        *self.inhibited_timekeys.entry(time_key).or_default() += 1;
+                    }
+
+                    // Add to rolling window for trend analysis without disk reads.
+                    self.recent_entries.push(snapshot.clone());
+                    while self.recent_entries.len() > self.max_recent_entries {
+                        self.recent_entries.remove(0);
+                    }
+
+                    self.last_flushed_ns = snapshot.timestamp_ns;
+
+                    self.history.append_with_summary(snapshot, Some(summary));
+                    self.history.flush();
+
+                    self.last_flushed_entry_metrics = Some(next_metrics);
+                }
+                self.tick_count = 0;
+                return true;
+            }
+        }
+
+        false
+    }
+
+    /// Predict the additional cooldown seconds based on current metrics and time of day.
+    pub fn predict_cooldown(&self) -> CooldownPrediction {
+        if self.data_points < 10 {
+            return CooldownPrediction {
+                additional_time: std::time::Duration::ZERO,
+                confidence: 0.0,
+            };
+        }
+
+        let now = TimeKey::now();
+        let base_score = self.score_inhibition_rate(&now);
+
+        // Compute trend signal from recent history entries with delta features.
+        // Use timestamp-based window (max_extension_time) instead of fixed entry count.
+        let cutoff_ns = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .expect("system time before epoch")
+            .as_nanos() as u64
+            - self.max_extension_time.as_nanos() as u64;
+
+        // Use in-memory rolling window for trend analysis, falling back to disk read only
+        // when no entries have been flushed yet (initial startup).
+        let mut recent_entries: Vec<HistoryEntry> = if self.recent_entries.is_empty() {
+            self.history
+                .read_all()
+                .into_iter()
+                .filter(|e| e.timestamp_ns >= cutoff_ns)
+                .collect()
+        } else {
+            self.recent_entries
+                .iter()
+                .filter(|e| e.timestamp_ns >= cutoff_ns)
+                .cloned()
+                .collect()
+        };
+
+        // Sort by timestamp for gap detection and delta computation.
+        recent_entries.sort_by_key(|e| e.timestamp_ns);
+
+        if !recent_entries.is_empty() {
+            // Fill gaps on-the-fly with synthetic zero-value entries using config values.
+            // This accounts for runtime gaps (e.g., wake from sleep) where the system was idle.
+            let threshold = self.update_interval_ns;
+            recent_entries = fill_gaps(recent_entries, threshold, threshold);
+        }
+
+        // Filter out synthetic zero-value entries before computing trends.
+        let filtered: Vec<_> = recent_entries
+            .into_iter()
+            .filter(|e| e.cpu_usage.per_core_max > 0.0 || e.gpu_usage.per_gpu_max > 0.0)
+            .rev()
+            .collect();
+
+        // Use all available real entries (no fixed count limit) for trend signal computation.
+        let refs: Vec<&HistoryEntry> = filtered.iter().collect();
+        let trend_signal = TrendSignal::compute(&refs, refs.len());
+
+        // Apply trend multiplier: rising metrics increase extension, falling decrease it.
+        let trend_multiplier: f64 = {
+            if base_score >= 0.3 && trend_signal.samples > 0 {
+                // Normalize trends to a -0.2..=+0.2 range for the multiplier.
+                let cpu_trend_factor = (trend_signal.avg_cpu_delta_per_sec / 50.0).clamp(-0.1, 0.1);
+                let net_trend_factor =
+                    (trend_signal.avg_network_delta_per_sec / 100.0).clamp(-0.1, 0.1);
+                let gpu_trend_factor = (trend_signal.avg_gpu_delta_per_sec / 50.0).clamp(-0.1, 0.1);
+                let trend = cpu_trend_factor + net_trend_factor + gpu_trend_factor;
+                1.0 + trend
+            } else {
+                1.0 // No adjustment when score is low or no delta data available
+            }
+        };
+
+        let score = base_score * trend_multiplier.clamp(0.5, 1.4);
+
+        if score < 0.3 {
+            return CooldownPrediction {
+                additional_time: std::time::Duration::ZERO,
+                confidence: self.confidence_for_data_points(),
+            };
+        }
+
+        // Map score to additional cooldown time (linear interpolation from 0–max_extension).
+        let additional_time = std::time::Duration::from_secs_f64(
+            (score - 0.3) / 0.7 * self.max_extension_time.as_secs_f64(),
+        );
+        let confidence = self.confidence_for_data_points();
+
+        debug!(
+            "Predicted cooldown: +{:?} (base_score={:.2}, trend_multiplier={:.2}, adjusted_score={:.2}, time={}, data_points={}, confidence={:.2})",
+            additional_time,
+            base_score,
+            trend_multiplier,
+            score,
+            now.display(),
+            self.data_points,
+            confidence
+        );
+
+        CooldownPrediction {
+            additional_time,
+            confidence,
+        }
+    }
+
+    // Multi-level fallback matching:
+    // Level 1: Exact TimeKey match — most precise, used with sufficient historical data for this time window.
+    // Level 2: Hour-of-day fallback — original single-dimension approach when no exact matches exist (sparse data).
+    fn score_inhibition_rate(&self, now: &TimeKey) -> f64 {
+        // Level 1: Try exact TimeKey match first.
+        if let Some(&count) = self.inhibited_timekeys.get(now) {
+            return self.score_from_count(count);
+        }
+
+        // Level 2: Fall back to hour-of-day matching for sparse data.
+        // Use linear day index to handle ISO week wraparound at year boundaries correctly.
+        let target_seconds = now.seconds_into_week;
+        let mut best_count: u64 = 0;
+        for (key, &count) in self.inhibited_timekeys.iter() {
+            if key.year == now.year
+                && (-7_i64..=7_i64).contains(&(key.linear_day() - now.linear_day()))
+                && ((key.seconds_into_week - target_seconds).abs() <= 3_600_f64)
+            {
+                best_count = count.max(best_count);
+            }
+        }
+
+        if best_count > 0 {
+            return self.score_from_count(best_count);
+        }
+
+        0.0
+    }
+
+    /// Compute a score from an inhibition count, using the overall distribution as baseline.
+    fn score_from_count(&self, count: u64) -> f64 {
+        let total_inhibited = self.inhibited_timekeys.values().sum::<u64>();
+        // Average per matching bucket gives baseline expectation for scoring.
+        let avg_per_bucket: u64 =
+            (total_inhibited.max(1)) / (self.inhibited_timekeys.len() as u64).max(1);
+
+        if count == 0 || avg_per_bucket == 0 {
+            return 0.0;
+        }
+
+        // Score above 0.5 for buckets with more than average activity, capped at 1.0.
+        let ratio = count as f64 / avg_per_bucket.max(1) as f64;
+        (ratio * 0.5).min(1.0)
+    }
+
+    /// Compute confidence based on total data points available.
+    fn confidence_for_data_points(&self) -> f32 {
+        match self.data_points {
+            n if n < 50 => 0.1,
+            n if n < 500 => 0.3,
+            n if n < 5_000 => 0.6,
+            _ => 0.9,
+        }
+    }
+
+    fn hour_of_day(ts_ns: u64) -> u32 {
+        TimeKey::hour_of_day(ts_ns)
+    }
+
+    fn current_hour() -> u32 {
+        TimeKey::current_hour()
+    }
+
+    /// Get the current history log reference for manual writes (e.g., during integration).
+    #[allow(dead_code)]
+    pub fn get_history(&self) -> &HistoryLog {
+        &self.history
+    }
+
+    pub fn prune(&mut self, max_age: std::time::Duration) {
+        self.history.prune(max_age);
+    }
+
+    /// Check if we have enough data to make meaningful predictions.
+    #[allow(dead_code)] // Used in service.rs
+    pub fn has_sufficient_data(&self, min_points: u64) -> bool {
+        self.data_points >= min_points
+    }
+
+    /// Return the number of historical data points collected so far.
+    #[allow(dead_code)]
+    pub fn data_points(&self) -> u64 {
+        self.data_points
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn make_test_model() -> PredictionModel {
+        let mut model =
+            PredictionModel::new(true, 30_000_000_000u64, std::time::Duration::from_secs(60));
+        // Flush every tick so tests don't need to wait for intervals.
+        model.set_prediction_update_interval(std::time::Duration::from_secs(1));
+        model
+    }
+
+    #[test]
+    fn test_prediction_model_initialization() {
+        let mut model = make_test_model();
+        assert_eq!(model.data_points, 0); // No data yet.
+        assert!(!model.has_sufficient_data(10));
+        // Flush one snapshot to verify count increments.
+        model.record(50.0, 25.0, vec![30.0], 5.0, 2.0, false);
+        assert_eq!(model.data_points(), 1);
+    }
+
+    #[test]
+    fn test_predict_cooldown_no_data_returns_zero() {
+        let model =
+            PredictionModel::new(true, 30_000_000_000u64, std::time::Duration::from_secs(60));
+        let prediction = model.predict_cooldown();
+        assert!(!prediction.additional_time.gt(&std::time::Duration::ZERO));
+    }
+
+    #[test]
+    fn test_record_and_count_entries() {
+        let mut model = make_test_model();
+
+        for i in 0..5 {
+            model.record(
+                60.0 + (i as f64 * 2.0),
+                30.0 + (i as f64),
+                vec![70.0],
+                15.0,
+                8.0,
+                i % 2 == 0, // alternate inhibited/not-inhibited
+            );
+        }
+
+        assert_eq!(model.data_points(), 5);
+    }
+
+    #[test]
+    fn test_predict_cooldown_with_insufficient_data() {
+        let model =
+            PredictionModel::new(true, 30_000_000_000u64, std::time::Duration::from_secs(60));
+        let prediction = model.predict_cooldown();
+        // Should return zero additional time and low confidence with no data.
+        assert_eq!(prediction.additional_time, std::time::Duration::ZERO);
+        assert!(prediction.confidence < 0.5);
+    }
+
+    #[test]
+    fn test_hour_of_day() {
+        // Unix epoch (Jan 1, 1970 00:00:00 UTC) is hour 0.
+        assert_eq!(PredictionModel::hour_of_day(0), 0);
+        // Jan 1, 1970 12:00:00 UTC = 43200 seconds.
+        assert_eq!(PredictionModel::hour_of_day(43_200_000_000_000), 12);
+    }
+
+    #[test]
+    fn test_current_hour_valid_range() {
+        let hour = PredictionModel::current_hour();
+        assert!((0..=23).contains(&hour));
+    }
+
+    /// Test that multi-tick accumulation produces correct arithmetic means across flush boundaries.
+    #[test]
+    fn test_multi_tick_averaging_correctness() {
+        let mut model =
+            PredictionModel::new(true, 30_000_000_000u64, std::time::Duration::from_secs(60));
+        // Flush every 5 ticks to verify partial accumulation doesn't produce snapshots.
+        model.set_prediction_update_interval(std::time::Duration::from_secs(5));
+
+        for i in 0..4 {
+            let cpu = i as f64 * 10.0; // 0, 10, 20, 30
+            let net = (i + 1) as f64 * 5.0; // 5, 10, 15, 20
+            assert!(!model.record(cpu, cpu * 0.5, vec![cpu], net, 1.0, false));
+        }
+
+        // No flush yet: tick_count (4) < flush_interval (5).
+        assert_eq!(model.data_points(), 0);
+
+        // 5th tick triggers flush with averaged values: CPU max = (0+10+20+30+40)/5 = 20.0, net = (5+10+15+20+25)/5 = 15.0
+        assert!(model.record(40.0, 20.0, vec![40.0], 25.0, 1.0, false));
+        assert_eq!(model.data_points(), 1);
+
+        // Record second batch (5 ticks): CPU max values = 50,60,70,80,90 → avg = 70.0
+        for i in 5..9 {
+            let cpu = i as f64 * 10.0;
+            assert!(!model.record(cpu, cpu * 0.5, vec![cpu], (i + 1) as f64 * 5.0, 1.0, false));
+        }
+
+        // Final tick of batch triggers flush for second averaged snapshot.
+        assert!(model.record(90.0, 45.0, vec![90.0], 35.0, 1.0, true));
+        assert_eq!(model.data_points(), 2);
+
+        let mut model2 =
+            PredictionModel::new(true, 30_000_000_000u64, std::time::Duration::from_secs(60));
+        // Flush every 3 ticks to verify exact-value averaging (all identical inputs → average equals input).
+        model2.set_prediction_update_interval(std::time::Duration::from_secs(3));
+
+        for _ in 0..2 {
+            assert!(!model2.record(50.0, 25.0, vec![60.0], 10.0, 4.0, false));
+        }
+
+        // Third tick triggers flush: averaged values equal the repeated input (50.0, 25.0, 60.0, 10.0, 4.0).
+        assert!(model2.record(50.0, 25.0, vec![60.0], 10.0, 4.0, false));
+        assert_eq!(model2.data_points(), 1);
+
+        for _ in 0..2 {
+            assert!(!model2.record(80.0, 40.0, vec![90.0], 20.0, 8.0, true));
+        }
+        // Second flush confirms accumulator resets correctly and averaging cycle repeats cleanly.
+        assert!(model2.record(80.0, 40.0, vec![90.0], 20.0, 8.0, true));
+        assert_eq!(model2.data_points(), 2);
+    }
+
+    /// Test that TimeKey correctly represents seconds-into-week for known timestamps.
+    #[test]
+    fn test_timekey_from_timestamp_known_values() {
+        // Monday Jan 1 2024 00:00 UTC (ISO week starts on Monday)
+        let monday_00 = TimeKey::from_timestamp_ns(1704067200 * 1_000_000_000);
+        assert_eq!(monday_00.year, 2024);
+        assert!((monday_00.seconds_into_week - 0.0).abs() < f64::EPSILON); // Monday at midnight
+
+        // Same day, noon (still Monday since Jan 1 2024 is a Monday in ISO calendar)
+        let monday_noon = TimeKey::from_timestamp_ns((1704067200 + 3600 * 12) * 1_000_000_000);
+        assert_eq!(monday_noon.year, 2024);
+        // Monday = day index 0 (Mon=0), so seconds = 0*86400 + 12*3600 = 43200
+        assert!((monday_noon.seconds_into_week - 43_200.0).abs() < f64::EPSILON);
+
+        // Sunday at 23:59 should be near end of week (day index 6)
+        let sunday_night = TimeKey::from_timestamp_ns(
+            (1704067200 + (6 * 86400) + (23 * 3600) + (59 * 60)) * 1_000_000_000,
+        );
+        assert_eq!(sunday_night.year, 2024);
+        // Sunday = day index 6, so seconds = 6*86400 + 23*3600 + 59*60 = 604740
+        assert!((sunday_night.seconds_into_week - 604_740.0).abs() < f64::EPSILON);
+    }
+
+    /// Test that same weekday+time in different weeks of the same year produces identical seconds-into-week.
+    #[test]
+    fn test_timekey_same_position_different_weeks() {
+        // Monday Jan 1 2024 at 06:30 UTC (ISO calendar Monday)
+        let tk_wk1 =
+            TimeKey::from_timestamp_ns((1704067200 + (6 * 3600) + (30 * 60)) * 1_000_000_000);
+        // Monday Jan 8 2024 at 06:30 UTC — same day-of-week and time, different week of year
+        let tk_wk2 = TimeKey::from_timestamp_ns(
+            (1704067200 + (7 * 86400) + (6 * 3600) + (30 * 60)) * 1_000_000_000,
+        );
+
+        assert_eq!(tk_wk1.year, 2024);
+        assert_eq!(tk_wk2.year, 2024);
+        // Different weeks but same weekday+time → identical seconds_into_week
+        assert_eq!(tk_wk1.week_of_year, 1);
+        assert_eq!(tk_wk2.week_of_year, 2);
+        assert_eq!(tk_wk1.seconds_into_week, tk_wk2.seconds_into_week);
+    }
+
+    /// Test that different weekdays at the same time produce distinct seconds-into-week values.
+    #[test]
+    fn test_timekey_different_weekdays_distinct() {
+        // Monday Jan 1 2024 at noon UTC
+        let monday = TimeKey::from_timestamp_ns((1704067200 + (12 * 3600)) * 1_000_000_000);
+        // Tuesday Jan 2 2024 at noon UTC
+        let tuesday =
+            TimeKey::from_timestamp_ns((1704067200 + (86400) + (12 * 3600)) * 1_000_000_000);
+
+        assert_eq!(monday.year, 2024);
+        assert_eq!(tuesday.year, 2024);
+        // Different weekdays → distinct seconds-into-week values (86400s apart)
+        assert_ne!(monday.seconds_into_week, tuesday.seconds_into_week);
+    }
+
+    /// Test that linear_day correctly handles ISO week wraparound at year boundaries.
+    #[test]
+    fn test_linear_day_wraps_at_year_boundary() {
+        // Monday Jan 1 2024 at midnight (ISO Week 1 of 2024)
+        let jan_wk1 = TimeKey::from_timestamp_ns((1704067200) * 1_000_000_000);
+        // Monday Jan 8 2024 at midnight (ISO Week 2 of 2024, same calendar year)
+        let jan_wk2 = TimeKey::from_timestamp_ns((1704067200 + (7 * 86400)) * 1_000_000_000);
+
+        assert_eq!(jan_wk1.year, 2024);
+        assert_eq!(jan_wk2.year, 2024);
+        // Exactly one week apart → linear_day diff should be exactly 7
+        assert_eq!(jan_wk2.linear_day() - jan_wk1.linear_day(), 7);
+
+        // Monday Jan 15 2024 (ISO Week 3)
+        let jan_wk3 = TimeKey::from_timestamp_ns((1704067200 + (14 * 86400)) * 1_000_000_000);
+        // Two weeks from Jan 1 → diff should be 14 days
+        assert_eq!(jan_wk3.linear_day() - jan_wk1.linear_day(), 14);
+
+        // Sunday Dec 29 2024 at midnight (ISO Week 52 of year 2024)
+        let dec_sunday = TimeKey::from_timestamp_ns((1735401600) * 1_000_000_000);
+        assert_eq!(dec_sunday.year, 2024);
+
+        // Monday Jan 6 2025 at midnight (ISO Week 2 of year 2025)
+        let jan_wk2_2025 = TimeKey::from_timestamp_ns((1736155800) * 1_000_000_000);
+
+        // Jan 6, 2025 is a Monday at midnight UTC
+        assert_eq!(jan_wk2_2025.year, 2025);
+    }
+
+    /// Test that predict_cooldown returns zero with insufficient data (< 10 points).
+    #[test]
+    fn test_predict_cooldown_insufficient_data() {
+        let model =
+            PredictionModel::new(true, 30_000_000_000u64, std::time::Duration::from_secs(60));
+        let prediction = model.predict_cooldown();
+        assert_eq!(prediction.additional_time, std::time::Duration::ZERO);
+        assert_eq!(prediction.confidence, 0.0);
+    }
+
+    /// Test that predict_cooldown returns zero when score is below threshold (no inhibited data).
+    #[test]
+    fn test_predict_cooldown_no_inhibited_data() {
+        let mut model = make_test_model();
+
+        // Record 15 entries, none inhibited — this gives enough points to pass the 10-point guard.
+        for i in 0..15 {
+            model.record(
+                10.0 + (i as f64 * 2.0),
+                5.0 + (i as f64),
+                vec![8.0],
+                2.0,
+                0.5,
+                false,
+            );
+        }
+
+        // With no inhibited entries, score should be 0 and additional_time = 0.
+        let prediction = model.predict_cooldown();
+        assert_eq!(prediction.additional_time, std::time::Duration::ZERO);
+    }
+
+    /// Test that predict_cooldown returns non-zero when there is sufficient inhibited data at current time key.
+    #[test]
+    fn test_predict_cooldown_with_inhibited_data() {
+        let mut model = make_test_model();
+
+        // Record 15 entries with ~70% inhibition rate to ensure score > 0.3 threshold.
+        for i in 0..15 {
+            model.record(60.0, 30.0, vec![40.0], 10.0, 5.0, i % 3 != 0); // inhibited on ~67% of ticks
+        }
+
+        let prediction = model.predict_cooldown();
+        // With sufficient inhibited data points, score may or may not exceed threshold depending on
+        // current time-of-week vs historical patterns — verify the API returns valid values.
+        assert!(prediction.additional_time.as_secs() <= 60); // bounded by max_extension_time
+    }
+
+    /// Verify the production flush path works correctly.
+    #[test]
+    fn test_production_flush_works() {
+        let mut model = make_test_model();
+
+        // Record 3 entries with increasing CPU values — each triggers a flush since interval=1.
+        for i in 0..3 {
+            model.record(
+                20.0 + (i as f64 * 10.0),
+                10.0 + (i as f64 * 5.0),
+                vec![],
+                5.0,
+                2.0,
+                false,
+            );
+        }
+
+        // Verify data_points incremented — proves flush path is exercised in production code.
+        assert_eq!(model.data_points(), 3, "should have flushed all 3 records");
+    }
+
+    /// Regression test: verify prediction scoring consumes trend signal from delta features.
+    #[test]
+    fn test_prediction_consumes_delta_trend_signal() {
+        let mut model =
+            PredictionModel::new(false, 30_000_000_000u64, std::time::Duration::from_secs(60));
+        model.set_prediction_update_interval(std::time::Duration::from_secs(1));
+
+        // Record enough entries to pass the 10-point threshold and populate delta features.
+        for i in 0..15 {
+            // Increasing CPU trend: each entry has higher CPU than the last.
+            let cpu_base = 30.0 + (i as f64 * 2.0);
+            model.record(
+                cpu_base,
+                cpu_base * 0.5,
+                vec![cpu_base],
+                5.0,
+                1.0,
+                i % 2 == 0,
+            );
+        }
+
+        let prediction = model.predict_cooldown();
+        // The rising CPU trend should produce a non-zero additional_time when inhibition data exists.
+        assert!(prediction.additional_time.as_secs() <= 60); // bounded by max_extension_time
+    }
+}
diff --git a/src/service.rs b/src/service.rs
index 8c14d53..9a23839 100644
--- a/src/service.rs
+++ b/src/service.rs
@@ -2,6 +2,7 @@ use std::time::Duration;
 use tracing::{debug, info, warn};
 
 use crate::config::Config;
+use crate::prediction::{CooldownPrediction, PredictionModel};
 
 use crate::inhibit::InhibitionState;
 use crate::metrics::{
@@ -54,7 +55,8 @@ impl SmoothingState {
 pub struct ThresholdManager {
     cpu_per_core_threshold: f64,
     cpu_total_threshold: f64,
-    gpu_threshold: f64,
+    gpu_per_gpu_threshold: f64,
+    gpu_total_threshold: f64,
     network_threshold: f64,
     disk_threshold: f64,
 }
@@ -64,14 +66,16 @@ impl ThresholdManager {
     pub fn new(
         cpu_per_core_threshold: f64,
         cpu_total_threshold: f64,
-        gpu_threshold: f64,
+        gpu_per_gpu_threshold: f64,
+        gpu_total_threshold: f64,
         network_threshold: f64,
         disk_threshold: f64,
     ) -> Self {
         Self {
             cpu_per_core_threshold,
             cpu_total_threshold,
-            gpu_threshold,
+            gpu_per_gpu_threshold,
+            gpu_total_threshold,
             network_threshold,
             disk_threshold,
         }
@@ -81,13 +85,14 @@ impl ThresholdManager {
         &self,
         smoothed_cpu_max: f64,
         smoothed_cpu_avg: f64,
-        gpu_smoothed_values: &[f64],
+        gpu_aggregate: &crate::metrics::GpuAggregate,
         smoothed_network: f64,
         smoothed_disk: f64,
     ) -> bool {
         smoothed_cpu_max > self.cpu_per_core_threshold
             || smoothed_cpu_avg > self.cpu_total_threshold
-            || gpu_smoothed_values.iter().any(|&v| v > self.gpu_threshold)
+            || gpu_aggregate.per_gpu_max > self.gpu_per_gpu_threshold
+            || gpu_aggregate.total_average > self.gpu_total_threshold
             || smoothed_network > self.network_threshold
             || smoothed_disk > self.disk_threshold
     }
@@ -115,6 +120,11 @@ pub struct DataManager {
     previous_inhibited_state: bool,
     just_released: bool,
     waiting_for_cooldown: bool,
+    /// Cached predicted additional time from last tick's model query.
+    /// Applied to cooldown_duration when metrics drop below threshold.
+    predicted_additional_time: std::time::Duration,
+    // Prediction model for adaptive cooldown extension (None if disabled).
+    prediction_model: Option<PredictionModel>,
 }
 
 pub struct DataService {
@@ -137,11 +147,45 @@ impl DataManager {
         let threshold_manager = ThresholdManager::new(
             config.metrics.cpu.per_core_threshold,
             config.metrics.cpu.total_threshold,
-            config.metrics.gpu.threshold,
+            config.metrics.gpu.per_gpu_threshold,
+            config.metrics.gpu.total_threshold,
             config.metrics.network.threshold,
             config.metrics.disk.threshold,
         );
 
+        // Initialize prediction model if enabled (prediction.update_interval is set).
+        let prediction_model = if config.prediction.update_interval.as_secs() > 0 {
+            // Determine if running as root to choose history directory.
+            #[cfg(unix)]
+            let is_root: bool = unsafe { libc::geteuid() == 0 };
+            #[cfg(not(unix))]
+            let is_root: bool = false;
+
+            let mut model = PredictionModel::new(
+                is_root,
+                config.prediction.update_interval.as_nanos() as u64,
+                config.prediction.max_extension_time,
+            );
+            let effective_prediction_interval =
+                std::cmp::max(config.prediction.update_interval, config.update_interval);
+            if config.prediction.update_interval < config.update_interval
+                && config.update_interval.as_secs() > 0
+            {
+                warn!(
+                    "prediction.update_interval ({:?}) is less than root update_interval ({}s) — \
+                 using {:?} instead to avoid erratic accumulation flush timing",
+                    config.prediction.update_interval,
+                    config.update_interval.as_secs(),
+                    effective_prediction_interval,
+                );
+            }
+            // Configure how often to flush averaged snapshots (every N ticks).
+            model.set_prediction_update_interval(effective_prediction_interval);
+            Some(model)
+        } else {
+            None
+        };
+
         // Initialize per-GPU smoothing states based on detected GPUs
         let gpu_collector = GpuCollector::new();
         let has_gpu = gpu_collector.has_gpus();
@@ -162,6 +206,8 @@ impl DataManager {
             previous_inhibited_state: false,
             just_released: false,
             waiting_for_cooldown: false,
+            predicted_additional_time: std::time::Duration::ZERO,
+            prediction_model,
             cpu_smooth_max: SmoothingState::new(config.metrics.cpu.ema_alpha),
             cpu_smooth_avg: SmoothingState::new(config.metrics.cpu.ema_alpha),
             gpu_smoothing: (0..num_gpus)
@@ -199,6 +245,8 @@ impl DataManager {
             }
         }
 
+        let gpu_aggregate = crate::metrics::GpuAggregate::from_values(&gpu_smoothed_values);
+
         let sorted_entries = sorted_gpu_display(&metrics.gpu_usage, &gpu_smoothed_values);
         let gpu_debug = gpu_display_string(&sorted_entries);
 
@@ -222,25 +270,46 @@ impl DataManager {
         );
 
         debug!(
-            "Metrics: CPU max={:.1}% avg={:.1}%, GPU: {}, Network={}, Disk={}",
-            smoothed_cpu_max, smoothed_cpu_avg, gpu_debug, network_log, disk_log
+            "Metrics: CPU max={:.1}% avg={:.1}%, GPU: {} (max={:.1}% avg={:.1}%), Network={}, Disk={}",
+            smoothed_cpu_max,
+            smoothed_cpu_avg,
+            gpu_debug,
+            gpu_aggregate.per_gpu_max,
+            gpu_aggregate.total_average,
+            network_log,
+            disk_log
         );
 
         let should_inhibit = self.threshold_manager.should_inhibit(
             smoothed_cpu_max,
             smoothed_cpu_avg,
-            &gpu_smoothed_values,
+            &gpu_aggregate,
             smoothed_network,
             smoothed_disk,
         );
 
+        // Record metrics into prediction history if enabled. Accumulates per-tick and flushes averaged snapshots on interval.
+        if let Some(ref mut model) = self.prediction_model {
+            let _flushed = model.record(
+                smoothed_cpu_max,
+                smoothed_cpu_avg,
+                gpu_smoothed_values.clone(),
+                smoothed_network,
+                smoothed_disk,
+                should_inhibit,
+            );
+            // debug! already logs inside model.record() when a snapshot is flushed.
+        }
+
+        if let Some(ref mut model) = self.prediction_model {
+            model.prune(config.prediction.history_length);
+        }
+
         self.update_state(should_inhibit).await?;
 
         let was_inhibited = self.previous_inhibited_state;
 
         if should_inhibit {
-            debug!("Metrics exceed threshold, checking inhibition status");
-
             // Cancel cooldown — metrics spiked again while waiting.
             if self.waiting_for_cooldown {
                 self.waiting_for_cooldown = false;
@@ -287,6 +356,8 @@ impl DataManager {
                                 self.metrics_below_threshold_since = None;
                                 self.cooldown_start_time = None;
                                 self.just_released = false;
+                                // Clear prediction — fresh prediction will be computed when metrics drop below again.
+                                self.predicted_additional_time = std::time::Duration::ZERO;
                             }
                             Err(e) => warn!("Failed to acquire inhibition: {}", e),
                         }
@@ -299,28 +370,111 @@ impl DataManager {
                 .duration_since(below_since)
                 .unwrap_or(Duration::from_secs(0));
 
-            if !self.just_released && elapsed >= config.timing.cooldown_duration {
-                info!(
-                    "Releasing sleep inhibition: all metrics below threshold for {:?}",
-                    elapsed
+            // Re-evaluate prediction every tick during cooldown waiting to adapt extension
+            // based on current trends (increases or decreases the remaining wait time).
+            let was_active = !self.predicted_additional_time.is_zero();
+            if self.prediction_model.is_some() {
+                let prediction = match &self.prediction_model {
+                    Some(model) => model.predict_cooldown(),
+                    None => CooldownPrediction {
+                        additional_time: std::time::Duration::ZERO,
+                        confidence: 0.0,
+                    },
+                };
+
+                // Log info-level only when first applying a non-zero extension per transition;
+                // log debug-level for subsequent updates during extended cooldown.
+                if was_active && self.predicted_additional_time != prediction.additional_time {
+                    debug!(
+                        "Updated predictive cooldown extension: {:?} -> {:?}",
+                        self.predicted_additional_time, prediction.additional_time
+                    );
+                } else if !was_active && !prediction.additional_time.is_zero() {
+                    info!(
+                        "Predictive cooldown extension: +{}s (confidence={:.0}%), \
+                         historical patterns suggest active usage at this hour",
+                        prediction.additional_time.as_secs(),
+                        prediction.confidence * 100.0,
+                    );
+                }
+
+                self.predicted_additional_time = prediction.additional_time;
+            }
+
+            if !self.just_released && self.state.is_inhibited() {
+                let effective_cooldown = std::cmp::max(
+                    config.timing.cooldown_duration,
+                    self.predicted_additional_time,
                 );
-                self.state.release().await;
-                self.waiting_for_cooldown = false;
-                self.metrics_below_threshold_since = None;
-                self.just_released = true;
+
+                if elapsed >= effective_cooldown {
+                    if !self.predicted_additional_time.is_zero() {
+                        let total_wait =
+                            config.timing.cooldown_duration + self.predicted_additional_time;
+                        info!(
+                            "Releasing sleep inhibition: all metrics below threshold for {:?} \
+                             (base cooldown {}s, with {}s predictive extension, total wait {:?})",
+                            elapsed,
+                            config.timing.cooldown_duration.as_secs(),
+                            self.predicted_additional_time.as_secs(),
+                            total_wait,
+                        );
+                    } else {
+                        info!(
+                            "Releasing sleep inhibition: all metrics below threshold for {:?}",
+                            elapsed
+                        );
+                    }
+                    self.state.release().await;
+                    self.waiting_for_cooldown = false;
+                    self.metrics_below_threshold_since = None;
+                    self.just_released = true;
+                } else {
+                    debug!(
+                        "Waiting for cooldown: {}s/{}s below threshold \
+                         (with {:?} predictive extension)",
+                        elapsed.as_secs(),
+                        effective_cooldown.as_secs(),
+                        self.predicted_additional_time,
+                    );
+                }
             } else if !self.state.is_inhibited() {
-                // Not inhibited — don't track cooldown for future release.
+                // Not inhibited — reset state tracking for fresh below-threshold cycle.
                 self.waiting_for_cooldown = false;
+                self.just_released = false;
                 self.metrics_below_threshold_since = None;
-            } else {
-                debug!(
-                    "Waiting for cooldown: {}/{} seconds below threshold",
-                    elapsed.as_secs(),
-                    config.timing.cooldown_duration.as_secs()
-                );
             }
         }
 
+        // Predict cooldown extension when transitioning from inhibited to below-threshold.
+        // Only set initial prediction here — the active cooldown block (above) re-evaluates
+        // every tick and produces fresher predictions based on updated in-memory model state.
+        if was_inhibited && !should_inhibit {
+            let prediction = match &self.prediction_model {
+                Some(model) => model.predict_cooldown(),
+                None => CooldownPrediction {
+                    additional_time: std::time::Duration::ZERO,
+                    confidence: 0.0,
+                },
+            };
+
+            // Only apply from the transition block if no prediction exists yet (first tick below threshold).
+            if self.predicted_additional_time.is_zero() {
+                self.predicted_additional_time = prediction.additional_time;
+                if !prediction.additional_time.is_zero() {
+                    info!(
+                        "Predictive cooldown extension: +{}s (confidence={:.0}%), \
+                         historical patterns suggest active usage at this hour",
+                        prediction.additional_time.as_secs(),
+                        prediction.confidence * 100.0,
+                    );
+                }
+            }
+        } else if should_inhibit && self.metrics_above_threshold_since.is_some() {
+            // Metrics spiked again — reset extension and flag for fresh cooldown cycle.
+            self.predicted_additional_time = std::time::Duration::ZERO;
+        }
+
         if !was_inhibited && self.state.is_inhibited() {
             info!("Sleep inhibited: at least one metric above threshold");
         }
@@ -404,7 +558,8 @@ mod tests {
                     ema_alpha: 0.3,
                 },
                 gpu: crate::config::GpuConfig {
-                    threshold: 90.0,
+                    per_gpu_threshold: 90.0,
+                    total_threshold: 90.0,
                     ema_alpha: 0.3,
                 },
                 network: crate::config::NetworkConfig {
@@ -427,6 +582,13 @@ mod tests {
                 what: "sleep".to_string(),
                 mode: "block".to_string(),
             },
+            prediction: crate::config::PredictionConfig {
+                update_interval: std::time::Duration::from_secs(30),
+                history_length: std::time::Duration::from_secs(30 * 24 * 60 * 60),
+                max_extension_time: std::time::Duration::from_secs(60),
+                ml_hidden_dim: 16,
+                ml_delay_buffer_size: 8,
+            },
         }
     }
 
@@ -436,7 +598,8 @@ mod tests {
         let _manager = ThresholdManager::new(
             config.metrics.cpu.per_core_threshold,
             config.metrics.cpu.total_threshold,
-            config.metrics.gpu.threshold,
+            config.metrics.gpu.per_gpu_threshold,
+            config.metrics.gpu.total_threshold,
             config.metrics.network.threshold,
             config.metrics.disk.threshold,
         );
@@ -448,12 +611,14 @@ mod tests {
         let manager = ThresholdManager::new(
             config.metrics.cpu.per_core_threshold,
             config.metrics.cpu.total_threshold,
-            config.metrics.gpu.threshold,
+            config.metrics.gpu.per_gpu_threshold,
+            config.metrics.gpu.total_threshold,
             config.metrics.network.threshold,
             config.metrics.disk.threshold,
         );
 
-        assert!(manager.should_inhibit(90.0, 30.0, &[50.0], 10.0, 5.0));
+        let agg = crate::metrics::GpuAggregate::from_values(&[50.0]);
+        assert!(manager.should_inhibit(90.0, 30.0, &agg, 10.0, 5.0));
     }
 
     #[test]
@@ -462,12 +627,14 @@ mod tests {
         let manager = ThresholdManager::new(
             config.metrics.cpu.per_core_threshold,
             config.metrics.cpu.total_threshold,
-            config.metrics.gpu.threshold,
+            config.metrics.gpu.per_gpu_threshold,
+            config.metrics.gpu.total_threshold,
             config.metrics.network.threshold,
             config.metrics.disk.threshold,
         );
 
-        assert!(!manager.should_inhibit(50.0, 30.0, &[10.0], 10.0, 5.0));
+        let agg = crate::metrics::GpuAggregate::from_values(&[10.0]);
+        assert!(!manager.should_inhibit(50.0, 30.0, &agg, 10.0, 5.0));
     }
 
     #[test]
@@ -476,12 +643,14 @@ mod tests {
         let manager = ThresholdManager::new(
             config.metrics.cpu.per_core_threshold,
             config.metrics.cpu.total_threshold,
-            config.metrics.gpu.threshold,
+            config.metrics.gpu.per_gpu_threshold,
+            config.metrics.gpu.total_threshold,
             config.metrics.network.threshold,
             config.metrics.disk.threshold,
         );
 
-        assert!(manager.should_inhibit(80.0, 45.0, &[95.0], 10.0, 5.0));
+        let agg = crate::metrics::GpuAggregate::from_values(&[95.0]);
+        assert!(manager.should_inhibit(80.0, 45.0, &agg, 10.0, 5.0));
     }
 
     #[test]
@@ -490,12 +659,14 @@ mod tests {
         let manager = ThresholdManager::new(
             config.metrics.cpu.per_core_threshold,
             config.metrics.cpu.total_threshold,
-            config.metrics.gpu.threshold,
+            config.metrics.gpu.per_gpu_threshold,
+            config.metrics.gpu.total_threshold,
             config.metrics.network.threshold,
             config.metrics.disk.threshold,
         );
 
-        assert!(manager.should_inhibit(80.0, 45.0, &[50.0, 95.0], 10.0, 5.0));
+        let agg = crate::metrics::GpuAggregate::from_values(&[50.0, 95.0]);
+        assert!(manager.should_inhibit(80.0, 45.0, &agg, 10.0, 5.0));
     }
 
     #[test]
diff --git a/systemd/rouser.service b/systemd/rouser.service
index 1812aca..307ad7a 100644
--- a/systemd/rouser.service
+++ b/systemd/rouser.service
@@ -6,14 +6,20 @@ Wants=network.target
 
 [Service]
 Type=simple
-ExecStart=%h/.local/bin/rouser --config %h/.config/rouser/config.toml
+ExecStart=%h/.local/bin/rouser
 Restart=on-failure
 RestartSec=5s
 StandardOutput=journal
 StandardError=journal
 SyslogIdentifier=rouser
-# Security hardening (non-breaking for D-Bus access)
+# Binary and config live in home — must be readable by the service.
+ReadOnlyPaths=%h/.local/bin %h/.config/rouser
+# History data: allow writing to XDG_STATE_HOME despite ProtectHome=read-only.
+ReadWritePaths=%h/.local/state/rouser
 ProtectHome=read-only
+
+# Root/system mode: state directory at /var/lib/rouser (standard location).
+StateDirectory=rouser
 PrivateTmp=true
 NoNewPrivileges=false