Skip to content

richardoriginal/ha_db_hogs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ha_db_hogs.py

Analyzes Home Assistant database update frequency and attribute size to find main "database hogs".

  • Find high-update "chatty" entities
  • Suggest which to exclude from recorder (MariaDB/SQLite)
  • Suggest which to include in InfluxDB
  • Detect large JSON attributes that bloat the DB
  • Compare current results against previous runs

Features (configurable):

  • SQLite and MariaDB support
  • Yaml config file (thoroughly config.yaml)
  • Threshold-based categories: exclude from recorder or include in influxdb
  • Attribute size analysis

1. Installation

Requirements:

  • Python 3.10+
  • Modules: yaml, mysql.connector

Clone:

git clone https://github.com/richardoriginal/ha_db_hogs.git
cd ha_db_hogs
pip install -r requirements.txt

2. Configuration

Edit config.yaml to set:

  • DB type & connection info
  • Thresholds for influxdb / exclusion
  • History file path
  • Output limits

Example config.yaml:

db_type: sqlite
sqlite_path: data/home-assistant_v2.db
please_keep_reason: True
thresholds:
  exclude_updates_per_hour: 10
  influxdb.updates_per_hour: 100
poutput:
  show_reason: True
limit: 10000
compare_with_history: True
history_file: comp/db_hog_history.json

3. Usage

python ha_db_hogs.py

Optional: specify a different config:

python ha_db_hogs.py --config examples/sample_config.yaml

4. Output

Shows:

  • Entity ID
  • Total state changes
  • Updates per hour
  • Attribute size
  • Suggested category
  • Reason for classification (optional)

The script will print the top entities by update count & attribute size, plus suggested lists for recorder.default & influxdb yaml configs.

Example output:

Top Entities by Updates & Attribute Size:
Entity ID                                               Count     Upd/hr   AttrSize   Category            Reason
---------------------------------------------------------------------------------------------------------------
sensor.office_presence_epl_target_1_distance           592235    2539.32        138   influxdb            updates/hr > 200
sensor.car_data                                          1230       0.01         55   exclude             updates/hr < 0.01
sensor.memory_use                                       54124     224.47        128   influxdb            updates/hr > 200
sensor.front_door_battery                                2500       0.50         45   keep                default rule

Also outputs:

  • recorder.yaml exclude list
  • influxdb.yaml include list

5. (Optional) Automated Monthly Run

This "should" work but I haven't run it, so YMMV:

Home Assistant automation example:

automation:
  - alias: "Run DB Hog Analyzer Monthly"
    trigger:
      platform: time
      at: "03:00:00"
    condition:
      condition: template
      value_template: "{{ now().day == 1 }}"
    action:
      service: shell_command.run_db_hogs

shell_command:
  run_db_hogs: "python3 /config/ha_db_hogs.py --config /config/config.yaml"

General Users

  • Run with default thresholds first
  • Only change thresholds if output looks too large/small
  • Keep history to track changes over months

Power Users

  • Fine-tune thresholds per device class
  • Integrate output into HA config with !include
  • Use history diff to track long-term trends

About

Analyzes Home Assistant database update frequency and attribute size to find main "database hogs"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages