Analyzes Home Assistant database update frequency and attribute size to find main "database hogs".
- Find high-update "chatty" entities
- Suggest which to exclude from recorder (MariaDB/SQLite)
- Suggest which to include in InfluxDB
- Detect large JSON attributes that bloat the DB
- Compare current results against previous runs
Features (configurable):
- SQLite and MariaDB support
- Yaml config file (thoroughly config.yaml)
- Threshold-based categories: exclude from recorder or include in influxdb
- Attribute size analysis
Requirements:
- Python 3.10+
- Modules:
yaml,mysql.connector
Clone:
git clone https://github.com/richardoriginal/ha_db_hogs.git
cd ha_db_hogs
pip install -r requirements.txtEdit config.yaml to set:
- DB type & connection info
- Thresholds for influxdb / exclusion
- History file path
- Output limits
Example config.yaml:
db_type: sqlite
sqlite_path: data/home-assistant_v2.db
please_keep_reason: True
thresholds:
exclude_updates_per_hour: 10
influxdb.updates_per_hour: 100
poutput:
show_reason: True
limit: 10000
compare_with_history: True
history_file: comp/db_hog_history.jsonpython ha_db_hogs.pyOptional: specify a different config:
python ha_db_hogs.py --config examples/sample_config.yamlShows:
- Entity ID
- Total state changes
- Updates per hour
- Attribute size
- Suggested category
- Reason for classification (optional)
The script will print the top entities by update count & attribute size, plus suggested lists for recorder.default & influxdb yaml configs.
Example output:
Top Entities by Updates & Attribute Size:
Entity ID Count Upd/hr AttrSize Category Reason
---------------------------------------------------------------------------------------------------------------
sensor.office_presence_epl_target_1_distance 592235 2539.32 138 influxdb updates/hr > 200
sensor.car_data 1230 0.01 55 exclude updates/hr < 0.01
sensor.memory_use 54124 224.47 128 influxdb updates/hr > 200
sensor.front_door_battery 2500 0.50 45 keep default rule
Also outputs:
- recorder.yaml exclude list
- influxdb.yaml include list
This "should" work but I haven't run it, so YMMV:
Home Assistant automation example:
automation:
- alias: "Run DB Hog Analyzer Monthly"
trigger:
platform: time
at: "03:00:00"
condition:
condition: template
value_template: "{{ now().day == 1 }}"
action:
service: shell_command.run_db_hogs
shell_command:
run_db_hogs: "python3 /config/ha_db_hogs.py --config /config/config.yaml"- Run with default thresholds first
- Only change thresholds if output looks too large/small
- Keep history to track changes over months
- Fine-tune thresholds per device class
- Integrate output into HA config with !include
- Use history diff to track long-term trends