Skip to content

feat(zigbee): add Zigbee Herdsman plugin for direct Zigbee device integration#512

Open
akadlec wants to merge 38 commits intomainfrom
claude/implement-zigbee-herdsman-plugin-5PExy
Open

feat(zigbee): add Zigbee Herdsman plugin for direct Zigbee device integration#512
akadlec wants to merge 38 commits intomainfrom
claude/implement-zigbee-herdsman-plugin-5PExy

Conversation

@akadlec
Copy link
Copy Markdown
Contributor

@akadlec akadlec commented Mar 30, 2026

Summary

This PR introduces a comprehensive Zigbee Herdsman plugin that enables direct Zigbee device integration via the zigbee-herdsman library. The plugin provides self-contained Zigbee network management with device discovery, adoption, and real-time monitoring capabilities.

Key Changes

Backend Implementation

  • Core Services

    • ZigbeeHerdsmanAdapterService: Wraps zigbee-herdsman Controller with typed events and lifecycle management for serial communication with Zigbee coordinators
    • ZigbeeHerdsmanService: Main plugin service implementing IManagedPluginService interface for plugin lifecycle and state management
    • ZhDeviceAdoptionService: Handles device adoption workflow with automatic channel and property mapping
    • ZhMappingPreviewService: Generates device mapping previews before adoption with customizable expose overrides
    • ZhDeviceConnectivityService: Monitors device connectivity with configurable timeouts for mains and battery-powered devices
  • Configuration & Models

    • ZigbeeHerdsmanConfigModel: Complete configuration model with serial port, network, discovery, and database settings
    • ZigbeeHerdsmanDiscoveredDeviceModel: Response model for discovered devices with interview status and availability tracking
    • Comprehensive response models for mapping previews, coordinator info, and device adoption
  • Controllers & DTOs

    • ZigbeeHerdsmanDiscoveredDevicesController: REST API endpoints for device discovery, permit join, mapping preview, and device adoption
    • Request/response DTOs with full validation and Swagger documentation
  • Constants & Mappings

    • Device type mappings (Coordinator, Router, EndDevice)
    • Channel category mappings from Zigbee exposes (light, switch, thermostat, sensors, etc.)
    • Property bindings for common Zigbee properties (state, brightness, temperature, humidity, etc.)
    • Interview status tracking (not_started, in_progress, completed, failed)
  • Database Entities

    • ZigbeeHerdsmanDeviceEntity: Extends base DeviceEntity with IEEE address and device type
    • ZigbeeHerdsmanChannelEntity: Channel entity for Zigbee devices
    • ZigbeeHerdsmanChannelPropertyEntity: Channel property entity for device properties
  • Platform Integration

    • ZigbeeHerdsmanDevicePlatform: Implements IDevicePlatform interface for device command execution and property updates
  • Exception Handling

    • Custom exceptions for connection failures, validation errors, coordinator offline scenarios, and command execution failures

Admin UI Implementation

  • Configuration Form (zigbee-herdsman-config-form.vue)

    • Serial port configuration (path, baud rate, adapter type)
    • Network settings (channel, PAN ID, extended PAN ID, network key)
    • Discovery settings (permit join timeout)
    • Database configuration
  • Device Management Forms

    • Device add form with IEEE address, name, and category selection
    • Device edit form for updating device properties
    • Mapping preview component for visualizing device structure before adoption
  • Store & Schemas

    • Zod schemas for configuration, devices, channels, and channel properties
    • Store type definitions for type-safe state management
    • Locale support with English translations

Plugin Integration

  • Module registration in both backend (app.module.ts) and admin (app.main.ts)
  • Full OpenAPI documentation with extra models
  • Plugin lifecycle management with proper initialization and cleanup

Notable Implementation Details

  • Exposes Format Compatibility: Uses the same exposes format as zigbee2mqtt since both leverage zigbee-herdsman-converters
  • Device Interview Tracking: Monitors device interview status to ensure devices are properly configured before use
  • Flexible Mapping: Supports customizable property mappings with expose overrides during device adoption
  • Connectivity Monitoring: Implements separate timeout strategies for mains-powered and battery-powered devices
  • Serial Communication: Supports multiple adapter types (auto, zstack, ember, deconz, zigate) with configurable b

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom


Note

High Risk
High risk due to introducing a new backend plugin with new REST endpoints, device command execution path, and a database migration adding Zigbee-specific columns, plus new runtime dependencies (zigbee-herdsman, serialport).

Overview
Adds a new Zigbee Herdsman plugin end-to-end: the admin app registers the plugin and ships new config/device add/edit forms with schemas and English i18n, enabling configuration of coordinator serial/TCP settings, network channel, discovery timeouts, and database path.

On the backend, registers a new Nest plugin module and exposes new APIs for discovered devices (list, coordinator info, permit-join, mapping preview, adopt, remove), plus a device platform implementation to send commands via zigbee-herdsman-converters with retries/backoff and a connectivity monitor service with configurable timeouts. Includes a TypeORM migration adding Zigbee metadata columns to devices and channel properties, updates backend dependencies (adds zigbee-herdsman, zigbee-herdsman-converters, serialport), and makes Shelly reconnection logic tolerant of missing resetReconnectInterval.

Reviewed by Cursor Bugbot for commit c5c884a. Bugbot is set up for automated code reviews on this repo. Configure here.

@akadlec akadlec self-assigned this Mar 30, 2026
@github-actions github-actions Bot added backend Backend app related admin Admin app related docs labels Mar 30, 2026
@akadlec akadlec changed the title feat: add Zigbee Herdsman plugin for direct Zigbee device integration feat(zigbee): add Zigbee Herdsman plugin for direct Zigbee device integration Mar 30, 2026
Comment thread apps/backend/src/plugins/devices-zigbee-herdsman/models/config.model.ts Outdated
Comment thread apps/admin/src/plugins/devices-zigbee-herdsman/store/keys.ts Outdated
Comment thread apps/backend/src/migrations/1743400000000-AddZigbeeHerdsmanColumns.ts Outdated
claude added 18 commits April 3, 2026 18:49
1. Remove unused ZhCoordinatorInfo interface: defined but never
   referenced. The adapter's getCoordinatorInfo() uses its own inline
   return type, and the controller builds ZhCoordinatorInfoModel
   directly. Removing eliminates a misleading contract that could
   drift from the implementation.

2. Add exponential backoff to command retries: convertSet retry loop
   now waits 250ms * 2^(attempt-1) between attempts (250ms, 500ms,
   1000ms). Immediate retries on a low-bandwidth Zigbee mesh often
   hit the same congestion, wasting coordinator bandwidth.

3. Validate permitJoin timeout DTO: added @ISINT, @min(1), @max(254)
   validators to ZhPermitJoinRequestDto.timeout. Also added Math.min
   cap of 254 in the adapter service as defense-in-depth. Previously
   any number was accepted, allowing very large timeouts that leave
   the Zigbee network open to joins indefinitely.

Issue 4 (data_type key in toInstance) does not exist: verified that
CreateDeviceChannelPropertyDto declares the field as `data_type`
with bare @expose(), so the snake_case key maps correctly.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
checkConnectivity() was calling findAll() inside
updateDeviceConnectionState() for each device that changed state,
resulting in N separate full-table queries per 60s check cycle.

Restructured to: collect all IEEE addresses needing updates first,
then issue a single findAll() query, build a Map<ieeeAddress, device>
lookup, and iterate the updates against it. This reduces N+1 queries
to exactly 1 (or 0 when no state transitions occurred).

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
1. Fix stale discovered devices on leave: onDeviceLeave now calls
   discoveredDevices.delete() instead of only setting available=false.
   Devices that leave the network (e.g. factory reset) are fully
   removed from the discovered map, matching the removeDevice()
   behavior and preventing stale entries in the adoption UI.

2. Use crypto.randomBytes for network key material: replaced
   Math.random() (predictable, not cryptographically secure) with
   Node.js crypto.randomBytes for the 128-bit network encryption key
   and extended PAN ID, and crypto.randomInt for PAN ID. Math.random
   output can be reconstructed, which would allow decrypting all
   Zigbee network traffic.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
1. Fix permitJoin timeout: pass the timeout value to zigbee-herdsman's
   native controller.permitJoin(true, joinTimeout) API instead of only
   passing true. The controller manages its own internal timer and
   auto-disables join. The local setTimeout is kept as a safety
   fallback (joinTimeout + 1s) to update the permitJoinEnabled flag
   in case the controller's timer doesn't fire (e.g. adapter
   disconnection).

2. Reorder controller routes: moved static routes (coordinator-info,
   adopt, permit-join) before parameterized routes (:ieeeAddress/*).
   In NestJS, route registration order matters — if a @get(':ieeeAddress')
   were added in the future, it would shadow @get('coordinator-info')
   if declared first. Added section comments to make the ordering
   intent explicit.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
1. Stop re-wrapping HttpException subclasses: the controller was
   catching DevicesZigbeeHerdsmanException/NotFoundException/
   ValidationException and re-wrapping via err.message into
   UnprocessableEntityException/NotFoundException. Since these are
   already HttpException subclasses with correct status codes,
   err.message contained the serialized {statusCode, message, error}
   object, producing garbled '[object Object]' responses. Removed
   the try/catch — NestJS handles HttpExceptions directly.

2. Fix permit-join response timeout: when disabling permit join
   (enabled=false), the response now returns timeout: 0 instead of
   the misleading 254-second default.

3. Detect unconsumed payload keys: after iterating all toZigbee
   converters, any payload keys not consumed by any converter now
   log a warning and set success=false. Previously, commands for
   properties with no matching converter were silently dropped while
   reporting success.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
zigbee-herdsman natively supports TCP-to-serial bridges via
tcp://host:port paths (e.g. SLZB-06, SLZB-07, ser2net).

- Config validator: detect tcp:// paths and validate format (host:port
  with valid port range) instead of checking file accessibility via
  fs.accessSync which would fail for network paths. Actual TCP
  connectivity is verified when the adapter starts.

- Admin UI: updated serial port label to "Coordinator path", placeholder
  to show tcp:// example, and section description to mention network
  adapter support alongside USB dongles.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
This plugin is a standalone Zigbee integration with no dependency on
or relationship to the zigbee2mqtt plugin.

- Rename z2mProperty/z2m_property to zigbeeProperty/zigbee_property
  across the DTO, response model, mapping preview service, and
  adoption service

- Remove all Z2M/zigbee2mqtt mentions from comments, including
  constants (channel identifiers, access bits, property mappings,
  mapping functions) and interfaces (expose types)

- Rewrite plugin description and readme to describe the plugin as
  self-contained without referencing zigbee2mqtt as a comparison point

- Update readme to mention network-attached coordinators (SLZB-06/07)
  and TCP path support

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
…etworkKey transform

1. Fix device category detection for composite exposes: exposeTypes like
   'light' and 'switch' contain nested features arrays with property
   names (e.g. 'state', 'brightness'). The controller now extracts
   property names from both top-level and nested features, so
   mapZhCategoryToDeviceCategory correctly identifies devices like
   switches with power monitoring (OUTLET) instead of falling through
   to GENERIC.

2. Remove dead ZIGBEE_HERDSMAN_STORE_PREFIX: exported from keys.ts but
   never imported anywhere. Removed the unused constant.

3. Fix humidity dataType: changed from UCHAR (integer 0-255) to FLOAT.
   Zigbee humidity sensors report values with decimal precision
   (e.g. 65.32%), and UCHAR would truncate fractional readings.
   Matches temperature and pressure which already use FLOAT.

4. Remove @Transform redaction from networkKey: the toPlainOnly
   transform replaced number[] with the string '[REDACTED]', creating
   a type mismatch that could corrupt the key during serialization
   round-trips (plainToInstance → instanceToPlain → plainToInstance).
   The config API is admin-only and auth-required, so the raw value
   is acceptable. The field keeps @expose for whitelist validation.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
generatePreview only collected top-level expose property names,
missing nested features (e.g. state, brightness inside a light
expose). The controller's transformToDiscoveredDevice was already
fixed to iterate nested features. Applied the same pattern here
so mapZhCategoryToDeviceCategory produces consistent results in
both the preview API and the discovered-devices list.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
1. Extract duplicated expose property extraction to shared
   extractExposeInfo() in constants. Both the controller's
   transformToDiscoveredDevice and the mapping preview service's
   generatePreview now call the same function, eliminating the
   risk of the logic drifting out of sync.

2. Fix adapter disconnect resource leak: onAdapterDisconnected set
   started=false but left this.controller populated. This caused:
   - stop() to return early (guard: !started || !controller) without
     cleaning up the controller
   - start() to skip the stop() call (guard: this.started) and
     overwrite this.controller, leaking the old instance's serial
     port handle, timers, and event listeners
   Now onAdapterDisconnected nulls out this.controller and cleans up
   the permit join timer, so both stop() and start() behave correctly
   after a disconnect.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
Add TypeORM migration for zigbee-herdsman plugin entity columns:
- devices_module_devices: ieee_address, network_address,
  manufacturer_name, model_id, date_code, software_build_id,
  interview_completed
- devices_module_channels_properties: zigbee_cluster, zigbee_attribute

Update task checklist with completion status for all phases.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
The previous down method used CREATE TABLE AS SELECT to recreate
tables without the new columns. This had multiple problems:
- Referenced wrong column names (data_type vs dataType)
- Omitted columns added by other plugins, silently dropping data
- Destroyed all indexes, foreign keys, unique constraints, and
  CHECK constraints on both tables

Replaced with ALTER TABLE DROP COLUMN (supported since SQLite 3.35.0).
All columns qualify: nullable, no constraints, not indexed, not PK.
Wrapped in try/catch for older SQLite versions where the columns
remain harmlessly (only used by zigbee-herdsman device type).

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
1. Include nested features in discovered device exposes: composite
   types (light, switch, climate) have no useful top-level property/
   name/access/unit — the actual capabilities live in nested features
   arrays. Now iterates features and adds them to the expose list so
   API consumers can see brightness, state, color_temp etc. from the
   discovery endpoint.

2. Add remoteFormSubmit default to withDefaults in all 3 Vue forms:
   the prop is optional (boolean | undefined) but the watcher declares
   (): boolean, so without a default it evaluates as undefined on first
   render. Added remoteFormSubmit: false to config form, device add
   form, and device edit form.

Issue 2 (mapZhTypeToDataType) was retracted — no bug exists.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
1. Fix nested feature cast and undefined name: changed cast from
   ZhExposeInfoModel[] to an explicit inline type matching the actual
   zigbee-herdsman-converters feature shape. Added name fallback to
   feature.property when feature.name is undefined (common for features
   that only define property).

2. Add fractional value_step check to mapZhTypeToDataType: added
   optional valueStep parameter. When value_step is fractional (e.g.
   0.1 for temperature in tenths), returns FLOAT regardless of the
   min/max range. This prevents unknown numeric properties with
   decimal precision from being typed as UCHAR/USHORT/UINT. Both
   callers in mapping-preview now pass value_step.

Issue 1 (config schema camelCase vs snake_case) does not exist:
transformConfigPluginResponse calls snakeToCamel() on the API response
before parsing with the schema, so camelCase fields are correct.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
ZigbeeHerdsmanUpdatePluginConfigDto was missing the database_path
field. The config model exposes it, the admin form binds to it, and
the frontend update schema sends it — but the backend DTO silently
dropped it during validation because the property didn't exist.
Changes to the database path in the UI were never persisted.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
zigbee-herdsman-converters' toZigbee state converter expects string
values "ON"/"OFF"/"TOGGLE" and throws if it receives anything else
(including boolean true/false). The Smart Panel stores state as
DataTypeType.BOOL, so the platform receives boolean values.

Added convertValueForZigbee() that maps:
- state: true → "ON", false → "OFF"
- state: "true"/"false" → "ON"/"OFF"
- state: "on"/"off"/"toggle" → uppercase
- All other properties: passed through unchanged

The other three issues were verified as invalid:
- endpoints is Endpoint[] (confirmed from zigbee-herdsman typings)
- mapZhTypeToDataType correctly falls through to FLOAT for unknowns
- Config schema camelCase is correct (snakeToCamel transform applied)

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
…pe sizing

1. Fix migration partial rollback: replaced single try-catch around
   9 DROP COLUMN statements with individual try-catch per column.
   If one fails (older SQLite), the others are still attempted, and
   each failure is isolated rather than leaving an inconsistent state.

2. Add convertSet guard: check typeof converter.convertSet === 'function'
   before iterating matching keys. Prevents runtime crash if a converter
   in the definition loses its function reference (e.g. corrupted entry).

3. Fix expose.access truthy check: changed from `expose.access` (falsy
   for 0) to `expose.access !== undefined`. Exposes with access=0 are
   now shown as "partial" in the preview instead of being silently
   dropped, since zigbee-herdsman-converters can legitimately use 0.

4. Fix mapZhTypeToDataType with only valueMax: separated the negative
   min check from the max sizing. Now when only valueMax is defined
   (e.g. value_max: 100 without value_min), the function returns UCHAR
   instead of falling through to FLOAT. The negative check runs
   independently of whether valueMax is defined.

Config schema camelCase issue was already verified as invalid in
previous turns (snakeToCamel transform handles conversion).

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
…stSeen null

1. Add isCoordinatorOnline check to adoptDevice: adoption now fails
   early with a validation error if the coordinator is offline,
   preventing creation of Smart Panel devices from stale cached
   registry entries that are not currently reachable.

2. Add @isdefined to ReqZhAdoptDeviceDto and ReqZhPermitJoinDto:
   sending {} or omitting the data field now returns a structured
   validation error instead of a 500 TypeError from dereferencing
   undefined body.data.

3. Remove unused ZigbeeHerdsmanDeviceAddSimpleFormSchema and its
   inferred type IZigbeeHerdsmanDeviceAddSimpleForm — neither was
   used outside their own definition files.

4. Fix devices with lastSeen=null staying online forever: the
   connectivity checker was skipping these devices entirely, leaving
   them reported as online indefinitely (registered with
   available=true but never entering timeout evaluation). Now treats
   lastSeen=null as offline.

Config schema camelCase issue confirmed invalid (4th time) —
snakeToCamel transform in transformConfigPluginResponse handles it.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
@akadlec akadlec force-pushed the claude/implement-zigbee-herdsman-plugin-5PExy branch from d402e1f to d0732fd Compare April 3, 2026 18:50
processIncomingMessage was receiving ZCL attribute data from device
messages but discarding everything except a debug log of link quality.
Adopted devices never reflected real-time state changes, making the
integration effectively write-only.

Now implements the full fromZigbee pipeline:
1. Iterates fromZigbee converters from the device definition, matching
   by cluster name (same pattern as zigbee-herdsman-converters expects)
2. Each converter's convert() translates raw ZCL data into a state
   object (e.g. { temperature: 22.5, humidity: 65.3 })
3. Adds linkquality to the state if present in the message
4. Fetches all channel properties for the adopted device
5. Matches state keys to property identifiers (set to zigbee expose
   names during adoption) and writes values via PropertyValueService
6. PropertyValueService handles persistence, change detection, and
   WebSocket broadcast

The CI failure (devices-shelly-ng resetReconnectInterval) is a
pre-existing issue on main, not related to this plugin.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
1. Fix shelly-ng CI blocker: resetReconnectInterval() doesn't exist on
   the RpcHandler type. Merged the method into the existing cast type
   and call via optional chaining. This was a pre-existing type error
   on main that blocked generate:openapi.

2. Fix fromZigbee converter call signature: convert() expects
   (model, msg, publish, options, meta) but was called with
   (herdsmanDevice, data, {}, meta, {}). Now passes:
   - model: discovered.definition (the device definition)
   - msg: full message object with data, cluster, type, device, endpoint
   - publish: no-op function (required callback, unused in our context)
   - options: empty object
   - meta: { state, logger, device, options }

3. Await converter.convert() result: fromZigbee converters may be async.
   Without await, the Promise object was spread into convertedState
   instead of the actual converted values, silently losing all data.

4. Add TOCTOU guard in platform processBatch: re-check isStarted()
   per device in the loop to detect asynchronous adapter disconnections
   between the initial check and actual command execution.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

claude added 2 commits April 3, 2026 21:11
…stence

1. Add missing @influxdata/influxdb-client dependency: the influx-v2
   plugin imports it but it wasn't in package.json, blocking
   generate:openapi on CI. Pre-existing issue on main.

2. Fix boolean conversion for all device types: convertValueForZigbee
   only converted booleans for the 'state' key. Other boolean-typed
   properties (lock_state, fan_state, etc.) also need conversion since
   toZigbee converters expect strings ("ON"/"OFF", "LOCK"/"UNLOCK").
   Now converts all boolean values regardless of key name.

3. Fix optimistic lastDbState update: was set before the DB write,
   so if setConnectionState threw or the device wasn't adopted yet,
   the state was marked as persisted and never retried. Now only
   updates lastDbState after successful setConnectionState call.
   Failed writes and unadopted devices will retry on the next cycle.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
…rvice

PropertyValueService is a provider in DevicesModule but NOT in its
exports list, so it's unavailable for injection in other modules.
NestJS threw UnknownDependenciesException at startup.

Replaced with channelsPropertiesService.update() which IS exported
and matches the pattern used by the Z2M plugin for writing property
values. This also triggers the proper WebSocket events for real-time
UI updates.

https://claude.ai/code/session_014bjB9Cn1WKASNLBeCuSbom
@akadlec akadlec closed this Apr 13, 2026
@akadlec akadlec deleted the claude/implement-zigbee-herdsman-plugin-5PExy branch April 13, 2026 10:48
@akadlec akadlec restored the claude/implement-zigbee-herdsman-plugin-5PExy branch April 13, 2026 10:50
@akadlec akadlec reopened this Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

admin Admin app related backend Backend app related docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants