Skip to content

DAB-LABS/smart-sniffer

Repository files navigation

SMART Sniffer

Proactive disk health monitoring for Home Assistant.
Sniff early. Sniff often. 🕵️‍♂️

Release Beta Build License


Uncorrectable errors. Pending failures quietly piling up. 🙀

SMART drives say "PASSED" right up until the end. 😵

Sniff early. Sniff often. 🕵️‍♂️

SMART Sniffer follows the trail, sniffing out the early warning signs your drives won't tell you about and reporting them to Home Assistant before it's too late. One agent per machine. Every drive gets a health score. Alerts fire automatically. No automations required.


Features

Early warning alerts — The Attention Needed sensor monitors leading indicators of failure across ATA, SATA, NVMe, and SAS drives. Four clear states: NO · MAYBE · YES · UNSUPPORTED.

Zero-config notifications — Persistent notifications fire automatically when drive health changes. No automations or blueprints to set up. Alerts escalate, de-escalate, and auto-dismiss.

Disk usage monitoring — Opt-in filesystem tracking during install. Monitor storage utilization on any mountpoint — the agent reports total, used, available bytes and percentage via a dedicated API endpoint.

Multi-machine monitoring — Install a lightweight Go agent on each machine. Each drive appears as its own HA device with full sensor entities and diagnostics.

Auto-discovery — Agents advertise themselves on the local network via mDNS/Zeroconf. Home Assistant discovers them automatically — no manual IP entry needed.

AI-ready drive analysis -- A service call returns full SMART data, health assessments, and storage metrics for any agent. Pipe it to Google AI, ChatGPT, Claude, or Ollama for automated drive health reports with zero parsing.

Secure by default — Optional bearer token authentication between agent and integration. SHA256-verified binary downloads.


How It Works

Architecture diagram

Each machine runs a lightweight smartha-agent binary that wraps smartctl and serves SMART data over HTTP. The HA integration polls each agent and creates devices, sensors, and health alerts for every drive it finds.

Entities per drive

Each drive gets its own HA device. Entities are created dynamically — if a drive doesn't report an attribute, the sensor is simply not created.

Disk Usage (per agent host, requires agent v0.5.0+ with filesystems configured):

Entity Description
Disk Usage — Root (/) Percentage used for the monitored mountpoint

Attributes: mountpoint, device, fstype, total_gb, used_gb, available_gb. One sensor per mountpoint configured during agent install. Grouped under a "Disk Usage ({hostname})" device alongside the drive devices.

Sensors:

Entity Description
Attention Needed Proactive health alert — NO / MAYBE / YES / UNSUPPORTED
Health SMART pass/fail — OK, Problem, or Unknown
Standby Whether the drive is currently spun down. When On, exposes a data_as_of attribute showing when the cached SMART readings were last refreshed.
Temperature Current drive temp (°C)
Power-On Hours Total hours powered on
SMART Status Raw SMART verdict (PASSED / FAILED)

Diagnostic sensors (conditional):

Entity Description
Reallocated Sector Count Bad sectors remapped to spares (ATA)
Reported Uncorrectable Errors Unrecoverable read/write errors (ATA)
Wear Leveling / Percentage Used SSD endurance indicator
Power Cycle Count Total power on/off cycles
Reallocated Event Count Individual reallocation events (ATA)
Spin Retry Count Motor spin-up retries — HDD only
Command Timeout Internal command timeouts
Available Spare NVMe reserve block pool (%)
Available Spare Threshold Manufacturer-set minimum spare (%)
Current Pending Sector Count Sectors waiting for reallocation (ATA)

Vendor-specific SMART attributes (ATA/SATA drives in smartctl database, disabled by default):

For drives recognized by smartctl's database (drivedb.h), every named SMART attribute not already covered by the sensors above is created as a diagnostic entity. These are disabled by default -- enable what you need from the device page. Examples: erase counts, host writes/reads, program/erase fail counts, available reserved space, remaining lifetime percentage. The exact attributes vary by drive manufacturer and model.

Entities per agent

Each agent host groups under its own HA device alongside the drive devices. These entities describe the agent itself rather than any one drive. Enabled-by-default entities show up as soon as you reload the integration; the rest are visible from the device page's "disabled" section and can be turned on from entity settings.

Primary (enabled by default):

Entity Type Description
Agent Status binary_sensor Connectivity. On when reachable, Off when disconnected. Stays available across outages so automations can trigger on it.
Agent Version sensor Semantic version reported by the agent. Drives the "Agent version outdated" HA repair notification.
OS sensor Agent host OS: linux, darwin, or windows. Reports unknown for agents older than v0.5.3.

Diagnostic (disabled by default):

Entity Type Description
Agent Last Seen sensor Timestamp of the most recent successful poll.
Agent IP sensor IP address recorded for the agent in the HA config entry.
Agent Port sensor Port the agent is serving on (default 9099).
HA Poll Interval sensor How often Home Assistant polls the agent, in seconds. Distinct from the agent's own scan_interval.
Auth Active binary_sensor On when a bearer token is configured for this agent.
Attention Needed — how it classifies drives

The Attention Needed sensor evaluates individual SMART attributes every poll cycle:

State Meaning Action
NO All monitored indicators clear None required
MAYBE Early degradation signals detected Monitor closely, plan replacement
YES Data integrity at risk Back up immediately
UNSUPPORTED No usable SMART data Common with USB enclosures

Critical triggers (→ YES): Reallocated sectors, pending sectors, uncorrectable errors, NVMe critical_warning, NVMe media_errors, spare depletion below threshold.

Warning triggers (→ MAYBE): Reallocated events, spin retry count, command timeouts, NVMe spare < 20%, NVMe percentage_used ≥ 90%.

When a drive's state changes, a persistent notification fires in HA automatically. Notifications escalate on worsening conditions and dismiss when resolved. See attention-severity-logic.md for the full specification.

Agent configuration

Create config.yaml in the working directory or /etc/smartha-agent/:

port: 9099
token: "your-secret-token"    # optional -- omit to disable auth
scan_interval: 60s
standby_mode: standby          # optional -- never, standby, sleep, or idle
advertise_interface: eth0      # optional -- restrict mDNS to this interface
exclude_devices:               # optional -- set by installer's drive picker
  - /dev/sdb
filesystems:                   # optional -- set by installer's disk usage picker
  - path: /
    uuid: a1b2c3d4-5678-90ab-cdef-1234567890ab
    device: /dev/sda1
    fstype: ext4

All options can also be set via CLI flags: --port, --token, --scan-interval, --interface, --config.

Exclude devices: Device paths listed in exclude_devices are skipped during every scan. The agent resolves symlinks at startup, so /dev/disk/by-id/... paths and their /dev/sdX equivalents both match. The installer's drive picker (Linux) shows all detected drives and lets you choose which ones to monitor -- useful for excluding iSCSI LUNs, USB backup drives, or anything you don't want polled. Drives with remote-storage transports (iSCSI, Fibre Channel) are flagged in yellow and excluded from the default selection. You can also add paths manually and restart the service. If a device appears in both exclude_devices and device_overrides, the exclusion wins and a warning is logged.

Scan interval: Uses Go duration syntax -- 30s, 5m, 1h, 24h are all valid. When standby_mode is set, the agent skips sleeping drives and serves cached data, so the interval does not cause unnecessary wake-ups. When standby_mode is never (the default), each poll wakes any drive that is spun down. This is the agent-side read cadence and is separate from the HA Poll Interval entity, which reflects how often Home Assistant pulls fresh data from the agent itself.

Standby mode: Controls whether the agent avoids waking sleeping drives during polling. Set to standby, sleep, or idle to match your drives' power management (these correspond to smartctl -n modes). When set, the agent passes -n <mode> to smartctl on each poll -- if a drive is in that power state, smartctl exits without waking it and the agent serves the last cached SMART data with an in_standby flag. The default is never, which wakes drives on every poll. On the very first poll after startup, the agent always wakes all drives regardless of this setting to collect a SMART baseline (serial number, model, attributes). This one-time wake ensures every drive is registered with a stable identity from the start. Subsequent polls honor the standby setting normally.

Network interface: The advertise_interface setting restricts mDNS to a single interface. The installer sets this during setup if you pick a specific interface. When not set, the agent auto-filters known virtual interfaces (Docker, ZeroTier, Tailscale, WireGuard, etc.) and advertises on all remaining physical interfaces. To change the interface after install, edit config.yaml and restart the service — no reinstall needed.

Authentication: When a token is set, every request to the agent must include an Authorization: Bearer <token> header — requests without it receive a 401 Unauthorized response. When adding the agent in Home Assistant, enter the same token in the integration's config flow. If no token is set, the agent serves data openly without auth.

Disk usage monitoring: The installer's Disk Usage Monitoring picker lets you select which mountpoints to track. Each configured mountpoint is polled on the same interval as SMART data and served via /api/filesystems. Mountpoints are identified by UUID (via blkid) for stable entity identity across reboots. If no filesystems are configured, the endpoint simply isn't registered and callers receive a 404. To add or change monitored mountpoints after install, re-run the installer or edit config.yaml directly and restart the service.

Auto-discovery: The agent advertises itself on the local network via mDNS (Zeroconf) by default. HA automatically detects running agents and prompts you to set them up — no manual IP entry needed. Disable with mdns: false in config or --no-mdns flag. Note: mDNS is link-local, so agents on different VLANs won't be discovered without an mDNS reflector.

Multi-homed hosts: Machines running Docker, VPNs (ZeroTier, Tailscale, WireGuard), or virtual bridges have multiple network interfaces. If the agent advertises on a VPN or container interface, HA may discover it at an unreachable IP and fail to connect. The installer's interface picker and the advertise_interface config option solve this — see Platform Install Paths for details.

API endpoints:

Endpoint Description
GET /api/health Agent status, version, available endpoints, drive/filesystem counts
GET /api/drives Summary list of all discovered drives
GET /api/drives/{id} Full SMART data for a single drive
GET /api/filesystems Disk usage for all configured mountpoints (only registered when filesystems are configured)

Service management:

Platform Status Logs Restart
Linux systemctl status smartha-agent journalctl -u smartha-agent -f systemctl restart smartha-agent
macOS launchctl list | grep smartha tail -f /var/log/smartha-agent.log sudo launchctl kickstart -k system/com.dablabs.smartha-agent
Windows Get-Service SmartHA-Agent Event Viewer Restart-Service SmartHA-Agent
Building from source

Requires Go 1.22+.

cd agent
make              # build for current platform
make all          # cross-compile all targets
make release      # build all + generate SHA256 checksums
make clean        # remove build artifacts

Binaries output to agent/build/.

Platform Architecture Binary Status
Linux amd64, arm64 smartha-agent-linux-amd64, -arm64 Tested
macOS amd64 (Intel), arm64 (Apple Silicon) smartha-agent-darwin-amd64, -arm64 Tested
Windows amd64 smartha-agent-windows-amd64.exe Tested (v0.5.1+)

Quick Start

Requires: smartmontools 7.0+ on each monitored machine (for JSON output support). The installer handles installation automatically (Homebrew on macOS, apt/dnf/yum on Linux), but some older distros ship smartctl 6.x which does not support the --json flag the agent relies on. Run smartctl --version to check. If you're on 6.x, install a newer version from the smartmontools releases page or from a backports repository.

Optional: btrfs-progs is recommended on systems with btrfs filesystems. The installer's disk-usage picker and the agent's /api/filesystems endpoint both fall back to btrfs filesystem usage --raw when statvfs returns zero on a btrfs mount (a known quirk on some multi-device or near-full configurations). Without btrfs-progs, btrfs entries display as (unknown size) in the picker and report zero-byte usage from the API. Most distros include btrfs-progs by default if any btrfs filesystems exist on the system.

1. Install the agent

Run on each machine you want to monitor:

curl -sSL https://raw.githubusercontent.com/DAB-LABS/smart-sniffer/main/install.sh | sudo bash
Windows (PowerShell as Admin)
irm https://raw.githubusercontent.com/DAB-LABS/smart-sniffer/main/install.ps1 | iex

Important: This must be run from PowerShell, not Command Prompt (CMD). If you see 'irm' is not recognized, you're in CMD -- open PowerShell as Administrator instead. Right-click the Start button and select Terminal (Admin) or search for "PowerShell" in the Start menu.

Note: Requires v0.5.1 or later. Earlier Windows builds had a service startup bug tracked as #13. Service events appear in Event Viewer under Windows Logs → Application with source SmartHA-Agent.

macOS: "unidentified developer" warning

The agent binary is not code-signed or notarized. macOS Gatekeeper will block it on first run with a "cannot be opened" or "unidentified developer" dialog.

If you installed via the install script, remove the quarantine flag manually:

sudo xattr -d com.apple.quarantine /usr/local/bin/smartha-agent

Then restart the service:

sudo launchctl kickstart -k system/com.dablabs.smartha-agent

Alternatively, go to System Settings > Privacy & Security, find the blocked app, and click Open Anyway.

Note: The right-click > Open workaround was removed in macOS Sequoia (15.0). The xattr method above works on all current macOS versions including Sequoia.

Pin a specific version
VERSION=0.1.0 curl -sSL https://raw.githubusercontent.com/DAB-LABS/smart-sniffer/main/install.sh | sudo bash

The installer detects your OS and architecture, downloads the correct binary from GitHub Releases, verifies the SHA256 checksum, installs smartmontools if missing, prompts for configuration (port, token, scan interval, disk usage monitoring, mDNS interface), and sets up a system service.

Disk usage monitoring is opt-in. During a fresh install or upgrade to v0.5.0+, the installer asks which mountpoints to monitor. Select the mountpoints you care about. The agent will report usage percentage, total/used/available bytes for each.

On machines with multiple network interfaces (Docker, ZeroTier, Tailscale, etc.), the installer presents an interface picker so mDNS discovery advertises on the right network. Pick your LAN interface — not "All interfaces" — if you run VPNs or containers, otherwise HA may discover the agent at an unreachable IP.

Agent install on Linux

Uninstall the agent
curl -sSL https://raw.githubusercontent.com/DAB-LABS/smart-sniffer/main/install.sh | sudo UNINSTALL=1 bash

Stops the service, removes the binary, config, and service files.

2. Add the integration to Home Assistant

Via HACS (recommended):

  1. Open HACS → three-dot menu → Custom repositories
  2. Add https://github.com/DAB-LABS/smart-sniffer · Category: Integration
  3. Download SMART Sniffer → Restart HA

Manual: Copy custom_components/smart_sniffer/ into your HA custom_components/ directory and restart.

3. Connect to the agent

Auto-discovery (recommended): The agent advertises itself via mDNS. After a few seconds Home Assistant will show a discovery notification — just click Add and you're done. If the agent has a bearer token, you'll be prompted for it.

Auto-discovery prompt

Manual: Settings → Devices & Services → Add Integration → SMART Sniffer — enter the agent's host, port, optional token, and polling interval.

Every drive on the machine appears as its own HA device.

NAS & RAID Setup

Most Linux, macOS and Windows machines work out of the box. NAS devices and RAID controllers sometimes need extra steps because their storage controllers present drives differently than standard SATA/NVMe.

Platform guides: For step-by-step walkthroughs on specific platforms, see the Platform Installation Guides -- covering Synology, QNAP, TrueNAS SCALE, Unraid, Proxmox, Docker, virtual machines, and hardware RAID controllers.

Diagnosing drive detection

If the agent starts but reports no drives (or fewer than expected), run:

smartha-agent --discover

This probes every drive the OS exposes, tests protocol detection, and tells you exactly what the agent will see at runtime. If any drives need manual configuration, it offers to write the config for you. See the Drive Discovery guide for full details and example output.

Paste the --discover output into a GitHub issue if you need help -- it gives us everything we need to diagnose remotely.

Raw SMART data

Need the full SMART dump for a drive? The integration includes everything smartctl returns in the diagnostics download. Go to Settings > Devices & Services > SMART Sniffer > three-dot menu > Download Diagnostics. The downloaded JSON file contains the complete raw SMART data for every drive on that agent, along with the attention evaluation and agent metadata. Useful for debugging unexpected sensor values, sharing in bug reports, or passing to an AI for deeper analysis.

The agent also exposes raw data via its REST API at http://<agent-ip>:9099/api/drives (summary) and http://<agent-ip>:9099/api/drives/{id} (full SMART JSON for a single drive).

NAS devices

NAS platforms have platform-specific quirks -- proprietary device paths (Synology), SCSI-to-ATA protocol mismatches (QNAP), outdated smartmontools versions, and LXC bridge interfaces that confuse mDNS. The agent handles most of this automatically since v0.5.5, but some platforms need device_overrides or a newer smartmontools. See the platform-specific guides: Synology, QNAP, TrueNAS SCALE, Unraid.

Hardware RAID controllers

Drives behind hardware RAID controllers (MegaRAID, HP SmartArray, 3ware, Areca) are hidden from the OS behind the RAID layer. smartctl needs a RAID-specific device type flag to reach each physical drive. --discover does not probe RAID controllers yet, so these require manual device_overrides. See the RAID Controllers guide for controller identification, drive enumeration, and config examples.

smartmontools version

The agent requires smartmontools 7.0+ for JSON output support. Most current Linux distros ship 7.x. Older or embedded systems (Synology DSM, QNAP QTS, some RHEL/CentOS 7 installs) may ship 6.x or older. Run smartctl --version to check. The agent logs a clear error on startup if the version is too old.


AI & Automation

SMART Sniffer includes a smart_sniffer.get_drive_data service call that returns the complete cached SMART dataset for any agent. The response includes raw SMART attributes, our attention evaluation (state, severity, and reasons), filesystem usage, and agent metadata -- everything an LLM needs to produce a meaningful health assessment.

What's in the response:

Key Content
agent Name, host, OS, uptime, version
drives Per-drive: model, serial, protocol, device path, attention state/severity/reasons, full SMART JSON
filesystems Per-mount: mountpoint, total bytes, used bytes, percentage

Example use case: automated weekly health report. Create an HA automation that runs every Monday morning, calls smart_sniffer.get_drive_data for each agent, passes the JSON to a conversation agent (Google AI, OpenAI, Claude, Ollama) with a prompt like "Analyze these SMART readings and flag anything concerning," then emails you the AI's response. The AI can catch patterns like rising reallocated sector counts, aging drives approaching end-of-life, or storage filling up -- things the raw sensor values alone don't always make obvious.

The service uses supports_response: only, so it also works from Developer Tools > Services for manual inspection of the full response payload.

For one-off analysis without automation, use Download Diagnostics (Settings > Devices & Services > SMART Sniffer > three-dot menu). The downloaded JSON contains the same raw SMART data and can be pasted directly into any AI chat interface. See Raw SMART data above.


Screenshots

NVMe SSD — Sensors NVMe SSD — Diagnostics
SATA SSD — Sensors SATA SSD — Diagnostics
Attention: YES (Critical) Trigger Reason in Diagnostics
Attention: MAYBE (Warning) Warning Reason in Diagnostics
🔌 What about USB drives?

External drives connected via USB enclosures typically block SMART passthrough. SMART Sniffer detects this and marks the drive as UNSUPPORTED with Health: Unknown.

This is the most common "why isn't my drive showing data?" scenario. It's a hardware limitation of the USB bridge chip, not a bug.


Documentation

Doc Description
Attention Severity Logic State machine, classification rules, notification lifecycle
Trigger → Entity Map Every attention trigger mapped to its sensor entity and icon
Early Warning Attributes Which SMART attributes predict failure and why
Attribute Name Variants Manufacturer-specific smartctl name mapping research
Drive Discovery (--discover) Probe drives, detect protocols, auto-generate config
Mock Agent Testing tool -- fake agent with controllable drives
Platform Install Paths Install locations, immutable rootfs support, network interface filtering
Agent Version Repair Design doc for agent version checking and HA repair notifications
Build Journal Design decisions, iteration history, known issues
Examples Community-contributed automations -- copy, paste, adapt

Platform Installation Guides

Step-by-step setup for NAS devices, hypervisors, and containerized environments. See the guides hub for the full index.

Platform What's covered
Proxmox + HA Agent on host, integration in VM, mDNS, firewall
Synology DSM SynoCli smartmontools, /dev/sataX paths, --discover
QNAP QTS SAT fallback, lxcbr0 exclusion, interface selection
TrueNAS SCALE ZFS context, btrfs-progs, filesystem monitoring
Unraid br0 bridge, Docker deployment (community -- in progress)
Docker Device passthrough, host networking (community -- in progress)
Virtual Machines ESXi, Hyper-V, VirtualBox -- why SMART needs the host
Hardware RAID Controllers MegaRAID, HP SmartArray, 3ware, Areca -- manual device_overrides

Community Deployments

Deployment Maintainer Description
Docker @fireinice Dockerfile + auto-generated docker-compose with per-drive capability scoping. Available on Docker Hub.

Note: disk usage monitoring (/api/filesystems) is not yet supported in Docker deployments. Container-aware path mapping is on the roadmap.

Testing

The integration has been tested against the included Mock Agent — a standalone Python tool that simulates a smartha-agent with fully controllable fake drives. It serves the same API as the real agent, with a web dashboard for changing SMART attributes in real time. Useful for validating attention state transitions, notification behavior, and new drive types without waiting for real hardware to degrade.

python3 tools/mock-agent.py --port 9100 --preload sata_hdd,nvme,usb_blocked

Point the HA integration at localhost:9100 and you're testing.

Contributing

Found a bug? Have a drive that isn't mapping correctly? See CONTRIBUTING.md for how to help.

Drive-specific smartctl -a --json output samples are especially welcome — they help us catch manufacturer name variants we haven't seen yet.

Roadmap

  • HAOS App -- SMART Sniffer App for Home Assistant OS
  • Integration icons for HA integrations page
  • Disk usage monitoring (agent-side) -- /api/filesystems endpoint with installer picker
  • Disk usage monitoring (integration-side) -- filesystem sensor entities in HA
  • Standby-aware polling (smartctl -n standby) -- shipped v0.5.3
  • Integration: agent connectivity sensor + diagnostic entities (version, last seen, IP, port, auth) -- shipped v0.5.3
  • Agent: smartctl minimum version check (fail early with clear message if < 7.0) -- shipped v0.5.3
  • Integration: dedicated Drive Standby binary sensor with data_as_of attribute -- shipped v0.5.4
  • Integration: Agent OS diagnostic sensor (linux / darwin / windows) -- shipped v0.5.4
  • NAS protocol detection (SAT fallback, --discover, device_overrides) -- shipped v0.5.5
  • Agent: smartctl path auto-resolution (finds newer smartctl when PATH version is outdated) -- shipped v0.5.5.1
  • Agent: broadened SAT fallback to all smartctl execution failure bits -- shipped v0.5.5.2
  • Agent: first-poll wake (collects SMART baseline from sleeping drives on startup) -- shipped v0.5.5.3
  • Agent: expanded mDNS interface filter (51 prefixes) + IP scoring improvement -- shipped v0.5.5.4
  • Installer: macOS quarantine removal + expanded interface picker labels -- shipped v0.5.5.5
  • Unified ATA/NVMe wear level scale + btrfs filesystem fallback + installer bind mount dedup -- shipped v0.5.6
  • Custom Lovelace card -- SMART Sniffer Card v1.0.19
  • Platform installation guides -- Proxmox, Synology, QNAP, TrueNAS SCALE, Unraid, Docker, VMs
  • Drive Discovery (--discover) documentation
  • MQTT agent mode
  • Configurable alert thresholds via options flow
  • Per-drive scan intervals
  • YAML-based SMART attribute definitions (vendor field mapping, transforms, units)
  • SAS/SCSI basic monitoring (health, temperature, power-on hours, power cycles) -- shipped v0.5.7
  • Vendor-specific SMART diagnostic entities (disabled-by-default, all named ATA attributes) -- shipped v0.5.7
  • SAS/SCSI full attribute parsing (grown defect list, error counters, endurance indicators)
  • Agent: container-aware filesystem reporting (MNT_PREFIX path mapping for Docker deployments)
  • Agent: runtime interface detection (replace static prefix list with OS-level physical NIC detection)
  • Integration: parent-agent device hierarchy + optional area-on-setup (drives nest under agent via via_device, area inherits via HA prompt)
  • Integration: split consolidated wear-leveling / uncorrectable / pending-sector sensors into separate diagnostic entities when a drive reports multiple variants -- shipped v0.5.14
  • Integration: smart_sniffer.get_drive_data service call for AI/automation access to full SMART data -- shipped v0.5.15, community PR by @nsleigh

MIT License · Built by DAB-LABS