Set up and use Prometheus, Grafana, Uptime Kuma, and Alertmanager for home lab observability.
Configure and operate the observability stack on a Raspberry Pi 5 home lab. Covers Prometheus scrape configuration, Grafana dashboard management, Uptime Kuma endpoint monitoring, Alertmanager routing, and ntfy push notifications for alerts.
homelab_sshTest)Optional:
homelab_sshTest.homelab_serviceHealth for the monitoring
stack services: Prometheus (port 9090), Grafana (port 3000), Uptime Kuma
(port 3001), and Alertmanager (port 9093).homelab_serviceLogs to retrieve recent logs and identify the problem./opt/homelab/docker/monitoring/prometheus/prometheus.ymlhomelab_serviceRestartMonitoring stack services and ports:
http://pi:9090 -- metrics collection and alerting ruleshttp://pi:3000 -- visualization dashboardshttp://pi:3001 -- uptime and endpoint monitoringhttp://pi:9093 -- alert routing and silencinghttp://pi:9100/metrics -- host-level metricshttp://pi:8080 -- push notification serverConfig file locations:
/opt/homelab/docker/monitoring/
prometheus/prometheus.yml # Scrape configs and global settings
prometheus/rules/alerts.yml # Alerting rule definitions
alertmanager/alertmanager.yml # Receiver and route configuration
grafana/provisioning/ # Dashboard and datasource provisioning
Default scrape targets:
User: "Are all my services up?"
Assistant actions:
homelab_sshTesthomelab_serviceHealth for each known serviceResponse:
14 of 15 services are healthy. Stirling PDF is returning HTTP 502 -- likely a container crash. Checking logs... (calls
homelab_serviceLogsfor stirling-pdf) Stirling PDF OOM-killed 12 minutes ago. Recommend restarting with a memory limit increase.
User: "Add a Prometheus scrape target for my NAS at 192.168.1.50:9100"
Assistant actions:
prometheus.yml (add a new job)homelab_serviceRestart for Prometheus| Tool | Purpose |
|---|---|
homelab_sshTest | Validate connectivity before checks |
homelab_serviceHealth | Check health of individual services |
homelab_serviceLogs | Retrieve logs for unhealthy services |
homelab_serviceRestart | Restart a monitoring service after config edit |
homelab_composePs | List all containers in the monitoring stack |
homelab_composeUp | Redeploy monitoring stack after changes |
homelab_networkInfo | Verify network reachability of scrape targets |
POST /-/reload for config changes. A full container restart is heavier and causes a
brief metrics gap. Prefer reload when possible.service label in group_by.pi-system-management -- hardware metrics that feed into Prometheusdocker-compose-stacks -- deploying and updating the monitoring stacknetwork-configuration -- proxy setup for external access to Grafanabackup-recovery -- backing up Prometheus data and Grafana dashboards