1 Commits

Author SHA1 Message Date
Greg Hendrickson
d310d6ebbe feat: add production-ready Prometheus & Alertmanager configs
- prometheus.yml: Service discovery, alerting, multi-job scraping
- alertmanager.yml: Routing tree, inhibition rules, multi-channel
- node-exporter.yml: 30+ alert rules (CPU, memory, disk, network, system)
- File-based service discovery for dynamic host management
- Updated README with usage docs and alert catalog

Alert categories: availability, resource saturation, disk predictive,
I/O latency, network errors, clock sync, OOM detection, conntrack
2026-02-04 18:02:47 +00:00