Only reload prometheus/alertmanager if config changes

Previously prometheus/alertmanager were unconditionally reloaded every 10
minutes. Reloading alertmanager when it is about to deliver an alert
notification seems to make it account successfull notification deliveries as
failed, increasing alertmanager_notifications_failed_total. So reloading every
10 minutes caused spurious AlertmanagerNotifications alerts in our setup.

Also config updates are now atomic regardless of whether /tmp is a tmpfs.
This commit is contained in:
Julian Rother 2024-08-19 01:28:29 +02:00
parent 6cb9d48140
commit 5ebe07a48e
No known key found for this signature in database
GPG key ID: 8F9B6AE9BAAE4899

View file

@ -1,8 +1,6 @@
#!/bin/bash
set -euo pipefail
tmpfile=`mktemp`
(
cat /etc/prometheus/conf.d/*.conf
@ -13,9 +11,14 @@ echo " alertmanagers:"
echo "scrape_configs:"
cat /etc/prometheus/conf.d/scrape_configs/*.conf
) > $tmpfile
) > /etc/prometheus/prometheus.yml.new
chmod 0644 $tmpfile
mv $tmpfile /etc/prometheus/prometheus.yml
/usr/bin/systemctl reload prometheus
/usr/bin/systemctl reload prometheus-alertmanager || true
chmod 0644 /etc/prometheus/prometheus.yml.new
if ! diff -q /etc/prometheus/prometheus.yml.new /etc/prometheus/prometheus.yml > /dev/null; then
mv /etc/prometheus/prometheus.yml.new /etc/prometheus/prometheus.yml
/usr/bin/systemctl reload prometheus
/usr/bin/systemctl reload prometheus-alertmanager || true
else
rm -f /etc/prometheus/prometheus.yml.new
fi