TIL: Restarting systemd services on sustained CPU abuse

I kept finding avahi-daemon pegging the CPU in some of my LXC containers, and I wanted a service policy that behaves like a human would: limit it to 10%, restart immediately if pegged, and restart if it won’t calm down above 5%.

Well, turns out systemd already gives us 90% of this, but the documentation for that is squirrely, and after poking around a bit I found that the remaining 10% is just a tiny watchdog script and a timer.

Setup

First, contain the daemon with CPUQuota:

sudo systemctl edit avahi-daemon
[Service]
CPUAccounting=yes
CPUQuota=10%
Restart=on-failure
RestartSec=10s
KillSignal=SIGTERM
TimeoutStopSec=30s

Then create a generic watchdog script at /usr/local/sbin/cpu-watch.sh:

#!/bin/bash
set -euo pipefail

UNIT="$1"
INTERVAL=30

# Policy thresholds
PEGGED_NS=$((INTERVAL * 1000000000 * 9 / 10))   # ~90% of quota window
SUSTAINED_NS=$((INTERVAL * 1000000000 * 5 / 100)) # 5% CPU

STATE="/run/cpu-watch-${UNIT}.state"

current=$(systemctl show "$UNIT" -p CPUUsageNSec --value)
previous=0
[[ -f "$STATE" ]] && previous=$(cat "$STATE")
echo "$current" > "$STATE"

delta=$((current - previous))

# Restart if pegged (hitting CPUQuota)
if (( delta >= PEGGED_NS )); then
  logger -t cpu-watch "CPU pegged for $UNIT (${delta}ns), restarting"
  systemctl restart "$UNIT"
  exit 0
fi

# Restart if consistently above 5%
if (( delta >= SUSTAINED_NS )); then
  logger -t cpu-watch "Sustained CPU abuse for $UNIT (${delta}ns), restarting"
  systemctl restart "$UNIT"
fi

It’s not ideal to have hard-coded thresholds or to hit storage frequently, but in most modern systems /run is a tmpfs or similar, so for a simple watchdog this is acceptable.

The next step is to make it executable and figure out how to use it via systemd templates:

sudo chmod +x /usr/local/sbin/cpu-watch.sh
# cat /etc/systemd/system/[email protected]
[Unit]
Description=CPU watchdog for %i
After=%i

[Service]
Type=oneshot
ExecStart=/usr/local/sbin/cpu-watch.sh %i
# cat /etc/systemd/system/[email protected]
[Unit]
Description=Periodic CPU watchdog for %i

[Timer]
OnBootSec=2min
OnUnitActiveSec=30s
AccuracySec=5s

[Install]
WantedBy=timers.target

The trick I learned today was how to enable it with the target service name:

sudo systemctl daemon-reload
sudo systemctl enable --now [email protected]

You can check it’s working with:

sudo systemctl list-timers | grep cpu-watch
# this should show the script restart messages, if any:
sudo journalctl -t cpu-watch -f

Why This Works

The magic, according to Internet lore and a bit of LLM spelunking, is in using CPUUsageNSec deltas over a timer interval, which has a few nice properties:

  • Short CPU spikes are ignored, since the timer provides natural hysteresis
  • Sustained abuse (>5%) triggers restart
  • Pegged at quota (90% of 10%) triggers immediate restart
  • Runaway loops are contained by CPUQuota
  • Everything is systemd-native and auditable via journalctl

It’s not perfect, but at least I got a reusable pattern/template out of this experiment, and I can adapt this to other services as needed.