Kubernetes v1.32 [beta]
(enabled by default: true)On Linux nodes, Kubernetes 1.33 supports integrating with systemd to allow the operating system supervisor to recover a failed kubelet. This integration is not enabled by default. It can be used as an alternative to periodically requesting the kubelet's /healthz
endpoint for health checks. If the kubelet does not respond to the watchdog within the timeout period, the watchdog will kill the kubelet.
The systemd watchdog works by requiring the service to periodically send a keep-alive signal to the systemd process. If the signal is not received within a specified timeout period, the service is considered unresponsive and is terminated. The service can then be restarted according to the configuration.
Using the systemd watchdog requires configuring the WatchdogSec
parameter in the [Service]
section of the kubelet service unit file:
[Service] WatchdogSec=30s
Setting WatchdogSec=30s
indicates a service watchdog timeout of 30 seconds. Within the kubelet, the sd_notify()
function is invoked, at intervals of \( WatchdogSec \div 2\). to send WATCHDOG=1
(a keep-alive message). If the watchdog is not fed within the timeout period, the kubelet will be killed. Setting Restart
to "always", "on-failure", "on-watchdog", or "on-abnormal" will ensure that the service is automatically restarted.
Some details about the systemd configuration:
WatchdogSec
to 0, or omit setting it, the systemd watchdog is not enabled for this unit.WatchdogSec
in a systemd unit definition to a period shorter than 1 second, but Kubernetes does not support any shorter interval. The timeout does not have to be a whole integer number of seconds.WatchdogSec
to approximately a 15s period. Periods longer than 10 minutes are supported but explicitly not recommended.[Unit]Description=kubelet: The Kubernetes Node AgentDocumentation=https://kubernetes.io/docs/home/Wants=network-online.targetAfter=network-online.target[Service]ExecStart=/usr/bin/kubelet# Configures the watchdog timeoutWatchdogSec=30sRestart=on-failureStartLimitInterval=0RestartSec=10[Install]WantedBy=multi-user.target
For more details about systemd configuration, refer to the systemd documentation