Debugging 90s hangs during shutdown on Ubuntu 20.04
Table of contents
I spent some time yesterday trying to understand why one of the machines here seemed to always take 90s before shutting down or rebooting. This article is a summary of the wild goose chase I embarked on to try and understand what was happening.
When systemd
machines hang during shutdown, it usually means that there's a service that is not stopping fast enough
or it could be that systemd
is waiting for the DefaultTimeoutStopSec
timeout before
killing the remaining processes after all services have been stopped.
In my case, what I was seeing is the following:
[ OK ] Finished Reboot.
[ OK ] Reached target Reboot.
[ 243.652848 ] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, fluent-bit
These processes are part or k3s
, they are the parent processes of containers or processes running
in the containers themselves. Clearly, it has something to do with k3s
.
Update: the systemd bugfix was backported to Ubuntu 20.04.
This service is a little bit special, when restarting it, the containers that it started should
continue to run unperturbed (Otherwise upgrades would be fun...).
To accomplish that, it uses KillMode=process
in its systemd
service definition.
This tells systemd
to only kill the main process that was spawned by ExecStart
and not to bother with
all the others.
This is a case where, during shutdown, systemd
will wait up to DefaultTimeoutStopSec
before killing
all the remaining processes.
This parameter is set to 5s
on my machines to avoid exactly that, but somehow on this particular
machine it didn't seem to be sufficient.
Systemd v245 Bug
I'll spare all the details, but I ended up having to hack up systemd-shutdown
to add some debugging
printfs in order to convince myself that the timeout was indeed the default 90s
and not the 5s
that
it should be.
Roughly:
apt-get source systemd
# add debug
meson build/
cp /lib/systemd/usr/lib/systemd/systemd-shutdown /lib/systemd/usr/lib/systemd/systemd-shutdown.ori
cp build/systemd-shutdown /lib/systemd/usr/lib/systemd/systemd-shutdown
# I guess I could have learned how to configure this properly, but "I was in a hurry"
chrpath -r /lib/systemd/usr/lib/systemd/systemd-shutdown
reboot # and pray
Once I was convinced systemd
was really using the default timeout value instead of the configured
one, I set out to (re-)read its code to figure out where it got confused and maybe fix it.
In the end, it was not too bad to spot:
finish:
/* snip snip snip */
reset_arguments();
/* snip snip snip */
if (shutdown_verb) {
r = become_shutdown(shutdown_verb, retval);
log_error_errno(r, "Failed to execute shutdown binary, %s: %m", getpid_cached() == 1 ? "freezing" : "quitting");
error_message = "Failed to execute shutdown binary";
}
Basically, all the parameters were reset to their default value by reset_arguments
right before calling become_shutdown
.
Just to make sure, I looked at the main
branch and sure enough, the reset_arguments
call isn't
there anymore.
The bug was already fixed: https://github.com/systemd/systemd/commit/7d9eea2bd3d4f83668c7a78754d201b22
That is all well and good, less work for me. But my machine is still slow to shutdown and there's zero
chance that Actually it was
eventually fixed!systemd
will get an update for that particular issue on Ubuntu 20.04...
Fast shutdown workaround
After much fighting with systemd, I ended up with the following workaround:
This instanced service unit will signal every process in the specified service's cgroup
to stop,
wait 5 seconds, then kill the survivors just before the filesystems are unmounted during shutdown.
To enable it for the k3s-agent
service:
sudo systemctl enable cgroup-kill-on-shutdown@k3s-agent.service.service
(replace k3s-agent.service
with whichever service needs cleaning up. Mind the double
.service.service
)
Debugging systemd shutdown process
Figuring how to set that up was an exercise in patience and bricolage...
I rebooted the machine way too many time and resorted to sleeps, prints and redirections to /dev/tty1
to be able to get any debug info out of the service.
Not recommended.
Pitfalls
One of the thing that bit pretty hard was that I was using this construct to kill the processes individually:
while read pid; do
echo killing $pid
kill -9 $pid
done </sys/fs/cgroup/unified/system.slice/k3s-agent.service/cgroup.procs
It kept leaving processes behind...
What happens is that the file being read is changing with every call to kill
and thus bash was skipping/seeking
over parts of it with every iteration.
Very annoying, also, it was pure luck that the pids were all the same length...
Using the weird pids=$(cat blah); echo $pids | ...
construct avoids that issue by reading the whole
file only once before killing everything.
Better tool for debugging the shutdown process
systemd
has a test mode that is quite handy to figure out what will actually get called when reaching
a target
.
I used it to understand why my instanced service was not being called, but a regular service with the same content was called.
/usr/lib/systemd/systemd --test --system --unit=shutdown.target
First, all the unit files are listed and then all the jobs. The last part is of interest here since it tells which unit will actually be stopped.
From this I could see that systemd
was not even trying to stop the cgroup-kill-on-shutdown@k3s-agent.service.service
instanced service.
By comparing the units description with a regular service and the faulty service, I realized that they
were not using the same Slice
.
Setting Slice=system.slice
in the instanced service fixed it.
Conclusion
That was quite a detour... I just wanted to reboot my machine.
At least I learned a few things:
- Always read the code you're actually running (and compare it to the
main
branch after) - Reading files exposed by the kernel can/will change while you read them
systemd
has a lot of moving parts, but its tooling seems good, once you figure out that there's actually an option to dump what you want.
Other people seemed to be having that problem too but without much in terms of solutions: https://github.com/k3s-io/k3s/issues/2400