NixOS Hetzner Boot Repair
I accidentally misconfigured one of my NixOS servers – a Hetzner vps – rather badly: too much ofthe disko configuration (which determines filesystem layouts) was shared between servers. When applied to a server that didn’t have an external volume it refused to finish booting because it was unable to mount a disk that didn’t exist.
This is always a risk when configuring filesystems on computers. NixOS, unlike many other operating sytems, has an easy way to let the user recover from that situation: the grub configuration of a NixOS system contains menu entries for multiple older generations. The idea is that when someone like I mess up like I did, they can reboot the machine, find an older version of the os configuration in the grub menu and boot that to recover.1 Other operating systems do this for e.g. kernel upgrades, but NixOS does it for the entire os configuration.
I’m sure this works great most of the time, but unfortunately the Hetzner web console did not pass my keypresses through to grub, so the server was stuck selecting the default grub entry, which was the most recent, broken generation.
Hetzner does support pxe booting into a rescue Debian system on their vps’s.
This system can mount filesystems from the server. I struggled to figure out how
to use this to repair the broken fstab
that prevented boot, when I realised I
was trying to solve the wrong problem. It might be easier to mount /boot
in
the rescue system, and shuffle around the menu entries such that grub defaults
into booting an older generation of the NixOS configuration!
That worked. Once in the server on an older generation, it is possible to revert
the hacky changes to grub.cfg
and then remotely deploy a new, working
configuration with nixos-rebuild
.