Ubuntu server continue boot after failure

posted on 2013-08-28

With the grub package used in our Ubuntu Server have a default setting in Grub which will keep the server from booting when there has been a failure. Instead of booting, it will get stuck at the grub boot menu waiting for user input. Of course this is not appropriate behavior for our server. We would rather have it try and try again.

People have been asking about this for different versions and we now hit the problem with Ubuntu Server 13.04.

The solution for newer versions of Ubuntu Server is simple. Open /etc/default/grub and add a GRUB_RECORDFAIL_TIMEOUT variable, for example:

GRUB_DEFAULT=0
GRUB_HIDDEN_TIMEOUT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=2
GRUB_RECORDFAIL_TIMEOUT=${GRUB_TIMEOUT}

Note that it's after GRUB_TIMEOUT has been set, and will set the failure timeout to the same value the normal timeout.

After editing the file, regenerate the grub configuration file using:

sudo update-grub

And you are done. Now if you have multiple servers managed with salt then the following salt state will help you out:

/etc/default/grub:
    file:
        - managed
        - mode: 644
        - source: salt://where_you_have_the_file/grub
        - user: root
        - group: root

run-update-grub:
  cmd.wait:
    - name: update-grub
    - cwd: /
    - watch:
      - file: /etc/default/grub

Happy hacking!