Wednesday 22 July 2009

Recovering Ubuntu

A few days ago I did something bad.  I did a distro upgrade to bleeding edge Ubuntu (9.10) before release, all for the want of the latest version of libvirt on my antiquated laptop.

I know pre-release software might be rough around the edges, I know things might not work, but I didn't really care, after all these things are usually fixed over time.  However, what I didn't bargin for was that the upgrade process was to totally bjorked that it sent dpkg into an infinite loop unpacking the same packages, over and over again, until I stopped it.

Big mistake, but what else could I do?



Now my laptop has a nifty new security feature: It freezes on boot with a responsive kernel.  The magic alt-sysrq keys still work, allowing a safe reboot. (alt-sysrq-s : sync drives, alt-sysrq-u : remount drives readonly, alt-sysrq-b : reboot NOW).  I believe the cause was that dpkg didn't get to do the "configuring packages" step, so everything got updated, but not configured.

Assessing the Problem

Ubuntu helpfully provides recovery kernels (same kernels, different boot options to be more verbose), none of which worked in my case, but they allow one to watch what is going on during boot.  In my case, the last message I saw was "Running /scripts/init-bottom... Done." and then it froze.  Google turned up a few hits of people with the same problem, but no solution, and no better description of the problem.

However, I know that the init scripts ARE running, so that means the kernel is successfully booting and passing control to the initrd image, which may or may not be finishing, and thus may or may not be passing control to the real init scripts on disk.  Those real init scripts are definitely not running, you know, the part that starts with: "* Reading files needed to boot.. [ OK ]".  So the problem is in the middle somewhere.  This further supports my working theory: the kernel got updated, but initramfs doesn't match the kernel, so it's becoming very unhappy.

Getting a Command Prompt


Given that the kernel IS booting, it's easy to bypass the initscripts and get a command prompt.  Edit the kernel command line in grub at boot time and append "init=/bin/bash" to it.  Presto, instead of running init, it just drops you to a prompt.

If my initramfs theory is true, all I need to do is finish dpkg's configuration phase to fix things.  To do that, first mount the root filesystem read/write: mount / -o remount,rw  then run dpkg configure: dkpg --configure -a

Indeed, that was definitely a problem, it spent quite a while configuring packages and regnerating things.

Additional Recovery, Just In Case

Just in case that wasn't the problem, it could be an init bug that has been fixed in a later release of some package, so while we're at it, let's just do an upgrade too.  Bring up networking:  dhclient &   (the & is needed to throw it in the background, for some reason my CTRL keys weren't responding, like CTRL-Z to stop the current program, or CTRL-C to exit it).  Then do an update to pull down the package lists:  apt-get update.  Then do the install: apt-get upgrade.  This completed successfully.

Reboot (using the sysrq method, since init didn't run, nothing has hooked ctrl-alt-del, and the reboot command can't talk to init (because it's not running) so it does nothing).

That indeed fixed it, it boots right up to the graphical login, no problem.  Ta-Da!

0 comments:

Post a Comment