Amazon Linux 2014.03 の kernel update 時に instance が起動しなくなる問題の対処法
追記: Amazon Linux 2014.03 の問題ではなく、yum.conf の設定の問題でした。
Amazon Linux で yum.conf に installonlypkgs を正しく設定しないと kernel update で問題が発生する - tkuchikiの日記
に詳細を書きました。
解決方法自体は間違っていないので、記事は残しておきます。
kernel update 前の kernel version は以下のとおりです。
[ec2-user@amzn-201403 ~]$ uname -a Linux amzn-201403 3.10.53-56.140.amzn1.x86_64 #1 SMP Thu Aug 14 22:00:02 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
この状態で、yum.conf の releasever=2014.03 に固定した状態で、yum update kernel を行うと、
[ec2-user@amzn-201403 ~]$ sudo yum upgrade kernel-3.10.57-57.140.amzn1 Loaded plugins: priorities, update-motd, upgrade-helper 66 packages excluded due to repository priority protections Resolving Dependencies --> Running transaction check ---> Package kernel.x86_64 0:3.10.42-52.145.amzn1 will be updated ---> Package kernel.x86_64 0:3.10.53-56.140.amzn1 will be updated ---> Package kernel.x86_64 0:3.10.57-57.140.amzn1 will be an update --> Finished Dependency Resolution Dependencies Resolved ===================================================================================================================================================================== Package Arch Version Repository Size ===================================================================================================================================================================== Updating: kernel x86_64 3.10.57-57.140.amzn1 amzn-updates 12 M Transaction Summary ===================================================================================================================================================================== Upgrade 1 Package Total download size: 12 M Is this ok [y/d/N]: y Downloading packages: Delta RPMs disabled because /usr/bin/applydeltarpm not installed. kernel-3.10.57-57.140.amzn1.x86_64.rpm | 12 MB 00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Updating : kernel-3.10.57-57.140.amzn1.x86_64 1/3 Cleanup : kernel.x86_64 2/3 warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.order: remove failed: No such file or directory warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.networking: remove failed: No such file or directory warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.modesetting: remove failed: No such file or directory warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.drm: remove failed: No such file or directory warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.builtin: remove failed: No such file or directory warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.block: remove failed: No such file or directory grubby fatal error: unable to find a suitable template grubby: doing this would leave no kernel entries. Not writing out new config. Cleanup : kernel.x86_64 3/3 warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.order: remove failed: No such file or directory warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.networking: remove failed: No such file or directory warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.modesetting: remove failed: No such file or directory warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.drm: remove failed: No such file or directory warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.builtin: remove failed: No such file or directory warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.block: remove failed: No such file or directory grubby fatal error: unable to find a suitable template Verifying : kernel-3.10.57-57.140.amzn1.x86_64 1/3 Verifying : kernel-3.10.42-52.145.amzn1.x86_64 2/3 Verifying : kernel-3.10.53-56.140.amzn1.x86_64 3/3 Updated: kernel.x86_64 0:3.10.57-57.140.amzn1 Complete!
よく見ると、以下の様な不穏なメッセージが出ております。
warning: file xxx: remove failed: No such file or directory grubby fatal error: unable to find a suitable template grubby: doing this would leave no kernel entries. Not writing out new config.
これを無視して reboot すると、
instance が起動しなくなります。
grubby: doing this would leave no kernel entries. Not writing out new config.
が怪しいと思ったので、
/etc/grub.conf を見て見ると、
[ec2-user@amzn-201403 ~]$ cat /etc/grub.conf # created by imagebuilder default=0 timeout=3 hiddenmenu title Amazon Linux 2014.03 (3.10.42-52.145.amzn1.x86_64) root (hd0,0) kernel /boot/vmlinuz-3.10.42-52.145.amzn1.x86_64 root=LABEL=/ console=ttyS0 initrd /boot/initramfs-3.10.42-52.145.amzn1.x86_64.img
kernel と initrd に指定してあるファイルが古い...
ということで /boot を見てると、
[ec2-user@amzn-201403 ~]$ ls -l /boot/ 合計 26784 -rw------- 1 root root 2203204 10月 11 06:43 System.map-3.10.57-57.140.amzn1.x86_64 -rw-r--r-- 1 root root 80264 10月 11 06:43 config-3.10.57-57.140.amzn1.x86_64 drwxr-xr-x 3 root root 4096 6月 12 01:51 efi drwxr-xr-x 2 root root 4096 11月 26 15:23 grub -rw------- 1 root root 21333115 11月 26 15:16 initramfs-3.10.57-57.140.amzn1.x86_64.img -rw-r--r-- 1 root root 128836 10月 11 06:43 symvers-3.10.57-57.140.amzn1.x86_64.gz -rwxr-xr-x 1 root root 3664144 10月 11 06:43 vmlinuz-3.10.57-57.140.amzn1.x86_64
案の定、ファイルがない...
ので、これを書き換えます。
# /etc/grub.conf - kernel /boot/vmlinuz-3.10.42-52.145.amzn1.x86_64 root=LABEL=/ console=ttyS0 + kernel /boot/vmlinuz-3.10.57-57.140.amzn1.x86_64 root=LABEL=/ console=ttyS0 - initrd /boot/initramfs-3.10.42-52.145.amzn1.x86_64.img + initrd /boot/initramfs-3.10.57-57.140.amzn1.x86_64.img
これで reboot しても instance が起動します。
もし、すでに失敗して起動しなくなっている場合でも、
他の instance に EBS を mount して、/etc/grub.conf を書き換えて、EBS をつけて instance を起動すれば直ります(手順省略)。
また、Amazon Linux 2014.03 から 2014.09 に update したときも同様の問題が発生 しますので、ご注意ください。
参考文献
bugzilla にそれっぽいものがありましたが、目を通しきれていません。
参考まで。
Bug 730357 – grubby fatal error: unable to find a suitable template