tkuchikiの日記

新ブログ https://blog.tkuchiki.net

Amazon Linux 2014.03 の kernel update 時に instance が起動しなくなる問題の対処法

追記: Amazon Linux 2014.03 の問題ではなく、yum.conf の設定の問題でした。

Amazon Linux で yum.conf に installonlypkgs を正しく設定しないと kernel update で問題が発生する - tkuchikiの日記
に詳細を書きました。

解決方法自体は間違っていないので、記事は残しておきます。

kernel update 前の kernel version は以下のとおりです。

[ec2-user@amzn-201403 ~]$ uname -a
Linux amzn-201403 3.10.53-56.140.amzn1.x86_64 #1 SMP Thu Aug 14 22:00:02 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

この状態で、yum.conf の releasever=2014.03 に固定した状態で、yum update kernel を行うと、

[ec2-user@amzn-201403 ~]$ sudo yum upgrade kernel-3.10.57-57.140.amzn1
Loaded plugins: priorities, update-motd, upgrade-helper
66 packages excluded due to repository priority protections
Resolving Dependencies
--> Running transaction check
---> Package kernel.x86_64 0:3.10.42-52.145.amzn1 will be updated
---> Package kernel.x86_64 0:3.10.53-56.140.amzn1 will be updated
---> Package kernel.x86_64 0:3.10.57-57.140.amzn1 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

=====================================================================================================================================================================
 Package                           Arch                              Version                                           Repository                               Size
=====================================================================================================================================================================
Updating:
 kernel                            x86_64                            3.10.57-57.140.amzn1                              amzn-updates                             12 M

Transaction Summary
=====================================================================================================================================================================
Upgrade  1 Package

Total download size: 12 M
Is this ok [y/d/N]: y
Downloading packages:
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
kernel-3.10.57-57.140.amzn1.x86_64.rpm                                                                                                        |  12 MB     00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Updating   : kernel-3.10.57-57.140.amzn1.x86_64                                                                                                                1/3
  Cleanup    : kernel.x86_64                                                                                                                                     2/3
warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.order: remove failed: No such file or directory
warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.networking: remove failed: No such file or directory
warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.modesetting: remove failed: No such file or directory
warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.drm: remove failed: No such file or directory
warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.builtin: remove failed: No such file or directory
warning: file /lib/modules/3.10.53-56.140.amzn1.x86_64/modules.block: remove failed: No such file or directory
grubby fatal error: unable to find a suitable template
grubby: doing this would leave no kernel entries. Not writing out new config.
  Cleanup    : kernel.x86_64                                                                                                                                     3/3
warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.order: remove failed: No such file or directory
warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.networking: remove failed: No such file or directory
warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.modesetting: remove failed: No such file or directory
warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.drm: remove failed: No such file or directory
warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.builtin: remove failed: No such file or directory
warning: file /lib/modules/3.10.42-52.145.amzn1.x86_64/modules.block: remove failed: No such file or directory
grubby fatal error: unable to find a suitable template
  Verifying  : kernel-3.10.57-57.140.amzn1.x86_64                                                                                                                1/3
  Verifying  : kernel-3.10.42-52.145.amzn1.x86_64                                                                                                                2/3
  Verifying  : kernel-3.10.53-56.140.amzn1.x86_64                                                                                                                3/3

Updated:
  kernel.x86_64 0:3.10.57-57.140.amzn1

Complete!

よく見ると、以下の様な不穏なメッセージが出ております。

warning: file xxx: remove failed: No such file or directory
grubby fatal error: unable to find a suitable template
grubby: doing this would leave no kernel entries. Not writing out new config.

これを無視して reboot すると、
instance が起動しなくなります。

grubby: doing this would leave no kernel entries. Not writing out new config.

が怪しいと思ったので、

/etc/grub.conf を見て見ると、

[ec2-user@amzn-201403 ~]$ cat /etc/grub.conf
# created by imagebuilder
default=0
timeout=3
hiddenmenu

title Amazon Linux 2014.03 (3.10.42-52.145.amzn1.x86_64)
root (hd0,0)
kernel /boot/vmlinuz-3.10.42-52.145.amzn1.x86_64 root=LABEL=/ console=ttyS0
initrd /boot/initramfs-3.10.42-52.145.amzn1.x86_64.img

kernel と initrd に指定してあるファイルが古い...
ということで /boot を見てると、

[ec2-user@amzn-201403 ~]$ ls -l /boot/
合計 26784
-rw------- 1 root root  2203204 10月 11 06:43 System.map-3.10.57-57.140.amzn1.x86_64
-rw-r--r-- 1 root root    80264 10月 11 06:43 config-3.10.57-57.140.amzn1.x86_64
drwxr-xr-x 3 root root     4096  6月 12 01:51 efi
drwxr-xr-x 2 root root     4096 11月 26 15:23 grub
-rw------- 1 root root 21333115 11月 26 15:16 initramfs-3.10.57-57.140.amzn1.x86_64.img
-rw-r--r-- 1 root root   128836 10月 11 06:43 symvers-3.10.57-57.140.amzn1.x86_64.gz
-rwxr-xr-x 1 root root  3664144 10月 11 06:43 vmlinuz-3.10.57-57.140.amzn1.x86_64

案の定、ファイルがない...

ので、これを書き換えます。

# /etc/grub.conf

- kernel /boot/vmlinuz-3.10.42-52.145.amzn1.x86_64 root=LABEL=/ console=ttyS0
+ kernel /boot/vmlinuz-3.10.57-57.140.amzn1.x86_64 root=LABEL=/ console=ttyS0
- initrd /boot/initramfs-3.10.42-52.145.amzn1.x86_64.img
+ initrd /boot/initramfs-3.10.57-57.140.amzn1.x86_64.img

これで reboot しても instance が起動します。
もし、すでに失敗して起動しなくなっている場合でも、
他の instance に EBS を mount して、/etc/grub.conf を書き換えて、EBS をつけて instance を起動すれば直ります(手順省略)。

また、Amazon Linux 2014.03 から 2014.09 に update したときも同様の問題が発生 しますので、ご注意ください。

参考文献

bugzilla にそれっぽいものがありましたが、目を通しきれていません。
参考まで。

Bug 730357 – grubby fatal error: unable to find a suitable template