Dracut

问题

Linux启动过程中,kernel先被加载到内存中,然后再挂载根文件系统(硬盘),挂载根文件系统就需要用到该文件系统的驱动。

我们知道Linux采用模块化的设计,大部分模块都是被安装到根文件系统上的,当需要使用它们的时候才被加载到内存当中,这样可以避免kernel过于庞大。那么既然内存中的kernel中并不包含根文件系统的驱动(模块),如何挂载它呢?

Linux系统可以被安装到各种设备上,比如硬盘(不通厂家的硬盘驱动也不同),NFS,LVM,RAID volume等等。如果把所有根文件系统的驱动都放到kernel中,那么Kernel势必会很大。

Initramfs

initramfs就是一个临时的根文件系统,创建initramfs的时候,可以把当前主机的真实根文件系统的驱动(模块)放到这个文件中。

bootloader把kernel和initramfs加载到内存,然后把initramfs的地址通知kernel,由kernel解压initramfs并且把它当作根文件系统,然后执行initramfs里面的init脚本。

当kernel把这个initramfs当作自己的根文件系统之后,当kernel检测到真正的根文件系统之后,就可以借助于udev从这个临时的根文件系统中加载相应的模块。然后通过switch_root命令把根文件系统切换过去。

Dracut

  • dracut可以生成一个通用的initramfs。
  • dracut也可以生成一个只针对当前主机的initramfs。
  • 如果根文件系统在NFS中,dracut会:
    1. 启动主网络接口
    2. 从DHCP服务器获取NFS server地址(也可以静态配置,下面有介绍)
    3. mount NFS server
  • 如果根文件系统在logical volume中,那么LVM相关的helper程序会被启动去扫描和激活LVM。
  • dracut使用udev去监听硬件,当硬件上线后,udev会调用相应的helper程序去处理,比如加载相应的驱动。

udev(userspace device)是一个用户态的工具,可以监听硬件设备的状态,然后根据你设置的rules去执行一些相应的动作。参考: udev introudev rules

使用Dracut创建Initramfs

# dracut option [initramfs] [kernel-version]
# dracut my-initramfs.img 4.18.0-151.el8.x86_64  #指定生成的文件名、为哪个kernel生成initramfs
# dracut --host-only                             #生成只用于本机的initramfs,文件名和kernel都用默认值

#选项
-a,                                              #添加你想要的dracut modules到initramfs中,多个module以空格分隔
-m,                                              #只安装这些dracut modules到initramfs中,多个module以空格分隔
--drivers,                                       #只安装这些driver(driver指的是内核模块,和dracut module不同,后者是dracut自己的东西)
--add-drivers,                                   #除了安装默认driver,还安装这些drivers
--include                                        #通过--include选项注入自己的脚本
  #dracut --include cmdline-preset /etc/cmdline.d/mycmdline.conf initramfs-cmdline-pre.img 
  #上面的命令把当前目录下的cmdline-present拷贝到initramfs的/etc/cmdline.d/mycmdline.conf

#可以通过dracut --list-modules列出所有到dracut modules
#还有很多选项自己参考帮助文档吧

查看Initramfs的内容

# lsinitrd | less                                 #查看默认initramfs
# lsinitrd my-initramfs.img | less                #指定文件名

Dracut Kernel cmdline Options

#查看当前的kernel cdmline
# dracut --print-cmdline

#下面是给dracut用的一些kernel cmdline
#root
init=<path to real init>
	specify the path to the init program to be started after the initramfs has finished
root=<path to blockdevice>
	specify the block device to use as the root filesystem.
	Example:
	root=/dev/sda1
	root=/dev/disk/by-path/pci-0000:00:1f.1-scsi-0:0:1:0-part1
	root=/dev/disk/by-label/Root
	root=LABEL=Root
	root=/dev/disk/by-uuid/3f5ad593-4546-4a94-a374-bcfb68aa11f7
	root=UUID=3f5ad593-4546-4a94-a374-bcfb68aa11f7
	root=PARTUUID=3f5ad593-4546-4a94-a374-bcfb68aa11f7
rootfstype=<filesystem type>
	"auto" if not specified.
	Example:
	rootfstype=ext3
rootflags=<mount options>
	specify additional mount options for the root filesystem. If not set, /etc/fstab of the real root will be parsed for special mount options and mounted accordingly.

#drivers
rd.driver.blacklist=<drivername>[,<drivername>,…]
	do not load kernel module <drivername>. This parameter can be specified multiple times.
rd.driver.pre=<drivername>[,<drivername>,…]
	force loading kernel module <drivername>. This parameter can be specified multiple times.

#debug,log
rd.info
	print informational output though "quiet" is set
rd.shell
	allow dropping to a shell, if root mounting fails
rd.debug
	set -x for the dracut shell. 
rd.break
	drop to a shell at the end
rd.break={cmdline|pre-udev|pre-trigger|initqueue|pre-mount|mount|pre-pivot|cleanup}
	drop to a shell on defined breakpoint
rd.udev.info
	set udev to loglevel info
rd.udev.debug
	set udev to loglevel debug

#network
ip={dhcp|on|any|dhcp6|auto6|either6}
ip=<interface>:{dhcp|on|any|dhcp6|auto6}[:[<mtu>][:<macaddr>]]
ip=<client-IP>:[<peer>]:<gateway-IP>:<netmask>:<client_hostname>:<interface>:{none|off|dhcp|on|any|dhcp6|auto6|ibft}[:[<mtu>][:<macaddr>]]
ip=<client-IP>:[<peer>]:<gateway-IP>:<netmask>:<client_hostname>:<interface>:{none|off|dhcp|on|any|dhcp6|auto6|ibft}[:[<dns1>][:<dns2>]]
rd.route=<net>/<netmask>:<gateway>[:<interface>]
bootdev=<interface>
	specify network interface to use routing and netroot information from. Required if multiple ip= lines are used.
BOOTIF=<MAC>
	specify network interface to use routing and netroot information from.
rd.bootif=0
	Disable BOOTIF parsing, which is provided by PXE
nameserver=<IP> [nameserver=<IP> …]
	specify nameserver(s) to use

#NFS
root=[<server-ip>:]<root-dir>[:<nfs-options>]
	mount nfs share from <server-ip>:/<root-dir>, if no server-ip is given, use dhcp next_server. 
root=nfs:[<server-ip>:]<root-dir>[:<nfs-options>], root=nfs4:[<server-ip>:]<root-dir>[:<nfs-options>], root={dhcp|dhcp6}
root=dhcp alone directs initrd to look at the DHCP root-path where NFS options can be specified.

	Example:
    	root-path=<server-ip>:<root-dir>[,<nfs-options>]
    	root-path=nfs:<server-ip>:<root-dir>[,<nfs-options>]
    	root-path=nfs4:<server-ip>:<root-dir>[,<nfs-options>]

Dracut的执行

dracut在生成initramfs时,会把一个最基本的dracut module: /usr/lib/dracut/modules.d/99base 安装到initramfs中。 kernle解压initramfs后,就会执行99base模块下的init.sh 而init.sh按照下面的流程执行:

                                    systemd-journal.socket
                                               |
                                               v
                                    dracut-cmdline.service
                                               |
                                               v
                                    dracut-pre-udev.service
                                               |
                                               v
                                     systemd-udevd.service
                                               |
                                               v
local-fs-pre.target                dracut-pre-trigger.service
         |                                     |
         v                                     v
 (various mounts)  (various swap  systemd-udev-trigger.service
         |           devices...)               |             (various low-level   (various low-level
         |               |                     |             services: seed,       API VFS mounts:
         v               v                     v             tmpfiles, random     mqueue, configfs,
  local-fs.target   swap.target     dracut-initqueue.service    sysctl, ...)        debugfs, ...)
         |               |                     |                    |                    |
         \_______________|____________________ | ___________________|____________________/
                                              \|/
                                               v
                                        sysinit.target
                                               |
                             _________________/|\___________________
                            /                  |                    \
                            |                  |                    |
                            v                  |                    v
                        (various               |              rescue.service
                       sockets...)             |                    |
                            |                  |                    v
                            v                  |              rescue.target
                     sockets.target            |
                            |                  |
                            \_________________ |                                 emergency.service
                                              \|                                         |
                                               v                                         v
                                         basic.target                             emergency.target
                                               |
                        ______________________/|
                       /                       |
                       |                       v
                       |            dracut-pre-mount.service
                       |                       |
                       |                       v
                       |                  sysroot.mount
                       |                       |
                       |                       v
                       |             initrd-root-fs.target
           (custom initrd services)            |
                       |                       v
                       |             dracut-mount.service
                       |                       |
                       |                       v
                       |            initrd-parse-etc.service
                       |                       |
                       |                       v
                       |            (sysroot-usr.mount and
                       |             various mounts marked
                       |               with fstab option
                       |                x-initrd.mount)
                       |                       |
                       |                       v
                       |                initrd-fs.target
                       \______________________ |
                                              \|
                                               v
                                          initrd.target
                                               |
                                               v
                                    dracut-pre-pivot.service
                                               |
                                               v
                                     initrd-cleanup.service
                                          isolates to
                                    initrd-switch-root.target
                                               |
                                               v
                        ______________________/|
                       /                       |
                       |        initrd-udevadm-cleanup-db.service
                       |                       |
           (custom initrd services)            |
                       |                       |
                       \______________________ |
                                              \|
                                               v
                                   initrd-switch-root.target
                                               |
                                               v
                                   initrd-switch-root.service
                                               |
                                               v
                                          switch-root

从上面的流程图可以发现,dracut给我们提供了一些hook,让我们可以在一些特定的时间点做自己想做的事情。

  1. Hook: cmdline
    The cmdline hook is a place to insert scripts to parse the kernel command line and prepare the later actions, like setting up udev rules and configuration files. In this hook the most important environment variable is defined: root. The second one is rootok, which indicates, that a module claimed to be able to parse the root defined. So for example, root=iscsi:…. will be claimed by the iscsi dracut module, which then sets rootok.

  2. Hook: pre-udev
    This hook is executed right after the cmdline hook and a check if root and rootok were set. Here modules can take action with the final root, and before udev has been run.

  3. Start Udev
    Now udev is started and the logging for udev is setup.

  4. Hook: pre-trigger
    In this hook, you can set udev environment variables with udevadm control –property=KEY=value or control the further execution of udev with udevadm.

  5. Trigger Udev
    udev is triggered by calling udevadm trigger, which sends add events for all devices and subsystems.

  6. Main Loop
    In the main loop of dracut loops until udev has settled and all scripts in initqueue/finished returned true. In this loop there are three hooks, where scripts can be inserted by calling /sbin/initqueue.

  7. Initqueue
    This hook gets executed every time a script is inserted here, regardless of the udev state.

  8. Initqueue settled
    This hooks (initqueue/settled) gets executed every time udev has settled.

  9. Initqueue timeout
    This hooks (initqueue/timeout) gets executed, when the main loop counter becomes half of the rd.retry counter.

  10. Initqueue finished
    This hook (initqueue/finished) is called after udev has settled and if all scripts herein return 0 the main loop will be ended. Abritary scripts can be added here, to loop in the initqueue until something happens, which a dracut module wants to wait for.

  11. Hook: pre-mount
    Before the root device is mounted all scripts in the hook pre-mount are executed. In some cases (e.g. NFS) the real root device is already mounted, though.

  12. Hook: mount
    This hook is mainly to mount the real root device.

  13. Hook: pre-pivot
    This hook is called before cleanup hook, This is a good place for actions other than cleanups which need to be called before pivot.

  14. Hook: cleanup
    This hook is the last hook and is called before init finally switches root to the real root device. This is a good place to clean up and kill processes not needed anymore.

  15. Cleanup and switch_root
    Init (or systemd) kills all udev processes, cleans up the environment, sets up the arguments for the real init process and finally calls switch_root. switch_root removes the whole filesystem hierarchy of the initramfs, chroot()s to the real root device and calls /sbin/init with the specified arguments. To ensure all files in the initramfs hierarchy can be removed, all processes still running from the initramfs should not have any open file descriptors left.

编写自己的Dracut Modules(向hook点添加自己的代码)

#在/usr/lib/dracut/modules.d/目录下新建一个目录,比如96test-module
mkdir -p /usr/lib/dracut/modules.d/96test-module
#创建module-setup.sh
cd /usr/lib/dracut/modules.d/96test-module
touch module-setup.sh
#module-setup.sh中至少要存在如下几个方法
check()
{
	return 0   #永远安装这个模块
	return 1   #永远不安装这个模块
	return 255 #如果该模块被--include指定,或者被其他模块依赖,就安装
}

install() #这里面是具体的安装过程
{
	inst_hook cmdline 20 "$moddir/my-script1.sh"       #安装my-script1.sh到cmdline hook
	inst_simple "$moddir/my-cmd1.sh" /sbin/my-cmd1.sh  #安装my-cmd1.sh到initramfs的/sbin/my-cmd1.sh
	                                                   #要在/usr/lib/dracut/modules.d/96test-module中创建my-script1.sh,my-cmd1.sh
}

#然后当执行"dracut"去创建initramfs时,dracut会执行check()判断是否需要安装这个模块,如果需要,就执行install()进行安装

#下面是另外几个可能用到的方法
depends()
{
	#The function depends() should echo all other dracut module names the module depends on.
}
cmdline()
{
	#This function should print the kernel command line options needed to boot the current machine setup. It should start with a space and should not print a newline.
}
installkernel()
{
	#In installkernel() all kernel related files should be installed. You can use all of the functions mentioned in [creation] to install files.
}

参考

dracut文档udev介绍udev rules