1.Introduction:
Booting a Linux system is influenced by the design of hardware involved in booting. Whole boot mechanism can be divided into several logical steps, doesn’t matter whether you are booting a standard x86 desktop machine or an embedded system powered with Linux. This article explains boot process from initial power on state to start of first user application. It also covers other boot-related topics such as the boot loaders, kernel decompression, the initial RAM disk, and other elements of Linux boot.
2. Starts with hardware design:
Design of hardware defines initial steps to be performed, when a system is powered on. How to read data from nonvolatile memory to RAM depends upon the type of memory, used for volatile and non volatile storage. Some devices may employ a core to perform security check before actual boot against any tampering in boot code and application, while in some cases there may be a tiny code to initialize basic hardware.
3. Boot Sequence:
Following figure shows different steps involved in booting of a typical Linux machine. Each step is described in detail in later sections.
System Start-up:
When a typical Linux desktop machine is powered on, the hardware needs time for voltage to stabilize and reset to be released. After reset, processor executes code from a well known location. In a desktop machine this location is in BIOS (Basic Input Output System), which is stored in flash memory of mother board. The purpose of BIOS is to perform some low level initialization like Cache, Memory controller, clocks. CPU initially executes a portion of BIOS from flash itself, which includes low level initialization. Once initialization of chipset is done, CPU loads remaining portion of BIOS into RAM and executes it. BIOS searches for bootable devices in their order of priority, and looks for first level boot loader i.e. MBR(Master Boot Record). If MBR is found, it is loaded into RAM. If no MBR is found, BIOS gives error message "no bootable device, insert boot disk".
While in embedded systems there may be independent hardware block to perform some primary tasks like initialization of essential hardware, security checks etc. Once these primary tasks are successfully done, it resets CPU. In some cases, reset vector can be configured through these blocks. It provides flexibility to boot from any memory location.
First Stage Boot Loader:
The first stage boot loader resides inside MBR. The MBR is 512 byte image containing first stage boot loader, partition table, and magic number (See figure 2). First 446 bytes of MBR is executable primary boot loader, next 64 bytes are the partition table, which contains information about four partition tables. Last two bytes are magic number (0xAA55) used for validation check of MBR. The Primary boot loader finds and loads secondary boot loader. It searched for an active partition from partition table. Once active partition is found, its boot record is read from device into RAM and executed.
In embedded system bootstrap environment is used. The boot strap environment includes uboot, Redboot etc. These programs are stored at predefined location of flash memory of target platform. These programs download and decompress kernel image into RAM and then execute it. They also perform some level of system test and hardware initialization before downloading kernel. They cover functionality of first stage and second stage boot loaders both.
Second Stage Boot Loader:
The second stage boot loader is also called as kernel loader. The task of second stage boot loader is to load kernel and an initial ramdisk (optional). LILO(Linux LOader) and GRUB(Grand Unified Bootloader) are generally used in x86 desktop environment. They include functionality of first stage and second stage boot loader both. GRUB includes knowledge about Linux file system, so GRUB can load Linux kernel from ext2 and ext3 file system. It breaks the two stage process in three stages; stage 1, 1.5 and 2. In stage 1.5, it builds knowledge of the Linux file system containing the Linux kernel. Stage two boot loader displays list of kernels available for booting. Once a kernel is selected to boot, second stage loader consults file system and loads kernel image and initial ramdisk in RAM. The kernel loaded into RAM may not be executable kernel. It is a compressed image. The head of the kernel image contains utility routine, which initializes some hardware and decompress the kernel image. Second stage boot loader executes this routine.
Kernel:
When the routine bzImage (for an i386 image) is invoked by second stage boot loader, it begins at ./arch/i386/boot/head.S in the start assembly routine (see Figure 3 for the major flow). This routine does some basic hardware setup and invokes the startup_32 routine in ./arch/i386/boot/compressed/head.S. This routine sets up a basic environment (stack, etc.) and clears the Block Started by Symbol (BSS). The kernel is then decompressed through a call to a C function called decompress_kernel (located in ./arch/i386/boot/compressed/misc.c). When the kernel is decompressed into memory, it is called. This is yet another startup_32 function, but this function is in ./arch/i386/kernel/head.S.
startup_32() function (also called the swapper or process 0; PID 0), the page tables are initialized and memory paging is enabled. The type of CPU is detected along with any optional floating-point unit (FPU). The start_kernel() (init/main.c) function is then invoked, which takes you to the non-architecture specific Linux kernel. This is, in essence, the main function for the Linux kernel.
In the new
With the call to start_kernel(), a long list of initialization functions are called to set up interrupts, perform further memory configuration, and load the initial RAM disk. In the end, a call is made to kernel_thread (in arch/i386/kernel/process.c) to start the init function, which is the first user-space process. Finally, the idle task is started and the scheduler can now take control (after the call to cpu_idle). With interrupts enabled, the pre-emptive scheduler periodically takes control to provide multitasking.
initrd:
During the boot of the kernel, the initial-RAM disk (initrd) that was loaded into memory by the stage 2 boot loader is copied into RAM and mounted. This initrd serves as a temporary root file system in RAM and allows the kernel to fully boot without having to mount any physical disks. Since the necessary modules needed to interface with peripherals can be part of the initrd, the kernel can be very small, but still support a large number of possible hardware configurations. After the kernel is booted, the root file system is pivoted (via pivot_root) where the initrd root file system is unmounted and the real root file system is mounted.
The initrd function allows you to create a small Linux kernel with drivers compiled as loadable modules. These loadable modules give the kernel the means to access disks and the file systems on those disks, as well as drivers for other hardware assets. Because the root file system is a file system on a disk, the initrd function provides a means of bootstrapping to gain access to the disk and mount the real root file system. In an embedded target without a hard disk, the initrd can be the final root file system, or the final root file system can be mounted via the Network File System (NFS).
User Space
After the kernel is booted and initialized, the kernel starts the first user-space application /sbin/init. This is the first program invoked that is compiled with the standard C library. Prior to this point in the process, no standard C applications have been executed. Init will start the services setup in the system. This program also creates a process called init process with process id 1. The init process is parent of all process.
The init process reads the file "/etc/inittab" and uses this file to determine how to create processes. The init process is always running and can dynamically do things and run processes based upon various signals. The administrator can also cause it to dynamically change system processes and run levels by using the telinit program or editing the "/etc/inittab" file. Unix and Linux utilize what is called "run levels". A run level is a software configuration of the system that allows only a selected group of processes to exist. Init can run the system in one of eight run levels. These run levels are 0-6 and S or s. The system runs in only one of these run levels at a time.
0 - halt
1 - Single user mode
2 - Multiuser, without NFS (The same as 3, if you don't have networking)
3 - Full multiuser mode
4 - unused
5 - X11
6 - Reboot
The /etc/inittab file instructs init which runlevel to start the system at and describes the processes to be run at each runlevel. An entry in the inittab file has the following format:
Id : run_levels : action : Process
Id: A unique sequence of 1-4 characters which identifies an entry in inittab.
Run levels: Lists the run levels for which the specified action should be taken. This field may contain multiple characters for different run levels allowing a particular process to run at multiple run levels.
For example, 123 specifies that the process should be started in run levels 1, 2, and 3.
Action:
Describes which action should be taken. Valid actions are listed below respawn: The process will be restarted whenever it terminates. wait: The process will be started once when the specified run level is entered and init will wait for its termination. once: The process will be executed once when the specified run level is entered boot: The process will be executed during system boot. The run levels field is ignored. bootwait: Same as "boot" above, but init waits for its termination off: This does nothing. ondemand: This process will be executed whenever the specified ondemand runlevel is called. initdefault: Specifies the run level which should be entered after system boot. If none exists, init will ask for a run level on the console. The process field is ignored.
sysinit:
The process will be executed during system boot. It will be executed before any boot or bootwait entries. The run levels field is ignored.
powerwait:
The process will be executed when init receives the SIGPWR signal. Init will wait for the process to finish before continuing.
powerfail:
Same as powerwait but init does not wait for the process to complete.
powerokwait:
The process will be executed when init receives the SIGPWR signal provided there is a file called "/etc/powerstatus" containing the word "OK". This means that the power has come back again.
ctrlaltdel:
This process is executed when init receives the SIGINT signal. This means someone on the system console has pressed the "CTRL-ALT-DEL" key combination.
kbrequest:
The process will be executed when init receives a signal from the keyboard handler that a special key combination was pressed on the console keyboard.
Process:
Specifies the process to be executed. If the process starts with the '+' character, init will not do utmp accounting for that process. The utmp file allows one to discover information about who is currently using UNIX. The file is a sequence of entries with the following structure declared in the include file:
struct utmp {
char ut_line[8]; /* tty name */
char ut_name[8]; /* user id */
long ut_time; /* time on */
};
This structure gives the name of the special file associated with the user's terminal, the user's login name, and the time of the login in the form of time (2).
The /etc/inittab file: (Example)
On line 1 you'll see "id:3:initdefault:". The id is "id" which stands for initdefault. Note it is unique on all the numbered lines. The runlevel is 3 which sets the default starting run level to run level 3.
You will also find following entry in /etc/inittab:
“con:2345:respawn:/sbin/getty console”
It instructs init process to start getty application for run level 2,3,4, and 5. getty opens a tty port, prompts for a login name and invokes the /bin/login command. It displays login prompt and performs random administrative things, such as setting the UID and GID of the tty. Then it executes /etc/profile script to set the HOME, PATH, SHELL, TERM, MAIL, and LOGNAME environment variables. PATH defaults to /usr/local/bin:/bin:/usr/bin for normal users, and to /usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin for root.
After this, it starts user’s shell. If no shell is specified for the user in /etc/passwd, then /bin/sh is used. If there is no directory specified in /etc/passwd, then / is used. The user shell then in turn invokes user’s shell profile file (.cshrc or .bashrc) from his home directory and then displays prompt to run commands.