Notes on the Linux kernel
In this article I wrote down a few notes to deepen/recapitulate my knowledge about the Linux kernel. I tried to keep most of it in a simple Q&A format.
General knowledge
What is the Linux kernel? - The kernel is more or less just a program. It gets “started” by the bootloader and the bootloader is also able to pass arguments to it (comparable to command line arguments). The kernel’s purpose is to manage (1) memory, (2) processes and (3) devices as well as providing an interface to user space processes through (4) system calls.
What kernel version does my current system use? - Use uname -r
command to find out. The resulting name/number corresponds to a kernel image in the /boot
directory. More specifically /boot/vmlinuz-<KERNEL-NUMBER>
.
How much RAM is available to the kernel? - The command free -h
gives the answer.
What is a syscall? - Syscalls provide an API between the kernel and the user space. In Linux syscalls are made available to programs by the libraries such as glibc.
What syscalls are available? How many syscalls do exist? - On my Ubuntu system, the file /usr/include/x86_64-linux-gnu/asm/unistd_64.h
gives some hints about the type and number of syscalls. I counted the number of syscalls on my system with grep -c -e <pattern> <file>
and got 347 syscalls. On other distros the path of the unistd.h
file might vary.
How to trace syscalls for a given program? - The strace
command can be used to trace syscalls. Either attach to a running process with strace -p <pid>
or execute and trace a program with strace <program> <args>
. The -c
option can be used to only print an overview about the syscalls that have been used as well as the number of invocations. Depending on the use case strace
must be used with sudo
.
How to list the available hardware? - Linux provides a number of command line tools to show information about available hardware, for example:
lshw
lists available hardwarelspci
lists PCI deviceslsusb
lists USB deviceslscpu
lists information about the CPU architecturelsblk
lists block devices (e.g. hard disks and partitions)
Name two important virtual file systems on a typical Linux systems! What exactly is a virtual file sytem? - Examples of virtual file systems are the /proc
and the /sys
directories. The /proc
directory provides information about running processes, the /sys
directory information about devices. It’s also possible to make modification to the operating system or the hardware respectively by writing to certain files in either the /proc
or /sys
directory. These directories are known as virtual file systems because their contents are not stored on disk, instead the contents are produced on demand by an associated kernel function.
How do you find out what files are opened by a specific process? - It can be found out by the use of the lsof
command. More specifically for a given PID execute the command lsof -p <pid>
What kind of device files exist in Linux? - Devices can be accessed in the /dev
directory. There are 2 types of standard devices (type can be determined with ls -l
for instance):
- Character devices can be accessed by a byte stream similar to a file. Drivers usually provide read, write, open, close functions and are accessed through the file system. Unlike regular files, it’s not possible to move back and forth like in regular files. Instead all data has to be accessed sequenatially. Examples of character devices are serial ports, parallel ports and sound cards.
- Block devices can host file systems. Most common examples are hard disks. It’s possible to read/write from/to block devices like character devices, however the way block devices are handled internally by the kernel’s driver interface differs. Block devices are also made available through nodes in the file system.
What’s the point of device’s major and minor numbers? - When using the ls -l
command on a device file you’ll see two numbers in the column that is normally used for the file size. These two numbers are called the major device number and the minor device number for the particular device. The major number identifies the driver associated with the device. It can be shown that for example both /dev/null
and /dev/zero
are managed by driver 1
. The minor number is only used by the driver but not the kernel itself. Therefore minor numbers can be used to differentiate between multiple devices controlled by the same driver.
Where can you find information about block devices? - Either use the lsblk
command or look into the virtual file system sys/block
.
How to retrieve information about block/char devices by their major/minor numbers? - By examining the directories /sys/block/<major>:<minor>
or /sys/char/<major>:<minor>
How to trace output from the kernel via printk()
? - The Kernel’s debug output can be shown with the dmesg
or journalctl -k
commands.
Booting
What are the stages of the Linux boot process? - Here’s a brief overview about the 6 stages in the Linux boot process:
- Basic Input/Output System (BIOS): The BIOS performs integrity checks of harddisks, searches for and executes a bootloader program found in the MBR.
- Master Boot Record (MBR): Special type of boot sector at the very beginning of a partitioned mass storage device. Contains a start program for the bootloader.
- Grand Unified Bootloader (GRUB): GRUB loads and executes the Kernel as well as initrd images
- Kernel: In this stage the kernel selected by GRUB mounts the root file system specified in
/boot/grub/grub.cfg
(see theroot
variable). Then it executes the/sbin/init
program with pid 1. On modern Linux system/sbin/init
is usually a symlink on/usr/lib/systemd/systemd
. The kernel also establishes a temporary root filesystem using a RAM disk (initrd) until the real file system is mounted. - Init: Executes systemd targets (which succeeds the concepts of runlevels). An overview about the available runlevel targets can be displayed with
systemctl list-units --type=target --all
. For desktop systems (with a window manager) the default isdisplay-manager.target
,graphical.target
or something similar. It’s possible to change the default target with e.g.sudo systemctl set-default multi-user.target
to start the system in text mode. - Runlevel programs: TODO what files are executed for a systemd target.
How to show the boot log? - Use the journalctl -b
command.
When talking about GRUB, which version is usually meant? - Nowadays people almost exclusively use GRUB 2. So when talking about GRUB people usually mean GRUB 2.
Why configure GRUB? - Some options for the kernel can only be configured at boot time.
Which files are used for configuring GRUB? - Default settings are defined in /etc/default/grub
. Additionally the boot entries in the GRUB menu are configured by the scripts in the /etc/grub.d
directory. Initially GRUB comes with a few standard scripts:
$ ls /etc/grub.d
00_header
05_debian_theme
10_linux
10_linux_zfs
20_linux_xen
20_memtest86+
...
The number in the prefix is used to determine the order of these scripts when generating the GRUB menu.
What options can be used for configuring GRUB? - Executing the info -f grub -n 'Simple configuration'
command lists an overview of available variables that can be configured.
How to actually apply the changes made in /etc/default/grub
or /etc/grub.d
? - Executing the grub2-mkconfig
command, generates the true config file for Grub /boot/grub2/grub.cfg
.
How to look at the kernels’ command-line arguments? - They are made available through the /proc/cmdline
file.
What happens with unrecognized kernel process command-line arguments from GRUB? - They’re ignored. This can sometimes be useful, because user-space programs can inspect the kernel command-line arguments.
Where are valid kernel parameters documented? - In the Linux source tree, they’re documented within the Documentation/admin-guide/kernel-parameters.txt
file. Nowadays there more than 500 possible arguments.
List a few important/interesting kernel parameters!
root
specifies the UUID of the root file system.rdinit
allows to specifiy a program that gets executed from the initial ramdisk instead of/init
. It’s possible to run a shell for example instead of/init
. Of course no filesystem would be mounted etc, but could be useful for debugging.rfkill.default_state
allows to start the system in airplane mode if set to 0. That means all communication (wifi, bluetooth, gps, etc) is blocked by default.ro
mounts root device read-only so there are no other processes disturbing the integrity checks (fsck) until it’s remounted.rw
mounts root device read-write. This is the default.quiet
disables most log messages.
What program is responsible for starting system services? - As already stated before, systemd is responsible for starting up system services/daemons and /sbin/init
is usually a symlink on systemd.
What concept preceeded systemd? - Before systemd was established a the default way to setup services, runlevel scripts where used. These script were present in the /etc/rc.d
directory.
Loadable kernel modules
TODO
Kernel source
TODO
Configuring and building a kernel
TODO