Debug kernel with KGDB

Debug kernel with KGDB

What is KGDB?

KGDB intend to be used as a source code level debugger on a running Linux kernel. It works with GDB and allows the user to inspect memory, variables, setup breakpoints, step lines and instructions. Pretty much the same that all application developers are used to, but for the kernel itself.

Allmost every embedded Linux system does have a serial port available, and that is all that you need to connect GDB to your kernel.

One thing to keep in mind, as with all debugging, is that everything that is called timing will be messed up. It will be pretty obvious when you actually pause a running kernel that keeps up the communicates with all hardware. Especially if you have any hardware watchdogs enabled...

Compile the kernel with support for KGDB

There are a few kernel options that you have to enable in order to use KGDB:

  • CONFIG_KGDB to enable remote debugging.
  • CONFIG_KGDB_SERIAL_CONSOLE let you share a serial console with GDB.
  • CONFIG_FRAME_POINTER is used to produce more reliable stack backtraces by inserting code to preserve the frame informations in registers or on the stack.
  • CONFIG_KALLSYMS_ALL to make sure that all symbols are loaded into the kernel image (i.e. symbols from all sections).
  • CONFIG_MAGIC_SYSRQ to be able to make sysrq. More about this below.

KGDBOC

KGDB over console, or kgdboc, let you use a console port as the debugging port. If we only have one serial port available, we could split the console and gdb communication using agent-proxy [2] .

Agent-proxy

To split a serial port into console and GDB communication we could use agent-proxy. Download and compile agent-proxy

git clone http://git.kernel.org/pub/scm/utils/kernel/kgdb/agent-proxy.git
cd agent-proxy
make

Launch agent-proxy

agent-proxy 4440^4441 0 /dev/ttyS0,115200

If your hardware does not support the line break sequence you have to add the -s003 option. You will find out pretty soon if it is needed - if your target continues to run after sending a break, then you should try to add it, in other words:

agent-proxy 4440^4441 0 /dev/ttyS0,115200 -s003

Where ttyS0 is the serial port on your host.

This will create two TCP sockets, one for serial console and with for GDB, which is listening to port 4440 and 4441 respectively.

Connect to the serial console with your favorite client (socat, netcat, telnet...)

telnet localhost 4440

Setup kgdboc with kernel arguments

kgdboc could be used early in the boot process if it is compiled into the kernel as a built-in (not as module), by providing the kgdboc arguments.

Add kgdboc=<tty-device>,[baud] to your command line argument, e.g.

kgdboc=ttyS0,11500

Where ttyS0 is the serial port on the target.

The kgdbwait argument stops the kernel execution and enter the kernel debugger as earliest as possible. This let you connect to the running kernel with GDB.

See kernel parameters [1] for more information.

Setup kgdboc with kernel module

If the kgdb is not compiled to be built-in but as a module, you provide the same arguments while loading the kernel

modprobe kgdboc=ttyS0,115200

Setup kgdboc in runtime using sysfs

It is also possible to enable kgdboc by echoing parameters into sysfs entries

echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc

Connect GDB to a running kernel

Stop execution and wait for debugger

We have to stop the execution of the kernel in order to connect with gdb.

If gdbwait is provided as boot argument, the kernel will stop its execution and wait.

Otherwise we have to trigger this manually by using SysRq-G. This requires that the CONFIG_MAGIC_SYSRQ is enabled in your kernel config.

Your favorite serial application does probably have some keyboard combination to send SYSRQ requests (GNU Screen has "CTRL+A b" for example), otherwise you can use procfs to send the trigger

echo g > /proc/sysrq-trigger

Connect with GDB

Start GDB and provide the vmlinux from your kernel root directory, remember that you have to use the GDB that came with your toolchain. I do always use the -tui flag to start with a nice terminal user interface

aarch64-linux-gnu-gdb -tui ./vmlinux

Now, if you have a separate serial port for gdb, you could connect to it directly

(gdb) set remotebaud 115200
(gdb) target remote /dev/ttyS0

If you are using agent-proxy, then we should connct to port 4441

(gdb) target remote localhost:4441

Now you are able to set breakpoints, watch variables and use gdb as you used to.

/media/kgdb.jpg

One tip is to set a breakpoint at ksys_sync

(gdb) b ksys_sync

This let you run the sync command as trig to go-to-debug-mode.

-ENOENT, but believe me, it's there

-ENOENT, but believe me, it's there

Almost every ELF-file in a Linux environment is dynamically linked, and the operating system has to locate all dynamic libraries in order to execute the file. To its help, it has the runtime dynamic linker, whose only job is to interpret the ELF file format, load the shared objects with unresolved references, and, at last, execute and pass over the control to the ELF file.

The expected path to the dynamic linker is hardcoded into the ELF itself, so if the linker is missing, it leads to an obvious but maybe confusing error.

If I want to run an executable that has its path to a dynamic linker that simple does not exists, we got this:

root@cgtqmx6:/# ./01_SimpleTriangle -sh: ./01_SimpleTriangle: No such file or directory

Hue? But the file does exist! OK, lets find out which system calls is made.:

root@cgtqmx6:/# strace ./01_SimpleTriangle

execve("./01_SimpleTriangle", ["./01_SimpleTriangle"], [/* 15 vars */]) = -1 ENOENT (No such file or directory)
write(2, "strace: exec: No such file or directory) = 40
exit_group(1) = ?+++ exited with 1 +++

Not much valuable information. Strace calls execve that says the file does not exist. We are talking about the file... but which file is it?

As told before, the runtime dynamic linker is always executed first to load libraries for unresolved symbols. So one qualified guess is that it is the dynamic linker that is missing.

Readelf is a handy tool to read out information from an ELF file such as sections, which architecture it is compiled for and so on. It also shows us the expected path to the dynamic linker.:

root@cgtqmx6:/# readelf -l 01_SimpleTriangle

Elf file type is EXEC (Executable file)
Entry point 0x9288There are 8 program headers, starting at offset 52
Program Headers:  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align  EXIDX          0x0022ec 0x0000a2ec 0x0000a2ec 0x00088 0x00088 R   0x4  PHDR           0x000034 0x00008034 0x00008034 0x00100 0x00100 R E 0x4  INTERP         0x000134 0x00008134 0x00008134 0x00013 0x00013 R   0x1    >>>>>>

[Requesting program interpreter: /lib/ld-linux.so.3] <<<<<<<  LOAD           0x000000 0x00008000 0x00008000 0x02378 0x02378 R E 0x8000         0x002378 0x00012378 0x00012378 0x002a0 0x00358 RW  0x8000  DYNAMIC        0x002384 0x00012384 0x00012384 0x00118 0x00118 RW  0x4  NOTE           0x000148 0x00008148 0x00008148 0x00020 0x00020 R   0x4  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

 Section to Segment mapping:  Segment Sections...   00     .ARM.exidx    01        02     .interp    03     .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .ARM.extab .ARM.exidx .eh_frame    04     .init_array .fini_array .jcr .dynamic .got .data .bss    05     .dynamic    06     .note.ABI-tag    07

Here we go! The expected path is /lib/ld-linux.so.3. Our system is currently using /lib/ld-linux-armhf.so.3, so lets create a symbolic link for now.:

ln -s /lib/ld-linux-armhf.so.3 /lib/ld-linux.so.3

And finally, the program is now executing!:

root@cgtqmx6:/# ./01_SimpleTriangle root@cgtqmx6:/#

Anyway, who decide which runtime dynamic linker to use? It is decided in the linker-stage, usually by ld or gcc and may vary between different toolchains,