[BL training-materials-updates] Kernel debugging lab improvement

Michael Opdenacker michael.opdenacker at bootlin.com
Mon Dec 3 21:24:00 CET 2018

Repository : git://git.free-electrons.com/training-materials.git
On branch  : master
Link       : http://git.free-electrons.com/training-materials/commit/?id=9a013bc72adfd9c1f8d1659478e8d22a74050992


commit 9a013bc72adfd9c1f8d1659478e8d22a74050992
Author: Michael Opdenacker <michael.opdenacker at bootlin.com>
Date:   Mon Dec 3 21:24:00 2018 +0100

    Kernel debugging lab improvement
    - Stop proposing to disassemble the whole kernel
      (very slow)
    - Add details about using the offset in the crash
      info to find out the exact instruction where the
      crash happened.
    Signed-off-by: Michael Opdenacker <michael.opdenacker at bootlin.com>


 labs/kernel-debugging/kernel-debugging.tex | 39 +++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/labs/kernel-debugging/kernel-debugging.tex b/labs/kernel-debugging/kernel-debugging.tex
index 97751e3..5f6404b 100644
--- a/labs/kernel-debugging/kernel-debugging.tex
+++ b/labs/kernel-debugging/kernel-debugging.tex
@@ -103,24 +103,32 @@ Using Elixir or the kernel source code, have a look at the definition of this
 function. This, with a careful review of the driver source code should
 probably be enough to help you understand and fix the issue.
-\section{Further analysis of the problem}
+\section{Locating the exact line where the error happens}
-If the function source code is not enough, then you can look at the
-disassembled version of the function.
+Even if you already found out which instruction caused the crash, it's
+useful to use information in the crash report.
+If you look again, the report tells you at what offset in the function
+this happens. Let's disassemble the code for this function to
+understand exactly where the issue happened.
 That's where we need a kernel compiled with \code{CONFIG_DEBUG_INFO}
 as we did at the beginning of this lab. This way, the kernel is
 compiled with \code{$(CROSSCOMPILE)gcc -g}, which keeps the source
 code inside the binaries.
-You can use either;
+You could disassemble the whole \code{vmlinux} file and work with
+the \code{PC} absolute address, but it is going to take a long time.
+Instead, using Elixir or \code{cscope}, find the \code{.c} source file where
+the function is implemented. In the kernel sources, you can then find
+and dissassemble the corresponding \code{.o} file:
-cd ~/linux-kernel-labs/src/linux/
-arm-linux-gnueabi-objdump -S vmlinux > vmlinux.disasm
+arm-linux-gnueabi-objdump -S file.o > file.S
-or, using \code{gdb-multiarch}\footnote{gdb-multiarch is a new package
+For this need, you could also use {gdb-multiarch}\footnote{gdb-multiarch is a new package
 supporting multiple architectures at once. If you have a cross
 toolchain including gdb, you can also run arm-linux-gdb directly.}:
@@ -132,9 +140,16 @@ gdb-multiarch vmlinux
 (gdb) disassemble function_name
-Then find at which exact instruction the crash occurs. The offset is
-provided by the crash output, as well as a dump of the code around the
-crashing instruction.
+Then, in the disassembled code, find the start address of the
+function, and using an hexadecimal calculator, add the offset that
+was provided in the crash output. That's how you can find the
+exact assembly instruction where the crash occured, together
+with the C code it was compiled from.
+A little understanding of assembly instructions on the architecture
+you are working on helps, but seeing the original C code should answer
+most questions.
-Of course, analyzing the disassembled version of the function requires
-some assembly skills on the architecture you are working on.
+Note that the same technique works if the error comes directly from
+the code of a module. Just dissassemble the \code{.o} file from which
+the \code{.ko} file was generated from.

More information about the training-materials-updates mailing list