:lang: en

= RT Faults

Finding line number information for faults in realtime components

1. Get a version of LinuxCNC which prints the faulting instruction
   address (that includes this version of LinuxCNC)

2. Include debugging info in your modules. +
   For built-in modules, below the definition of EXTRA_CFLAGS in
   Makefile, add:
+
  EXTRA_CFLAGS += -g
+
For standalone modules, add the same line just above the line:
+
  ifeq ($(BUILDSYS),kbuild)
+
and (re)build the component

3. Run hal until the fault occurs. +
   *DO NOT EXIT THE HAL SESSION YET*. +
   You must find the start of the module (step 5) first.

4. Note the ip (instruction pointer) address in dmesg, e.g.:
+
  RTAPI: Task 1[c2800000]: Fault with vec=14, signo=11 ip=c93dc01a.
                                                          ^^^^^^^^

5. Find the module which contains the offending IP.
+
  $ cat /proc/modules
  motmod 142230 0 - Live 0xc93df000
  fault 1626 1 motmod, Live 0xc93dc000
  hal_lib 30517 2 motmod,fault, Live 0xc93d5000
+
Now you can exit hal/emc2.

6. Subtract the start of the module from the faulting ip (in this case,
   0x1a) +
   Among other ways to do this, you can use the shell:
+
  $ printf "0x%x\n" $((0xc93dc01a-0xc93dc000))
  0x1a

7. Use addr2line to find out the source code line:
+
  $ addr2line -e emc2-dev/src/fault.ko 0x1a
  /usr/src/linux-headers-2.6.32-122-rtai/hal/components/fault.comp:9
+
Ignore how the directory name is wrong and see whether this has helped
you localize the problem:
+
  fault.comp:9          *(int*)0 = 0;
+
Yup! Looks like it has.
+
Note that even if you do not prefix the address argument to addr2line
with 0x, it is interpreted as a hex number, and you'll get the wrong
line-number information if passing a decimal number. Thus take care
to always use hex addresses with addr2line.

// vim: set syntax=asciidoc: