Debugging with GDB
#1
I'm trying to find the bug that causes the Google test to fail occasionally in the CI servers. Unfortunately I'm not too familiar with Linux debugging, so I've hit a dead-end.

I used a simple bash script to run the test 100 times, and it crashed once during the run, producing a core file. Then I tried loading the core file and printing the backtrace, this is what I got:
Code:
Core was generated by `./Google-exe'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f317a840bb9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19-gdb.py", line 63, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
ImportError: No module named 'libstdcxx'
(gdb) bt
#0  0x00007f317a840bb9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f317a843fc8 in __GI_abort () at abort.c:89
#2  0x00007f317b064535 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007f317b0626d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f317b062703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f317b0b5ad5 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f317abd8182 in start_thread (arg=0x7f317a503700) at pthread_create.c:312
#7  0x00007f317a904efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
I have no idea why there are unknown items in the stack trace and what I could do to get more information. It almost seems like some debugging information (for libstdc++) is not loaded, what do I need to do to fill it in?
On Windows it happens quite often that missing symbols cause the stacktrace to be wrong after the first missing entry, so if it is the same on Linux, it could shed more light into the issue.

I tried running the 100-repetition script with gdb - start gdb, load the executable, set a breakpoint on the __GI_abort function, then run. Unfortunately it doesn't work because at the time the executable is loaded, the __GI_abort function is not loaded yet, so a breakpoint cannot be set on it. Here's my experimental script (doesn't work):
Code:
for ((n=0;n<100;n++))
do
    gdb ./Google-exe >> google.out <<HERE
break __GI_abort
run
HERE
Can anyone think of something to make this possible?
Reply
Thanks given by:
#2
Ah, did it. I added a custom constructor to cAssertFailure and set the breakpoint there; this allowed me to use the 100-repetition script and finally I found the problem: The cNetworkSingleton has a HasTerminated flag which was being set too soon - before all the network links have been closed. The links then tried to remove themselves from a cNetworkSingleton which was already marked as HasTerminated, which triggered an assert / throw.
Reply
Thanks given by:
#3
In terms of the actual debugging problem, it looks like either your on the wrong thread, and it wasn't doing anything, or the stack got hosed above the call to abort, so all you can see are libc internals.
Reply
Thanks given by:
#4
Yeah, the stack probably got hosed, but still I'd like to see the libc internals' symbols.
Reply
Thanks given by:




Users browsing this thread: 2 Guest(s)