Table of Contents
This is going to be a 2-part article describing how one can develop a gdb script for examining tree-like structures in a core file. It also gives some insight on how to orient yourself in a core file / disassembly when you have no debugging symbols.
In the part 1, we will start by reproducing some random crash of Sun's JVM (taken from bugs.sun.com), exploring core file a little bit, and then proceed to indroducing a set of gdb script functions to semi-automate the process.
The target audience for this are gdb beginners, since explanations tend to be rather verbose. Also, as the title suggests, automating part describes pre-python gdb scripting (recently, it has become possible to write gdb scripts with Python, and it's NOT what is described here).
Let's start by picking the bug. This one looks good enough - it has a rather small reproducer, and fails on the pretty much latest 1.6.0_16 JVM release (exactly what I happen to have installed on my machine).
Example 1. CrashTest.java - the failing test
public class CrashTest { public static void main(String[] args) { for (int i = 0; i < 2; i++) NestedLoop(); } public static long NestedLoop() { final int n = 50; long startTime = System.currentTimeMillis(); int x = 0; for (int a = 0; a < n; a++) for (int b = 0; b < n; b++) for (int c = 0; c < n; c++) for (int d = 0; d < n; d++) for (int e = 0; e < n; e++) for (int f = 0; f < n; f++) x++; long stopTime = System.currentTimeMillis(); return stopTime - startTime; } }
Now, I run it and get the core file:
Example 2. Reproducing the failure
$ javac CrashTest.java $ ulimit -c unlimited $ java -server -Xcomp CrashTest # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0xb7062c2c, pid=9322, tid=2873416560 # # JRE version: 6.0_16-b01 # Java VM: Java HotSpot(TM) Server VM (14.2-b01 compiled mode linux-x86 ) # Problematic frame: # V [libjvm.so+0x4bfc2c] # # An error report file with more information is saved as: # .../hs_err_pid9322.log # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # Aborted (core dumped) $
Ok, now we have a core file to attach the gdb to:
$ gdb java core
Let's see what the failing thead has been doing:
Example 3. Backtrace of the failing thread
(gdb) backtrace / #0 0xb7863424 in __kernel_vsyscall () | #1 0xb770b8e0 in raise () from /lib/i686/cmov/libc.so.6 error | #2 0xb770ee15 in abort () from /lib/i686/cmov/libc.so.6 repor- | #3 0xb70cfb9f in os::abort(bool) () from /usr/lib/jvm/java-6-sun-1.6.0.16/jre/lib/i386/server/libjvm.so ting | #4 0xb71f2b81 in VMError::report_and_die() () from ... | #5 0xb70d63ac in JVM_handle_linux_signal () from ... \ #6 0xb70d2624 in signalHandler(int, siginfo*, void*) () from ... #7 <signal handler called> / #8 0xb7062c2c in PhaseIdealLoop::get_early_ctrl(Node*) () from ... | #9 0xb70680ea in IdealLoopTree::counted_loop(PhaseIdealLoop*) () from ... | #10 0xb7067ed6 in IdealLoopTree::counted_loop(PhaseIdealLoop*) () from ... compi- | #11 0xb7067ed6 in IdealLoopTree::counted_loop(PhaseIdealLoop*) () from ... lation | #12 0xb7067ed6 in IdealLoopTree::counted_loop(PhaseIdealLoop*) () from ... of java | #13 0xb7067ed6 in IdealLoopTree::counted_loop(PhaseIdealLoop*) () from ... method | #14 0xb7067ed6 in IdealLoopTree::counted_loop(PhaseIdealLoop*) () from ... | #15 0xb706a31c in PhaseIdealLoop::PhaseIdealLoop(PhaseIterGVN&, PhaseIdealLoop const*, bool) () from ... | #16 0xb6dfe34b in Compile::Optimize() () from ... | #17 0xb6dfb5ff in Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool) () from ... | #18 0xb6d95c4f in C2Compiler::compile_method(ciEnv*, ciMethod*, int) () from ... \ #19 0xb6e03d0b in CompileBroker::invoke_compiler_on_method(CompileTask*) () from ... #20 0xb6e03629 in CompileBroker::compiler_thread_loop() () from ... #21 0xb71ba306 in compiler_thread_entry(JavaThread*, Thread*) () from ... #22 0xb71b397e in JavaThread::run() () from ... #23 0xb70d4cce in java_start(Thread*) () from ... #24 0xb783c585 in start_thread () from /lib/i686/cmov/libpthread.so.0 #25 0xb77b029e in clone () from /lib/i686/cmov/libc.so.6 (gdb)
So, JVM crashed while trying to compile some Java method. We can obtain the name of that method from
3rd parameter passed to Compile::Compile()
at frame #17 (and we will), but
for now, an easier way is to look it up in the error report hs_err_pid9322.log
(though
it's kind of obvious here which method causes the failure):
... Current CompileTask: C2:306 b CrashTest.NestedLoop()J (115 bytes) ...
Obviously, we don't have sources from which JVM 1.6.0_16 was compiled. But, luckily for us, OpenJDK project exists, which is an open-sourced version of Java. What's important here is that such fundamental data structures as classes representing compilation, java methods, and stuff like that are stable enough and rarely change with respect to fields layout and control flow in general.
One can check out the sources from OpenJDK using Mercurial (we will need hotspot subdirectory only):
$hg clone http://hg.openjdk.java.net/jdk6/jdk6/hotspot hotspot
For example, now we could look up sources corresponding to frame #16 from the backtrace we obtained:
Example 4. Compile::Optimize() call, hotspot/src/share/vm/opto/compile.cpp, line 667
... // Now optimize Optimize(); if (failing()) return; ...
First thing we need (to be able to examine object's fields), is to determine where the object itself resides
in memory. This is easy if we have some code calling member method of the object, like Compile::Optimize()
case discussed above - all we need to know is the value of this
pointer. According to the x86 calling conventions, this
pointer is pushed to stack last just before the call.
Let's look then at the disassembly of frame #17, where Optimize()
was called:
(gdb) set print asm-demangle on (gdb) frame 17 #17 0xb6dfb5ff in Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool) () from ... (gdb) info reg eip eip 0xb6dfb5ff 0xb6dfb5ff <Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool)+1967> (gdb) disassemble $eip-9 $eip Dump of assembler code from 0xb6dfb5f6 to 0xb6dfb5ff: 0xb6dfb5f6 <Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool)+1958>: mov 0x8(%ebp),%eax 0xb6dfb5f9 <Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool)+1961>: push %eax 0xb6dfb5fa <Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool)+1962>: call 0xb6dfdf80 <Compile::Optimize()> End of assembler dump. (gdb)
As you can see, the last thing pushed is a value taken from address $ebp+0x8
, and
it's the value of this
pointer. We can check that both [$ebp+0x8]
and [$esp]
are same at this point:
(gdb) print/x *(void**)($ebp+0x8) $7 = 0xab44c900 (gdb) print/x *(void**)($esp+0x0) $8 = 0xab44c900 (gdb)
Why it's taken precisely from
offset from
0x8
is not very important for now, but that's simply
because call instruction puts some stuff (like return address) on stack, and also in the called method prologue
they store old value of $ebp
bsp
just before storing esp
to
ebp
:
(gdb) disassemble 'Compile::Optimize()' Dump of assembler code for function _ZN7Compile8OptimizeEv: 0xb6dfdf80 <Compile::Optimize()+0>: push %ebp 0xb6dfdf81 <Compile::Optimize()+1>: mov %esp,%ebp ...
Let's double check we got this
right. From class Compile
and class Phase
sources we see that there is a field 'Compile *
C'
(hotspot/src/share/vm/opto/phase.hpp, line 105
) which must be located in
the beginning of any Compile
instance in memory, and Compile::Compile()
stores this
to the field in the very beginning of
the method. Indeed:
(gdb) print/x *0xab44c900@10 // see Compile class instance in memory $14 = {0x0, 0xab44c900, 0x132, 0x100, 0x937e288, 0xffffffff, 0x93d19d4, 0x9304798, 0x0, 0x0} (gdb) ^^^ here it is, Compile *C == this
Our ultimate goal is to examine 'InlineTree * _ilt'
field of a Compile
class instance. This is a tree-like structure holding information about what methods the
compiler decided to inline during the compilation. The class is inherently recursive (children are listed in
the _subtrees
field):
Example 5. class InlineTree source code (hotspot/src/share/vm/opto/parse.hpp, line 32)
class InlineTree : public ResourceObj { Compile* C; // cache JVMState* _caller_jvms; // state of caller ciMethod* _method; // method being called by the caller_jvms InlineTree* _caller_tree; uint _count_inline_bcs; // Accumulated count of inlined bytecodes // ... skipped ... GrowableArray<InlineTree*> _subtrees; ...
We will return to this InlineTree
structure in the part 2, but for now let's
concentrate on printing ciMethod
s, and subsequently, ciSymbol
, ciInstanceKlass
, and ciSignature
:
Example 6. ciMethod class sources (hotspot/src/share/vm/ci/ciMethod.hpp, line 42)
... private: // General method information. ciFlags _flags; ciSymbol* _name; ciInstanceKlass* _holder; ciSignature* _signature; ciMethodData* _method_data; ...
We will now obtain a pointer to some ciMethod
instance, figure out its layout in
memory, obtain pointers to method's signature, name, klass, and figure out how to print all of them. The we will
proceed to creating gsb script functions to print all of them.
If we look at our backtrace, frame #17, we can see that some
ciMethod*
was passed to Compile()
constructor as
3rd parameter. Knowing that 'this'
pointer is at $ebp+0x8
address in memory,
it's easy to tell where passed ciMethod*
resides on stack:
(gdb) frame 17 #17 0xb6dfb5ff in Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool) () from ... (gdb) print/x *(void**)($ebp+0x8)@7 $18 = {0xab44c900, 0xab44ce00, 0x9343d50, 0x937e288, 0xffffffff, 0x1, 0x0 } this ciEnv* C2Compiler* ciMethod* int bool bool (gdb)
So, 0x937e288
seems to be the address of some ciMethod
object. We can verify it by checking out its virtual table:
(gdb) p/x *0x937e288 // get virtual table address $31 = 0xb7297248 (gdb) p/x *0xb7297248@4 // get addresses of first 4 virtual methods $32 = {0xb6db99d0, 0xb6db9ab0, 0xb6da8d70, 0xb6da8d80} (gdb) x 0xb6db99d0 // check out a method at 1st address 0xb6db99d0 <ciMethod::print_impl(outputStream*)>: 0x57e58955 (gdb) x 0xb6db9ab0 // check out a method at 2nd address 0xb6db9ab0 <ciMethod::type_string()>: 0x53e58955 (gdb)
Now let's look at ciMethod
object layout in memory:
Example 7. ciMethod layout, hotspot/src/share/vm/ci/ciMethod.hpp
(gdb) p/x *0x937e288@30 $33 = {0xb7297248, // virtual table address // ================ ciObject fields begin 0x0938e3ec, // jobject _handle 0x0, // ciKlass* _klass 0x4af, // uint _ident // ================ ciObject fields end // ================ ciFlags fields begin 0xc1000009, // jint _flags // ================ ciFlags fields end 0x937e2e8, // ciSymbol* _name 0x937e2f8, // ciInstanceKlass* _holder 0x937e360, // ciSignature* _signature 0x937e3a8, // ciMethodData* _method_data 0x0, // BCEscapeAnalyzer* _bcea 0x937ef18, // ciMethodBlocks* _method_blocks 0x73, // int _code_size 0x4, // int _max_stack ...
Once again, we can check we deduced this correctly by examining virtual tables of 'ciSymbol*
_name'
and 'ciInstanceKlass* _holder'
:
(gdb) x *0x937e2e8 // ciSymbol* _name 0xb7297d68 <vtable for ciSymbol+8>: 0xb6dc1b00 (gdb) x *0x937e2f8 // ciInstanceKlass* _holder 0xb72970e8 <vtable for ciInstanceKlass+8>: 0xb6db0280 (gdb)
This is how a ciSymbol
is printed in C++ code:
void ciSymbol::print_symbol_on(outputStream *st) { GUARDED_VM_ENTRY(get_symbolOop()->print_symbol_on(st);) } symbolOop ciSymbol::get_symbolOop() { return (symbolOop)get_oop(); } oop ciObject::get_oop() const { assert(_handle != NULL, "null oop"); return JNIHandles::resolve_non_null(_handle); } inline oop JNIHandles::resolve_non_null(jobject handle) { // ... skipped ... oop result = *(oop*)handle; // ... skipped ... return result; };
And this is symbolOopDesc
sources:
Example 8. hotspot/src/share/vm/oops/symbolOops.hpp
class symbolOopDesc : public oopDesc { friend class VMStructs; private: unsigned short _length; // number of UTF8 characters in the symbol jbyte _body[1]; ...
So, if we could obtain symbolOopDesc
object address from our ciSymbol
, we will be able to print its _body
. As seen from the snippet
above, symbolOopDesc
is just a dereferenced ciSymbol::_handle
, so let's obtain symbolOopDesc
for known
ciSymbol*
(0x937e2e8
) and examine it:
(gdb) p/x *0x937e2e8@4 $40 = {0xb7297d68, 0x938e3f0, 0x0, 0x4a9} vtable _handle _klass _ident (gdb) p/x *0x938e3f0 // de-reference _handle, getting oop $41 = 0xafb8c538 (gdb) p/x *0xafb8c538@20 // examine symbolOopDesc in memory $42 = { // ================= oopDesc fields begin 0x1, // markOop _mark 0xaf9d0230, // _metadata union // ================= oopDesc fields end // this is where the actual symbol data starts 0x654e000a, // 000a is 'unsigned short _length' (offset 8 bytes from symbolOopDesc start) 0x64657473, // and jbyte _body offset is 10 bytes from the start 0x706f6f4c, ...
Knowing the offset to the first char, we can easily print the string with gdb:
(gdb) x/s (char*)0xafb8c538+10 0xafb8c542: "NestedLoop" (gdb)
So, finally, our first result - the method on which compilation failed, is "NestedLoop
". Not a huge achievement, since we knew it right from the beginning - it's right
there, in the generated
error file hs_err_pid9322.log
.
At this point we are ready to create first gdb script function, which will print a string for a given
symbolOop
. To do that, we create a file (let's call it ~/gdb_script
) with the following content:
Example 9. Gdb script function for printing symbolOop*
define dump_symbol set $symboloop_length_offset = 8 printf "%s", ($arg0)+$symboloop_length_offset+2 end document dump_symbol Prints symbolOop at address arg0 end
Once loaded into gdb session, dump_symbol function can be used to print out any symbolOop
automatically:
(gdb) source ~/gdb_script (gdb) help dump_symbol Prints symbolOop at address arg0 (gdb) dump_symbol 0xafb8c538 NestedLoop(gdb)
To be able to print a string having ciSymbol* as a starting point, we'll need a couple of extra functions:
Example 10. ciSymbol printing with gdb script
# offset of _handle field in class ciSymbol in bytes set $cisymbol_handle_offset = 4 define read_4 set $$arg0 = *($arg1) end document read_4 Reads 4 bytes from address $arg1 to a variable named by $arg0 end define dump_cisymbol read_4 tmp (($arg0)+$cisymbol_handle_offset) read_4 tmp $tmp dump_symbol $tmp end document dump_cisymbol Dumps ciSymbol* given as arg0 end
There is no concept of return variable in gdb script functions, so we need tricks like the one used in
read_4
function - initializing a global variable with a returned value. What
read_4
does is ultimately a de-referencing of some pointer:
(gdb) p/x *0x937e2e8 // de-reference by gdb $51 = 0xb7297d68 (gdb) read_4 testvar 0x937e2e8 // de-reference by read_4 (gdb) p/x $testvar $52 = 0xb7297d68 // check result (gdb)
Now we can print ciSymbol
, but for printing ciMethod
's full
name, we'd also need to know how to print ciInstanceKlass
(let's skip the signature
for now). I'll omit the details here, since ciInstanceKlass
has ciSymbol*
somewhere inside its body, so it's only a matter of getting correct offsets:
Example 11. ciMethod printing
set $cimethod_holder_offset = 24 set $ciklass_name_offset = 20 set $cimethod_name_offset = 20 define dump_cimethod read_4 klass (($arg0)+$cimethod_holder_offset) read_4 name ($klass+$ciklass_name_offset) dump_cisymbol $name printf "." read_4 name (($arg0)+$cimethod_name_offset) dump_cisymbol $name printf "\n" end document dump_cimethod Dumps ciMethod* given as arg0 end
Note
All offsets, of course, are subject to change depending on a platform (size of void*
, and other types), compiler (paddings) and JVM flavour (debug/product).
We have created several gdb script functions, which make digging through JVM's cores a little bit easier. Let's test:
(gdb) source ~/gdb_script (gdb) apropos dump_ dump_cimethod -- Dumps ciMethod* given as arg0 dump_cisymbol -- Dumps ciSymbol* given as arg0 dump_symbol -- Prints symbolOop at address arg0 (gdb) dump_cimethod 0x937e288 CrashTest.NestedLoop (gdb)
Stay tuned for part 2 of this, where we will introduce stack-aware gdb script functions, and automate walking
of tree-like structures such as InlineTree
.
-
Examining recursive structures with gdb - second part of this article.
-
Debugging with GDB - a short and easy introduction to gdb.
-
User-defined commands - documentation on gdb command files.
-
[Fastboot] Enhanced gdb-script to dump traces of all threads - huge gdb script to use as reference.
No comments:
Post a Comment