Wednesday, August 23, 2006
Diving into a kernel crash dump and banging my head on the bottom
I previously gave an example of diving into a kernel crash dump with mdb. In that case, I was lucky enough to have in a register the pointer to the data structure I wanted to look at. I'm looking at another crash dump, and I'm not so lucky this time. I have to go hunting the pointer I'm interested in.
Here's the backtrace:
The basic problem is that strcmp() is getting passed a NULL pointer. I won't go into the details of that here, what I'm interested in here is determining what filesystem is being mounted. domount() is passed a pointer to a vnode, so I'm going to try looking there.
If this were a straight x86 box, I'd be happy. All arguments are passed on the stack, so things are very straightforward. I'd even have the arguments listed in the backtrace, so there'd be no more work than a cut and paste. But this is an x64 box, and arguments are passed in registers, so I have to manually track the value I want as it's moved from the register in which it was passed to the location where it was saved. It may have been saved on the stack, which makes life (relatively) easy, but it may have been saved into a non-volatile register, in which case I need to track it through succeeding stack frames until it gets pushed onto the stack. (Well, okay, this is just basic recursion, with "getting pushed onto the stack" as the base case.)
So here's what domount() looks like:
Okay, so the vnode pointer I'm interested in is argument 3. And, as everyone knows (or at least can figure out after looking at a good reference), the third argument is passed in %rdx. What I need to do is track where domount() stores this:
Okay, so domount() stores %rdx into %r15 (a non-volatile register.) This means more work, as I have to go look at vfs_parsemntopts() to see where it stores %r15. But first, let me check that %r15 isn't used anywhere else in domount() before the instruction of interest (domount+0xc87):
Okay, so domount() is overwriting the register a couple of places, and they're before the instruction of interest. So I check them out:
So this looks like it's just an early exit from the function, and the second instance is similar, so I probably don't have to worry about these two cases. So that leaves me looking at vfs_parsemntopts():
Here's the backtrace:
> $C fffffe800017fc30 strcmp() fffffe800017fc90 vfs_setmntopt_nolock+0x147() fffffe800017fce0 vfs_parsemntopts+0x96() fffffe800017fe10 domount+0xc87() fffffe800017fe90 mount+0x105() fffffe800017fed0 syscall_ap+0x97() fffffe800017ff20 sys_syscall32+0xef() 00000000080b2b80 0xfe45a0cc() >
The basic problem is that strcmp() is getting passed a NULL pointer. I won't go into the details of that here, what I'm interested in here is determining what filesystem is being mounted. domount() is passed a pointer to a vnode, so I'm going to try looking there.
If this were a straight x86 box, I'd be happy. All arguments are passed on the stack, so things are very straightforward. I'd even have the arguments listed in the backtrace, so there'd be no more work than a cut and paste. But this is an x64 box, and arguments are passed in registers, so I have to manually track the value I want as it's moved from the register in which it was passed to the location where it was saved. It may have been saved on the stack, which makes life (relatively) easy, but it may have been saved into a non-volatile register, in which case I need to track it through succeeding stack frames until it gets pushed onto the stack. (Well, okay, this is just basic recursion, with "getting pushed onto the stack" as the base case.)
So here's what domount() looks like:
int domount(char *fsname, struct mounta *uap, vnode_t *vp, struct cred *credp, struct vfs **vfspp) {
Okay, so the vnode pointer I'm interested in is argument 3. And, as everyone knows (or at least can figure out after looking at a good reference), the third argument is passed in %rdx. What I need to do is track where domount() stores this:
> domount::dis domount: pushq %rbp domount+1: movq %rsp,%rbp domount+4: pushq %r15 domount+6: movq %rdx,%r15 [ ... ]
Okay, so domount() stores %rdx into %r15 (a non-volatile register.) This means more work, as I have to go look at vfs_parsemntopts() to see where it stores %r15. But first, let me check that %r15 isn't used anywhere else in domount() before the instruction of interest (domount+0xc87):
> domount::dis ! grep '%r15$' domount+4: pushq %r15 domount+6: movq %rdx,%r15 domount+0x37f: popq %r15 domount+0x66e: popq %r15 >
Okay, so domount() is overwriting the register a couple of places, and they're before the instruction of interest. So I check them out:
> domount+0x37f::dis domount+0x35b: movl %eax,%r14d domount+0x35e: je -0x29cdomount+0x364: movl $0x16,%eax domount+0x369: cmpl $0x4e,%r14d domount+0x36d: cmovl.ne %r14d,%eax domount+0x371: addq $0xf8,%rsp domount+0x378: popq %rbx domount+0x379: popq %r12 domount+0x37b: popq %r13 domount+0x37d: popq %r14 domount+0x37f: popq %r15 domount+0x381: leave domount+0x382: ret [ ... ]
So this looks like it's just an early exit from the function, and the second instance is similar, so I probably don't have to worry about these two cases. So that leaves me looking at vfs_parsemntopts():
> vfs_parsemntopts::dis vfs_parsemntopts: pushq %rbp vfs_parsemntopts+1: movq %rsp,%rbp vfs_parsemntopts+4: pushq %r15 vfs_parsemntopts+6: movl $0x1,%r15d vfs_parsemntopts+0xc: pushq %r14 vfs_parsemntopts+0xe: pushq %r13 vfs_parsemntopts+0x10: pushq %r12 vfs_parsemntopts+0x12: pushq %rbx vfs_parsemntopts+0x13: movq %rsi,%rbx vfs_parsemntopts+0x16: subq $0x18,%rsp [ ... ]Woohoo! vfs_parsemntopts() pushes %r15 onto the stack, so I'm done looking for the vnode pointer. So, pull it off the stack, dereference it as a vnode_t, and I get the name of the filesystem that was being mounted (or at least a cached guess):
> fffffe800017fce0-8/J 0xfffffe800017fcd8: ffffffff827afdc0 > ffffffff827afdc0::print -t vnode_t [ ... ] char *v_path = 0xffffffff97b11e70 "/netapp/some/filesystem" [ ... ]