Thursday, May 04, 2006

 

First DTrace success

While we were evaluating the T2000 server, we ran into some performance problems in our application. The software couldn't scale up to fully utilize the hardware.

The traditional approach to debugging this would have been to increase the debugging level and then analyze the logs to try to find the problem. The problem with doing so in this case was that the mutex around writes to the log file was one of the candidates for lock contention. Whereas we might have gotten useful information had this analysis pointed elsewhere, had it pointed at the log file mutex, the analysis would have told us nothing. We couldn't have known whether we had introduced that lock contention ourselves by logging more information.

This was our first use of Solaris 10 in our environment, and I'd been waiting quite a while to use DTrace to solve a real problem, so I took the opportunity to do so. Having had limited experience using DTrace, it took longer to find the problem than it should have, and the path I ended up taking probably wasn't optimal. (In retrospect, the approach I took was embarrassingly brute-force, but I'll detail it here anyway.)

I'll skip some of the dead-end fumbling I did and just present the final successful path. I was curious to see where the application was spending all of its time. I used this DTrace script to determine that:

#!/usr/sbin/dtrace -s

pid$1:::entry
{
    self->ts[probemod,probefunc] = timestamp;
}


pid$1:::return
/ self->ts[probemod,probefunc] != 0/
{
    foo = timestamp - self->ts[probemod,probefunc];
    @functime[probemod,probefunc] = sum(foo);
    self->ts[probemod,probefunc] = 0;
}

(Note that the above was very expensive, as it enabled tens of thousands of probes. In retrospect, I might have tried a different approach to get this information.)

The results of running this script told me that the application was spending ~50% of its time in ___lwp_mutex_timedlock(). Okay, so what is the code path leading to this function?

#!/usr/sbin/dtrace -s

pid$1::___lwp_mutex_timedlock:entry
{
    self->ts = timestamp;
}


pid$1::___lwp_mutex_timedlock:return
/self->ts != 0/
{
    foo = timestamp - self->ts;
    @stacktime[ustack()] = sum(foo);
    self->ts = 0;
}
Running the above script didn't immediately point to a smoking gun. There was no one code path accounting for most of the time spent here, it was fairly well distributed through the code. But the code paths did share a common theme that would have been immediately apparent had I limited the stack depth to 5. I would then have seen these top two culprits:

              libc.so.1`___lwp_mutex_timedlock+0x30
              libc.so.1`queue_lock+0x60
              libc.so.1`mutex_lock_queue+0xcc
              libc.so.1`lmutex_lock+0xe0
              libc.so.1`malloc+0x44

              libc.so.1`___lwp_mutex_timedlock+0x30
              libc.so.1`queue_lock+0x60
              libc.so.1`mutex_lock_queue+0xcc
              libc.so.1`lmutex_lock+0xe0
              libc.so.1`free+0x1c
So all of this lock contention was coming from libc's malloc() and free(). This actually turned out to be an ideal situation, as we could just set LD_PRELOAD to use libumem. There was no need to make code changes to the application itself. When we tried this, we saw our lock contention problems disappear.

The key points I want to make about the experience are these:

- Using DTrace allowed us to avoid the traditional method of additional logging, which could have masked the real problem given that there's a mutex involved in logging. - DTrace let us discover the problem far more quickly than the traditional method likely would have. Even given my inexperience, it was maybe a few hours worth of work. With some experience (and with some knowledge of plockstat), it might have taken ten minutes or less. - DTrace pointed us to a problem that we likely wouldn't have considered, as it was outside of our code. And given that we hadn't considered this possibility prior to discovering the problem, we certainly didn't have debugging statements ready to be enabled that would have caught this problem.


Comments:
Looks nice! Awesome content. Good job guys.
»
 
Very pretty site! Keep working. thnx!
»
 
Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?