Thursday, January 25, 2007

 

Microstate accounting in Solaris 10 and CPU latency

Solaris 10 included some pretty major features such as ZFS, Zones and DTrace. There were also other features that weren't quite as major but that are pretty nifty in and of themselves. One of these features is that microstate accounting is turned on by default in Solaris 10. (It was optional in earlier releases of Solaris.) Eric Schrock blogged about microstate accounting here.

Using 'prstat -m' you can see some of this information. For example:

   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP
  9559 mynhier   27 3.9 0.0 0.0 0.0 0.0 0.0  69   0 186  3K   0 ube/1
  9562 mynhier   18 6.6 0.0 0.0 0.0 0.0 0.0  76   0  92  7K   0 ube/1
  9555 mynhier   21 2.7 0.0 0.0 0.0 0.0 0.0  76   0 143  2K   0 cc1/1
  9564 mynhier   14 8.8 0.0 0.0 0.0 0.0 0.0  77   0 124 13K   0 ube/1
  9551 mynhier   18 2.2 0.0 0.0 0.0 0.0 0.0  80   0 123  2K   0 cc1/1
  9549 mynhier   17 2.1 0.0 0.0 0.0 0.0 0.0  81   0 117  2K   0 cc1/1
  9560 mynhier   16 2.6 0.0 0.0 0.0 0.0 0.0  82   0 350  6K   0 iropt/1
  9507 mynhier  5.9 1.6 0.0 0.0 0.0 0.0  59  34  14  31  2K   0 dmake/1
  9508 mynhier  3.4 1.0 0.0 0.0 0.0 0.0  75  21  14  17  1K   0 dmake/1
  9489 mynhier  2.9 0.7 0.0 0.0 0.0 0.0  96 0.8  15  13  1K   0 dmake/1
  9490 mynhier  2.9 0.7 0.0 0.0 0.0 0.0  96 0.8  15  13  1K   0 dmake/1
   479 mynhier  0.7 0.0 0.0 0.0 0.0 0.0  98 1.7  28   0 113  13 Xorg/1
  9543 mynhier  0.1 0.4 0.0 0.0 0.0 0.0  67  32   7   1 215   0 cc/1
  9536 mynhier  0.1 0.4 0.0 0.0 0.0 0.0  79  20   6   3 207   0 cc/1
  9527 mynhier  0.1 0.4 0.0 0.0 0.0 0.0  95 4.4   6   5 203   0 cc/1
  9554 mynhier  0.1 0.2 0.0 0.0 0.0 0.0  86  14   2   0 221   0 gcc/1
  9550 mynhier  0.1 0.2 0.0 0.0 0.0 0.0  85  15   2   0 180   0 gcc/1
  9548 mynhier  0.1 0.2 0.0 0.0 0.0 0.0  84  16   1   3 163   0 gcc/1
  9546 mynhier  0.1 0.2 0.0 0.0 0.0 0.0  87  12   3   0 200   0 cc/1
  9544 mynhier  0.1 0.2 0.0 0.0 0.0 0.0  83  16   1   1 239   0 sh/1
Total: 111 processes, 238 lwps, load averages: 5.46, 5.79, 5.68
>

The additional columns (TRP through LAT) indicate what percentage of time the process spent handling traps, text page faults, data page faults, sleeping or waiting for CPU.

Given that this is only a two-processor machine, it's fairly obvious from the load average reported above that the machine is overloaded, but the load average doesn't give quite as detailed picture of what's going as does 'prstat -m' output. In particular, the LAT column (CPU latency) is pretty instructive. All of the processes in the above example are spending some percentage of their time fully able to run but waiting to be switched onto a CPU. You could have guessed this from the load average, but load average couldn't have told you that some of these processes were spending over 80% of their time sitting in a run queue.


Comments:
could u elaborate more on what exactly is 'microstate accounting'?
 
Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?