Introducing DTrace

Bryan Cantrill (bmc at kiowa.eng.sun.com)
 
From: Bryan Cantrill (bmc at kiowa.eng.sun.com)
Subject: Introducing DTrace
Newsgroups: comp.unix.solaris
Date: 2003-11-14 22:02:13 PST
 
All,

The DTrace cat has been pawing its way out of the metaphorical bag for quite
some time now, with Gavin Maltby posting some especially enticing examples
here on comp.unix.solaris.  Now that Solaris Express 11/03 is publicly
available for download, we on the DTrace engineering team wanted to take a
moment to post some slightly more elaborate examples of this new technology.

Before we get started, note that there is quite a bit of publicly available
documentation on DTrace at BigAdmin:

  http://www.sun.com/bigadmin/content/dtrace

Also, note that this post isn't meant to be a comprehensive tutorial -- it's
really more of a quick demonstration of how DTrace might be used to solve
problems that you may come across in your day-to-day workings with Solaris.

On with the show:  let's say you're looking at mpstat(1) output on your 
multiuser server.  You might see something like this:

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
 12    1  27 3504   338  206  765   27  114   65    1   337    9  19  41  31
 13    1  19 5725    98   68  723   22  108  120    0   692    3  17  20  61
 14    0  57 3873   224  192  670   22   75   86    1   805    7  10  35  48
 15   36   7 1551    42   28  689    6   68   59    0   132    2   9  35  54
 16   14   7 7209   504  457 1031   37  125  244    0   459    6  30   4  60
 17    5   5 4960   150  108  817   37   98  154    0   375    6  26   6  62
 18    5   6 6085  1687 1661  741   60   76  248    0   434    3  33   0  64
 19    0  15 10037    72   41  876   23  100  291    1   454    2  19   9  71
 20   12   5 5890   746  711  992   32  122  216    2   960   10  33   4  53
 21   60   5 1567   729  713  467   15   80   59    0   376    2  35  10  53
 22    0   6 4378   315  291  751   17   84  142    1   312    3  16   1  80
 23    0   6 12119    33    3  874   20   82  384    1   513    4  24  11  62

And well you may wonder (as perhaps you often have) -- what the hell is
causing all of those cross calls, anyway?  (Cross calls appear in the "xcal"
column; see mpstat(1).)  

Using DTrace, investigating this is a snap:

# dtrace -n xcalls'{@[execname] = count()}'
dtrace: description 'xcalls' matched 4 probes
[ letting this run for a few seconds ]
^C

  mozilla-bin                                                       1
  lockd                                                             1
  in.mpathd                                                         2
  nsrmmd                                                            5
  grep                                                              6
  chmod                                                             6
  cat                                                               6
  nwadmin                                                          13
  ls                                                               24
  in.tftpd                                                         28
  nsrindexd                                                        34
  fsflush                                                          38
  cut                                                              42
  find                                                             42
  mkdir                                                            66
  rm                                                               76
  ipop3d                                                           78
  scp                                                              79
  inetd                                                            96
  dtrace                                                          111
  nawk                                                            118
  imapd-simmonmt                                                  126
  rmdir                                                           132
  sshd                                                            138
  rpc.rstatd                                                      159
  mv                                                              398
  save                                                           1292
  gzip                                                           1315
  get_all2                                                       1678
  sched                                                          1712
  nfsd                                                           3709
  tar                                                           27054

So the bulk of these are coming from tar(1) processes.  To explore the cross
calls coming from tar specifically, we could add a predicate to the above:

# dtrace -n xcalls'/execname == "tar"/{@[pid] = count()}'             
dtrace: description 'xcalls' matched 4 probes
[ again letting this run for a few seconds ]
^C

   938339              507
   938415              784
   938229             2438
   938390             3282
   938363             6464
   937067            13984

But now we notice a strange thing:  none of these processes seems to be
around by the time we're back on the shell, e.g.:

# pflags 937067
pflags: cannot examine 937067: no such process or core file

So is someone launching a bunch of short-lived tar processes?  To explore
this as a rough first cut, we could just print the pid whenever a tar
process is launched:

# dtrace -n syscall::exec\*:return'/execname == "tar"/{trace(pid)}'
CPU     ID                    FUNCTION:NAME
 18    112                     exece:return    954538
 21    112                     exece:return    954519
 17    112                     exece:return    954584
 21    112                     exece:return    954579
 21    112                     exece:return    954607
 22    112                     exece:return    954632
 21    112                     exece:return    954658
 20    112                     exece:return    954684
 20    112                     exece:return    954705
 15    112                     exece:return    954727
 14    112                     exece:return    954763
 16    112                     exece:return    954775
 12    112                     exece:return    954798
 18    112                     exece:return    954818
 22    112                     exece:return    954873
 22    112                     exece:return    954889
 23    112                     exece:return    954893
 14    112                     exece:return    954916
 14    112                     exece:return    954971
...

Seeing that this is happening with some frequency (the above has output
every second or so), we want to explore more generally what is being
exec(2)'d on the machine:

# dtrace -n syscall::exec\*:entry'{@[copyinstr(arg0)] = count()}'
[ letting this run for a bit ]
^C

  /usr/lib/mail.local                                               1
  /usr/bin/true                                                     1
  /usr/bin/pgrep                                                    1
  /usr/bin/wc                                                       1
  /usr/bin/touch                                                    1
  /usr/bin/sleep                                                    1
  /usr/sbin/modinfo                                                 4
  /bin/awk                                                          4
  /bin/sh                                                           4
  /usr/bin/chmod                                                   14
  /usr/bin/cat                                                     14
  /usr/bin/tar                                                     14
  ./do1                                                            14
  /usr/bin/cut                                                     15
  /usr/bin/grep                                                    15
  /usr/bin/find                                                    15
  /usr/bin/ls                                                      15
  /usr/bin/mv                                                      15
  /usr/bin/rm                                                      16
  /usr/bin/mkdir                                                   17
  /usr/bin/gzip                                                    29
  /usr/bin/nawk                                                    44
  /usr/bin/rmdir                                                   48
  /home/ellenp/rmdir                                               48
  /home1/netscape/rmdir                                            48
  /usr/dt/bin/rmdir                                                96
  /usr/openwin/bin/rmdir                                           96

There's a bunch that's interesting here -- but for now we're interested 
in those tar processes.  In particular, when someone is exec'ing a tar
process, we want to know who exactly they are.  To do this, we write
the following DTrace script, and call it "whotar.d":

syscall::exec*:entry
/copyinstr(arg0) == "/usr/bin/tar"/
{
 @[execname] = count();
}

Running this points us straight to the culprit:

# dtrace -s whotar.d
[ again letting this run for a bit ]
^C

  get_all2                                                         22

So get_all2 is the problem.  We can now use traditional tools to hone in
on this:

# ptree `pgrep get_all2`
100578 /usr/sbin/inetd -s
  900870 in.rlogind
    900872 -csh
      901382 /bin/sh get_all3
        901383 /bin/sh ./get_all2 80/80acc080 90/00fccc90
          959045 /bin/sh ./get_all2 80/80acc080 90/00fccc90
            959046 tar xvf - explorer.80c23d8e.ptgw02is-2003.10.27.09.35/README 
              959047 gzip -dc /net/pamer.central/proactive/rawdata/8e/80c23d8e
        901384 /bin/sh ./get_all2 90/80bdbb90 a0/1274eda0
          959565 /bin/sh ./get_all2 90/80bdbb90 a0/1274eda0
            959566 tar xvf - explorer.8089ed9d.curly-2002.05.06.10.00/README exp
              959567 gzip -dc /net/pamer.central/proactive/rawdata/9d/8089ed9d
        901385 /bin/sh ./get_all2 a0/80aabda0 b0/1d6c05b0
          959619 mv disks/diskinfo disks/format.out disks/mount.out etc/release 
        901386 /bin/sh ./get_all2 b0/80b264b0 c0/335419c0
          959612 /bin/sh ./get_all2 b0/80b264b0 c0/335419c0
            959613 tar xvf - explorer.82a86abd.seq22-2002.05.31.21.49/README exp
              959614 gzip -dc /net/pamer.central/proactive/rawdata/bd/82a86abd

So that explains where the tar processes (and thus the cross calls) are coming
from!  And just how short-lived _are_ these tar processes?  We can answer this
question with DTrace, natch.  To do this, we'll use an associative array and
the "timestamp" variable.  We are going to linearly quantize the result from 0
seconds to 60 seconds.  We'll also use a "tick-1sec" probe to automatically
stop after a minute.  The script (which we call "longtar.d"):

syscall::exec*:return
/execname == "tar"/
{
 start[pid] = timestamp;
}

fbt::exit:entry
/start[pid]/
{
 @["tar time, in seconds"] =
     lquantize((timestamp - start[pid]) / 1000000000, 0, 60);
 start[pid] = 0;
}

tick-1sec
/n++ >= 60/
{
 exit(0);
}

Running this:

# dtrace -s longtar.d
dtrace: script 'longtar.d' matched 4 probes
CPU     ID                    FUNCTION:NAME
 23  38110                       :tick-1sec 

  tar time, in seconds                              
           value  ------------- Distribution ------------- count    
             < 0 |                                         0        
               0 |@                                        1        
               1 |@@@@@                                    4        
               2 |@@@@@@@@@@@                              8        
               3 |@@@@@@@                                  5        
               4 |@@@@@@@                                  5        
               5 |@@                                       2        
               6 |                                         0        
               7 |@@@@                                     3        
               8 |                                         0        

This highlights just how difficult this problem would be to find using
traditional (process-centric) tools:  each tar process is only alive 
for (roughly) two seconds!  And yet, taken together, the tar processes
are accounting for the vast majority of cross calls on the system.

This is a good example of using DTrace because the problem here is _not_ a
single application -- it's an insurgent army of tar processes that are
slipping away before they can be noticed.  Indeed, running prstat(1) on this
machine doesn't show tar -- or _any_ of the get_all2 progeny processes -- as
top resource-consuming processes.  Using DTrace, however, one can quickly
root-cause aberrant systemic behavior of this nature.  

Of course, this is but a fraction of what DTrace can do.  We encourage you to
check out the AnswerBook at BigAdmin:

  http://www.sun.com/bigadmin/content/dtrace

Or better yet, go to:

  http://wwws.sun.com/software/solaris/solaris-express/get.html

There you can download Solaris Express 11/03 and try DTrace yourself!

 Enjoy,
 Bryan Cantrill, Mike Shapiro and Adam Leventhal
 DTrace Engineering Team

----------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.  bmc@eng.sun.com  (650) 786-3652

 

 

 


Last changes: Friday, February 09, 2007 03:39:23 PM,
:P 2004 filibeto.org, site statistics