Thursday, April 13, 2006

xjobs & DTrace

After having integrated all the features that I thought to be most important for xjobs, I considered giving it some tuning. Some things came right to my mind (object caching, more efficient token passing, and so on). But then I wanted to see, what DTrace could tell me.

Unfortunately, I forgot to log my tests directly, so I reran the tests against the previous release for this blog. The old release misses the object cache and token tunings that I included before considering DTrace, but the numbers should be roughly the same, as the D script only looks at syscall times.

I used the following, fairly simple script that sums up the times spent in syscalls:


#!/usr/sbin/dtrace -s
syscall:::entry
/ execname == $$1 /
{
self->ts = timestamp;
}
syscall:::return
/ self->ts /
{
@[probefunc] = sum(timestamp - self->ts);
self->ts = 0;
}




Then I did the following:


maierkom@aquila:~$ systime xjobs
dtrace: script '/home/maierkom/bin/sun4/systime' matched 460 probes
^Z
Angehalten
maierkom@aquila:~$ bg
[1] systime xjobs &
maierkom@aquila:~$ /usr/bin/ls -1 /usr/sbin | src/xjobs-20060412/xjobs -v2 echo >> & /tmp/12
maierkom@aquila:~$ fg
systime xjobs
^C

issetugid 8500
fstat64 11800
setcontext 14700
sysconfig 18400
ioctl 21400
read 30300
getcwd 43400
brk 62600
stat 95700
resolvepath 110900
memcntl 158400
schedctl 2193000
lwp_self 2504800
getrlimit 3253400
fstat 4133900
getpid 4888400
lwp_sigmask 4929800
setpgrp 5193200
unlink 6338700
fcntl 7303600
lstat64 10207400
mmap 12543500
munmap 12776900
write 15068100
open 19000800
exece 432717100
access 787848500
fork1 1721425000
close 87198287700
waitsys 103746109400




OK, waitsys is big. I guess that's OK. But WTF is the matter with close? xjobs has to close all open filedescriptors that are not needed by the jobs forked. So I changed the filedescriptor handling, and set all
descriptors to be closed on exec. Additionally, access uses a lot of
time. When the utility to execute is given on the command line of xjobs,
one only needs to search the PATH once. OK, then let's do this caching.


After adding these changes and some the ones mentioned above concerning
the user code, I get the following results:





dtrace: script '/home/maierkom/bin/sun4/systime' matched 460 probes
^Z
Angehalten
maierkom@aquila:~$ bg
[1] systime xjobs &
maierkom@aquila:~$ /usr/bin/ls -1 /usr/sbin | src/xjobs-20060413/xjobs -v2 echo > & /tmp/13
maierkom@aquila:~$ fg
systime xjobs
^C

getrlimit 7100
issetugid 8700
fstat64 13900
setcontext 17400
sysconfig 19800
ioctl 23700
read 36500
getcwd 57200
brk 73300
resolvepath 106500
stat 110700
memcntl 185800
schedctl 1784700
lwp_self 2381700
fstat 2850800
setpgrp 3608200
lwp_sigmask 4380400
getpid 5376800
unlink 5641200
access 5886200
munmap 6634300
mmap 7548900
lstat64 7577200
fcntl 12572000
write 16043800
open 17121400
close 19644800
waitsys 32769800
fork1 119443300
exece 368008400





That looks much better. Now, let's take a look at the overall numbers:


maierkom@aquila:~$ /usr/bin/ls -1 /usr/sbin | timex src/xjobs-20060412/xjobs -v2 echo > /tmp/12

real 6.86
user 17.18
sys 4.81

maierkom@aquila:~$ /usr/bin/ls -1 /usr/sbin | timex src/xjobs-20060413/xjobs -v2 echo > /tmp/13

real 0.68
user 0.13
sys 0.48




WOW! DTrace you made my day! Factor ten for real and system time. xjobs is now really fast.

1 Comments:

Blogger Amber said...

Impressive stuff :)

I really liked the convincing way you compare xjobs to xargs -P on your website.

It seems to me you have your goals set out pretty nicely, and that is a very good feature for a programmer.

Thanks for giving me the source code, too!

12:10 PM  

Post a Comment

<< Home