-{section: How to sample the stack with gdb} +{section: How to sample the stack} This method is useful when you are called in to diagnose a condor daemon that is using lots of cpu (or blocking a lot on I/O) for an unknown reason. Although crude, I have found this method to lead to the source of trouble in numerous cases. -Here’s an example gdb script for the schedd: +If pstack or gstack are available, just use those. If not, you can use gdb. + +Here’s an example gdb script for the schedd: {verbatim} add-symbol-file /path/to/condor_schedd.dbg @@ -19,7 +21,7 @@ If the stack shows that the schedd is usually in a certain part of the code, this may lead you to the source of trouble. -If you need to get the condor admin to do this for you, then it is nice to make it even simpler. Below is a simple script to do the sampling. (Note that it doesn’t load the .dbg file, just to keep things simple.) +If you need to get the condor admin to do this for you, then it is nice to make it even simpler. Below is a simple script to do the sampling. (Note that it doesn’t load the .dbg file, just to keep things simple.) {verbatim} #!/bin/sh @@ -51,17 +53,17 @@ Callgrind is a wonderful profiling tool. The one big disadvantage of it is that it slows down the application considerably (~20 times in my experience). -Here’s an example of how to run callgrind on the collector. +Here’s an example of how to run callgrind on the collector. {verbatim} condor_off -collector # become the user who runs condor sudo su root -valgrind —tool=callgrind /path/to/condor_collector -f -p 9618 >& /tmp/callgrind.log < /dev/null & +valgrind —tool=callgrind /path/to/condor_collector -f -p 9618 >& /tmp/callgrind.log < /dev/null & PID=$! {endverbatim} -You will see some files in the current working directory named “callgrind.out.$PID” and “callgrind.info.$PID”. Change the ownership of these from root to the condor user or callgrind will have trouble writing to them when the program exits. +You will see some files in the current working directory named “callgrind.out.$PID” and “callgrind.info.$PID”. Change the ownership of these from root to the condor user or callgrind will have trouble writing to them when the program exits. {verbatim} chown condor:condor callgrind.*.$PID