Man Page collect.1




NAME

     collect - command used for performance data collection


SYNOPSIS

     collect collect-arguments target target-arguments


DESCRIPTION

     The collect command runs the target process and records per-
     formance  data,  using  profiling or tracing techniques, and
     global data for the process.  The data can be examined  with
     a   GUI   program,  analyzer,  or  a  command-line  program,
     er_print. The data collection software run  by  the  collect
     command is referred to here as the Collector.

     Note: You must have a license to run the GUI program.

     The data from a single run of the collect command is  called
     an  experiment.   It  is represented in the file system as a
     directory, with various files inside that directory.

     target is the path name of the executable or  Java[tm]  .jar
     or  .class  file  for  which you want to collect performance
     data.  (For more information about Java profiling, see  JAVA
     PROFILING,  below.)   Executables  that  are targets for the
     collect command can be compiled with any level of  optimiza-
     tion,  but must use dynamic linking.  If a program is stati-
     cally linked, the collect command prints an  error  message.
     In order to see annotated source using analyzer or er_print,
     targets should be compiled with the -g flag, and should  not
     be stripped.

     The collect command uses the following strategy to find  its
     target:
      o If there is a file with the name of the  target  that  is
      marked executable, it is verified as an ELF executable that
      can run on the target machine. If the file is  not  such  a
      valid ELF executable, the collect command fails.
      o If there is a file with the name of the target, and it is
      not executable, it is checked to see if it is a Java jar or
      class file. If it is, a target name for java  is  inserted,
      with  any  necessary  flags,  and  data is collected on the
      JVM[tm] machine.  (The terms  "Java  virtual  machine"  and
      "JVM" mean a virtual machine for the Java[tm] platform.)
      o If there is no file with the  name  of  the  target,  the
      user's  path  is  searched to find an executable; if one is
      found, it is verified as described above.
      o If no file of the current  name  is  found,  the  command
      looks  for  a  file  with  that  name and the string .class
      appended; if found, the target name of  java  is  inserted,
      with the appropriate flags, as above.
      o If none of these procedures can find the target, the com-
      mand fails.


ARGUMENTS

     If it is invoked with no arguments, collect prints  a  usage
     message,  including the default experiment configuration and
     the names of any hardware counters available for profiling.

  Data Specifications
     -p option
          Collect clock-based profiles.  The  allowed  values  of
          option are:

          Value     Meaning

          off       turn off clock-based profiling

          on        turn  on  clock-based  profiling   with   the
                    default profiling interval of 10 milliseconds

          lo[w]     turn on clock-based profiling with  the  low-
                    resolution  profiling  interval  of  100 mil-
                    liseconds

          hi[gh]    turn on clock-based profiling with the  high-
                    resolution   profiling  interval  of  1  mil-
                    lisecond

                    NOTE: to  do  high-resolution  profiling  you
                    must enable the high-resolution system clock.
                    See Clock-Based Profiling, below.

          n         turn on clock-based profiling with a  profil-
                    ing interval of n milliseconds.

                    If the value is smaller than the system clock
                    resolution  it  is  set  to  the system clock
                    resolution; if it is larger than  the  system
                    clock  resolution  it  is rounded down to the
                    nearest multiple of the system clock  resolu-
                    tion.

          If no  explicit  -p  off  argument  is  given,  and  no
          hardware  counter  profiling  is specified, clock-based
          profiling is turned on.

     -h counter[,value[,counter2[,value2]]]
          Collect  hardware  counter  overflow   profiles.    The
          counter  name can be either a standard counter name, or
          the internal name, as used by cputrack(1). To list both
          the  standard  and  the internal names, type collect at
          the command line. If the name is an internal name,  and
          it  is  trailed by a slash and a single digit (0 or 1),
          the event register specified by that digit is used;  if
          not,  whichever event register supports that counter is
          used.

          The overflow value is the number of events  counted  at
          which  the  counter overflows and the overflow event is
          recorded. It can be set to one the following:

          Value     Meaning

          0 or a null string
                    Use the default overflow value

          hi[gh]    Use the high resolution overflow  value.  The
                    value h for high resolution is also supported
                    for compatibility.

          lo[w]     Use the low resolution overflow value

          n         Use an overflow value of n

          If you specify the optional second counter  and  second
          overflow  value,  the  overflow  value  for  the  first
          counter must be specified as one of the  values  listed
          above. The two named counters specified must be on dif-
          ferent event registers.

          An experiment can specify both hardware counter profil-
          ing  and  clock-based  profiling.   If hardware counter
          profiling is specified, but  clock-based  profiling  is
          not  explicitly  specified,  clock-based  profiling  is
          turned off.

     -s option
          Collect synchronization tracing data. The minimum delay
          threshold  for tracing events is set using option.  The
          allowed values of option are:

          Value     Meaning

          on        turn on synchronization delay tracing and set
                    the threshold value by calibration at runtime

          calibrate same as on

          off       turn off synchronization delay tracing

          n         turn on synchronization delay tracing with  a
                    threshold  value  of  n microseconds. If n is
                    zero, all events are traced.

          all       turn on  synchronization  delay  tracing  and
                    trace all synchronization events.

          Synchronization delay tracing is turned off by default.

          Synchronization events are not recorded for Java  moni-
          tors.

     -H option
          Collect heap trace data. The allowed values  of  option
          are:

          Value     Meaning

          on        turn on tracing of memory allocation requests

          off       turn  off  tracing   of   memory   allocation
                    requests

          Heap tracing is turned off by default.

          Heap tracing events are not recorded  for  Java  memory
          allocations.

     -m option
          Collect MPI tracing data. The allowed values of  option
          are:

          Value     Meaning

          on        turn on tracing of MPI calls

          off       turn off tracing of MPI calls

          MPI tracing is turned off by default.

     -S interval
          Collect periodic samples at the interval specified  (in
          seconds).  If interval is zero, no periodic samples are
          collected.  By default, periodic sampling at  1  second
          intervals is enabled.  The data recorded in the samples
          is data for the process, and includes a  timestamp  and
          execution  statistics  from  the  kernel,  among  other
          things.

     If no data specification arguments are supplied, clock-based
     profiling data, with the default resolution, is collected.

     If clock-based profiling is explicitly disabled, and neither
     hardware-counter  overflow profiling nor any kind of tracing
     is enabled, the collect command warns that no function-level
     data  is  being  collected. The target is still executed and
     global data is recorded.


  Experiment Controls
     -L size
          Limit the amount of profiling and tracing data recorded
          to size megabytes.  The limit applies to the sum of all
          profiling data and tracing  data,  but  not  to  sample
          points.  The  limit  is  only  approximate,  and can be
          exceeded.  When the limit is reached, no more profiling
          or tracing data is recorded, but the experiment remains
          open and samples are recorded until the target  process
          terminates.

          The default limit on the amount  of  data  recorded  is
          2000 Mbytes.  To remove the limit, set size to zero.

     -F option
          Control whether or not descendant processes should have
          their data recorded.  The allowed values of option are:

          Value     Meaning

          on        record   experiments   on   all    descendant
                    processes

          off       do  not  record  experiments  on   descendant
                    processes

          Following descendant processes is off by default.   See
          the section "FOLLOWING DESCENDANT PROCESSES", below.

     -j option
          Control  Java  profiling  when  the  target  is  a  JVM
          machine. The allowed values of option are:

          Value     Meaning

          on        Record profiling data for  the  JVM  machine,
                    and  recognize  methods  compiled by the Java
                    HotSpot[tm] virtual machine.

          off       Do not record Java profiling data.

          See the section "JAVA PROFILING", below.

          You must use -j to obtain profiling data if the  target
          is  a JVM machine.  The version of the JVM machine must
          be no earlier than 1.4. Java profiling  does  not  work
          for earlier versions of the JVM machine.  The -j option
          is not needed if the target is a class or jar file.  If
          you are using a 64 bit JVM machine you must specify its
          path explicitly as the target.  Do  not  use  the  -d64
          option  for  a 32 bit JVM machine.  If the -j option is
          set, but the target is not a JVM machine,  no  data  is
          recorded for the target.  The collect command validates
          the version of the JVM machine specified for Java  pro-
          filing.

     -l signal
          Record a sample point  whenever  the  given  signal  is
          delivered to the process.

     -y signal[,r]
          Control recording of data with  signal.   Whenever  the
          given  signal  is  delivered  to  the  process,  switch
          between paused (no data is recorded) and resumed  (data
          is  recorded)  states.  The Collector is started in the
          resumed state if the optional ,r flag is given,  other-
          wise  it  is  started in the paused state.  This option
          does not affect the recording of sample points.

  Output Controls
     -o experiment_name
          Use experiment_name as the name of the experiment to be
          recorded.   The  experiment_name string must end in the
          string .er; if not, an error message is printed, and no
          experiment is run.

          If -o is not specified, a name of the form stem.n.er is
          chosen,  where stem is a string, and n is a number.  If
          a group name has been specified with -g, stem is  taken
          from  the group name by removing the .erg suffix. If no
          group name has been  specified,  stem  is  set  to  the
          string test.

          If the collect command is launched from one of the com-
          mands used to run MPI jobs and -o is not specified, the
          value of n used in the name is taken from the  environ-
          ment  variable used to define the MPI rank of that pro-
          cess. Otherwise, n is set to the lowest integer not  in
          use.

          If the name is not specified in the form stem.n.er, and
          the  the  given  name  is  in  use, an error message is
          printed, and no experiment is run.  If the name  is  of
          the  form stem.n.er and the name is in use, the experi-
          ment is recorded under a name corresponding  the  first
          available  value  of n that is not in use. A warning is
          printed if the name is changed.

     -d directory_name
          Place the experiment in directory  directory_name.   If
          no  directory is given, the experiment is placed in the
          current working directory.

     -g group_name
          Add the experiment to the experiment group  group_name.
          The  group_name  string must end in the string .erg; if
          not, an error is reported, and no experiment is run.

  Other Arguments
     -n   Dry run: the target is not run, but all the details  of
          the experiment that would be run are printed.  Turns on
          -v.

     -R   Display the  text  version  of  the  performance  tools
          README  in  the  terminal  window. If the README is not
          found, a warning is printed.  No further arguments  are
          examined, and no further processing is done.

     -V   Print the current version.  No  further  arguments  are
          examined, and no further processing is done.

     -v   Print the current version and further detailed informa-
          tion about the experiment being run.

     -x   Leave the target process stopped on the exit  from  the
          exec  system  call,  in  order  to  allow a debugger to
          attach to it.

          To attach a debugger to the target once it  is  stopped
          by  collect, you must determine the PID of the process,
          start the debugger, configure it to ignore SIGPROF (and
          SIGEMT, if you chose to collect hardware counter data),
          and then attach to the process using the  PID.  As  the
          process  runs  under  the  control of the debugger, the
          Collector records an experiment.

  Obsolete Arguments
     -a   Address space data is no longer supported.   If  -a  is
          specified,  an error message is printed and the command
          exits.


FOLLOWING DESCENDANT PROCESSES

     Processes can create descendant processes by calling a  sys-
     tem  library  function.  The  Collector can collect data for
     descendant  processes  initiated  by   calls   to   fork(2),
     fork1(2),  fork(3F), vfork(2), and exec(2) and its variants.
     The call to vfork is replaced internally by a call to fork1.
     The  Collector  ignores  calls  to  system(3C),  system(3F),
     sh(3F), popen(3C), and similar functions, and their  associ-
     ated  descendant  processes.  If the -F on argument is used,
     the Collector opens a new  experiment  for  each  descendant
     process  inside the parent experiment. These new experiments
     are named with their lineage as follows.  To form the exper-
     iment  name  for a descendant process, an underscore, a code
     letter and a number are added to  its  creator's  experiment
     name. The code letter is "f" for a fork and "x" for an exec.
     The number is the index of the fork or  exec  (whether  suc-
     cessful or not). For example, if the experiment name for the
     initial process is "test.1.er", the experiment for the child
     process  created by its third fork is "test.1.er/_f3.er". If
     that child process execs  a  new  image,  the  corresponding
     experiment name is "test.1.er/_f3_x1.er".

     The Analyzer and er_print do not automatically read  experi-
     ments  on descendant processes.  You can explicitly refer to
     them, as described above, or you can make a group file  from
     the original experiment as follows, by first running collect
     -F on -g group.erg on the target.  If  this  was  the  first
     run,  the  original  (founder)  experiment  is  recorded  as
     group.1.er, and a group file  containing  only  the  founder
     experiment is created.  To add all descendant process exper-
     iments  to  the  group,  type  ls  -1d  group.1.er/*.er   >>
     group.erg at the command prompt. You can edit the group file
     to select a subset of descendant  experiments.  The  experi-
     ments in the group are loaded by the Analyzer or by er_print
     as for any other group file.


JAVA PROFILING

     In this release,  Java  profiling  consists  of  running  an
     experiment on the JVM machine as it runs the user's class or
     jar files. For all interpreted methods, data is reported for
     the  JVM  machine  itself, but for methods compiled with the
     Java HotSpot virtual machine, data  is  reported  for  named
     methods.  Data  is reported in the usual way for any C, C++,
     or  Fortran  code  called  by  a  Java  target.   Such  code
     corresponds to Java native methods.

     Clock-based profiling  and  hardware-counter  profiling  are
     supported.   Synchronization  tracing  and heap tracing only
     collects data on native  calls  for  memory  allocation  and
     calls  to  the  various synchronization primitives.  They do
     not record data for Java allocations  or  for  Java  monitor
     functions.

     When collect inserts a target name of java into the argument
     list,  it  examines  environment variables for a path to the
     java target, in the order JDK_1_4_HOME, then JDK_HOME,  then
     JAVA_PATH,  and  finally  the user's PATH.  For the first of
     these that is set, the resultant target is  verified  as  an
     ELF  executable.  If  it is not, collect fails with an error
     indicating which environment variable was used, and the full
     path name that was tried.

     Java profiling is not supported for  versions  of  the  Java
     [tm]  2 SDK earlier than 1.4.  Attempts to profile Java code
     using earlier versions fail because the java executable  for
     these versions is a shell script, not an ELF executable, and
     the necessary functionality to support Java profiling is not
     available.


USING COLLECT WITH MPI

     collect can be used with MPI by simply prefacing the  target
     and its arguments with collect and its arguments in the com-
     mand line that starts the MPI job.  For example,
          % mprun -np 16 a.out 3 5
     can be replaced by
          % mprun -np 16 collect -m  on  -d  /tmp/mydirectory  -g
          run1.erg a.out 3 5
     to run an MPI tracing experiment  on  each  of  the  16  MPI
     processes,  collecting them all in a specific directory, and
     collecting them as a group.  The individual experiments  are
     named by the MPI rank, as described above.  The experiments,
     as specified  above,  contain  clock-based  profiling  data,
     which is turned on by default, and MPI Trace data.


DATA COLLECTED

     Three types of data are collected: profiling  data,  tracing
     data and sampling data. The data packets recorded in profil-
     ing and tracing include the callstack of each LWP, the  LWP,
     thread  and  CPU IDs, and some event-specific data. The data
     packets recorded in sampling contain  global  data  such  as
     execution  statistics,  but  no  program-specific  or event-
     specific data. All data packets include a timestamp.

     Clock-based Profiling
          The event-specific data recorded in clock-based profil-
          ing  is  an  array of counts for each accounting micro-
          state. The microstate array is incremented by the  sys-
          tem  at  a prescribed frequency, and is recorded by the
          Collector when a profiling signal is processed.

          Clock-based profiling can run at normal  frequency  (10
          ms.),  high-resolution  frequency  (1 ms.), or a custom
          frequency,  specified  in  milliseconds.    For   high-
          resolution  profiling,  the  operating  system  on  the
          machine must be running with  a  high-resolution  clock
          routine, which can be done by putting the line:

               set hires_tick=1

          in the file /etc/system and rebooting.  High-resolution
          profiles  record ten times as much data for a given run
          as normal profiles.  If you try to set  high-resolution
          profiling  on a machine whose operating system does not
          support it, the Collector prints a warning message  and
          uses  the  highest  resolution  supported. Similarly, a
          custom setting that is not a multiple of the resolution
          supported  by the system is rounded down to the nearest
          non-zero multiple of that  resolution,  and  a  warning
          message is printed.
          Clock-based profiling data is converted into  the  fol-
          lowing metrics:

               User CPU Time
               Wall Time
               Total LWP Time
               System CPU Time
               Wait CPU Time
               User Lock Time
               Text Page Fault Time
               Data Page Fault Time
               Other Wait Time

          For multiprocessor experiments, all of  the  times  are
          summed across all LWPs in the process.  Total time adds
          up to the wall-clock time, multiplied  by  the  average
          number of LWPs in the process.

     Hardware Counter Overflow Profiling
          Hardware counter overflow profiling records the  number
          of  events  counted by the hardware counter at the time
          the overflow signal was processed.

          Hardware counter overflow profiling can be done on sys-
          tems  that  support overflow profiling and that include
          the hardware counter shared library, libcpc.so(3).  You
          must   use  a  version  of  the  Solaris[tm]  Operating
          Environment that is  no  earlier  that  the  Solaris  8
          Operating Environment. On UltraSPARC[tm] computers, you
          must use a version of the hardware no earlier than  the
          UltraSPARC III hardware.  On computers that do not sup-
          port overflow profiling, an attempt to select  hardware
          counter overflow profiling generates an error.

          The counters available depend on the specific CPU  chip
          and  operating environment. The list of counters can be
          determined by running the collect command with no argu-
          ments,  which  prints out a usage message that contains
          the names of the  counters.   The  counters  that  have
          aliases  are displayed first in the list, followed by a
          list of all counters.

          The lines of output for an aliased counter are  format-
          ted as follows:

               CPU Cycles (cycles = Cycle_cnt/0) 9999991 hi=1000003, lo=100000007

          In this line, the first field,  "CPU  Cycles",  is  the
          metric  name.  The  second  field,  "cycles", gives the
          alias name that can be used in the -h counter...  argu-
          ment.  The third field, "Cycle_cnt/0", gives the inter-
          nal name as used by cputrack(1) and the register number
          on  which  that  counter can be used. The next field is
          the default overflow interval, the following  field  is
          the  default high-resolution overflow interval, and the
          last  field  is  the  default  low-resolution  overflow
          interval.

          Lines of output for the non-aliased counters  are  for-
          matted as follows:

               Cycle_cnt/0 events 1000003 hi=100003, lo=9999991


          In this line, the first field, "Cycle_cnt/0", gives the
          internal  name  as used by cputrack(1) and the register
          number on which that counter can be  used.  The  string
          "Cycle_cnt/0  events"  is  the  metric  name  for  this
          counter. The next field is the default overflow  inter-
          val, the following field is the default high-resolution
          overflow interval, and the last field  is  the  default
          low-resolution overflow interval.

          For counters that count in cycles, the metrics reported
          are  converted  by  default  to inclusive and exclusive
          times, but can optionally be  shown  as  event  counts.
          For  counters  that do not count in cycles, the metrics
          reported are inclusive and exclusive event counts.

     Synchronization Delay Tracing
          Synchronization delay tracing records all calls to  the
          various   thread  synchronization  routines  where  the
          real-time delay in the call exceeds a specified  thres-
          hold. The data packet contains timestamps for entry and
          exit to the synchronization routines, the thread ID and
          the  LWP ID at the time the request is initiated. (Syn-
          chronization requests from a thread can be initiated on
          one LWP, but complete on another.)

          Synchronization delay tracing data  is  converted  into
          the following metrics:

               Synchronization Delay Events
               Synchronization Wait Time

     Heap Tracing
          Heap tracing records all calls to malloc,  free,  real-
          loc,   and   memalign,  with  the  size  of  the  block
          requested, its address, and for realloc,  the  previous
          address.

          Heap tracing  data  is  converted  into  the  following
          metrics:

               Leaks
               Bytes Leaked
               Allocations
               Bytes Allocated

          Leaks are defined as allocations that  are  not  freed.
          If  a  zero-length  block is allocated, it counts as an
          allocation with zero bytes allocated. If a  zero-length
          block is not freed, it counts as a leak with zero bytes
          leaked.

     MPI Tracing
          MPI tracing records calls to the MPI library for  func-
          tions  that  can  take  a significant amount of time to
          complete. The following functions from the MPI  library
          are traced:  MPI_Send, MPI_Bsend, MPI_Rsend, MPI_Ssend,
          MPI_Recv, MPI_Sendrecv, MPI_Sendrecv_replace, MPI_Wait,
          MPI_Waitall,  MPI_Waitany, MPI_Waitsome, MPI_Win_fence,
          MPI_Win_lock,      MPI_Allgather,       MPI_Allgatherv,
          MPI_Allreduce,       MPI_Alltoall,       MPI_Alltoallv,
          MPI_Barrier,   MPI_Bcast,   MPI_Gather,    MPI_Gatherv,
          MPI_Reduce,  MPI_Reduce_scatter, MPI_Scan, MPI_Scatter,
          MPI_Scatterv.

          MPI  tracing  data  is  converted  into  the  following
          metrics:

               MPI Time
               MPI Sends
               MPI Bytes Sent
               MPI Receives
               MPI Bytes Received
               Other MPI Calls

          The MPI Bytes Received metric uses the actual number of
          bytes  for blocking calls, but uses the buffer size for
          non-blocking calls. Metrics that are computed for  col-
          lective  operations  such as gather, scatter and reduce
          have the maximum possible values for these  operations.
          No reduction in the values are made due to optimization
          of the collective operations.

     Sampling and Global Data
          Sampling refers to the process  of  generating  markers
          along the time line of execution. At each sample point,
          execution statistics are  recorded.  All  of  the  data
          recorded at sample points is global to the program, and
          does not map to function-level metrics.

          Samples are always taken at the start of  the  process,
          and  at its termination. By default or if a non-zero -S
          argument is specified, samples are  taken  periodically
          at the specified interval.  In addition, samples can be
          taken by using the libcollector(3) API.

          The data recorded at  each  sample  point  consists  of
          microstate  accounting  information  from  the  kernel,
          along with various other statistics  maintained  within
          the kernel.


RESTRICTIONS

     The Collector interposes on some signal-handling routines to
     ensure  that its use of SIGPROF signals for clock-based pro-
     filing and SIGEMT for hardware-counter overflow profiling is
     not  disrupted by the target program.  The Collector library
     re-installs its own signal handler  if  the  target  program
     installs  a  signal  handler. The Collector's signal handler
     sets a flag that ensures that system calls  are  not  inter-
     rupted  to  deliver  signals.  This setting could change the
     behavior of the target program.

     The Collector interposes on setitimer(2) to ensure that  the
     profiling  timer  is  not available to the target program if
     clock-based profiling is enabled.

     The  Collector  interposes  on  functions  in  the  hardware
     counter  library,  libcpc.so,  so that an application cannot
     use hardware counters while the Collector is collecting per-
     formance  data.  The  interposed functions return a value of
     -1.

     Hardware counter profiling cannot be run on a  system  where
     cpustat  is  running,  because  cpustat takes control of the
     counters, and does not let a user process use them.

     Java profiling is not supported with versions of the JAVA  2
     SDK earlier than 1.4.

     Data is not  collected  on  descendant  processes  that  are
     created to use the setuid attribute.

     Applications that call vfork(2) have these calls replaced by
     a call to fork1(2).


SEE ALSO

     collector(1), dbx(1), er_archive(1), er_cp(1), er_export(1),
     er_mv(1),  er_print(1),  er_rm(1),  libcollector(3), and the
     manual
     Program Performance Analysis Tools