Using the perf tool centers around the following concepts:
- events - these are hardware interrupts, hardware counters, or software events can be tracked
- targets - while these CPUs, processes, or threads are executing, events will be tracked (tracking is disabled otherwise)
- filters - these are predicates on event contexts used to control which events are tracked in which contexts
As I understand it, perf can operate in two main modes (as indicated by the perf stat and perf record subcommands):
-
perf stat- count events that occur as target executes, produce a set of global event counts, e.g., count the total number of CPU cycles that elapsed while a process was executing.This command introduces very low monitoring overhead, but does not track event contexts and thus does not support filters.
-
perf record- count events that occur as target executes in a particular execution context (if the context satisfies the filter), produce a set of per-context event counts. Note that an execution context includes things like:- the name of the currently executing function (which will be unknown unless debugging symbols are present)
- which object file contains the currently executing function
Note that
perf recordcannot record and associate literally every event occurrence as they happen too frequently. Instead, it will sample events (with a user-specified frequency) and only record sampled events.Also note that, by default,
perf recordwill track the eventcycleswhich occurs once every CPU cycle. The effect of this is that it will estimate (by sampling) how many CPU cycles were spent executing each function that your target program/thread runs (or that runs on your target CPU).
For our purposes, we generally want perf record mode, as we want to know what context a particular event was associated with.
Here is a brief explainer of the perf record command (there are many more options that are not covered here):
perf record [-e <event|{event...}>] [--filter=<filter>] \
[-F <freq>] [-c <count>] \
[-g] [--call-graph fp|lbr|dwarf] \
[-o|--output <path>] \
[-a] [-C|--cpu CPU...] [-p|--pid PID...] [-t|--tid TID...] [--] [command]
where:
-
Line 1 options specify which events to track:
-
-especifies the singleeventor brace-enclosed, comma-separated{event...}list to trackIf absent, the default event is
cycles, i.e., this event occurs every CPU cycle. -
--filter <filter>specifies the event/context filter predicate; if absent, all tracked events are counted
-
-
Line 2 options specify how often to track events (these are mutually exclusive):
-Fspecifies a desired frequency to track events-cspecifies that 1 out of every count events will be tracked
If
-Fand-care absent, I believe the default is-F 1000, but I'm not sure about this. -
Line 3 options specify call graph tracking (this extends the execution context to include the stack trace and not just the currently executing function):
-genables call graph tracking--call-graph <unwind-mode>specifies howperfattempts to unwind call stacks:fp- (default) efficient but based on frame pointers which doesn't work on code compiled with option-fomit-frame-pointerlbr- an accurate and efficient mode that is only supported by modern CPUsdwarf- accurate but slow, based on DWARF call frame information
-
Line 4 options specify where/how to record results:
-o- specifies path to output file where results are saved; if absent, default isperf.data
-
Line 5 options specify targets (in order of granularity):
-a- specifies all CPUs as targets (i.e., the entire system is profiled)-C- specifies the named CPUs as targets-p- specifies the named running processes as targets-t- specifies the named threads as targetscommand- executescommandas a new process, specifies that process as a target, and automatically terminates whencommandprocess terminates
Note: if
commandis not specified,perfwill not terminate until the user presses Ctrl+C.To make
perfrecord data for a fixed duration of time for a non-command target (for example, the entire system), you can use the following pattern:perf record -a sleep 5Since
-awas passed, the entire system will be profiled; however, since<command>was passed, the profiling will terminate when the<command>process terminates (which, forsleep 5, will occur after 5 seconds). This also works for-C,-p, and-tIf instead you did:
perf record -aIt will only terminate when Ctrl+C is pressed.
This tool consumes perf record data and visualizes it. The main options are listed below (see the man page or tutorial for more options):
perf report [-i|--input <path>] [--stdio|--tui] [-g]
-ispecifies an input file; if absent, default isperf.data--stdiogenerates a report file on stdout as text while--tuipresents data using a terminal interface-gtells the tool to visualize call graph hierarhcies
-
Since the
perfcommand talks to low-level hardware counters (and can monitor the entire system), it typically must be run with administrator privileges, so usesudoif necessary. -
Note that
perfis actually a part of the Linux kernel. This means, typically, you must install a kernel specific version. To do this on Ubuntu, one does:apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r` -
Data that is produced by
perf recordcan go stale over time. I see two methods to deal with this:-
Generate a report using
perf reportASAP after runningperf record--- this trick avoids the staleness issue and is easy to do. -
However, there may be another possible approach that uses
perf archive--- essentially, this tool scans yourperf.datafile, gathers debugging symbols for all of the libraries that it references, and dumps them into compressed archive. This archive can then be unpacked in theperfbuild-id directory, which by default, is$HOME/.debug.
-
-
Running
perf recordin a Docker container requires building the Docker container with the--privilegedflag which gives the container root-like permissions on the host system --- this means such containers should be run only when performing profiling.
- The
perfwiki and its tutorial - Bredan Gregg's
perfExamples - https://github.com/bpradipt/perf-container - information about running
perfin a container
Of these resources, so far, I found the tutorial most helpful --- but the examples page is a nice quick reference if you know what you're doing.