Using the perf
tool centers around the following concepts:
- events - these are hardware interrupts, hardware counters, or software events can be tracked
- targets - while these CPUs, processes, or threads are executing, events will be tracked (tracking is disabled otherwise)
- filters - these are predicates on event contexts used to control which events are tracked in which contexts
As I understand it, perf
can operate in two main modes (as indicated by the perf stat
and perf record
subcommands):
-
perf stat
- count events that occur as target executes, produce a set of global event counts, e.g., count the total number of CPU cycles that elapsed while a process was executing.This command introduces very low monitoring overhead, but does not track event contexts and thus does not support filters.
-
perf record
- count events that occur as target executes in a particular execution context (if the context satisfies the filter), produce a set of per-context event counts. Note that an execution context includes things like:- the name of the currently executing function (which will be unknown unless debugging symbols are present)
- which object file contains the currently executing function
Note that
perf record
cannot record and associate literally every event occurrence as they happen too frequently. Instead, it will sample events (with a user-specified frequency) and only record sampled events.Also note that, by default,
perf record
will track the eventcycles
which occurs once every CPU cycle. The effect of this is that it will estimate (by sampling) how many CPU cycles were spent executing each function that your target program/thread runs (or that runs on your target CPU).
For our purposes, we generally want perf record
mode, as we want to know what context a particular event was associated with.
Here is a brief explainer of the perf record
command (there are many more options that are not covered here):
perf record [-e <event|{event...}>] [--filter=<filter>] \
[-F <freq>] [-c <count>] \
[-g] [--call-graph fp|lbr|dwarf] \
[-o|--output <path>] \
[-a] [-C|--cpu CPU...] [-p|--pid PID...] [-t|--tid TID...] [--] [command]
where:
-
Line 1 options specify which events to track:
-
-e
specifies the singleevent
or brace-enclosed, comma-separated{event...}
list to trackIf absent, the default event is
cycles
, i.e., this event occurs every CPU cycle. -
--filter <filter>
specifies the event/context filter predicate; if absent, all tracked events are counted
-
-
Line 2 options specify how often to track events (these are mutually exclusive):
-F
specifies a desired frequency to track events-c
specifies that 1 out of every count events will be tracked
If
-F
and-c
are absent, I believe the default is-F 1000
, but I'm not sure about this. -
Line 3 options specify call graph tracking (this extends the execution context to include the stack trace and not just the currently executing function):
-g
enables call graph tracking--call-graph <unwind-mode>
specifies howperf
attempts to unwind call stacks:fp
- (default) efficient but based on frame pointers which doesn't work on code compiled with option-fomit-frame-pointer
lbr
- an accurate and efficient mode that is only supported by modern CPUsdwarf
- accurate but slow, based on DWARF call frame information
-
Line 4 options specify where/how to record results:
-o
- specifies path to output file where results are saved; if absent, default isperf.data
-
Line 5 options specify targets (in order of granularity):
-a
- specifies all CPUs as targets (i.e., the entire system is profiled)-C
- specifies the named CPUs as targets-p
- specifies the named running processes as targets-t
- specifies the named threads as targetscommand
- executescommand
as a new process, specifies that process as a target, and automatically terminates whencommand
process terminates
Note: if
command
is not specified,perf
will not terminate until the user presses Ctrl+C.To make
perf
record data for a fixed duration of time for a non-command target (for example, the entire system), you can use the following pattern:perf record -a sleep 5
Since
-a
was passed, the entire system will be profiled; however, since<command>
was passed, the profiling will terminate when the<command>
process terminates (which, forsleep 5
, will occur after 5 seconds). This also works for-C
,-p
, and-t
If instead you did:
perf record -a
It will only terminate when Ctrl+C is pressed.
This tool consumes perf record
data and visualizes it. The main options are listed below (see the man page or tutorial for more options):
perf report [-i|--input <path>] [--stdio|--tui] [-g]
-i
specifies an input file; if absent, default isperf.data
--stdio
generates a report file on stdout as text while--tui
presents data using a terminal interface-g
tells the tool to visualize call graph hierarhcies
-
Since the
perf
command talks to low-level hardware counters (and can monitor the entire system), it typically must be run with administrator privileges, so usesudo
if necessary. -
Note that
perf
is actually a part of the Linux kernel. This means, typically, you must install a kernel specific version. To do this on Ubuntu, one does:apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r`
-
Data that is produced by
perf record
can go stale over time. I see two methods to deal with this:-
Generate a report using
perf report
ASAP after runningperf record
--- this trick avoids the staleness issue and is easy to do. -
However, there may be another possible approach that uses
perf archive
--- essentially, this tool scans yourperf.data
file, gathers debugging symbols for all of the libraries that it references, and dumps them into compressed archive. This archive can then be unpacked in theperf
build-id directory, which by default, is$HOME/.debug
.
-
-
Running
perf record
in a Docker container requires building the Docker container with the--privileged
flag which gives the container root-like permissions on the host system --- this means such containers should be run only when performing profiling.
- The
perf
wiki and its tutorial - Bredan Gregg's
perf
Examples - https://github.com/bpradipt/perf-container - information about running
perf
in a container
Of these resources, so far, I found the tutorial most helpful --- but the examples page is a nice quick reference if you know what you're doing.