1
Compilation Requirements
2
* Debuging flags: -g -gdwarf-2 -g3 -fno-omit-frame-pointer
3
* Disable inlining: -fno-inline
9
operf <operf_args> <application> <app_args>
10
With kernel profilling (kernel image should be uncompressed and not stripped)
11
* In OpenSuSE actually running kernel can be just gunzip'ed
12
* If self-compiled, it normally could be found in /usr/src/linux/arch/x86/boot/compressed/vmlinux
13
operf --vmlinux=vmlinux <app...>
14
To include callgraph information to analyze what was calling the long-running symbols
18
Configure the daemon (kernel and other options)
19
opcontrol --vmlinux=vmlinux
20
Run and dump the results (results can be dumped while running)
21
opcontrol --reset; opcontrol --start; <run app>; opcontrol --stop
25
opcontrol --start-daemon
26
opcontrol --status - Daemon configuration
31
opreport - Dumps all shared libraries and apps consumed cpu cycles
32
opreport --symbols --debug-info - Dump symbols used cpu cycles (apps should be compiled with -g to get this info)
33
opreport --symbols --debug-info --details - Get line infomation
36
opreport --symbols -p /lib/modules/3.11.6-4-desktop
39
opreport ... </path/to/library_or_app> - Only report information related to the specified application
40
opreport --include-symbols <symbol> - Only report information about the specified sy,bol
41
opreport --threshold 4.0 - Only report on stuff with more than 4% usage
42
opreport event:CPU_CLK_UNHALTED cpu:1 ... - Filter out only specific events (see man opreport)
45
opreport --symbols --debug-info --callgraph - operf should be executted with --callgraph as well (costs much performance)
47
The format is a bit strange, there are multiple blocks. In each block one line starts at the
48
very beginning of line and others after a small padding with spaces. This provides a standard
49
information about considered symbol. Lines above it, list all functions calling the symbol along
50
with statistics how often the symbol was sampled being called from a specified function.
51
The lines below are list of functions called by the symbol. The line ending with "[self]"
52
tells how long the symbol itself was executed (without calling anything futher).\
55
opgprof <component_full_name> - Generates gprof output for a single component
56
gprof -p <component_full_name> - and other gprof commands...
59
- With: https://github.com/jrfonseca/gprof2dot
60
opreport --callgraph --debug-info --long-filenames | gprof2dot -f oprofile | dot -Tpng -o output.png
63
Anotate library source:
64
opannotate --source /home/csa/opc/libds/libds.so
65
opannotate --assembly /home/csa/opc/libds/libds.so
67
Storing and comparing profiles:
68
opreport --output-dir=<result_dir> - Use oprofile information from the specified directory
69
oparchive -o <arc_name> <app>
70
opreport --session-dir=`pwd`/oprofile_data/ -x1 <app> { archive: ./<arc_name> } { }
71
opreport --session-dir=`pwd`/oprofile_data/ -x1 <app> { archive: ./<arc_name> } { ./<arc2_name> }
75
Differential profiles:
84
opcontrol --list-events
86
CPU_CLK_UNHALTED - CPU load
87
BR_MISSP_EXEC - Branch missprediction
88
L2_LINES_IN - L2 Cache lines modifications
92
opcontrol --event=<event1_spec> --event=<event2_spec>
93
- This should be executed before daemon started
94
- The number of supported counters dependens on CPU (1-8)
95
- Default event counter (depends on CPU): CPU_CLK_UNHALTED:100000:0:1:1
96
- Multiple events are not working for me (core duo, just first is collected)
98
event_spec: <name:sample_rate:mask:kernel_mode:user_mode>
99
+ name and mask are obtained using 'ophelp'
100
+ sample_rate - events are generated once at specified amount of cycles
101
+ if kernel_mode = 0, the events obtained in kernel mode are ignored
102
+ if user_mode = 0, the events obtained in user mode are ignored
104
opcontrol --separate=<mode>
105
+ none - do not separate the profiles (default)
106
+ library - generate per-application profiles for libraries
107
+ kernel - generate per-application profiles for the kernel
109
+ all - generate per-application profiles for libraries and
110
per-application profiles for the kernel and kernel
116
There is limited amount of counters which can monitor arbitrary events. AMD
117
have 4 and Intel have 2 (actually 5, but 3 is predefined and not used by
118
oprofile). To count more, the counter multiplexing is could be used by
119
modules. This actually implemented for AMD (and probably old P4 of Intel).
120
However, the support for modern Intel CPU's is still missing.
121
Development is going on in oprofile git branch by Robert Richter and he is
122
promising to include complete version 2.6.30 or 31.
123
http://git.kernel.org/?p=linux/kernel/git/rric/oprofile.git;a=shortlog;h=multiplexing
124
arch/x86/oprofile/nmi_int.c:op_nmi_init appropriate CPU model
125
op_model_ppro.c - is currently used for Core* processors
126
Patch agains 2.6.28 is certainly not including support for Intel
127
There is also was development to update oprofile apps to work with perfmon2
128
module, already supporting multiplexing, but it's only for ia64 at the moment
129
http://sourceforge.net/mailarchive/forum.php?thread_name=20090212222748.GA7205%40suse.de&forum_name=perfmon2-devel
134
operf --events=BR_MISS_PRED_RETIRED:500,BR_INST_RETIRED:500 <app>
136
operf --events=CPU_CLK_UNHALTED:2000003,mem_load_uops_retired:2000003:l1_hit ./mult_sse_debug
145
- oprof_start - Simple UI allowing selection of events from list