3
Function/Tree Profiller: codeanalyst(+), valgrind/kcachegrind(slow) [low-precision: sysprof, google-perftools]
4
Hardware Assisted Profiller: oprofile(+), perfmon2
5
Heap Profiller: valgrind/kcachegrind(slow), Google PerfTools (low precision)
6
Memory Fragmentation: TotalView Memoryscape
7
Memory Leak Detector: TotalView Memoryscape, valgrind (free), Purify (finds lot)
8
Debugger: Slickedit + gdb, TotalView + ReplayEngine (!?)
9
Tracer: dtrace (!), strace (syscalls), ltrace(library callss), btrace (I/O)
10
Extra Tools: PowerTop, LatencyTop
12
Standard GNU profiler (gprof, gcov)
14
GNU profiller have problems with multithreaded applications and is not able
15
to track into the dynamicaly loaded (dlopen) shared libraries. Moreover,
16
while gprof will just ignore such libraries, gcov will fail all calls
22
Open source memory leak detector and profiller. Very powerful, it is able
23
to profile CPU and HEAP, simullate cache and branching behaviour. There is
24
even alpha version of I/O profiller available. However, it's extremely
27
* kcahcegrind - provides graphical tree-style profiller (very convinient)
28
* alleyoop - Gnome GUI to valgrind memcheck and helgrind. Do not provide
31
Google Performance Tools
32
------------------------
33
Malloc debugger as well as Heap and CPU profillers. Constructs graph to
34
better represent call flow.
36
- Due to the design, you only got the main time consuming functions. The
37
profilling of servers waking up for a short time to process requests is
39
- Crashing on amd64 due to libunwind 0.98.x (0.99 is expected to fix problem
44
Plain profiller using hardware counters (oprofile vanilla kernel module).
45
Supports threads and processes. Annotates sources/assembler code with
47
+ There is something like qemu support
48
- Current version have problems handling multiple events at once (no multiplexing)
49
- The root priveleges are required to start and stop monitoring
50
- Only plane profilling, no way to know who was calling expansive functions
52
Perfmon2 [requires kernel patch]
54
pfmon - is an alternative to OProfile.
55
+ Works fine from standard user credential
56
+ Works well with multiple events (multiplexing support)
57
+ Provides convinient constracts to start and stop monitoring in certain
58
places or at access to certain data.
59
+ Able to monitor a single application only
60
+ gpfmon is python based graphical frontend (able to operate over ssh)
61
- However, it is only able to provide symbol information, the sources are
63
- Plane profilling only, you will not know who executed this free function
64
in libc who takes 90% of time.
68
The Performance API (PAPI) project specifies a standard API for accessing
69
hardware performance counters available on most modern microprocessors.
70
- Needs own patch to a kernel which are quite outdated (2.6.26RC when 2.6.29
71
is out) or optionally perfmon2 module. However, the compilation with last is
73
- For that reasons, not tested
77
Very simple tree based profiler with minimalistic user interface.
78
* Provides kernel module (sysprof-module, kernel patchig is not needed)
79
* Operates in system wide mode (no events, just cpu time)
80
* Reports where the most time was spent (low precision), organizes symbols in
85
- Console based frontend to OProfile and PAPI providing the same functionality
86
but allows a little bit less typing.
90
Open source profiler based on OProfile. Unfortunatelly it works only with
91
AMD processors. I patched it to run on Core2.
92
- However, the event multiplexing is not yet implemented in Intel driver and
93
for that reasons only two events could be seen simultaneously (normal AMD
94
configs are about 10). However, it's expected in few next kernel releases
95
and it's quite OK even with 2.
96
- I have not implemented standard views (becuase of aforesaid limitation and
97
necessity to make correspondance between AMD and Intel counters)
99
TotalView MemoryScape & Debugger (with ReplayEngine) [commercial]
100
--------------------------------
101
- Finding memory leaks and potential problems in memory
103
- Debugger provides step-back functionality by recording states
105
- Supports parallel debugging (MPI, ...), not tests
107
Sources should be compiled with -g -gdwarf-2 -g3 -DDEBUG_GDB
109
IBM Rational PurifyPlus [commercial, limited(very) evolution]
110
-----------------------
111
Rational suite do not require kernel support consist of 3 tools:
112
purify - finds problems in code
113
purecov - coverage tasting
114
quantify - performance
116
- IBM only supports RHEL and SuSE Enterprise. Other systems are unsupported.
117
+ However, it's pretty well running in VM with scientific Linux
118
- Evalution license renders some limitations. However, test run on opc-server
119
shown a lot of found errors. No precise details because of evaluation, but
120
looks comporable to valgrind but much faster.
122
Intel VTune [commercial]
124
- Intel profiler. Depends on ugly implementation of DCOM to Linux and for that
125
reasons is working on very limited number of platforms. I was able to set
126
on Gentoo and some tools even executed(terribly slow), hower the profiling
127
always complained and failed.
128
- Depending on hardware counters it is not working in VM.
129
- Expected to be counterpart of oprofile (plus some graphics)
130
- Basicaly console tools
131
- No eclipse integration on 64bit platform
133
Insure++ [commercial, not available]
135
Advanced tool to find memory and other problems in code
139
* cprof - out of maintenance
140
* sprof - not found any more
141
* prospect - hardware debugger based on oprofile module for 2.4 kernels, no
143
* HP Q-Tools - Hardware profiler is available only to Linux/Itanium platform.
144
But sice version 0.5.1 qprof is running on common Intel hardware.
145
However, it is quite slow, unprecise like google toolkit and do not provide
146
anything special. As well some problems with threads/shared libraries...
147
* HP Caliper - HP-UX and Itanium/Linux only
151
Profiller is included in python
152
python -m cProfile -o profile.out <python.scrypt>
154
It could be used together with valgrind, using following command:
155
valgrind --tool=callgrind ${PYTHON} -m cProfile -o ${PACKAGE_HOME}/profile.out ${PACKAGE_HOME}/SOURCES/PyHST.py $*
160
* pmr - measures pipe bandwidth
161
* pintool - Pin is a tool for the dynamic instrumentation of programs: it may
162
inject arbitrary code (written in C or C++) at arbitrary places in the
163
executable. Pin adds the code dynamically while the executable is running.
164
This also makes it possible to attach Pin to an already running process.
169
Extremely powerful trace for OpenSolaris. BrandZ framework extending
170
Solaris Zones allows to execute Linux applications inside and hence
172
The current version lx brand emulates 2.4.21 kernel and glibc 2.3
177
Displays on per-application basis how much time applications spending
178
waiting on CPU (modules, symbols, system calls)
182
Measures process wakeups per second. Quite important for power consumption