1<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
4[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
5
6
7<chapter id="ms-manual" xreflabel="Massif: a heap profiler">
8  <title>Massif: a heap profiler</title>
9
10<para>To use this tool, you must specify
11<option>--tool=massif</option> on the Valgrind
12command line.</para>
13
14<sect1 id="ms-manual.overview" xreflabel="Overview">
15<title>Overview</title>
16
17<para>Massif is a heap profiler.  It measures how much heap memory your
18program uses.  This includes both the useful space, and the extra bytes
19allocated for book-keeping and alignment purposes.  It can also
20measure the size of your program's stack(s), although it does not do so by
21default.</para>
22
23<para>Heap profiling can help you reduce the amount of memory your program
24uses.  On modern machines with virtual memory, this provides the following
25benefits:</para>
26
27<itemizedlist>
28  <listitem><para>It can speed up your program -- a smaller
29    program will interact better with your machine's caches and
30    avoid paging.</para></listitem>
31
32  <listitem><para>If your program uses lots of memory, it will
33    reduce the chance that it exhausts your machine's swap
34    space.</para></listitem>
35</itemizedlist>
36
37<para>Also, there are certain space leaks that aren't detected by
38traditional leak-checkers, such as Memcheck's.  That's because
39the memory isn't ever actually lost -- a pointer remains to it --
40but it's not in use.  Programs that have leaks like this can
41unnecessarily increase the amount of memory they are using over
42time.  Massif can help identify these leaks.</para>
43
44<para>Importantly, Massif tells you not only how much heap memory your
45program is using, it also gives very detailed information that indicates
46which parts of your program are responsible for allocating the heap memory.
47</para>
48
49</sect1>
50
51
52<sect1 id="ms-manual.using" xreflabel="Using Massif and ms_print">
53<title>Using Massif and ms_print</title>
54
55<para>First off, as for the other Valgrind tools, you should compile with
56debugging info (the <option>-g</option> option).  It shouldn't
57matter much what optimisation level you compile your program with, as this
58is unlikely to affect the heap memory usage.</para>
59
60<para>Then, you need to run Massif itself to gather the profiling
61information, and then run ms_print to present it in a readable way.</para>
62
63
64
65
66<sect2 id="ms-manual.anexample" xreflabel="An Example">
67<title>An Example Program</title>
68
69<para>An example will make things clear.  Consider the following C program
70(annotated with line numbers) which allocates a number of different blocks
71on the heap.</para>
72
73<screen><![CDATA[
74 1      #include <stdlib.h>
75 2
76 3      void g(void)
77 4      {
78 5         malloc(4000);
79 6      }
80 7
81 8      void f(void)
82 9      {
8310         malloc(2000);
8411         g();
8512      }
8613
8714      int main(void)
8815      {
8916         int i;
9017         int* a[10];
9118
9219         for (i = 0; i < 10; i++) {
9320            a[i] = malloc(1000);
9421         }
9522
9623         f();
9724
9825         g();
9926
10027         for (i = 0; i < 10; i++) {
10128            free(a[i]);
10229         }
10330
10431         return 0;
10532      }
106]]></screen>
107
108</sect2>
109
110
111<sect2 id="ms-manual.running-massif" xreflabel="Running Massif">
112<title>Running Massif</title>
113
114<para>To gather heap profiling information about the program
115<computeroutput>prog</computeroutput>, type:</para>
116<screen><![CDATA[
117valgrind --tool=massif prog
118]]></screen>
119
120<para>The program will execute (slowly).  Upon completion, no summary
121statistics are printed to Valgrind's commentary;  all of Massif's profiling
122data is written to a file.  By default, this file is called
123<filename>massif.out.&lt;pid&gt;</filename>, where
124<filename>&lt;pid&gt;</filename> is the process ID, although this filename
125can be changed with the <option>--massif-out-file</option> option.</para>
126
127</sect2>
128
129
130<sect2 id="ms-manual.running-ms_print" xreflabel="Running ms_print">
131<title>Running ms_print</title>
132
133<para>To see the information gathered by Massif in an easy-to-read form, use
134ms_print.  If the output file's name is
135<filename>massif.out.12345</filename>, type:</para>
136<screen><![CDATA[
137ms_print massif.out.12345]]></screen>
138
139<para>ms_print will produce (a) a graph showing the memory consumption over
140the program's execution, and (b) detailed information about the responsible
141allocation sites at various points in the program, including the point of
142peak memory allocation.  The use of a separate script for presenting the
143results is deliberate:  it separates the data gathering from its
144presentation, and means that new methods of presenting the data can be added in
145the future.</para>
146
147</sect2>
148
149
150<sect2 id="ms-manual.theoutputpreamble" xreflabel="The Output Preamble">
151<title>The Output Preamble</title>
152
153<para>After running this program under Massif, the first part of ms_print's
154output contains a preamble which just states how the program, Massif and
155ms_print were each invoked:</para>
156
157<screen><![CDATA[
158--------------------------------------------------------------------------------
159Command:            example
160Massif arguments:   (none)
161ms_print arguments: massif.out.12797
162--------------------------------------------------------------------------------
163]]></screen>
164
165</sect2>
166
167
168<sect2 id="ms-manual.theoutputgraph" xreflabel="The Output Graph">
169<title>The Output Graph</title>
170
171<para>The next part is the graph that shows how memory consumption occurred
172as the program executed:</para>
173
174<screen><![CDATA[
175    KB
17619.63^                                                                       #
177     |                                                                       #
178     |                                                                       #
179     |                                                                       #
180     |                                                                       #
181     |                                                                       #
182     |                                                                       #
183     |                                                                       #
184     |                                                                       #
185     |                                                                       #
186     |                                                                       #
187     |                                                                       #
188     |                                                                       #
189     |                                                                       #
190     |                                                                       #
191     |                                                                       #
192     |                                                                       #
193     |                                                                      :#
194     |                                                                      :#
195     |                                                                      :#
196   0 +----------------------------------------------------------------------->ki     0                                                                   113.4
197
198
199Number of snapshots: 25
200 Detailed snapshots: [9, 14 (peak), 24]
201]]></screen>
202
203<para>Why is most of the graph empty, with only a couple of bars at the very
204end?  By default, Massif uses "instructions executed" as the unit of time.
205For very short-run programs such as the example, most of the executed
206instructions involve the loading and dynamic linking of the program.  The
207execution of <computeroutput>main</computeroutput> (and thus the heap
208allocations) only occur at the very end.  For a short-running program like
209this, we can use the <option>--time-unit=B</option> option
210to specify that we want the time unit to instead be the number of bytes
211allocated/deallocated on the heap and stack(s).</para>
212
213<para>If we re-run the program under Massif with this option, and then
214re-run ms_print, we get this more useful graph:</para>
215
216<screen><![CDATA[
21719.63^                                               ###
218     |                                               #
219     |                                               #  ::
220     |                                               #  : :::
221     |                                      :::::::::#  : :  ::
222     |                                      :        #  : :  : ::
223     |                                      :        #  : :  : : :::
224     |                                      :        #  : :  : : :  ::
225     |                            :::::::::::        #  : :  : : :  : :::
226     |                            :         :        #  : :  : : :  : :  ::
227     |                        :::::         :        #  : :  : : :  : :  : ::
228     |                     @@@:   :         :        #  : :  : : :  : :  : : @
229     |                   ::@  :   :         :        #  : :  : : :  : :  : : @
230     |                :::: @  :   :         :        #  : :  : : :  : :  : : @
231     |              :::  : @  :   :         :        #  : :  : : :  : :  : : @
232     |            ::: :  : @  :   :         :        #  : :  : : :  : :  : : @
233     |         :::: : :  : @  :   :         :        #  : :  : : :  : :  : : @
234     |       :::  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
235     |    :::: :  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
236     |  :::  : :  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
237   0 +----------------------------------------------------------------------->KB     0                                                                   29.48
238
239Number of snapshots: 25
240 Detailed snapshots: [9, 14 (peak), 24]
241]]></screen>
242
243<para>The size of the graph can be changed with ms_print's
244<option>--x</option> and <option>--y</option> options.  Each vertical bar
245represents a snapshot, i.e. a measurement of the memory usage at a certain
246point in time.  If the next snapshot is more than one column away, a
247horizontal line of characters is drawn from the top of the snapshot to just
248before the next snapshot column.  The text at the bottom show that 25
249snapshots were taken for this program, which is one per heap
250allocation/deallocation, plus a couple of extras.  Massif starts by taking
251snapshots for every heap allocation/deallocation, but as a program runs for
252longer, it takes snapshots less frequently.  It also discards older
253snapshots as the program goes on;  when it reaches the maximum number of
254snapshots (100 by default, although changeable with the
255<option>--max-snapshots</option> option) half of them are
256deleted.  This means that a reasonable number of snapshots are always
257maintained.</para>
258
259<para>Most snapshots are <emphasis>normal</emphasis>, and only basic
260information is recorded for them.  Normal snapshots are represented in the
261graph by bars consisting of ':' characters.</para>
262
263<para>Some snapshots are <emphasis>detailed</emphasis>.  Information about
264where allocations happened are recorded for these snapshots, as we will see
265shortly.  Detailed snapshots are represented in the graph by bars consisting
266of '@' characters.  The text at the bottom show that 3 detailed
267snapshots were taken for this program (snapshots 9, 14 and 24).  By default,
268every 10th snapshot is detailed, although this can be changed via the
269<option>--detailed-freq</option> option.</para>
270
271<para>Finally, there is at most one <emphasis>peak</emphasis> snapshot.  The
272peak snapshot is a detailed snapshot, and records the point where memory
273consumption was greatest.  The peak snapshot is represented in the graph by
274a bar consisting of '#' characters.  The text at the bottom shows
275that snapshot 14 was the peak.</para>
276
277<para>Massif's determination of when the peak occurred can be wrong, for
278two reasons.</para>
279
280<itemizedlist>
281  <listitem><para>Peak snapshots are only ever taken after a deallocation
282  happens.  This avoids lots of unnecessary peak snapshot recordings
283  (imagine what happens if your program allocates a lot of heap blocks in
284  succession, hitting a new peak every time).  But it means that if your
285  program never deallocates any blocks, no peak will be recorded.  It also
286  means that if your program does deallocate blocks but later allocates to a
287  higher peak without subsequently deallocating, the reported peak will be
288  too low.
289  </para>
290  </listitem>
291
292  <listitem><para>Even with this behaviour, recording the peak accurately
293  is slow.  So by default Massif records a peak whose size is within 1% of
294  the size of the true peak.  This inaccuracy in the peak measurement can be
295  changed with the <option>--peak-inaccuracy</option> option.</para>
296  </listitem>
297</itemizedlist>
298
299<para>The following graph is from an execution of Konqueror, the KDE web
300browser.  It shows what graphs for larger programs look like.</para>
301<screen><![CDATA[
302    MB
3033.952^                                                                    #
304     |                                                                   @#:
305     |                                                                 :@@#:
306     |                                                            @@::::@@#:
307     |                                                            @ :: :@@#::
308     |                                                          @@@ :: :@@#::
309     |                                                       @@:@@@ :: :@@#::
310     |                                                    :::@ :@@@ :: :@@#::
311     |                                                    : :@ :@@@ :: :@@#::
312     |                                                  :@: :@ :@@@ :: :@@#::
313     |                                                @@:@: :@ :@@@ :: :@@#:::
314     |                           :       ::         ::@@:@: :@ :@@@ :: :@@#:::
315     |                        :@@:    ::::: ::::@@@:::@@:@: :@ :@@@ :: :@@#:::
316     |                     ::::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
317     |                    @: ::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
318     |                    @: ::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
319     |                    @: ::@@:::::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
320     |                ::@@@: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
321     |             :::::@ @: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
322     |           @@:::::@ @: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
323   0 +----------------------------------------------------------------------->Mi
324     0                                                                   626.4
325
326Number of snapshots: 63
327 Detailed snapshots: [3, 4, 10, 11, 15, 16, 29, 33, 34, 36, 39, 41,
328                      42, 43, 44, 49, 50, 51, 53, 55, 56, 57 (peak)]
329]]></screen>
330
331<para>Note that the larger size units are KB, MB, GB, etc.  As is typical
332for memory measurements, these are based on a multiplier of 1024, rather
333than the standard SI multiplier of 1000.  Strictly speaking, they should be
334written KiB, MiB, GiB, etc.</para>
335
336</sect2>
337
338
339<sect2 id="ms-manual.thesnapshotdetails" xreflabel="The Snapshot Details">
340<title>The Snapshot Details</title>
341
342<para>Returning to our example, the graph is followed by the detailed
343information for each snapshot.  The first nine snapshots are normal, so only
344a small amount of information is recorded for each one:</para>
345<screen><![CDATA[
346--------------------------------------------------------------------------------
347  n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
348--------------------------------------------------------------------------------
349  0              0                0                0             0            0
350  1          1,008            1,008            1,000             8            0
351  2          2,016            2,016            2,000            16            0
352  3          3,024            3,024            3,000            24            0
353  4          4,032            4,032            4,000            32            0
354  5          5,040            5,040            5,000            40            0
355  6          6,048            6,048            6,000            48            0
356  7          7,056            7,056            7,000            56            0
357  8          8,064            8,064            8,000            64            0
358]]></screen>
359
360<para>Each normal snapshot records several things.</para>
361
362<itemizedlist>
363  <listitem><para>Its number.</para></listitem>
364
365  <listitem><para>The time it was taken. In this case, the time unit is
366  bytes, due to the use of
367  <option>--time-unit=B</option>.</para></listitem>
368
369  <listitem><para>The total memory consumption at that point.</para></listitem>
370
371  <listitem><para>The number of useful heap bytes allocated at that point.
372  This reflects the number of bytes asked for by the
373  program.</para></listitem>
374
375  <listitem><para>The number of extra heap bytes allocated at that point.
376  This reflects the number of bytes allocated in excess of what the program
377  asked for.  There are two sources of extra heap bytes.</para>
378
379  <para>First, every heap block has administrative bytes associated with it.
380  The exact number of administrative bytes depends on the details of the
381  allocator.  By default Massif assumes 8 bytes per block, as can be seen
382  from the example, but this number can be changed via the
383  <option>--heap-admin</option> option.</para>
384
385  <para>Second, allocators often round up the number of bytes asked for to a
386  larger number, usually 8 or 16.  This is required to ensure that elements
387  within the block are suitably aligned.  If N bytes are asked for, Massif
388  rounds N up to the nearest multiple of the value specified by the
389  <option><xref linkend="opt.alignment"/></option> option.
390  </para></listitem>
391
392  <listitem><para>The size of the stack(s).  By default, stack profiling is
393  off as it slows Massif down greatly.  Therefore, the stack column is zero
394  in the example.  Stack profiling can be turned on with the
395  <option>--stacks=yes</option> option.
396
397  </para></listitem>
398</itemizedlist>
399
400<para>The next snapshot is detailed.  As well as the basic counts, it gives
401an allocation tree which indicates exactly which pieces of code were
402responsible for allocating heap memory:</para>
403
404<screen><![CDATA[
405  9          9,072            9,072            9,000            72            0
40699.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
407->99.21% (9,000B) 0x804841A: main (example.c:20)
408]]></screen>
409
410<para>The allocation tree can be read from the top down.  The first line
411indicates all heap allocation functions such as <function>malloc</function>
412and C++ <function>new</function>.  All heap allocations go through these
413functions, and so all 9,000 useful bytes (which is 99.21% of all allocated
414bytes) go through them.  But how were <function>malloc</function> and new
415called?  At this point, every allocation so far has been due to line 20
416inside <function>main</function>, hence the second line in the tree.  The
417<option>-></option> indicates that main (line 20) called
418<function>malloc</function>.</para>
419
420<para>Let's see what the subsequent output shows happened next:</para>
421
422<screen><![CDATA[
423--------------------------------------------------------------------------------
424  n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
425--------------------------------------------------------------------------------
426 10         10,080           10,080           10,000            80            0
427 11         12,088           12,088           12,000            88            0
428 12         16,096           16,096           16,000            96            0
429 13         20,104           20,104           20,000           104            0
430 14         20,104           20,104           20,000           104            0
43199.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
432->49.74% (10,000B) 0x804841A: main (example.c:20)
433|
434->39.79% (8,000B) 0x80483C2: g (example.c:5)
435| ->19.90% (4,000B) 0x80483E2: f (example.c:11)
436| | ->19.90% (4,000B) 0x8048431: main (example.c:23)
437| |
438| ->19.90% (4,000B) 0x8048436: main (example.c:25)
439|
440->09.95% (2,000B) 0x80483DA: f (example.c:10)
441  ->09.95% (2,000B) 0x8048431: main (example.c:23)
442]]></screen>
443
444<para>The first four snapshots are similar to the previous ones.  But then
445the global allocation peak is reached, and a detailed snapshot (number 14)
446is taken.  Its allocation tree shows that 20,000B of useful heap memory has
447been allocated, and the lines and arrows indicate that this is from three
448different code locations: line 20, which is responsible for 10,000B
449(49.74%);  line 5, which is responsible for 8,000B (39.79%); and line 10,
450which is responsible for 2,000B (9.95%).</para>
451
452<para>We can then drill down further in the allocation tree.  For example,
453of the 8,000B asked for by line 5, half of it was due to a call from line
45411, and half was due to a call from line 25.</para>
455
456<para>In short, Massif collates the stack trace of every single allocation
457point in the program into a single tree, which gives a complete picture at
458a particular point in time of how and why all heap memory was
459allocated.</para>
460
461<para>Note that the tree entries correspond not to functions, but to
462individual code locations.  For example, if function <function>A</function>
463calls <function>malloc</function>, and function <function>B</function> calls
464<function>A</function> twice, once on line 10 and once on line 11, then
465the two calls will result in two distinct stack traces in the tree.  In
466contrast, if <function>B</function> calls <function>A</function> repeatedly
467from line 15 (e.g. due to a loop), then each of those calls will be
468represented by the same stack trace in the tree.</para>
469
470<para>Note also that each tree entry with children in the example satisfies an
471invariant: the entry's size is equal to the sum of its children's sizes.
472For example, the first entry has size 20,000B, and its children have sizes
47310,000B, 8,000B, and 2,000B.  In general, this invariant almost always
474holds.  However, in rare circumstances stack traces can be malformed, in
475which case a stack trace can be a sub-trace of another stack trace.  This
476means that some entries in the tree may not satisfy the invariant -- the
477entry's size will be greater than the sum of its children's sizes.  This is
478not a big problem, but could make the results confusing.  Massif can
479sometimes detect when this happens;  if it does, it issues a warning:</para>
480
481<screen><![CDATA[
482Warning: Malformed stack trace detected.  In Massif's output,
483         the size of an entry's child entries may not sum up
484         to the entry's size as they normally do.
485]]></screen>
486
487<para>However, Massif does not detect and warn about every such occurrence.
488Fortunately, malformed stack traces are rare in practice.</para>
489
490<para>Returning now to ms_print's output, the final part is similar:</para>
491
492<screen><![CDATA[
493--------------------------------------------------------------------------------
494  n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
495--------------------------------------------------------------------------------
496 15         21,112           19,096           19,000            96            0
497 16         22,120           18,088           18,000            88            0
498 17         23,128           17,080           17,000            80            0
499 18         24,136           16,072           16,000            72            0
500 19         25,144           15,064           15,000            64            0
501 20         26,152           14,056           14,000            56            0
502 21         27,160           13,048           13,000            48            0
503 22         28,168           12,040           12,000            40            0
504 23         29,176           11,032           11,000            32            0
505 24         30,184           10,024           10,000            24            0
50699.76% (10,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
507->79.81% (8,000B) 0x80483C2: g (example.c:5)
508| ->39.90% (4,000B) 0x80483E2: f (example.c:11)
509| | ->39.90% (4,000B) 0x8048431: main (example.c:23)
510| |
511| ->39.90% (4,000B) 0x8048436: main (example.c:25)
512|
513->19.95% (2,000B) 0x80483DA: f (example.c:10)
514| ->19.95% (2,000B) 0x8048431: main (example.c:23)
515|
516->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
517]]></screen>
518
519<para>The final detailed snapshot shows how the heap looked at termination.
520The 00.00% entry represents the code locations for which memory was
521allocated and then freed (line 20 in this case, the memory for which was
522freed on line 28).  However, no code location details are given for this
523entry;  by default, Massif only records the details for code locations
524responsible for more than 1% of useful memory bytes, and ms_print likewise
525only prints the details for code locations responsible for more than 1%.
526The entries that do not meet this threshold are aggregated.  This avoids
527filling up the output with large numbers of unimportant entries.  The
528thresholds can be changed with the
529<option>--threshold</option> option that both Massif and
530ms_print support.</para>
531
532</sect2>
533
534
535<sect2 id="ms-manual.forkingprograms" xreflabel="Forking Programs">
536<title>Forking Programs</title>
537<para>If your program forks, the child will inherit all the profiling data that
538has been gathered for the parent.</para>
539
540<para>If the output file format string (controlled by
541<option>--massif-out-file</option>) does not contain <option>%p</option>, then
542the outputs from the parent and child will be intermingled in a single output
543file, which will almost certainly make it unreadable by ms_print.</para>
544</sect2>
545
546
547<sect2 id="ms-manual.not-measured"
548       xreflabel="Measuring All Memory in a Process">
549<title>Measuring All Memory in a Process</title>
550<para>
551It is worth emphasising that by default Massif measures only heap memory, i.e.
552memory allocated with
553<function>malloc</function>,
554<function>calloc</function>,
555<function>realloc</function>,
556<function>memalign</function>,
557<function>new</function>,
558<function>new[]</function>,
559and a few other, similar functions.  (And it can optionally measure stack
560memory, of course.)  This means it does <emphasis>not</emphasis> directly
561measure memory allocated with lower-level system calls such as
562<function>mmap</function>,
563<function>mremap</function>, and
564<function>brk</function>.
565</para>
566
567<para>
568Heap allocation functions such as <function>malloc</function> are built on
569top of these system calls.  For example, when needed, an allocator will
570typically call <function>mmap</function> to allocate a large chunk of
571memory, and then hand over pieces of that memory chunk to the client program
572in response to calls to <function>malloc</function> et al.  Massif directly
573measures only these higher-level <function>malloc</function> et al calls,
574not the lower-level system calls.
575</para>
576
577<para>
578Furthermore, a client program may use these lower-level system calls
579directly to allocate memory.  By default, Massif does not measure these.  Nor
580does it measure the size of code, data and BSS segments.  Therefore, the
581numbers reported by Massif may be significantly smaller than those reported by
582tools such as <filename>top</filename> that measure a program's total size in
583memory.
584</para>
585
586<para>
587However, if you wish to measure <emphasis>all</emphasis> the memory used by
588your program, you can use the <option>--pages-as-heap=yes</option>.  When this
589option is enabled, Massif's normal heap block profiling is replaced by
590lower-level page profiling.  Every page allocated via
591<function>mmap</function> and similar system calls is treated as a distinct
592block.  This means that code, data and BSS segments are all measured, as they
593are just memory pages.  Even the stack is measured, since it is ultimately
594allocated (and extended when necessary) via <function>mmap</function>;  for
595this reason <option>--stacks=yes</option> is not allowed in conjunction with
596<option>--pages-as-heap=yes</option>.
597</para>
598
599<para>
600After <option>--pages-as-heap=yes</option> is used, ms_print's output is
601mostly unchanged.  One difference is that the start of each detailed snapshot
602says:
603</para>
604
605<screen><![CDATA[
606(page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
607]]></screen>
608
609<para>instead of the usual</para>:
610
611<screen><![CDATA[
612(heap allocation functions) malloc/new/new[], --alloc-fns, etc.
613]]></screen>
614
615<para>
616The stack traces in the output may be more difficult to read, and interpreting
617them may require some detailed understanding of the lower levels of a program
618like the memory allocators.  But for some programs having the full information
619about memory usage can be very useful.
620</para>
621
622</sect2>
623
624
625<sect2 id="ms-manual.acting" xreflabel="Action on Massif's Information">
626<title>Acting on Massif's Information</title>
627<para>Massif's information is generally fairly easy to act upon.  The
628obvious place to start looking is the peak snapshot.</para>
629
630<para>It can also be useful to look at the overall shape of the graph, to
631see if memory usage climbs and falls as you expect;  spikes in the graph
632might be worth investigating.</para>
633
634<para>The detailed snapshots can get quite large.  It is worth viewing them
635in a very wide window.   It's also a good idea to view them with a text
636editor.  That makes it easy to scroll up and down while keeping the cursor
637in a particular column, which makes following the allocation chains easier.
638</para>
639
640</sect2>
641
642</sect1>
643
644
645<sect1 id="ms-manual.options" xreflabel="Massif Command-line Options">
646<title>Massif Command-line Options</title>
647
648<para>Massif-specific command-line options are:</para>
649
650<!-- start of xi:include in the manpage -->
651<variablelist id="ms.opts.list">
652
653  <varlistentry id="opt.heap" xreflabel="--heap">
654    <term>
655      <option><![CDATA[--heap=<yes|no> [default: yes] ]]></option>
656    </term>
657    <listitem>
658      <para>Specifies whether heap profiling should be done.</para>
659    </listitem>
660  </varlistentry>
661
662  <varlistentry id="opt.heap-admin" xreflabel="--heap-admin">
663    <term>
664      <option><![CDATA[--heap-admin=<size> [default: 8] ]]></option>
665    </term>
666    <listitem>
667      <para>If heap profiling is enabled, gives the number of administrative
668      bytes per block to use.  This should be an estimate of the average,
669      since it may vary.  For example, the allocator used by
670      glibc on Linux requires somewhere between 4 to
671      15 bytes per block, depending on various factors.  That allocator also
672      requires admin space for freed blocks, but Massif cannot
673      account for this.</para>
674    </listitem>
675  </varlistentry>
676
677  <varlistentry id="opt.stacks" xreflabel="--stacks">
678    <term>
679      <option><![CDATA[--stacks=<yes|no> [default: no] ]]></option>
680    </term>
681    <listitem>
682      <para>Specifies whether stack profiling should be done.  This option
683      slows Massif down greatly, and so is off by default.  Note that Massif
684      assumes that the main stack has size zero at start-up.  This is not
685      true, but doing otherwise accurately is difficult.  Furthermore,
686      starting at zero better indicates the size of the part of the main
687      stack that a user program actually has control over.</para>
688    </listitem>
689  </varlistentry>
690
691  <varlistentry id="opt.pages-as-heap" xreflabel="--pages-as-heap">
692    <term>
693      <option><![CDATA[--pages-as-heap=<yes|no> [default: no] ]]></option>
694    </term>
695    <listitem>
696      <para>Tells Massif to profile memory at the page level rather
697        than at the malloc'd block level.  See above for details.
698      </para>
699    </listitem>
700  </varlistentry>
701
702  <varlistentry id="opt.depth" xreflabel="--depth">
703    <term>
704      <option><![CDATA[--depth=<number> [default: 30] ]]></option>
705    </term>
706    <listitem>
707      <para>Maximum depth of the allocation trees recorded for detailed
708      snapshots.  Increasing it will make Massif run somewhat more slowly,
709      use more memory, and produce bigger output files.</para>
710    </listitem>
711  </varlistentry>
712
713  <varlistentry id="opt.alloc-fn" xreflabel="--alloc-fn">
714    <term>
715      <option><![CDATA[--alloc-fn=<name> ]]></option>
716    </term>
717    <listitem>
718      <para>Functions specified with this option will be treated as though
719      they were a heap allocation function such as
720      <function>malloc</function>.  This is useful for functions that are
721      wrappers to <function>malloc</function> or <function>new</function>,
722      which can fill up the allocation trees with uninteresting information.
723      This option can be specified multiple times on the command line, to
724      name multiple functions.</para>
725
726      <para>Note that the named function will only be treated this way if it is
727      the top entry in a stack trace, or just below another function treated
728      this way.  For example, if you have a function
729      <function>malloc1</function> that wraps <function>malloc</function>,
730      and <function>malloc2</function> that wraps
731      <function>malloc1</function>, just specifying
732      <option>--alloc-fn=malloc2</option> will have no effect.  You need to
733      specify <option>--alloc-fn=malloc1</option> as well.  This is a little
734      inconvenient, but the reason is that checking for allocation functions
735      is slow, and it saves a lot of time if Massif can stop looking through
736      the stack trace entries as soon as it finds one that doesn't match
737      rather than having to continue through all the entries.</para>
738
739      <para>Note that C++ names are demangled.  Note also that overloaded
740      C++ names must be written in full.  Single quotes may be necessary to
741      prevent the shell from breaking them up.  For example:
742<screen><![CDATA[
743--alloc-fn='operator new(unsigned, std::nothrow_t const&)'
744]]></screen>
745      </para>
746      </listitem>
747  </varlistentry>
748
749  <varlistentry id="opt.ignore-fn" xreflabel="--ignore-fn">
750    <term>
751      <option><![CDATA[--ignore-fn=<name> ]]></option>
752    </term>
753    <listitem>
754      <para>Any direct heap allocation (i.e. a call to
755      <function>malloc</function>, <function>new</function>, etc, or a call
756      to a function named by an <option>--alloc-fn</option>
757      option) that occurs in a function specified by this option will be
758      ignored.  This is mostly useful for testing purposes.  This option can
759      be specified multiple times on the command line, to name multiple
760      functions.
761      </para>
762
763      <para>Any <function>realloc</function> of an ignored block will
764      also be ignored, even if the <function>realloc</function> call does
765      not occur in an ignored function.  This avoids the possibility of
766      negative heap sizes if ignored blocks are shrunk with
767      <function>realloc</function>.
768      </para>
769
770      <para>The rules for writing C++ function names are the same as
771      for <option>--alloc-fn</option> above.
772      </para>
773      </listitem>
774  </varlistentry>
775
776  <varlistentry id="opt.threshold" xreflabel="--threshold">
777    <term>
778      <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option>
779    </term>
780    <listitem>
781      <para>The significance threshold for heap allocations, as a
782      percentage of total memory size.  Allocation tree entries that account
783      for less than this will be aggregated.  Note that this should be
784      specified in tandem with ms_print's option of the same name.</para>
785    </listitem>
786  </varlistentry>
787
788  <varlistentry id="opt.peak-inaccuracy" xreflabel="--peak-inaccuracy">
789    <term>
790      <option><![CDATA[--peak-inaccuracy=<m.n> [default: 1.0] ]]></option>
791    </term>
792    <listitem>
793      <para>Massif does not necessarily record the actual global memory
794      allocation peak;  by default it records a peak only when the global
795      memory allocation size exceeds the previous peak by at least 1.0%.
796      This is because there can be many local allocation peaks along the way,
797      and doing a detailed snapshot for every one would be expensive and
798      wasteful, as all but one of them will be later discarded.  This
799      inaccuracy can be changed (even to 0.0%) via this option, but Massif
800      will run drastically slower as the number approaches zero.</para>
801    </listitem>
802  </varlistentry>
803
804  <varlistentry id="opt.time-unit" xreflabel="--time-unit">
805    <term>
806      <option><![CDATA[--time-unit=<i|ms|B> [default: i] ]]></option>
807    </term>
808    <listitem>
809      <para>The time unit used for the profiling.  There are three
810      possibilities: instructions executed (i), which is good for most
811      cases; real (wallclock) time (ms, i.e. milliseconds), which is
812      sometimes useful; and bytes allocated/deallocated on the heap and/or
813      stack (B), which is useful for very short-run programs, and for
814      testing purposes, because it is the most reproducible across different
815      machines.</para> </listitem>
816  </varlistentry>
817
818  <varlistentry id="opt.detailed-freq" xreflabel="--detailed-freq">
819    <term>
820      <option><![CDATA[--detailed-freq=<n> [default: 10] ]]></option>
821    </term>
822    <listitem>
823      <para>Frequency of detailed snapshots.  With
824      <option>--detailed-freq=1</option>, every snapshot is
825      detailed.</para>
826    </listitem>
827  </varlistentry>
828
829  <varlistentry id="opt.max-snapshots" xreflabel="--max-snapshots">
830    <term>
831      <option><![CDATA[--max-snapshots=<n> [default: 100] ]]></option>
832    </term>
833    <listitem>
834      <para>The maximum number of snapshots recorded.  If set to N, for all
835      programs except very short-running ones, the final number of snapshots
836      will be between N/2 and N.</para>
837    </listitem>
838  </varlistentry>
839
840  <varlistentry id="opt.massif-out-file" xreflabel="--massif-out-file">
841    <term>
842      <option><![CDATA[--massif-out-file=<file> [default: massif.out.%p] ]]></option>
843    </term>
844    <listitem>
845      <para>Write the profile data to <computeroutput>file</computeroutput>
846      rather than to the default output file,
847      <computeroutput>massif.out.&lt;pid&gt;</computeroutput>.  The
848      <option>%p</option> and <option>%q</option> format specifiers can be
849      used to embed the process ID and/or the contents of an environment
850      variable in the name, as is the case for the core option
851      <option><xref linkend="opt.log-file"/></option>.
852      </para>
853    </listitem>
854  </varlistentry>
855
856</variablelist>
857<!-- end of xi:include in the manpage -->
858
859</sect1>
860
861<sect1 id="ms-manual.monitor-commands" xreflabel="Massif Monitor Commands">
862<title>Massif Monitor Commands</title>
863<para>The Massif tool provides monitor commands handled by the Valgrind
864gdbserver (see <xref linkend="manual-core-adv.gdbserver-commandhandling"/>).
865</para>
866
867<itemizedlist>
868  <listitem>
869    <para><varname>snapshot [&lt;filename&gt;]</varname> requests
870    to take a snapshot and save it in the given &lt;filename&gt;
871    (default massif.vgdb.out).
872    </para>
873  </listitem>
874  <listitem>
875    <para><varname>detailed_snapshot [&lt;filename&gt;]</varname>
876    requests to take a detailed snapshot and save it in the given
877    &lt;filename&gt; (default massif.vgdb.out).
878    </para>
879  </listitem>
880  <listitem>
881    <para><varname>all_snapshots [&lt;filename&gt;]</varname>
882    requests to take all captured snapshots so far and save them in the given
883    &lt;filename&gt; (default massif.vgdb.out).
884    </para>
885  </listitem>
886</itemizedlist>
887</sect1>
888
889<sect1 id="ms-manual.clientreqs" xreflabel="Client requests">
890<title>Massif Client Requests</title>
891
892<para>Massif does not have a <filename>massif.h</filename> file, but it does
893implement two of the core client requests:
894<function>VALGRIND_MALLOCLIKE_BLOCK</function> and
895<function>VALGRIND_FREELIKE_BLOCK</function>;  they are described in
896<xref linkend="manual-core-adv.clientreq"/>.
897</para>
898
899</sect1>
900
901
902<sect1 id="ms-manual.ms_print-options" xreflabel="ms_print Command-line Options">
903<title>ms_print Command-line Options</title>
904
905<para>ms_print's options are:</para>
906
907<!-- start of xi:include in the manpage -->
908<variablelist id="ms_print.opts.list">
909
910  <varlistentry>
911    <term>
912      <option><![CDATA[-h --help ]]></option>
913    </term>
914    <listitem>
915      <para>Show the help message.</para>
916    </listitem>
917  </varlistentry>
918
919  <varlistentry>
920    <term>
921      <option><![CDATA[--version ]]></option>
922    </term>
923    <listitem>
924      <para>Show the version number.</para>
925    </listitem>
926  </varlistentry>
927
928  <varlistentry>
929    <term>
930      <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option>
931    </term>
932    <listitem>
933      <para>Same as Massif's <option>--threshold</option> option, but
934      applied after profiling rather than during.</para>
935    </listitem>
936  </varlistentry>
937
938  <varlistentry>
939    <term>
940      <option><![CDATA[--x=<4..1000> [default: 72]]]></option>
941    </term>
942    <listitem>
943      <para>Width of the graph, in columns.</para>
944    </listitem>
945  </varlistentry>
946
947  <varlistentry>
948    <term>
949      <option><![CDATA[--y=<4..1000> [default: 20] ]]></option>
950    </term>
951    <listitem>
952      <para>Height of the graph, in rows.</para>
953    </listitem>
954  </varlistentry>
955
956</variablelist>
957
958</sect1>
959
960<sect1 id="ms-manual.fileformat" xreflabel="fileformat">
961<title>Massif's Output File Format</title>
962<para>Massif's file format is plain text (i.e. not binary) and deliberately
963easy to read for both humans and machines.  Nonetheless, the exact format
964is not described here.  This is because the format is currently very
965Massif-specific.  In the future we hope to make the format more general, and
966thus suitable for possible use with other tools.  Once this has been done,
967the format will be documented here.</para>
968
969</sect1>
970
971</chapter>
972