1<?xml version="1.0"?> <!-- -*- sgml -*- --> 2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> 4 5 6<chapter id="mc-manual" xreflabel="Memcheck: a memory error detector"> 7<title>Memcheck: a memory error detector</title> 8 9<para>To use this tool, you may specify <option>--tool=memcheck</option> 10on the Valgrind command line. You don't have to, though, since Memcheck 11is the default tool.</para> 12 13 14<sect1 id="mc-manual.overview" xreflabel="Overview"> 15<title>Overview</title> 16 17<para>Memcheck is a memory error detector. It can detect the following 18problems that are common in C and C++ programs.</para> 19 20<itemizedlist> 21 <listitem> 22 <para>Accessing memory you shouldn't, e.g. overrunning and underrunning 23 heap blocks, overrunning the top of the stack, and accessing memory after 24 it has been freed.</para> 25 </listitem> 26 27 <listitem> 28 <para>Using undefined values, i.e. values that have not been initialised, 29 or that have been derived from other undefined values.</para> 30 </listitem> 31 32 <listitem> 33 <para>Incorrect freeing of heap memory, such as double-freeing heap 34 blocks, or mismatched use of 35 <function>malloc</function>/<computeroutput>new</computeroutput>/<computeroutput>new[]</computeroutput> 36 versus 37 <function>free</function>/<computeroutput>delete</computeroutput>/<computeroutput>delete[]</computeroutput></para> 38 </listitem> 39 40 <listitem> 41 <para>Overlapping <computeroutput>src</computeroutput> and 42 <computeroutput>dst</computeroutput> pointers in 43 <computeroutput>memcpy</computeroutput> and related 44 functions.</para> 45 </listitem> 46 47 <listitem> 48 <para>Passing a fishy (presumably negative) value to the 49 <computeroutput>size</computeroutput> parameter of a memory 50 allocation function.</para> 51 </listitem> 52 53 <listitem> 54 <para>Memory leaks.</para> 55 </listitem> 56</itemizedlist> 57 58<para>Problems like these can be difficult to find by other means, 59often remaining undetected for long periods, then causing occasional, 60difficult-to-diagnose crashes.</para> 61 62</sect1> 63 64 65 66<sect1 id="mc-manual.errormsgs" 67 xreflabel="Explanation of error messages from Memcheck"> 68<title>Explanation of error messages from Memcheck</title> 69 70<para>Memcheck issues a range of error messages. This section presents a 71quick summary of what error messages mean. The precise behaviour of the 72error-checking machinery is described in <xref 73linkend="mc-manual.machine"/>.</para> 74 75 76<sect2 id="mc-manual.badrw" 77 xreflabel="Illegal read / Illegal write errors"> 78<title>Illegal read / Illegal write errors</title> 79 80<para>For example:</para> 81<programlisting><![CDATA[ 82Invalid read of size 4 83 at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9) 84 by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9) 85 by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326) 86 by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) 87 Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd 88]]></programlisting> 89 90<para>This happens when your program reads or writes memory at a place 91which Memcheck reckons it shouldn't. In this example, the program did a 924-byte read at address 0xBFFFF0E0, somewhere within the system-supplied 93library libpng.so.2.1.0.9, which was called from somewhere else in the 94same library, called from line 326 of <filename>qpngio.cpp</filename>, 95and so on.</para> 96 97<para>Memcheck tries to establish what the illegal address might relate 98to, since that's often useful. So, if it points into a block of memory 99which has already been freed, you'll be informed of this, and also where 100the block was freed. Likewise, if it should turn out to be just off 101the end of a heap block, a common result of off-by-one-errors in 102array subscripting, you'll be informed of this fact, and also where the 103block was allocated. If you use the <option><xref 104linkend="opt.read-var-info"/></option> option Memcheck will run more slowly 105but may give a more detailed description of any illegal address.</para> 106 107<para>In this example, Memcheck can't identify the address. Actually 108the address is on the stack, but, for some reason, this is not a valid 109stack address -- it is below the stack pointer and that isn't allowed. 110In this particular case it's probably caused by GCC generating invalid 111code, a known bug in some ancient versions of GCC.</para> 112 113<para>Note that Memcheck only tells you that your program is about to 114access memory at an illegal address. It can't stop the access from 115happening. So, if your program makes an access which normally would 116result in a segmentation fault, you program will still suffer the same 117fate -- but you will get a message from Memcheck immediately prior to 118this. In this particular example, reading junk on the stack is 119non-fatal, and the program stays alive.</para> 120 121</sect2> 122 123 124 125<sect2 id="mc-manual.uninitvals" 126 xreflabel="Use of uninitialised values"> 127<title>Use of uninitialised values</title> 128 129<para>For example:</para> 130<programlisting><![CDATA[ 131Conditional jump or move depends on uninitialised value(s) 132 at 0x402DFA94: _IO_vfprintf (_itoa.h:49) 133 by 0x402E8476: _IO_printf (printf.c:36) 134 by 0x8048472: main (tests/manuel1.c:8) 135]]></programlisting> 136 137<para>An uninitialised-value use error is reported when your program 138uses a value which hasn't been initialised -- in other words, is 139undefined. Here, the undefined value is used somewhere inside the 140<function>printf</function> machinery of the C library. This error was 141reported when running the following small program:</para> 142<programlisting><![CDATA[ 143int main() 144{ 145 int x; 146 printf ("x = %d\n", x); 147}]]></programlisting> 148 149<para>It is important to understand that your program can copy around 150junk (uninitialised) data as much as it likes. Memcheck observes this 151and keeps track of the data, but does not complain. A complaint is 152issued only when your program attempts to make use of uninitialised 153data in a way that might affect your program's externally-visible behaviour. 154In this example, <varname>x</varname> is uninitialised. Memcheck observes 155the value being passed to <function>_IO_printf</function> and thence to 156<function>_IO_vfprintf</function>, but makes no comment. However, 157<function>_IO_vfprintf</function> has to examine the value of 158<varname>x</varname> so it can turn it into the corresponding ASCII string, 159and it is at this point that Memcheck complains.</para> 160 161<para>Sources of uninitialised data tend to be:</para> 162<itemizedlist> 163 <listitem> 164 <para>Local variables in procedures which have not been initialised, 165 as in the example above.</para> 166 </listitem> 167 <listitem> 168 <para>The contents of heap blocks (allocated with 169 <function>malloc</function>, <function>new</function>, or a similar 170 function) before you (or a constructor) write something there. 171 </para> 172 </listitem> 173</itemizedlist> 174 175<para>To see information on the sources of uninitialised data in your 176program, use the <option>--track-origins=yes</option> option. This 177makes Memcheck run more slowly, but can make it much easier to track down 178the root causes of uninitialised value errors.</para> 179 180</sect2> 181 182 183 184<sect2 id="mc-manual.bad-syscall-args" 185 xreflabel="Use of uninitialised or unaddressable values in system 186 calls"> 187<title>Use of uninitialised or unaddressable values in system 188 calls</title> 189 190<para>Memcheck checks all parameters to system calls: 191<itemizedlist> 192 <listitem> 193 <para>It checks all the direct parameters themselves, whether they are 194 initialised.</para> 195 </listitem> 196 <listitem> 197 <para>Also, if a system call needs to read from a buffer provided by 198 your program, Memcheck checks that the entire buffer is addressable 199 and its contents are initialised.</para> 200 </listitem> 201 <listitem> 202 <para>Also, if the system call needs to write to a user-supplied 203 buffer, Memcheck checks that the buffer is addressable.</para> 204 </listitem> 205</itemizedlist> 206</para> 207 208<para>After the system call, Memcheck updates its tracked information to 209precisely reflect any changes in memory state caused by the system 210call.</para> 211 212<para>Here's an example of two system calls with invalid parameters:</para> 213<programlisting><![CDATA[ 214 #include <stdlib.h> 215 #include <unistd.h> 216 int main( void ) 217 { 218 char* arr = malloc(10); 219 int* arr2 = malloc(sizeof(int)); 220 write( 1 /* stdout */, arr, 10 ); 221 exit(arr2[0]); 222 } 223]]></programlisting> 224 225<para>You get these complaints ...</para> 226<programlisting><![CDATA[ 227 Syscall param write(buf) points to uninitialised byte(s) 228 at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so) 229 by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so) 230 by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out) 231 Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd 232 at 0x259852B0: malloc (vg_replace_malloc.c:130) 233 by 0x80483F1: main (a.c:5) 234 235 Syscall param exit(error_code) contains uninitialised byte(s) 236 at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so) 237 by 0x8048426: main (a.c:8) 238]]></programlisting> 239 240<para>... because the program has (a) written uninitialised junk 241from the heap block to the standard output, and (b) passed an 242uninitialised value to <function>exit</function>. Note that the first 243error refers to the memory pointed to by 244<computeroutput>buf</computeroutput> (not 245<computeroutput>buf</computeroutput> itself), but the second error 246refers directly to <computeroutput>exit</computeroutput>'s argument 247<computeroutput>arr2[0]</computeroutput>.</para> 248 249</sect2> 250 251 252<sect2 id="mc-manual.badfrees" xreflabel="Illegal frees"> 253<title>Illegal frees</title> 254 255<para>For example:</para> 256<programlisting><![CDATA[ 257Invalid free() 258 at 0x4004FFDF: free (vg_clientmalloc.c:577) 259 by 0x80484C7: main (tests/doublefree.c:10) 260 Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd 261 at 0x4004FFDF: free (vg_clientmalloc.c:577) 262 by 0x80484C7: main (tests/doublefree.c:10) 263]]></programlisting> 264 265<para>Memcheck keeps track of the blocks allocated by your program 266with <function>malloc</function>/<computeroutput>new</computeroutput>, 267so it can know exactly whether or not the argument to 268<function>free</function>/<computeroutput>delete</computeroutput> is 269legitimate or not. Here, this test program has freed the same block 270twice. As with the illegal read/write errors, Memcheck attempts to 271make sense of the address freed. If, as here, the address is one 272which has previously been freed, you wil be told that -- making 273duplicate frees of the same block easy to spot. You will also get this 274message if you try to free a pointer that doesn't point to the start of a 275heap block.</para> 276 277</sect2> 278 279 280<sect2 id="mc-manual.rudefn" 281 xreflabel="When a heap block is freed with an inappropriate deallocation 282function"> 283<title>When a heap block is freed with an inappropriate deallocation 284function</title> 285 286<para>In the following example, a block allocated with 287<function>new[]</function> has wrongly been deallocated with 288<function>free</function>:</para> 289<programlisting><![CDATA[ 290Mismatched free() / delete / delete [] 291 at 0x40043249: free (vg_clientfuncs.c:171) 292 by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149) 293 by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60) 294 by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44) 295 Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd 296 at 0x4004318C: operator new[](unsigned int) (vg_clientfuncs.c:152) 297 by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314) 298 by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416) 299 by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272) 300]]></programlisting> 301 302<para>In <literal>C++</literal> it's important to deallocate memory in a 303way compatible with how it was allocated. The deal is:</para> 304<itemizedlist> 305 <listitem> 306 <para>If allocated with 307 <function>malloc</function>, 308 <function>calloc</function>, 309 <function>realloc</function>, 310 <function>valloc</function> or 311 <function>memalign</function>, you must 312 deallocate with <function>free</function>.</para> 313 </listitem> 314 <listitem> 315 <para>If allocated with <function>new</function>, you must deallocate 316 with <function>delete</function>.</para> 317 </listitem> 318 <listitem> 319 <para>If allocated with <function>new[]</function>, you must 320 deallocate with <function>delete[]</function>.</para> 321 </listitem> 322</itemizedlist> 323 324<para>The worst thing is that on Linux apparently it doesn't matter if 325you do mix these up, but the same program may then crash on a 326different platform, Solaris for example. So it's best to fix it 327properly. According to the KDE folks "it's amazing how many C++ 328programmers don't know this".</para> 329 330<para>The reason behind the requirement is as follows. In some C++ 331implementations, <function>delete[]</function> must be used for 332objects allocated by <function>new[]</function> because the compiler 333stores the size of the array and the pointer-to-member to the 334destructor of the array's content just before the pointer actually 335returned. <function>delete</function> doesn't account for this and will get 336confused, possibly corrupting the heap.</para> 337 338</sect2> 339 340 341 342<sect2 id="mc-manual.overlap" 343 xreflabel="Overlapping source and destination blocks"> 344<title>Overlapping source and destination blocks</title> 345 346<para>The following C library functions copy some data from one 347memory block to another (or something similar): 348<function>memcpy</function>, 349<function>strcpy</function>, 350<function>strncpy</function>, 351<function>strcat</function>, 352<function>strncat</function>. 353The blocks pointed to by their <computeroutput>src</computeroutput> and 354<computeroutput>dst</computeroutput> pointers aren't allowed to overlap. 355The POSIX standards have wording along the lines "If copying takes place 356between objects that overlap, the behavior is undefined." Therefore, 357Memcheck checks for this. 358</para> 359 360<para>For example:</para> 361<programlisting><![CDATA[ 362==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21) 363==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71) 364==27492== by 0x804865A: main (overlap.c:40) 365]]></programlisting> 366 367<para>You don't want the two blocks to overlap because one of them could 368get partially overwritten by the copying.</para> 369 370<para>You might think that Memcheck is being overly pedantic reporting 371this in the case where <computeroutput>dst</computeroutput> is less than 372<computeroutput>src</computeroutput>. For example, the obvious way to 373implement <function>memcpy</function> is by copying from the first 374byte to the last. However, the optimisation guides of some 375architectures recommend copying from the last byte down to the first. 376Also, some implementations of <function>memcpy</function> zero 377<computeroutput>dst</computeroutput> before copying, because zeroing the 378destination's cache line(s) can improve performance.</para> 379 380<para>The moral of the story is: if you want to write truly portable 381code, don't make any assumptions about the language 382implementation.</para> 383 384</sect2> 385 386 387<sect2 id="mc-manual.fishyvalue" 388 xreflabel="Fishy argument values"> 389<title>Fishy argument values</title> 390 391<para>All memory allocation functions take an argument specifying the 392size of the memory block that should be allocated. Clearly, the requested 393size should be a non-negative value and is typically not excessively large. 394For instance, it is extremely unlikly that the size of an allocation 395request exceeds 2**63 bytes on a 64-bit machine. It is much more likely that 396such a value is the result of an erroneous size calculation and is in effect 397a negative value (that just happens to appear excessively large because 398the bit pattern is interpreted as an unsigned integer). 399Such a value is called a "fishy value". 400 401The <varname>size</varname> argument of the following allocation functions 402is checked for being fishy: 403<function>malloc</function>, 404<function>calloc</function>, 405<function>realloc</function>, 406<function>memalign</function>, 407<function>new</function>, 408<function>new []</function>. 409<function>__builtin_new</function>, 410<function>__builtin_vec_new</function>, 411For <function>calloc</function> both arguments are being checked. 412</para> 413 414<para>For example:</para> 415<programlisting><![CDATA[ 416==32233== Argument 'size' of function malloc has a fishy (possibly negative) value: -3 417==32233== at 0x4C2CFA7: malloc (vg_replace_malloc.c:298) 418==32233== by 0x400555: foo (fishy.c:15) 419==32233== by 0x400583: main (fishy.c:23) 420]]></programlisting> 421 422<para>In earlier Valgrind versions those values were being referred to 423as "silly arguments" and no back-trace was included. 424</para> 425 426</sect2> 427 428 429<sect2 id="mc-manual.leaks" xreflabel="Memory leak detection"> 430<title>Memory leak detection</title> 431 432<para>Memcheck keeps track of all heap blocks issued in response to 433calls to 434<function>malloc</function>/<function>new</function> et al. 435So when the program exits, it knows which blocks have not been freed. 436</para> 437 438<para>If <option>--leak-check</option> is set appropriately, for each 439remaining block, Memcheck determines if the block is reachable from pointers 440within the root-set. The root-set consists of (a) general purpose registers 441of all threads, and (b) initialised, aligned, pointer-sized data words in 442accessible client memory, including stacks.</para> 443 444<para>There are two ways a block can be reached. The first is with a 445"start-pointer", i.e. a pointer to the start of the block. The second is with 446an "interior-pointer", i.e. a pointer to the middle of the block. There are 447several ways we know of that an interior-pointer can occur:</para> 448 449<itemizedlist> 450 <listitem> 451 <para>The pointer might have originally been a start-pointer and have been 452 moved along deliberately (or not deliberately) by the program. In 453 particular, this can happen if your program uses tagged pointers, i.e. 454 if it uses the bottom one, two or three bits of a pointer, which are 455 normally always zero due to alignment, in order to store extra 456 information.</para> 457 </listitem> 458 459 <listitem> 460 <para>It might be a random junk value in memory, entirely unrelated, just 461 a coincidence.</para> 462 </listitem> 463 464 <listitem> 465 <para>It might be a pointer to the inner char array of a C++ 466 <computeroutput>std::string</computeroutput>. For example, some 467 compilers add 3 words at the beginning of the std::string to 468 store the length, the capacity and a reference count before the 469 memory containing the array of characters. They return a pointer 470 just after these 3 words, pointing at the char array.</para> 471 </listitem> 472 473 <listitem> 474 <para>Some code might allocate a block of memory, and use the first 8 475 bytes to store (block size - 8) as a 64bit number. 476 <computeroutput>sqlite3MemMalloc</computeroutput> does this.</para> 477 </listitem> 478 479 <listitem> 480 <para>It might be a pointer to an array of C++ objects (which possess 481 destructors) allocated with <computeroutput>new[]</computeroutput>. In 482 this case, some compilers store a "magic cookie" containing the array 483 length at the start of the allocated block, and return a pointer to just 484 past that magic cookie, i.e. an interior-pointer. 485 See <ulink url="http://theory.uwinnipeg.ca/gnu/gcc/gxxint_14.html">this 486 page</ulink> for more information.</para> 487 </listitem> 488 489 <listitem> 490 <para>It might be a pointer to an inner part of a C++ object using 491 multiple inheritance. </para> 492 </listitem> 493</itemizedlist> 494 495<para>You can optionally activate heuristics to use during the leak 496search to detect the interior pointers corresponding to 497the <computeroutput>stdstring</computeroutput>, 498<computeroutput>length64</computeroutput>, 499<computeroutput>newarray</computeroutput> 500and <computeroutput>multipleinheritance</computeroutput> cases. If the 501heuristic detects that an interior pointer corresponds to such a case, 502the block will be considered as reachable by the interior 503pointer. In other words, the interior pointer will be treated 504as if it were a start pointer.</para> 505 506 507<para>With that in mind, consider the nine possible cases described by the 508following figure.</para> 509 510<programlisting><![CDATA[ 511 Pointer chain AAA Leak Case BBB Leak Case 512 ------------- ------------- ------------- 513(1) RRR ------------> BBB DR 514(2) RRR ---> AAA ---> BBB DR IR 515(3) RRR BBB DL 516(4) RRR AAA ---> BBB DL IL 517(5) RRR ------?-----> BBB (y)DR, (n)DL 518(6) RRR ---> AAA -?-> BBB DR (y)IR, (n)DL 519(7) RRR -?-> AAA ---> BBB (y)DR, (n)DL (y)IR, (n)IL 520(8) RRR -?-> AAA -?-> BBB (y)DR, (n)DL (y,y)IR, (n,y)IL, (_,n)DL 521(9) RRR AAA -?-> BBB DL (y)IL, (n)DL 522 523Pointer chain legend: 524- RRR: a root set node or DR block 525- AAA, BBB: heap blocks 526- --->: a start-pointer 527- -?->: an interior-pointer 528 529Leak Case legend: 530- DR: Directly reachable 531- IR: Indirectly reachable 532- DL: Directly lost 533- IL: Indirectly lost 534- (y)XY: it's XY if the interior-pointer is a real pointer 535- (n)XY: it's XY if the interior-pointer is not a real pointer 536- (_)XY: it's XY in either case 537]]></programlisting> 538 539<para>Every possible case can be reduced to one of the above nine. Memcheck 540merges some of these cases in its output, resulting in the following four 541leak kinds.</para> 542 543 544<itemizedlist> 545 546 <listitem> 547 <para>"Still reachable". This covers cases 1 and 2 (for the BBB blocks) 548 above. A start-pointer or chain of start-pointers to the block is 549 found. Since the block is still pointed at, the programmer could, at 550 least in principle, have freed it before program exit. "Still reachable" 551 blocks are very common and arguably not a problem. So, by default, 552 Memcheck won't report such blocks individually.</para> 553 </listitem> 554 555 <listitem> 556 <para>"Definitely lost". This covers case 3 (for the BBB blocks) above. 557 This means that no pointer to the block can be found. The block is 558 classified as "lost", because the programmer could not possibly have 559 freed it at program exit, since no pointer to it exists. This is likely 560 a symptom of having lost the pointer at some earlier point in the 561 program. Such cases should be fixed by the programmer.</para> 562 </listitem> 563 564 <listitem> 565 <para>"Indirectly lost". This covers cases 4 and 9 (for the BBB blocks) 566 above. This means that the block is lost, not because there are no 567 pointers to it, but rather because all the blocks that point to it are 568 themselves lost. For example, if you have a binary tree and the root 569 node is lost, all its children nodes will be indirectly lost. Because 570 the problem will disappear if the definitely lost block that caused the 571 indirect leak is fixed, Memcheck won't report such blocks individually 572 by default.</para> 573 </listitem> 574 575 <listitem> 576 <para>"Possibly lost". This covers cases 5--8 (for the BBB blocks) 577 above. This means that a chain of one or more pointers to the block has 578 been found, but at least one of the pointers is an interior-pointer. 579 This could just be a random value in memory that happens to point into a 580 block, and so you shouldn't consider this ok unless you know you have 581 interior-pointers.</para> 582 </listitem> 583 584</itemizedlist> 585 586<para>(Note: This mapping of the nine possible cases onto four leak kinds is 587not necessarily the best way that leaks could be reported; in particular, 588interior-pointers are treated inconsistently. It is possible the 589categorisation may be improved in the future.)</para> 590 591<para>Furthermore, if suppressions exists for a block, it will be reported 592as "suppressed" no matter what which of the above four kinds it belongs 593to.</para> 594 595 596<para>The following is an example leak summary.</para> 597 598<programlisting><![CDATA[ 599LEAK SUMMARY: 600 definitely lost: 48 bytes in 3 blocks. 601 indirectly lost: 32 bytes in 2 blocks. 602 possibly lost: 96 bytes in 6 blocks. 603 still reachable: 64 bytes in 4 blocks. 604 suppressed: 0 bytes in 0 blocks. 605]]></programlisting> 606 607<para>If heuristics have been used to consider some blocks as 608reachable, the leak summary details the heuristically reachable subset 609of 'still reachable:' per heuristic. In the below example, of the 95 610bytes still reachable, 87 bytes (56+7+8+16) have been considered 611heuristically reachable. 612</para> 613 614<programlisting><![CDATA[ 615LEAK SUMMARY: 616 definitely lost: 4 bytes in 1 blocks 617 indirectly lost: 0 bytes in 0 blocks 618 possibly lost: 0 bytes in 0 blocks 619 still reachable: 95 bytes in 6 blocks 620 of which reachable via heuristic: 621 stdstring : 56 bytes in 2 blocks 622 length64 : 16 bytes in 1 blocks 623 newarray : 7 bytes in 1 blocks 624 multipleinheritance: 8 bytes in 1 blocks 625 suppressed: 0 bytes in 0 blocks 626]]></programlisting> 627 628<para>If <option>--leak-check=full</option> is specified, 629Memcheck will give details for each definitely lost or possibly lost block, 630including where it was allocated. (Actually, it merges results for all 631blocks that have the same leak kind and sufficiently similar stack traces 632into a single "loss record". The 633<option>--leak-resolution</option> lets you control the 634meaning of "sufficiently similar".) It cannot tell you when or how or why 635the pointer to a leaked block was lost; you have to work that out for 636yourself. In general, you should attempt to ensure your programs do not 637have any definitely lost or possibly lost blocks at exit.</para> 638 639<para>For example:</para> 640<programlisting><![CDATA[ 6418 bytes in 1 blocks are definitely lost in loss record 1 of 14 642 at 0x........: malloc (vg_replace_malloc.c:...) 643 by 0x........: mk (leak-tree.c:11) 644 by 0x........: main (leak-tree.c:39) 645 64688 (8 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 14 647 at 0x........: malloc (vg_replace_malloc.c:...) 648 by 0x........: mk (leak-tree.c:11) 649 by 0x........: main (leak-tree.c:25) 650]]></programlisting> 651 652<para>The first message describes a simple case of a single 8 byte block 653that has been definitely lost. The second case mentions another 8 byte 654block that has been definitely lost; the difference is that a further 80 655bytes in other blocks are indirectly lost because of this lost block. 656The loss records are not presented in any notable order, so the loss record 657numbers aren't particularly meaningful. The loss record numbers can be used 658in the Valgrind gdbserver to list the addresses of the leaked blocks and/or give 659more details about how a block is still reachable.</para> 660 661<para>The option <option>--show-leak-kinds=<set></option> 662controls the set of leak kinds to show 663when <option>--leak-check=full</option> is specified. </para> 664 665<para>The <option><set></option> of leak kinds is specified 666in one of the following ways: 667 668<itemizedlist> 669 <listitem><para>a comma separated list of one or more of 670 <option>definite indirect possible reachable</option>.</para> 671 </listitem> 672 673 <listitem><para><option>all</option> to specify the complete set (all leak kinds).</para> 674 </listitem> 675 676 <listitem><para><option>none</option> for the empty set.</para> 677 </listitem> 678</itemizedlist> 679 680</para> 681 682<para> The default value for the leak kinds to show is 683 <option>--show-leak-kinds=definite,possible</option>. 684</para> 685 686<para>To also show the reachable and indirectly lost blocks in 687addition to the definitely and possibly lost blocks, you can 688use <option>--show-leak-kinds=all</option>. To only show the 689reachable and indirectly lost blocks, use 690<option>--show-leak-kinds=indirect,reachable</option>. The reachable 691and indirectly lost blocks will then be presented as shown in 692the following two examples.</para> 693 694<programlisting><![CDATA[ 69564 bytes in 4 blocks are still reachable in loss record 2 of 4 696 at 0x........: malloc (vg_replace_malloc.c:177) 697 by 0x........: mk (leak-cases.c:52) 698 by 0x........: main (leak-cases.c:74) 699 70032 bytes in 2 blocks are indirectly lost in loss record 1 of 4 701 at 0x........: malloc (vg_replace_malloc.c:177) 702 by 0x........: mk (leak-cases.c:52) 703 by 0x........: main (leak-cases.c:80) 704]]></programlisting> 705 706<para>Because there are different kinds of leaks with different 707severities, an interesting question is: which leaks should be 708counted as true "errors" and which should not? 709</para> 710 711<para> The answer to this question affects the numbers printed in 712the <computeroutput>ERROR SUMMARY</computeroutput> line, and also the 713effect of the <option>--error-exitcode</option> option. First, a leak 714is only counted as a true "error" 715if <option>--leak-check=full</option> is specified. Then, the 716option <option>--errors-for-leak-kinds=<set></option> controls 717the set of leak kinds to consider as errors. The default value 718is <option>--errors-for-leak-kinds=definite,possible</option> 719</para> 720 721</sect2> 722 723</sect1> 724 725 726 727<sect1 id="mc-manual.options" 728 xreflabel="Memcheck Command-Line Options"> 729<title>Memcheck Command-Line Options</title> 730 731<!-- start of xi:include in the manpage --> 732<variablelist id="mc.opts.list"> 733 734 <varlistentry id="opt.leak-check" xreflabel="--leak-check"> 735 <term> 736 <option><![CDATA[--leak-check=<no|summary|yes|full> [default: summary] ]]></option> 737 </term> 738 <listitem> 739 <para>When enabled, search for memory leaks when the client 740 program finishes. If set to <varname>summary</varname>, it says how 741 many leaks occurred. If set to <varname>full</varname> or 742 <varname>yes</varname>, each individual leak will be shown 743 in detail and/or counted as an error, as specified by the options 744 <option>--show-leak-kinds</option> and 745 <option>--errors-for-leak-kinds</option>. </para> 746 </listitem> 747 </varlistentry> 748 749 <varlistentry id="opt.leak-resolution" xreflabel="--leak-resolution"> 750 <term> 751 <option><![CDATA[--leak-resolution=<low|med|high> [default: high] ]]></option> 752 </term> 753 <listitem> 754 <para>When doing leak checking, determines how willing 755 Memcheck is to consider different backtraces to 756 be the same for the purposes of merging multiple leaks into a single 757 leak report. When set to <varname>low</varname>, only the first 758 two entries need match. When <varname>med</varname>, four entries 759 have to match. When <varname>high</varname>, all entries need to 760 match.</para> 761 762 <para>For hardcore leak debugging, you probably want to use 763 <option>--leak-resolution=high</option> together with 764 <option>--num-callers=40</option> or some such large number. 765 </para> 766 767 <para>Note that the <option>--leak-resolution</option> setting 768 does not affect Memcheck's ability to find 769 leaks. It only changes how the results are presented.</para> 770 </listitem> 771 </varlistentry> 772 773 <varlistentry id="opt.show-leak-kinds" xreflabel="--show-leak-kinds"> 774 <term> 775 <option><![CDATA[--show-leak-kinds=<set> [default: definite,possible] ]]></option> 776 </term> 777 <listitem> 778 <para>Specifies the leak kinds to show in a <varname>full</varname> 779 leak search, in one of the following ways: </para> 780 781 <itemizedlist> 782 <listitem><para>a comma separated list of one or more of 783 <option>definite indirect possible reachable</option>.</para> 784 </listitem> 785 786 <listitem><para><option>all</option> to specify the complete set (all leak kinds). 787 It is equivalent to 788 <option>--show-leak-kinds=definite,indirect,possible,reachable</option>.</para> 789 </listitem> 790 791 <listitem><para><option>none</option> for the empty set.</para> 792 </listitem> 793 </itemizedlist> 794 </listitem> 795 </varlistentry> 796 797 798 <varlistentry id="opt.errors-for-leak-kinds" xreflabel="--errors-for-leak-kinds"> 799 <term> 800 <option><![CDATA[--errors-for-leak-kinds=<set> [default: definite,possible] ]]></option> 801 </term> 802 <listitem> 803 <para>Specifies the leak kinds to count as errors in a 804 <varname>full</varname> leak search. The 805 <option><![CDATA[<set>]]></option> is specified similarly to 806 <option>--show-leak-kinds</option> 807 </para> 808 </listitem> 809 </varlistentry> 810 811 812 <varlistentry id="opt.leak-check-heuristics" xreflabel="--leak-check-heuristics"> 813 <term> 814 <option><![CDATA[--leak-check-heuristics=<set> [default: none] ]]></option> 815 </term> 816 <listitem> 817 <para>Specifies the set of leak check heuristics to be used 818 during leak searches. The heuristics control which interior pointers 819 to a block cause it to be considered as reachable. 820 The heuristic set is specified in one of the following ways:</para> 821 822 <itemizedlist> 823 <listitem><para>a comma separated list of one or more of 824 <option>stdstring length64 newarray multipleinheritance</option>.</para> 825 </listitem> 826 827 <listitem><para><option>all</option> to activate the complete set of 828 heuristics. 829 It is equivalent to 830 <option>--leak-check-heuristics=stdstring,length64,newarray,multipleinheritance</option>.</para> 831 </listitem> 832 833 <listitem><para><option>none</option> for the empty set.</para> 834 </listitem> 835 </itemizedlist> 836 </listitem> 837 838 <para>Note that these heuristics are dependent on the layout of the objects 839 produced by the C++ compiler. They have been tested with some gcc versions 840 (e.g. 4.4 and 4.7). They might not work properly with other C++ compilers. 841 </para> 842 </varlistentry> 843 844 845 <varlistentry id="opt.show-reachable" xreflabel="--show-reachable"> 846 <term> 847 <option><![CDATA[--show-reachable=<yes|no> ]]></option> 848 </term> 849 <term> 850 <option><![CDATA[--show-possibly-lost=<yes|no> ]]></option> 851 </term> 852 <listitem> 853 <para>These options provide an alternative way to specify the leak kinds to show: 854 </para> 855 <itemizedlist> 856 <listitem> 857 <para> 858 <option>--show-reachable=no --show-possibly-lost=yes</option> is equivalent to 859 <option>--show-leak-kinds=definite,possible</option>. 860 </para> 861 </listitem> 862 <listitem> 863 <para> 864 <option>--show-reachable=no --show-possibly-lost=no</option> is equivalent to 865 <option>--show-leak-kinds=definite</option>. 866 </para> 867 </listitem> 868 <listitem> 869 <para> 870 <option>--show-reachable=yes</option> is equivalent to 871 <option>--show-leak-kinds=all</option>. 872 </para> 873 </listitem> 874 </itemizedlist> 875 </listitem> 876 <para> Note that <option>--show-possibly-lost=no</option> has no effect 877 if <option>--show-reachable=yes</option> is specified.</para> 878 </varlistentry> 879 880 <varlistentry id="opt.undef-value-errors" xreflabel="--undef-value-errors"> 881 <term> 882 <option><![CDATA[--undef-value-errors=<yes|no> [default: yes] ]]></option> 883 </term> 884 <listitem> 885 <para>Controls whether Memcheck reports 886 uses of undefined value errors. Set this to 887 <varname>no</varname> if you don't want to see undefined value 888 errors. It also has the side effect of speeding up 889 Memcheck somewhat. 890 </para> 891 </listitem> 892 </varlistentry> 893 894 <varlistentry id="opt.track-origins" xreflabel="--track-origins"> 895 <term> 896 <option><![CDATA[--track-origins=<yes|no> [default: no] ]]></option> 897 </term> 898 <listitem> 899 <para>Controls whether Memcheck tracks 900 the origin of uninitialised values. By default, it does not, 901 which means that although it can tell you that an 902 uninitialised value is being used in a dangerous way, it 903 cannot tell you where the uninitialised value came from. This 904 often makes it difficult to track down the root problem. 905 </para> 906 <para>When set 907 to <varname>yes</varname>, Memcheck keeps 908 track of the origins of all uninitialised values. Then, when 909 an uninitialised value error is 910 reported, Memcheck will try to show the 911 origin of the value. An origin can be one of the following 912 four places: a heap block, a stack allocation, a client 913 request, or miscellaneous other sources (eg, a call 914 to <varname>brk</varname>). 915 </para> 916 <para>For uninitialised values originating from a heap 917 block, Memcheck shows where the block was 918 allocated. For uninitialised values originating from a stack 919 allocation, Memcheck can tell you which 920 function allocated the value, but no more than that -- typically 921 it shows you the source location of the opening brace of the 922 function. So you should carefully check that all of the 923 function's local variables are initialised properly. 924 </para> 925 <para>Performance overhead: origin tracking is expensive. It 926 halves Memcheck's speed and increases 927 memory use by a minimum of 100MB, and possibly more. 928 Nevertheless it can drastically reduce the effort required to 929 identify the root cause of uninitialised value errors, and so 930 is often a programmer productivity win, despite running 931 more slowly. 932 </para> 933 <para>Accuracy: Memcheck tracks origins 934 quite accurately. To avoid very large space and time 935 overheads, some approximations are made. It is possible, 936 although unlikely, that Memcheck will report an incorrect origin, or 937 not be able to identify any origin. 938 </para> 939 <para>Note that the combination 940 <option>--track-origins=yes</option> 941 and <option>--undef-value-errors=no</option> is 942 nonsensical. Memcheck checks for and 943 rejects this combination at startup. 944 </para> 945 </listitem> 946 </varlistentry> 947 948 <varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok"> 949 <term> 950 <option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option> 951 </term> 952 <listitem> 953 <para>Controls how Memcheck handles 32-, 64-, 128- and 256-bit 954 naturally aligned loads from addresses for which some bytes are 955 addressable and others are not. When <varname>yes</varname>, such 956 loads do not produce an address error. Instead, loaded bytes 957 originating from illegal addresses are marked as uninitialised, and 958 those corresponding to legal addresses are handled in the normal 959 way.</para> 960 961 <para>When <varname>no</varname>, loads from partially invalid 962 addresses are treated the same as loads from completely invalid 963 addresses: an illegal-address error is issued, and the resulting 964 bytes are marked as initialised.</para> 965 966 <para>Note that code that behaves in this way is in violation of 967 the ISO C/C++ standards, and should be considered broken. If 968 at all possible, such code should be fixed. This option should be 969 used only as a last resort.</para> 970 </listitem> 971 </varlistentry> 972 973 <varlistentry id="opt.keep-stacktraces" xreflabel="--keep-stacktraces"> 974 <term> 975 <option><![CDATA[--keep-stacktraces=alloc|free|alloc-and-free|alloc-then-free|none [default: alloc-then-free] ]]></option> 976 </term> 977 <listitem> 978 <para>Controls which stack trace(s) to keep for malloc'd and/or 979 free'd blocks. 980 </para> 981 982 <para>With <varname>alloc-then-free</varname>, a stack trace is 983 recorded at allocation time, and is associated with the block. 984 When the block is freed, a second stack trace is recorded, and 985 this replaces the allocation stack trace. As a result, any "use 986 after free" errors relating to this block can only show a stack 987 trace for where the block was freed. 988 </para> 989 990 <para>With <varname>alloc-and-free</varname>, both allocation 991 and the deallocation stack traces for the block are stored. 992 Hence a "use after free" error will 993 show both, which may make the error easier to diagnose. 994 Compared to <varname>alloc-then-free</varname>, this setting 995 slightly increases Valgrind's memory use as the block contains two 996 references instead of one. 997 </para> 998 999 <para>With <varname>alloc</varname>, only the allocation stack 1000 trace is recorded (and reported). With <varname>free</varname>, 1001 only the deallocation stack trace is recorded (and reported). 1002 These values somewhat decrease Valgrind's memory and cpu usage. 1003 They can be useful depending on the error types you are 1004 searching for and the level of detail you need to analyse 1005 them. For example, if you are only interested in memory leak 1006 errors, it is sufficient to record the allocation stack traces. 1007 </para> 1008 1009 <para>With <varname>none</varname>, no stack traces are recorded 1010 for malloc and free operations. If your program allocates a lot 1011 of blocks and/or allocates/frees from many different stack 1012 traces, this can significantly decrease cpu and/or memory 1013 required. Of course, few details will be reported for errors 1014 related to heap blocks. 1015 </para> 1016 1017 <para>Note that once a stack trace is recorded, Valgrind keeps 1018 the stack trace in memory even if it is not referenced by any 1019 block. Some programs (for example, recursive algorithms) can 1020 generate a huge number of stack traces. If Valgrind uses too 1021 much memory in such circumstances, you can reduce the memory 1022 required with the options <varname>--keep-stacktraces</varname> 1023 and/or by using a smaller value for the 1024 option <varname>--num-callers</varname>. 1025 </para> 1026 </listitem> 1027 </varlistentry> 1028 1029 <varlistentry id="opt.freelist-vol" xreflabel="--freelist-vol"> 1030 <term> 1031 <option><![CDATA[--freelist-vol=<number> [default: 20000000] ]]></option> 1032 </term> 1033 <listitem> 1034 <para>When the client program releases memory using 1035 <function>free</function> (in <literal>C</literal>) or 1036 <computeroutput>delete</computeroutput> 1037 (<literal>C++</literal>), that memory is not immediately made 1038 available for re-allocation. Instead, it is marked inaccessible 1039 and placed in a queue of freed blocks. The purpose is to defer as 1040 long as possible the point at which freed-up memory comes back 1041 into circulation. This increases the chance that 1042 Memcheck will be able to detect invalid 1043 accesses to blocks for some significant period of time after they 1044 have been freed.</para> 1045 1046 <para>This option specifies the maximum total size, in bytes, of the 1047 blocks in the queue. The default value is twenty million bytes. 1048 Increasing this increases the total amount of memory used by 1049 Memcheck but may detect invalid uses of freed 1050 blocks which would otherwise go undetected.</para> 1051 </listitem> 1052 </varlistentry> 1053 1054 <varlistentry id="opt.freelist-big-blocks" xreflabel="--freelist-big-blocks"> 1055 <term> 1056 <option><![CDATA[--freelist-big-blocks=<number> [default: 1000000] ]]></option> 1057 </term> 1058 <listitem> 1059 <para>When making blocks from the queue of freed blocks available 1060 for re-allocation, Memcheck will in priority re-circulate the blocks 1061 with a size greater or equal to <option>--freelist-big-blocks</option>. 1062 This ensures that freeing big blocks (in particular freeing blocks bigger than 1063 <option>--freelist-vol</option>) does not immediately lead to a re-circulation 1064 of all (or a lot of) the small blocks in the free list. In other words, 1065 this option increases the likelihood to discover dangling pointers 1066 for the "small" blocks, even when big blocks are freed.</para> 1067 <para>Setting a value of 0 means that all the blocks are re-circulated 1068 in a FIFO order. </para> 1069 </listitem> 1070 </varlistentry> 1071 1072 <varlistentry id="opt.workaround-gcc296-bugs" xreflabel="--workaround-gcc296-bugs"> 1073 <term> 1074 <option><![CDATA[--workaround-gcc296-bugs=<yes|no> [default: no] ]]></option> 1075 </term> 1076 <listitem> 1077 <para>When enabled, assume that reads and writes some small 1078 distance below the stack pointer are due to bugs in GCC 2.96, and 1079 does not report them. The "small distance" is 256 bytes by 1080 default. Note that GCC 2.96 is the default compiler on some ancient 1081 Linux distributions (RedHat 7.X) and so you may need to use this 1082 option. Do not use it if you do not have to, as it can cause real 1083 errors to be overlooked. A better alternative is to use a more 1084 recent GCC in which this bug is fixed.</para> 1085 1086 <para>You may also need to use this option when working with 1087 GCC 3.X or 4.X on 32-bit PowerPC Linux. This is because 1088 GCC generates code which occasionally accesses below the 1089 stack pointer, particularly for floating-point to/from integer 1090 conversions. This is in violation of the 32-bit PowerPC ELF 1091 specification, which makes no provision for locations below the 1092 stack pointer to be accessible.</para> 1093 </listitem> 1094 </varlistentry> 1095 1096 <varlistentry id="opt.show-mismatched-frees" 1097 xreflabel="--show-mismatched-frees"> 1098 <term> 1099 <option><![CDATA[--show-mismatched-frees=<yes|no> [default: yes] ]]></option> 1100 </term> 1101 <listitem> 1102 <para>When enabled, Memcheck checks that heap blocks are 1103 deallocated using a function that matches the allocating 1104 function. That is, it expects <varname>free</varname> to be 1105 used to deallocate blocks allocated 1106 by <varname>malloc</varname>, <varname>delete</varname> for 1107 blocks allocated by <varname>new</varname>, 1108 and <varname>delete[]</varname> for blocks allocated 1109 by <varname>new[]</varname>. If a mismatch is detected, an 1110 error is reported. This is in general important because in some 1111 environments, freeing with a non-matching function can cause 1112 crashes.</para> 1113 1114 <para>There is however a scenario where such mismatches cannot 1115 be avoided. That is when the user provides implementations of 1116 <varname>new</varname>/<varname>new[]</varname> that 1117 call <varname>malloc</varname> and 1118 of <varname>delete</varname>/<varname>delete[]</varname> that 1119 call <varname>free</varname>, and these functions are 1120 asymmetrically inlined. For example, imagine 1121 that <varname>delete[]</varname> is inlined 1122 but <varname>new[]</varname> is not. The result is that 1123 Memcheck "sees" all <varname>delete[]</varname> calls as direct 1124 calls to <varname>free</varname>, even when the program source 1125 contains no mismatched calls.</para> 1126 1127 <para>This causes a lot of confusing and irrelevant error 1128 reports. <varname>--show-mismatched-frees=no</varname> disables 1129 these checks. It is not generally advisable to disable them, 1130 though, because you may miss real errors as a result.</para> 1131 </listitem> 1132 </varlistentry> 1133 1134 <varlistentry id="opt.ignore-ranges" xreflabel="--ignore-ranges"> 1135 <term> 1136 <option><![CDATA[--ignore-ranges=0xPP-0xQQ[,0xRR-0xSS] ]]></option> 1137 </term> 1138 <listitem> 1139 <para>Any ranges listed in this option (and multiple ranges can be 1140 specified, separated by commas) will be ignored by Memcheck's 1141 addressability checking.</para> 1142 </listitem> 1143 </varlistentry> 1144 1145 <varlistentry id="opt.malloc-fill" xreflabel="--malloc-fill"> 1146 <term> 1147 <option><![CDATA[--malloc-fill=<hexnumber> ]]></option> 1148 </term> 1149 <listitem> 1150 <para>Fills blocks allocated 1151 by <computeroutput>malloc</computeroutput>, 1152 <computeroutput>new</computeroutput>, etc, but not 1153 by <computeroutput>calloc</computeroutput>, with the specified 1154 byte. This can be useful when trying to shake out obscure 1155 memory corruption problems. The allocated area is still 1156 regarded by Memcheck as undefined -- this option only affects its 1157 contents. Note that <option>--malloc-fill</option> does not 1158 affect a block of memory when it is used as argument 1159 to client requests VALGRIND_MEMPOOL_ALLOC or 1160 VALGRIND_MALLOCLIKE_BLOCK. 1161 </para> 1162 </listitem> 1163 </varlistentry> 1164 1165 <varlistentry id="opt.free-fill" xreflabel="--free-fill"> 1166 <term> 1167 <option><![CDATA[--free-fill=<hexnumber> ]]></option> 1168 </term> 1169 <listitem> 1170 <para>Fills blocks freed 1171 by <computeroutput>free</computeroutput>, 1172 <computeroutput>delete</computeroutput>, etc, with the 1173 specified byte value. This can be useful when trying to shake out 1174 obscure memory corruption problems. The freed area is still 1175 regarded by Memcheck as not valid for access -- this option only 1176 affects its contents. Note that <option>--free-fill</option> does not 1177 affect a block of memory when it is used as argument to 1178 client requests VALGRIND_MEMPOOL_FREE or VALGRIND_FREELIKE_BLOCK. 1179 </para> 1180 </listitem> 1181 </varlistentry> 1182 1183</variablelist> 1184<!-- end of xi:include in the manpage --> 1185 1186</sect1> 1187 1188 1189<sect1 id="mc-manual.suppfiles" xreflabel="Writing suppression files"> 1190<title>Writing suppression files</title> 1191 1192<para>The basic suppression format is described in 1193<xref linkend="manual-core.suppress"/>.</para> 1194 1195<para>The suppression-type (second) line should have the form:</para> 1196<programlisting><![CDATA[ 1197Memcheck:suppression_type]]></programlisting> 1198 1199<para>The Memcheck suppression types are as follows:</para> 1200 1201<itemizedlist> 1202 <listitem> 1203 <para><varname>Value1</varname>, 1204 <varname>Value2</varname>, 1205 <varname>Value4</varname>, 1206 <varname>Value8</varname>, 1207 <varname>Value16</varname>, 1208 meaning an uninitialised-value error when 1209 using a value of 1, 2, 4, 8 or 16 bytes.</para> 1210 </listitem> 1211 1212 <listitem> 1213 <para><varname>Cond</varname> (or its old 1214 name, <varname>Value0</varname>), meaning use 1215 of an uninitialised CPU condition code.</para> 1216 </listitem> 1217 1218 <listitem> 1219 <para><varname>Addr1</varname>, 1220 <varname>Addr2</varname>, 1221 <varname>Addr4</varname>, 1222 <varname>Addr8</varname>, 1223 <varname>Addr16</varname>, 1224 meaning an invalid address during a 1225 memory access of 1, 2, 4, 8 or 16 bytes respectively.</para> 1226 </listitem> 1227 1228 <listitem> 1229 <para><varname>Jump</varname>, meaning an 1230 jump to an unaddressable location error.</para> 1231 </listitem> 1232 1233 <listitem> 1234 <para><varname>Param</varname>, meaning an 1235 invalid system call parameter error.</para> 1236 </listitem> 1237 1238 <listitem> 1239 <para><varname>Free</varname>, meaning an 1240 invalid or mismatching free.</para> 1241 </listitem> 1242 1243 <listitem> 1244 <para><varname>Overlap</varname>, meaning a 1245 <computeroutput>src</computeroutput> / 1246 <computeroutput>dst</computeroutput> overlap in 1247 <function>memcpy</function> or a similar function.</para> 1248 </listitem> 1249 1250 <listitem> 1251 <para><varname>Leak</varname>, meaning 1252 a memory leak.</para> 1253 </listitem> 1254 1255</itemizedlist> 1256 1257<para><computeroutput>Param</computeroutput> errors have a mandatory extra 1258information line at this point, which is the name of the offending 1259system call parameter. </para> 1260 1261<para><computeroutput>Leak</computeroutput> errors have an optional 1262extra information line, with the following format:</para> 1263<programlisting><![CDATA[ 1264match-leak-kinds:<set>]]></programlisting> 1265<para>where <computeroutput><set></computeroutput> specifies which 1266leak kinds are matched by this suppression entry. 1267<computeroutput><set></computeroutput> is specified in the 1268same way as with the option <option>--show-leak-kinds</option>, that is, 1269one of the following:</para> 1270<itemizedlist> 1271 <listitem>a comma separated list of one or more of 1272 <option>definite indirect possible reachable</option>. 1273 </listitem> 1274 1275 <listitem><option>all</option> to specify the complete set (all leak kinds). 1276 </listitem> 1277 1278 <listitem><option>none</option> for the empty set. 1279 </listitem> 1280</itemizedlist> 1281<para>If this optional extra line is not present, the suppression 1282entry will match all leak kinds.</para> 1283 1284<para>Be aware that leak suppressions that are created using 1285<option>--gen-suppressions</option> will contain this optional extra 1286line, and therefore may match fewer leaks than you expect. You may 1287want to remove the line before using the generated 1288suppressions.</para> 1289 1290<para>The other Memcheck error kinds do not have extra lines.</para> 1291 1292<para> 1293If you give the <option>-v</option> option, Valgrind will print 1294the list of used suppressions at the end of execution. 1295For a leak suppression, this output gives the number of different 1296loss records that match the suppression, and the number of bytes 1297and blocks suppressed by the suppression. 1298If the run contains multiple leak checks, the number of bytes and blocks 1299are reset to zero before each new leak check. Note that the number of different 1300loss records is not reset to zero.</para> 1301<para>In the example below, in the last leak search, 7 blocks and 96 bytes have 1302been suppressed by a suppression with the name 1303<option>some_leak_suppression</option>:</para> 1304<programlisting><![CDATA[ 1305--21041-- used_suppression: 10 some_other_leak_suppression s.supp:14 suppressed: 12,400 bytes in 1 blocks 1306--21041-- used_suppression: 39 some_leak_suppression s.supp:2 suppressed: 96 bytes in 7 blocks 1307]]></programlisting> 1308 1309<para>For <varname>ValueN</varname> and <varname>AddrN</varname> 1310errors, the first line of the calling context is either the name of 1311the function in which the error occurred, or, failing that, the full 1312path of the <filename>.so</filename> file or executable containing the 1313error location. For <varname>Free</varname> errors, the first line is 1314the name of the function doing the freeing (eg, 1315<function>free</function>, <function>__builtin_vec_delete</function>, 1316etc). For <varname>Overlap</varname> errors, the first line is the name of the 1317function with the overlapping arguments (eg. 1318<function>memcpy</function>, <function>strcpy</function>, etc).</para> 1319 1320<para>The last part of any suppression specifies the rest of the 1321calling context that needs to be matched.</para> 1322 1323</sect1> 1324 1325 1326 1327<sect1 id="mc-manual.machine" 1328 xreflabel="Details of Memcheck's checking machinery"> 1329<title>Details of Memcheck's checking machinery</title> 1330 1331<para>Read this section if you want to know, in detail, exactly 1332what and how Memcheck is checking.</para> 1333 1334 1335<sect2 id="mc-manual.value" xreflabel="Valid-value (V) bit"> 1336<title>Valid-value (V) bits</title> 1337 1338<para>It is simplest to think of Memcheck implementing a synthetic CPU 1339which is identical to a real CPU, except for one crucial detail. Every 1340bit (literally) of data processed, stored and handled by the real CPU 1341has, in the synthetic CPU, an associated "valid-value" bit, which says 1342whether or not the accompanying bit has a legitimate value. In the 1343discussions which follow, this bit is referred to as the V (valid-value) 1344bit.</para> 1345 1346<para>Each byte in the system therefore has a 8 V bits which follow it 1347wherever it goes. For example, when the CPU loads a word-size item (4 1348bytes) from memory, it also loads the corresponding 32 V bits from a 1349bitmap which stores the V bits for the process' entire address space. 1350If the CPU should later write the whole or some part of that value to 1351memory at a different address, the relevant V bits will be stored back 1352in the V-bit bitmap.</para> 1353 1354<para>In short, each bit in the system has (conceptually) an associated V 1355bit, which follows it around everywhere, even inside the CPU. Yes, all the 1356CPU's registers (integer, floating point, vector and condition registers) 1357have their own V bit vectors. For this to work, Memcheck uses a great deal 1358of compression to represent the V bits compactly.</para> 1359 1360<para>Copying values around does not cause Memcheck to check for, or 1361report on, errors. However, when a value is used in a way which might 1362conceivably affect your program's externally-visible behaviour, 1363the associated V bits are immediately checked. If any of these indicate 1364that the value is undefined (even partially), an error is reported.</para> 1365 1366<para>Here's an (admittedly nonsensical) example:</para> 1367<programlisting><![CDATA[ 1368int i, j; 1369int a[10], b[10]; 1370for ( i = 0; i < 10; i++ ) { 1371 j = a[i]; 1372 b[i] = j; 1373}]]></programlisting> 1374 1375<para>Memcheck emits no complaints about this, since it merely copies 1376uninitialised values from <varname>a[]</varname> into 1377<varname>b[]</varname>, and doesn't use them in a way which could 1378affect the behaviour of the program. However, if 1379the loop is changed to:</para> 1380<programlisting><![CDATA[ 1381for ( i = 0; i < 10; i++ ) { 1382 j += a[i]; 1383} 1384if ( j == 77 ) 1385 printf("hello there\n"); 1386]]></programlisting> 1387 1388<para>then Memcheck will complain, at the 1389<computeroutput>if</computeroutput>, that the condition depends on 1390uninitialised values. Note that it <command>doesn't</command> complain 1391at the <varname>j += a[i];</varname>, since at that point the 1392undefinedness is not "observable". It's only when a decision has to be 1393made as to whether or not to do the <function>printf</function> -- an 1394observable action of your program -- that Memcheck complains.</para> 1395 1396<para>Most low level operations, such as adds, cause Memcheck to use the 1397V bits for the operands to calculate the V bits for the result. Even if 1398the result is partially or wholly undefined, it does not 1399complain.</para> 1400 1401<para>Checks on definedness only occur in three places: when a value is 1402used to generate a memory address, when control flow decision needs to 1403be made, and when a system call is detected, Memcheck checks definedness 1404of parameters as required.</para> 1405 1406<para>If a check should detect undefinedness, an error message is 1407issued. The resulting value is subsequently regarded as well-defined. 1408To do otherwise would give long chains of error messages. In other 1409words, once Memcheck reports an undefined value error, it tries to 1410avoid reporting further errors derived from that same undefined 1411value.</para> 1412 1413<para>This sounds overcomplicated. Why not just check all reads from 1414memory, and complain if an undefined value is loaded into a CPU 1415register? Well, that doesn't work well, because perfectly legitimate C 1416programs routinely copy uninitialised values around in memory, and we 1417don't want endless complaints about that. Here's the canonical example. 1418Consider a struct like this:</para> 1419<programlisting><![CDATA[ 1420struct S { int x; char c; }; 1421struct S s1, s2; 1422s1.x = 42; 1423s1.c = 'z'; 1424s2 = s1; 1425]]></programlisting> 1426 1427<para>The question to ask is: how large is <varname>struct S</varname>, 1428in bytes? An <varname>int</varname> is 4 bytes and a 1429<varname>char</varname> one byte, so perhaps a <varname>struct 1430S</varname> occupies 5 bytes? Wrong. All non-toy compilers we know 1431of will round the size of <varname>struct S</varname> up to a whole 1432number of words, in this case 8 bytes. Not doing this forces compilers 1433to generate truly appalling code for accessing arrays of 1434<varname>struct S</varname>'s on some architectures.</para> 1435 1436<para>So <varname>s1</varname> occupies 8 bytes, yet only 5 of them will 1437be initialised. For the assignment <varname>s2 = s1</varname>, GCC 1438generates code to copy all 8 bytes wholesale into <varname>s2</varname> 1439without regard for their meaning. If Memcheck simply checked values as 1440they came out of memory, it would yelp every time a structure assignment 1441like this happened. So the more complicated behaviour described above 1442is necessary. This allows GCC to copy 1443<varname>s1</varname> into <varname>s2</varname> any way it likes, and a 1444warning will only be emitted if the uninitialised values are later 1445used.</para> 1446 1447</sect2> 1448 1449 1450<sect2 id="mc-manual.vaddress" xreflabel=" Valid-address (A) bits"> 1451<title>Valid-address (A) bits</title> 1452 1453<para>Notice that the previous subsection describes how the validity of 1454values is established and maintained without having to say whether the 1455program does or does not have the right to access any particular memory 1456location. We now consider the latter question.</para> 1457 1458<para>As described above, every bit in memory or in the CPU has an 1459associated valid-value (V) bit. In addition, all bytes in memory, but 1460not in the CPU, have an associated valid-address (A) bit. This 1461indicates whether or not the program can legitimately read or write that 1462location. It does not give any indication of the validity of the data 1463at that location -- that's the job of the V bits -- only whether or not 1464the location may be accessed.</para> 1465 1466<para>Every time your program reads or writes memory, Memcheck checks 1467the A bits associated with the address. If any of them indicate an 1468invalid address, an error is emitted. Note that the reads and writes 1469themselves do not change the A bits, only consult them.</para> 1470 1471<para>So how do the A bits get set/cleared? Like this:</para> 1472 1473<itemizedlist> 1474 <listitem> 1475 <para>When the program starts, all the global data areas are 1476 marked as accessible.</para> 1477 </listitem> 1478 1479 <listitem> 1480 <para>When the program does 1481 <function>malloc</function>/<computeroutput>new</computeroutput>, 1482 the A bits for exactly the area allocated, and not a byte more, 1483 are marked as accessible. Upon freeing the area the A bits are 1484 changed to indicate inaccessibility.</para> 1485 </listitem> 1486 1487 <listitem> 1488 <para>When the stack pointer register (<literal>SP</literal>) moves 1489 up or down, A bits are set. The rule is that the area from 1490 <literal>SP</literal> up to the base of the stack is marked as 1491 accessible, and below <literal>SP</literal> is inaccessible. (If 1492 that sounds illogical, bear in mind that the stack grows down, not 1493 up, on almost all Unix systems, including GNU/Linux.) Tracking 1494 <literal>SP</literal> like this has the useful side-effect that the 1495 section of stack used by a function for local variables etc is 1496 automatically marked accessible on function entry and inaccessible 1497 on exit.</para> 1498 </listitem> 1499 1500 <listitem> 1501 <para>When doing system calls, A bits are changed appropriately. 1502 For example, <literal>mmap</literal> 1503 magically makes files appear in the process' 1504 address space, so the A bits must be updated if <literal>mmap</literal> 1505 succeeds.</para> 1506 </listitem> 1507 1508 <listitem> 1509 <para>Optionally, your program can tell Memcheck about such changes 1510 explicitly, using the client request mechanism described 1511 above.</para> 1512 </listitem> 1513 1514</itemizedlist> 1515 1516</sect2> 1517 1518 1519<sect2 id="mc-manual.together" xreflabel="Putting it all together"> 1520<title>Putting it all together</title> 1521 1522<para>Memcheck's checking machinery can be summarised as 1523follows:</para> 1524 1525<itemizedlist> 1526 <listitem> 1527 <para>Each byte in memory has 8 associated V (valid-value) bits, 1528 saying whether or not the byte has a defined value, and a single A 1529 (valid-address) bit, saying whether or not the program currently has 1530 the right to read/write that address. As mentioned above, heavy 1531 use of compression means the overhead is typically around 25%.</para> 1532 </listitem> 1533 1534 <listitem> 1535 <para>When memory is read or written, the relevant A bits are 1536 consulted. If they indicate an invalid address, Memcheck emits an 1537 Invalid read or Invalid write error.</para> 1538 </listitem> 1539 1540 <listitem> 1541 <para>When memory is read into the CPU's registers, the relevant V 1542 bits are fetched from memory and stored in the simulated CPU. They 1543 are not consulted.</para> 1544 </listitem> 1545 1546 <listitem> 1547 <para>When a register is written out to memory, the V bits for that 1548 register are written back to memory too.</para> 1549 </listitem> 1550 1551 <listitem> 1552 <para>When values in CPU registers are used to generate a memory 1553 address, or to determine the outcome of a conditional branch, the V 1554 bits for those values are checked, and an error emitted if any of 1555 them are undefined.</para> 1556 </listitem> 1557 1558 <listitem> 1559 <para>When values in CPU registers are used for any other purpose, 1560 Memcheck computes the V bits for the result, but does not check 1561 them.</para> 1562 </listitem> 1563 1564 <listitem> 1565 <para>Once the V bits for a value in the CPU have been checked, they 1566 are then set to indicate validity. This avoids long chains of 1567 errors.</para> 1568 </listitem> 1569 1570 <listitem> 1571 <para>When values are loaded from memory, Memcheck checks the A bits 1572 for that location and issues an illegal-address warning if needed. 1573 In that case, the V bits loaded are forced to indicate Valid, 1574 despite the location being invalid.</para> 1575 1576 <para>This apparently strange choice reduces the amount of confusing 1577 information presented to the user. It avoids the unpleasant 1578 phenomenon in which memory is read from a place which is both 1579 unaddressable and contains invalid values, and, as a result, you get 1580 not only an invalid-address (read/write) error, but also a 1581 potentially large set of uninitialised-value errors, one for every 1582 time the value is used.</para> 1583 1584 <para>There is a hazy boundary case to do with multi-byte loads from 1585 addresses which are partially valid and partially invalid. See 1586 details of the option <option>--partial-loads-ok</option> for details. 1587 </para> 1588 </listitem> 1589 1590</itemizedlist> 1591 1592 1593<para>Memcheck intercepts calls to <function>malloc</function>, 1594<function>calloc</function>, <function>realloc</function>, 1595<function>valloc</function>, <function>memalign</function>, 1596<function>free</function>, <computeroutput>new</computeroutput>, 1597<computeroutput>new[]</computeroutput>, 1598<computeroutput>delete</computeroutput> and 1599<computeroutput>delete[]</computeroutput>. The behaviour you get 1600is:</para> 1601 1602<itemizedlist> 1603 1604 <listitem> 1605 <para><function>malloc</function>/<function>new</function>/<computeroutput>new[]</computeroutput>: 1606 the returned memory is marked as addressable but not having valid 1607 values. This means you have to write to it before you can read 1608 it.</para> 1609 </listitem> 1610 1611 <listitem> 1612 <para><function>calloc</function>: returned memory is marked both 1613 addressable and valid, since <function>calloc</function> clears 1614 the area to zero.</para> 1615 </listitem> 1616 1617 <listitem> 1618 <para><function>realloc</function>: if the new size is larger than 1619 the old, the new section is addressable but invalid, as with 1620 <function>malloc</function>. If the new size is smaller, the 1621 dropped-off section is marked as unaddressable. You may only pass to 1622 <function>realloc</function> a pointer previously issued to you by 1623 <function>malloc</function>/<function>calloc</function>/<function>realloc</function>.</para> 1624 </listitem> 1625 1626 <listitem> 1627 <para><function>free</function>/<computeroutput>delete</computeroutput>/<computeroutput>delete[]</computeroutput>: 1628 you may only pass to these functions a pointer previously issued 1629 to you by the corresponding allocation function. Otherwise, 1630 Memcheck complains. If the pointer is indeed valid, Memcheck 1631 marks the entire area it points at as unaddressable, and places 1632 the block in the freed-blocks-queue. The aim is to defer as long 1633 as possible reallocation of this block. Until that happens, all 1634 attempts to access it will elicit an invalid-address error, as you 1635 would hope.</para> 1636 </listitem> 1637 1638</itemizedlist> 1639 1640</sect2> 1641</sect1> 1642 1643<sect1 id="mc-manual.monitor-commands" xreflabel="Memcheck Monitor Commands"> 1644<title>Memcheck Monitor Commands</title> 1645<para>The Memcheck tool provides monitor commands handled by Valgrind's 1646built-in gdbserver (see <xref linkend="manual-core-adv.gdbserver-commandhandling"/>). 1647</para> 1648 1649<itemizedlist> 1650 <listitem> 1651 <para><varname>get_vbits <addr> [<len>]</varname> 1652 shows the definedness (V) bits for <len> (default 1) bytes 1653 starting at <addr>. The definedness of each byte in the 1654 range is given using two hexadecimal digits. These hexadecimal 1655 digits encode the validity of each bit of the corresponding byte, 1656 using 0 if the bit is defined and 1 if the bit is undefined. 1657 If a byte is not addressable, its validity bits are replaced 1658 by <varname>__</varname> (a double underscore). 1659 </para> 1660 <para> 1661 In the following example, <varname>string10</varname> is an array 1662 of 10 characters, in which the even numbered bytes are 1663 undefined. In the below example, the byte corresponding 1664 to <varname>string10[5]</varname> is not addressable. 1665 </para> 1666<programlisting><![CDATA[ 1667(gdb) p &string10 1668$4 = (char (*)[10]) 0x8049e28 1669(gdb) monitor get_vbits 0x8049e28 10 1670ff00ff00 ff__ff00 ff00 1671(gdb) 1672]]></programlisting> 1673 1674 <para> The command get_vbits cannot be used with registers. To get 1675 the validity bits of a register, you must start Valgrind with the 1676 option <option>--vgdb-shadow-registers=yes</option>. The validity 1677 bits of a register can be obtained by printing the 'shadow 1' 1678 corresponding register. In the below x86 example, the register 1679 eax has all its bits undefined, while the register ebx is fully 1680 defined. 1681 </para> 1682<programlisting><![CDATA[ 1683(gdb) p /x $eaxs1 1684$9 = 0xffffffff 1685(gdb) p /x $ebxs1 1686$10 = 0x0 1687(gdb) 1688]]></programlisting> 1689 1690 </listitem> 1691 1692 <listitem> 1693 <para><varname>make_memory 1694 [noaccess|undefined|defined|Definedifaddressable] <addr> 1695 [<len>]</varname> marks the range of <len> (default 1) 1696 bytes at <addr> as having the given status. Parameter 1697 <varname>noaccess</varname> marks the range as non-accessible, so 1698 Memcheck will report an error on any access to it. 1699 <varname>undefined</varname> or <varname>defined</varname> mark 1700 the area as accessible, but Memcheck regards the bytes in it 1701 respectively as having undefined or defined values. 1702 <varname>Definedifaddressable</varname> marks as defined, bytes in 1703 the range which are already addressible, but makes no change to 1704 the status of bytes in the range which are not addressible. Note 1705 that the first letter of <varname>Definedifaddressable</varname> 1706 is an uppercase D to avoid confusion with <varname>defined</varname>. 1707 </para> 1708 1709 <para> 1710 In the following example, the first byte of the 1711 <varname>string10</varname> is marked as defined: 1712 </para> 1713<programlisting><![CDATA[ 1714(gdb) monitor make_memory defined 0x8049e28 1 1715(gdb) monitor get_vbits 0x8049e28 10 17160000ff00 ff00ff00 ff00 1717(gdb) 1718]]></programlisting> 1719 </listitem> 1720 1721 <listitem> 1722 <para><varname>check_memory [addressable|defined] <addr> 1723 [<len>]</varname> checks that the range of <len> 1724 (default 1) bytes at <addr> has the specified accessibility. 1725 It then outputs a description of <addr>. In the following 1726 example, a detailed description is available because the 1727 option <option>--read-var-info=yes</option> was given at Valgrind 1728 startup: 1729 </para> 1730<programlisting><![CDATA[ 1731(gdb) monitor check_memory defined 0x8049e28 1 1732Address 0x8049E28 len 1 defined 1733==14698== Location 0x8049e28 is 0 bytes inside string10[0], 1734==14698== declared at prog.c:10, in frame #0 of thread 1 1735(gdb) 1736]]></programlisting> 1737 </listitem> 1738 1739 <listitem> 1740 <para><varname>leak_check [full*|summary] 1741 [kinds <set>|reachable|possibleleak*|definiteleak] 1742 [heuristics heur1,heur2,...] 1743 [increased*|changed|any] 1744 [unlimited*|limited <max_loss_records_output>] 1745 </varname> 1746 performs a leak check. The <varname>*</varname> in the arguments 1747 indicates the default values. </para> 1748 1749 <para> If the <varname>[full*|summary]</varname> argument is 1750 <varname>summary</varname>, only a summary of the leak search is given; 1751 otherwise a full leak report is produced. A full leak report gives 1752 detailed information for each leak: the stack trace where the leaked blocks 1753 were allocated, the number of blocks leaked and their total size. When a 1754 full report is requested, the next two arguments further specify what 1755 kind of leaks to report. A leak's details are shown if they match 1756 both the second and third argument. A full leak report might 1757 output detailed information for many leaks. The nr of leaks for 1758 which information is output can be controlled using 1759 the <varname>limited</varname> argument followed by the maximum nr 1760 of leak records to output. If this maximum is reached, the leak 1761 search outputs the records with the biggest number of bytes. 1762 </para> 1763 1764 <para>The <varname>kinds</varname> argument controls what kind of blocks 1765 are shown for a <varname>full</varname> leak search. The set of leak kinds 1766 to show can be specified using a <varname><set></varname> similarly 1767 to the command line option <option>--show-leak-kinds</option>. 1768 Alternatively, the value <varname>definiteleak</varname> 1769 is equivalent to <varname>kinds definite</varname>, the 1770 value <varname>possibleleak</varname> is equivalent to 1771 <varname>kinds definite,possible</varname> : it will also show 1772 possibly leaked blocks, .i.e those for which only an interior 1773 pointer was found. The value <varname>reachable</varname> will 1774 show all block categories (i.e. is equivalent to <varname>kinds 1775 all</varname>). 1776 </para> 1777 1778 <para>The <varname>heuristics</varname> argument controls the heuristics 1779 used during the leak search. The set of heuristics to use can be specified 1780 using a <varname><set></varname> similarly 1781 to the command line option <option>--leak-check-heuristics</option>. 1782 The default value for the <varname>heuristics</varname> argument is 1783 <varname>heuristics none</varname>. 1784 </para> 1785 1786 <para>The <varname>[increased*|changed|any]</varname> argument controls what 1787 kinds of changes are shown for a <varname>full</varname> leak search. The 1788 value <varname>increased</varname> specifies that only block 1789 allocation stacks with an increased number of leaked bytes or 1790 blocks since the previous leak check should be shown. The 1791 value <varname>changed</varname> specifies that allocation stacks 1792 with any change since the previous leak check should be shown. 1793 The value <varname>any</varname> specifies that all leak entries 1794 should be shown, regardless of any increase or decrease. When 1795 If <varname>increased</varname> or <varname>changed</varname> are 1796 specified, the leak report entries will show the delta relative to 1797 the previous leak report. 1798 </para> 1799 1800 <para>The following example shows usage of the 1801 <varname>leak_check</varname> monitor command on 1802 the <varname>memcheck/tests/leak-cases.c</varname> regression 1803 test. The first command outputs one entry having an increase in 1804 the leaked bytes. The second command is the same as the first 1805 command, but uses the abbreviated forms accepted by GDB and the 1806 Valgrind gdbserver. It only outputs the summary information, as 1807 there was no increase since the previous leak search.</para> 1808<programlisting><![CDATA[ 1809(gdb) monitor leak_check full possibleleak increased 1810==19520== 16 (+16) bytes in 1 (+1) blocks are possibly lost in loss record 9 of 12 1811==19520== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1812==19520== by 0x80484D5: mk (leak-cases.c:52) 1813==19520== by 0x804855F: f (leak-cases.c:81) 1814==19520== by 0x80488E0: main (leak-cases.c:107) 1815==19520== 1816==19520== LEAK SUMMARY: 1817==19520== definitely lost: 32 (+0) bytes in 2 (+0) blocks 1818==19520== indirectly lost: 16 (+0) bytes in 1 (+0) blocks 1819==19520== possibly lost: 32 (+16) bytes in 2 (+1) blocks 1820==19520== still reachable: 96 (+16) bytes in 6 (+1) blocks 1821==19520== suppressed: 0 (+0) bytes in 0 (+0) blocks 1822==19520== Reachable blocks (those to which a pointer was found) are not shown. 1823==19520== To see them, add 'reachable any' args to leak_check 1824==19520== 1825(gdb) mo l 1826==19520== LEAK SUMMARY: 1827==19520== definitely lost: 32 (+0) bytes in 2 (+0) blocks 1828==19520== indirectly lost: 16 (+0) bytes in 1 (+0) blocks 1829==19520== possibly lost: 32 (+0) bytes in 2 (+0) blocks 1830==19520== still reachable: 96 (+0) bytes in 6 (+0) blocks 1831==19520== suppressed: 0 (+0) bytes in 0 (+0) blocks 1832==19520== Reachable blocks (those to which a pointer was found) are not shown. 1833==19520== To see them, add 'reachable any' args to leak_check 1834==19520== 1835(gdb) 1836]]></programlisting> 1837 <para>Note that when using Valgrind's gdbserver, it is not 1838 necessary to rerun 1839 with <option>--leak-check=full</option> 1840 <option>--show-reachable=yes</option> to see the reachable 1841 blocks. You can obtain the same information without rerunning by 1842 using the GDB command <computeroutput>monitor leak_check full 1843 reachable any</computeroutput> (or, using 1844 abbreviation: <computeroutput>mo l f r a</computeroutput>). 1845 </para> 1846 </listitem> 1847 1848 <listitem> 1849 <para><varname>block_list <loss_record_nr> </varname> 1850 shows the list of blocks belonging to <loss_record_nr>. 1851 </para> 1852 1853 <para> A leak search merges the allocated blocks in loss records : 1854 a loss record re-groups all blocks having the same state (for 1855 example, Definitely Lost) and the same allocation backtrace. 1856 Each loss record is identified in the leak search result 1857 by a loss record number. 1858 The <varname>block_list</varname> command shows the loss record information 1859 followed by the addresses and sizes of the blocks which have been 1860 merged in the loss record. 1861 </para> 1862 1863 <para> If a directly lost block causes some other blocks to be indirectly 1864 lost, the block_list command will also show these indirectly lost blocks. 1865 The indirectly lost blocks will be indented according to the level of indirection 1866 between the directly lost block and the indirectly lost block(s). 1867 Each indirectly lost block is followed by the reference of its loss record. 1868 </para> 1869 1870 <para> The block_list command can be used on the results of a leak search as long 1871 as no block has been freed after this leak search: as soon as the program frees 1872 a block, a new leak search is needed before block_list can be used again. 1873 </para> 1874 1875 <para> 1876 In the below example, the program leaks a tree structure by losing the pointer to 1877 the block A (top of the tree). 1878 So, the block A is directly lost, causing an indirect 1879 loss of blocks B to G. The first block_list command shows the loss record of A 1880 (a definitely lost block with address 0x4028028, size 16). The addresses and sizes 1881 of the indirectly lost blocks due to block A are shown below the block A. 1882 The second command shows the details of one of the indirect loss records output 1883 by the first command. 1884 </para> 1885<programlisting><![CDATA[ 1886 A 1887 / \ 1888 B C 1889 / \ / \ 1890 D E F G 1891]]></programlisting> 1892 1893<programlisting><![CDATA[ 1894(gdb) bt 1895#0 main () at leak-tree.c:69 1896(gdb) monitor leak_check full any 1897==19552== 112 (16 direct, 96 indirect) bytes in 1 blocks are definitely lost in loss record 7 of 7 1898==19552== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1899==19552== by 0x80484D5: mk (leak-tree.c:28) 1900==19552== by 0x80484FC: f (leak-tree.c:41) 1901==19552== by 0x8048856: main (leak-tree.c:63) 1902==19552== 1903==19552== LEAK SUMMARY: 1904==19552== definitely lost: 16 bytes in 1 blocks 1905==19552== indirectly lost: 96 bytes in 6 blocks 1906==19552== possibly lost: 0 bytes in 0 blocks 1907==19552== still reachable: 0 bytes in 0 blocks 1908==19552== suppressed: 0 bytes in 0 blocks 1909==19552== 1910(gdb) monitor block_list 7 1911==19552== 112 (16 direct, 96 indirect) bytes in 1 blocks are definitely lost in loss record 7 of 7 1912==19552== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1913==19552== by 0x80484D5: mk (leak-tree.c:28) 1914==19552== by 0x80484FC: f (leak-tree.c:41) 1915==19552== by 0x8048856: main (leak-tree.c:63) 1916==19552== 0x4028028[16] 1917==19552== 0x4028068[16] indirect loss record 1 1918==19552== 0x40280E8[16] indirect loss record 3 1919==19552== 0x4028128[16] indirect loss record 4 1920==19552== 0x40280A8[16] indirect loss record 2 1921==19552== 0x4028168[16] indirect loss record 5 1922==19552== 0x40281A8[16] indirect loss record 6 1923(gdb) mo b 2 1924==19552== 16 bytes in 1 blocks are indirectly lost in loss record 2 of 7 1925==19552== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1926==19552== by 0x80484D5: mk (leak-tree.c:28) 1927==19552== by 0x8048519: f (leak-tree.c:43) 1928==19552== by 0x8048856: main (leak-tree.c:63) 1929==19552== 0x40280A8[16] 1930==19552== 0x4028168[16] indirect loss record 5 1931==19552== 0x40281A8[16] indirect loss record 6 1932(gdb) 1933 1934]]></programlisting> 1935 1936 </listitem> 1937 1938 <listitem> 1939 <para><varname>who_points_at <addr> [<len>]</varname> 1940 shows all the locations where a pointer to addr is found. 1941 If len is equal to 1, the command only shows the locations pointing 1942 exactly at addr (i.e. the "start pointers" to addr). 1943 If len is > 1, "interior pointers" pointing at the len first bytes 1944 will also be shown. 1945 </para> 1946 1947 <para>The locations searched for are the same as the locations 1948 used in the leak search. So, <varname>who_points_at</varname> can a.o. 1949 be used to show why the leak search still can reach a block, or can 1950 search for dangling pointers to a freed block. 1951 Each location pointing at addr (or pointing inside addr if interior pointers 1952 are being searched for) will be described. 1953 </para> 1954 1955 <para>In the below example, the pointers to the 'tree block A' (see example 1956 in command <varname>block_list</varname>) is shown before the tree was leaked. 1957 The descriptions are detailed as the option <option>--read-var-info=yes</option> 1958 was given at Valgrind startup. The second call shows the pointers (start and interior 1959 pointers) to block G. The block G (0x40281A8) is reachable via block C (0x40280a8) 1960 and register ECX of tid 1 (tid is the Valgrind thread id). 1961 It is "interior reachable" via the register EBX. 1962 </para> 1963 1964<programlisting><![CDATA[ 1965(gdb) monitor who_points_at 0x4028028 1966==20852== Searching for pointers to 0x4028028 1967==20852== *0x8049e20 points at 0x4028028 1968==20852== Location 0x8049e20 is 0 bytes inside global var "t" 1969==20852== declared at leak-tree.c:35 1970(gdb) monitor who_points_at 0x40281A8 16 1971==20852== Searching for pointers pointing in 16 bytes from 0x40281a8 1972==20852== *0x40280ac points at 0x40281a8 1973==20852== Address 0x40280ac is 4 bytes inside a block of size 16 alloc'd 1974==20852== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1975==20852== by 0x80484D5: mk (leak-tree.c:28) 1976==20852== by 0x8048519: f (leak-tree.c:43) 1977==20852== by 0x8048856: main (leak-tree.c:63) 1978==20852== tid 1 register ECX points at 0x40281a8 1979==20852== tid 1 register EBX interior points at 2 bytes inside 0x40281a8 1980(gdb) 1981]]></programlisting> 1982 1983 <para> When <varname>who_points_at</varname> finds an interior pointer, 1984 it will report the heuristic(s) with which this interior pointer 1985 will be considered as reachable. Note that this is done independently 1986 of the value of the option <option>--leak-check-heuristics</option>. 1987 In the below example, the loss record 6 indicates a possibly lost 1988 block. <varname>who_points_at</varname> reports that there is an interior 1989 pointer pointing in this block, and that the block can be considered 1990 reachable using the heuristic 1991 <computeroutput>multipleinheritance</computeroutput>. 1992 </para> 1993 1994<programlisting><![CDATA[ 1995(gdb) monitor block_list 6 1996==3748== 8 bytes in 1 blocks are possibly lost in loss record 6 of 7 1997==3748== at 0x4007D77: operator new(unsigned int) (vg_replace_malloc.c:313) 1998==3748== by 0x8048954: main (leak_cpp_interior.cpp:43) 1999==3748== 0x402A0E0[8] 2000(gdb) monitor who_points_at 0x402A0E0 8 2001==3748== Searching for pointers pointing in 8 bytes from 0x402a0e0 2002==3748== *0xbe8ee078 interior points at 4 bytes inside 0x402a0e0 2003==3748== Address 0xbe8ee078 is on thread 1's stack 2004==3748== block at 0x402a0e0 considered reachable by ptr 0x402a0e4 using multipleinheritance heuristic 2005(gdb) 2006]]></programlisting> 2007 2008 </listitem> 2009 2010</itemizedlist> 2011 2012</sect1> 2013 2014<sect1 id="mc-manual.clientreqs" xreflabel="Client requests"> 2015<title>Client Requests</title> 2016 2017<para>The following client requests are defined in 2018<filename>memcheck.h</filename>. 2019See <filename>memcheck.h</filename> for exact details of their 2020arguments.</para> 2021 2022<itemizedlist> 2023 2024 <listitem> 2025 <para><varname>VALGRIND_MAKE_MEM_NOACCESS</varname>, 2026 <varname>VALGRIND_MAKE_MEM_UNDEFINED</varname> and 2027 <varname>VALGRIND_MAKE_MEM_DEFINED</varname>. 2028 These mark address ranges as completely inaccessible, 2029 accessible but containing undefined data, and accessible and 2030 containing defined data, respectively. They return -1, when 2031 run on Valgrind and 0 otherwise.</para> 2032 </listitem> 2033 2034 <listitem> 2035 <para><varname>VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE</varname>. 2036 This is just like <varname>VALGRIND_MAKE_MEM_DEFINED</varname> but only 2037 affects those bytes that are already addressable.</para> 2038 </listitem> 2039 2040 <listitem> 2041 <para><varname>VALGRIND_CHECK_MEM_IS_ADDRESSABLE</varname> and 2042 <varname>VALGRIND_CHECK_MEM_IS_DEFINED</varname>: check immediately 2043 whether or not the given address range has the relevant property, 2044 and if not, print an error message. Also, for the convenience of 2045 the client, returns zero if the relevant property holds; otherwise, 2046 the returned value is the address of the first byte for which the 2047 property is not true. Always returns 0 when not run on 2048 Valgrind.</para> 2049 </listitem> 2050 2051 <listitem> 2052 <para><varname>VALGRIND_CHECK_VALUE_IS_DEFINED</varname>: a quick and easy 2053 way to find out whether Valgrind thinks a particular value 2054 (lvalue, to be precise) is addressable and defined. Prints an error 2055 message if not. It has no return value.</para> 2056 </listitem> 2057 2058 <listitem> 2059 <para><varname>VALGRIND_DO_LEAK_CHECK</varname>: does a full memory leak 2060 check (like <option>--leak-check=full</option>) right now. 2061 This is useful for incrementally checking for leaks between arbitrary 2062 places in the program's execution. It has no return value.</para> 2063 </listitem> 2064 2065 <listitem> 2066 <para><varname>VALGRIND_DO_ADDED_LEAK_CHECK</varname>: same as 2067 <varname> VALGRIND_DO_LEAK_CHECK</varname> but only shows the 2068 entries for which there was an increase in leaked bytes or leaked 2069 number of blocks since the previous leak search. It has no return 2070 value.</para> 2071 </listitem> 2072 2073 <listitem> 2074 <para><varname>VALGRIND_DO_CHANGED_LEAK_CHECK</varname>: same as 2075 <varname>VALGRIND_DO_LEAK_CHECK</varname> but only shows the 2076 entries for which there was an increase or decrease in leaked 2077 bytes or leaked number of blocks since the previous leak search. It 2078 has no return value.</para> 2079 </listitem> 2080 2081 <listitem> 2082 <para><varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>: like 2083 <varname>VALGRIND_DO_LEAK_CHECK</varname>, except it produces only a leak 2084 summary (like <option>--leak-check=summary</option>). 2085 It has no return value.</para> 2086 </listitem> 2087 2088 <listitem> 2089 <para><varname>VALGRIND_COUNT_LEAKS</varname>: fills in the four 2090 arguments with the number of bytes of memory found by the previous 2091 leak check to be leaked (i.e. the sum of direct leaks and indirect leaks), 2092 dubious, reachable and suppressed. This is useful in test harness code, 2093 after calling <varname>VALGRIND_DO_LEAK_CHECK</varname> or 2094 <varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>.</para> 2095 </listitem> 2096 2097 <listitem> 2098 <para><varname>VALGRIND_COUNT_LEAK_BLOCKS</varname>: identical to 2099 <varname>VALGRIND_COUNT_LEAKS</varname> except that it returns the 2100 number of blocks rather than the number of bytes in each 2101 category.</para> 2102 </listitem> 2103 2104 <listitem> 2105 <para><varname>VALGRIND_GET_VBITS</varname> and 2106 <varname>VALGRIND_SET_VBITS</varname>: allow you to get and set the 2107 V (validity) bits for an address range. You should probably only 2108 set V bits that you have got with 2109 <varname>VALGRIND_GET_VBITS</varname>. Only for those who really 2110 know what they are doing.</para> 2111 </listitem> 2112 2113 <listitem> 2114 <para><varname>VALGRIND_CREATE_BLOCK</varname> and 2115 <varname>VALGRIND_DISCARD</varname>. <varname>VALGRIND_CREATE_BLOCK</varname> 2116 takes an address, a number of bytes and a character string. The 2117 specified address range is then associated with that string. When 2118 Memcheck reports an invalid access to an address in the range, it 2119 will describe it in terms of this block rather than in terms of 2120 any other block it knows about. Note that the use of this macro 2121 does not actually change the state of memory in any way -- it 2122 merely gives a name for the range. 2123 </para> 2124 2125 <para>At some point you may want Memcheck to stop reporting errors 2126 in terms of the block named 2127 by <varname>VALGRIND_CREATE_BLOCK</varname>. To make this 2128 possible, <varname>VALGRIND_CREATE_BLOCK</varname> returns a 2129 "block handle", which is a C <varname>int</varname> value. You 2130 can pass this block handle to <varname>VALGRIND_DISCARD</varname>. 2131 After doing so, Valgrind will no longer relate addressing errors 2132 in the specified range to the block. Passing invalid handles to 2133 <varname>VALGRIND_DISCARD</varname> is harmless. 2134 </para> 2135 </listitem> 2136 2137</itemizedlist> 2138 2139</sect1> 2140 2141 2142 2143 2144<sect1 id="mc-manual.mempools" xreflabel="Memory Pools"> 2145<title>Memory Pools: describing and working with custom allocators</title> 2146 2147<para>Some programs use custom memory allocators, often for performance 2148reasons. Left to itself, Memcheck is unable to understand the 2149behaviour of custom allocation schemes as well as it understands the 2150standard allocators, and so may miss errors and leaks in your program. What 2151this section describes is a way to give Memcheck enough of a description of 2152your custom allocator that it can make at least some sense of what is 2153happening.</para> 2154 2155<para>There are many different sorts of custom allocator, so Memcheck 2156attempts to reason about them using a loose, abstract model. We 2157use the following terminology when describing custom allocation 2158systems:</para> 2159 2160<itemizedlist> 2161 <listitem> 2162 <para>Custom allocation involves a set of independent "memory pools". 2163 </para> 2164 </listitem> 2165 <listitem> 2166 <para>Memcheck's notion of a a memory pool consists of a single "anchor 2167 address" and a set of non-overlapping "chunks" associated with the 2168 anchor address.</para> 2169 </listitem> 2170 <listitem> 2171 <para>Typically a pool's anchor address is the address of a 2172 book-keeping "header" structure.</para> 2173 </listitem> 2174 <listitem> 2175 <para>Typically the pool's chunks are drawn from a contiguous 2176 "superblock" acquired through the system 2177 <function>malloc</function> or 2178 <function>mmap</function>.</para> 2179 </listitem> 2180 2181</itemizedlist> 2182 2183<para>Keep in mind that the last two points above say "typically": the 2184Valgrind mempool client request API is intentionally vague about the 2185exact structure of a mempool. There is no specific mention made of 2186headers or superblocks. Nevertheless, the following picture may help 2187elucidate the intention of the terms in the API:</para> 2188 2189<programlisting><![CDATA[ 2190 "pool" 2191 (anchor address) 2192 | 2193 v 2194 +--------+---+ 2195 | header | o | 2196 +--------+-|-+ 2197 | 2198 v superblock 2199 +------+---+--------------+---+------------------+ 2200 | |rzB| allocation |rzB| | 2201 +------+---+--------------+---+------------------+ 2202 ^ ^ 2203 | | 2204 "addr" "addr"+"size" 2205]]></programlisting> 2206 2207<para> 2208Note that the header and the superblock may be contiguous or 2209discontiguous, and there may be multiple superblocks associated with a 2210single header; such variations are opaque to Memcheck. The API 2211only requires that your allocation scheme can present sensible values 2212of "pool", "addr" and "size".</para> 2213 2214<para> 2215Typically, before making client requests related to mempools, a client 2216program will have allocated such a header and superblock for their 2217mempool, and marked the superblock NOACCESS using the 2218<varname>VALGRIND_MAKE_MEM_NOACCESS</varname> client request.</para> 2219 2220<para> 2221When dealing with mempools, the goal is to maintain a particular 2222invariant condition: that Memcheck believes the unallocated portions 2223of the pool's superblock (including redzones) are NOACCESS. To 2224maintain this invariant, the client program must ensure that the 2225superblock starts out in that state; Memcheck cannot make it so, since 2226Memcheck never explicitly learns about the superblock of a pool, only 2227the allocated chunks within the pool.</para> 2228 2229<para> 2230Once the header and superblock for a pool are established and properly 2231marked, there are a number of client requests programs can use to 2232inform Memcheck about changes to the state of a mempool:</para> 2233 2234<itemizedlist> 2235 2236 <listitem> 2237 <para> 2238 <varname>VALGRIND_CREATE_MEMPOOL(pool, rzB, is_zeroed)</varname>: 2239 This request registers the address <varname>pool</varname> as the anchor 2240 address for a memory pool. It also provides a size 2241 <varname>rzB</varname>, specifying how large the redzones placed around 2242 chunks allocated from the pool should be. Finally, it provides an 2243 <varname>is_zeroed</varname> argument that specifies whether the pool's 2244 chunks are zeroed (more precisely: defined) when allocated. 2245 </para> 2246 <para> 2247 Upon completion of this request, no chunks are associated with the 2248 pool. The request simply tells Memcheck that the pool exists, so that 2249 subsequent calls can refer to it as a pool. 2250 </para> 2251 </listitem> 2252 2253 <listitem> 2254 <para><varname>VALGRIND_DESTROY_MEMPOOL(pool)</varname>: 2255 This request tells Memcheck that a pool is being torn down. Memcheck 2256 then removes all records of chunks associated with the pool, as well 2257 as its record of the pool's existence. While destroying its records of 2258 a mempool, Memcheck resets the redzones of any live chunks in the pool 2259 to NOACCESS. 2260 </para> 2261 </listitem> 2262 2263 <listitem> 2264 <para><varname>VALGRIND_MEMPOOL_ALLOC(pool, addr, size)</varname>: 2265 This request informs Memcheck that a <varname>size</varname>-byte chunk 2266 has been allocated at <varname>addr</varname>, and associates the chunk with the 2267 specified 2268 <varname>pool</varname>. If the pool was created with nonzero 2269 <varname>rzB</varname> redzones, Memcheck will mark the 2270 <varname>rzB</varname> bytes before and after the chunk as NOACCESS. If 2271 the pool was created with the <varname>is_zeroed</varname> argument set, 2272 Memcheck will mark the chunk as DEFINED, otherwise Memcheck will mark 2273 the chunk as UNDEFINED. 2274 </para> 2275 </listitem> 2276 2277 <listitem> 2278 <para><varname>VALGRIND_MEMPOOL_FREE(pool, addr)</varname>: 2279 This request informs Memcheck that the chunk at <varname>addr</varname> 2280 should no longer be considered allocated. Memcheck will mark the chunk 2281 associated with <varname>addr</varname> as NOACCESS, and delete its 2282 record of the chunk's existence. 2283 </para> 2284 </listitem> 2285 2286 <listitem> 2287 <para><varname>VALGRIND_MEMPOOL_TRIM(pool, addr, size)</varname>: 2288 This request trims the chunks associated with <varname>pool</varname>. 2289 The request only operates on chunks associated with 2290 <varname>pool</varname>. Trimming is formally defined as:</para> 2291 <itemizedlist> 2292 <listitem> 2293 <para> All chunks entirely inside the range 2294 <varname>addr..(addr+size-1)</varname> are preserved.</para> 2295 </listitem> 2296 <listitem> 2297 <para>All chunks entirely outside the range 2298 <varname>addr..(addr+size-1)</varname> are discarded, as though 2299 <varname>VALGRIND_MEMPOOL_FREE</varname> was called on them. </para> 2300 </listitem> 2301 <listitem> 2302 <para>All other chunks must intersect with the range 2303 <varname>addr..(addr+size-1)</varname>; areas outside the 2304 intersection are marked as NOACCESS, as though they had been 2305 independently freed with 2306 <varname>VALGRIND_MEMPOOL_FREE</varname>.</para> 2307 </listitem> 2308 </itemizedlist> 2309 <para>This is a somewhat rare request, but can be useful in 2310 implementing the type of mass-free operations common in custom 2311 LIFO allocators.</para> 2312 </listitem> 2313 2314 <listitem> 2315 <para><varname>VALGRIND_MOVE_MEMPOOL(poolA, poolB)</varname>: This 2316 request informs Memcheck that the pool previously anchored at 2317 address <varname>poolA</varname> has moved to anchor address 2318 <varname>poolB</varname>. This is a rare request, typically only needed 2319 if you <function>realloc</function> the header of a mempool.</para> 2320 <para>No memory-status bits are altered by this request.</para> 2321 </listitem> 2322 2323 <listitem> 2324 <para> 2325 <varname>VALGRIND_MEMPOOL_CHANGE(pool, addrA, addrB, 2326 size)</varname>: This request informs Memcheck that the chunk 2327 previously allocated at address <varname>addrA</varname> within 2328 <varname>pool</varname> has been moved and/or resized, and should be 2329 changed to cover the region <varname>addrB..(addrB+size-1)</varname>. This 2330 is a rare request, typically only needed if you 2331 <function>realloc</function> a superblock or wish to extend a chunk 2332 without changing its memory-status bits. 2333 </para> 2334 <para>No memory-status bits are altered by this request. 2335 </para> 2336 </listitem> 2337 2338 <listitem> 2339 <para><varname>VALGRIND_MEMPOOL_EXISTS(pool)</varname>: 2340 This request informs the caller whether or not Memcheck is currently 2341 tracking a mempool at anchor address <varname>pool</varname>. It 2342 evaluates to 1 when there is a mempool associated with that address, 0 2343 otherwise. This is a rare request, only useful in circumstances when 2344 client code might have lost track of the set of active mempools. 2345 </para> 2346 </listitem> 2347 2348</itemizedlist> 2349 2350</sect1> 2351 2352 2353 2354 2355 2356 2357 2358<sect1 id="mc-manual.mpiwrap" xreflabel="MPI Wrappers"> 2359<title>Debugging MPI Parallel Programs with Valgrind</title> 2360 2361<para>Memcheck supports debugging of distributed-memory applications 2362which use the MPI message passing standard. This support consists of a 2363library of wrapper functions for the 2364<computeroutput>PMPI_*</computeroutput> interface. When incorporated 2365into the application's address space, either by direct linking or by 2366<computeroutput>LD_PRELOAD</computeroutput>, the wrappers intercept 2367calls to <computeroutput>PMPI_Send</computeroutput>, 2368<computeroutput>PMPI_Recv</computeroutput>, etc. They then 2369use client requests to inform Memcheck of memory state changes caused 2370by the function being wrapped. This reduces the number of false 2371positives that Memcheck otherwise typically reports for MPI 2372applications.</para> 2373 2374<para>The wrappers also take the opportunity to carefully check 2375size and definedness of buffers passed as arguments to MPI functions, hence 2376detecting errors such as passing undefined data to 2377<computeroutput>PMPI_Send</computeroutput>, or receiving data into a 2378buffer which is too small.</para> 2379 2380<para>Unlike most of the rest of Valgrind, the wrapper library is subject to a 2381BSD-style license, so you can link it into any code base you like. 2382See the top of <computeroutput>mpi/libmpiwrap.c</computeroutput> 2383for license details.</para> 2384 2385 2386<sect2 id="mc-manual.mpiwrap.build" xreflabel="Building MPI Wrappers"> 2387<title>Building and installing the wrappers</title> 2388 2389<para> The wrapper library will be built automatically if possible. 2390Valgrind's configure script will look for a suitable 2391<computeroutput>mpicc</computeroutput> to build it with. This must be 2392the same <computeroutput>mpicc</computeroutput> you use to build the 2393MPI application you want to debug. By default, Valgrind tries 2394<computeroutput>mpicc</computeroutput>, but you can specify a 2395different one by using the configure-time option 2396<option>--with-mpicc</option>. Currently the 2397wrappers are only buildable with 2398<computeroutput>mpicc</computeroutput>s which are based on GNU 2399GCC or Intel's C++ Compiler.</para> 2400 2401<para>Check that the configure script prints a line like this:</para> 2402 2403<programlisting><![CDATA[ 2404checking for usable MPI2-compliant mpicc and mpi.h... yes, mpicc 2405]]></programlisting> 2406 2407<para>If it says <computeroutput>... no</computeroutput>, your 2408<computeroutput>mpicc</computeroutput> has failed to compile and link 2409a test MPI2 program.</para> 2410 2411<para>If the configure test succeeds, continue in the usual way with 2412<computeroutput>make</computeroutput> and <computeroutput>make 2413install</computeroutput>. The final install tree should then contain 2414<computeroutput>libmpiwrap-<platform>.so</computeroutput>. 2415</para> 2416 2417<para>Compile up a test MPI program (eg, MPI hello-world) and try 2418this:</para> 2419 2420<programlisting><![CDATA[ 2421LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 2422 mpirun [args] $prefix/bin/valgrind ./hello 2423]]></programlisting> 2424 2425<para>You should see something similar to the following</para> 2426 2427<programlisting><![CDATA[ 2428valgrind MPI wrappers 31901: Active for pid 31901 2429valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options 2430]]></programlisting> 2431 2432<para>repeated for every process in the group. If you do not see 2433these, there is an build/installation problem of some kind.</para> 2434 2435<para> The MPI functions to be wrapped are assumed to be in an ELF 2436shared object with soname matching 2437<computeroutput>libmpi.so*</computeroutput>. This is known to be 2438correct at least for Open MPI and Quadrics MPI, and can easily be 2439changed if required.</para> 2440</sect2> 2441 2442 2443<sect2 id="mc-manual.mpiwrap.gettingstarted" 2444 xreflabel="Getting started with MPI Wrappers"> 2445<title>Getting started</title> 2446 2447<para>Compile your MPI application as usual, taking care to link it 2448using the same <computeroutput>mpicc</computeroutput> that your 2449Valgrind build was configured with.</para> 2450 2451<para> 2452Use the following basic scheme to run your application on Valgrind with 2453the wrappers engaged:</para> 2454 2455<programlisting><![CDATA[ 2456MPIWRAP_DEBUG=[wrapper-args] \ 2457 LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 2458 mpirun [mpirun-args] \ 2459 $prefix/bin/valgrind [valgrind-args] \ 2460 [application] [app-args] 2461]]></programlisting> 2462 2463<para>As an alternative to 2464<computeroutput>LD_PRELOAD</computeroutput>ing 2465<computeroutput>libmpiwrap-<platform>.so</computeroutput>, you can 2466simply link it to your application if desired. This should not disturb 2467native behaviour of your application in any way.</para> 2468</sect2> 2469 2470 2471<sect2 id="mc-manual.mpiwrap.controlling" 2472 xreflabel="Controlling the MPI Wrappers"> 2473<title>Controlling the wrapper library</title> 2474 2475<para>Environment variable 2476<computeroutput>MPIWRAP_DEBUG</computeroutput> is consulted at 2477startup. The default behaviour is to print a starting banner</para> 2478 2479<programlisting><![CDATA[ 2480valgrind MPI wrappers 16386: Active for pid 16386 2481valgrind MPI wrappers 16386: Try MPIWRAP_DEBUG=help for possible options 2482]]></programlisting> 2483 2484<para> and then be relatively quiet.</para> 2485 2486<para>You can give a list of comma-separated options in 2487<computeroutput>MPIWRAP_DEBUG</computeroutput>. These are</para> 2488 2489<itemizedlist> 2490 <listitem> 2491 <para><computeroutput>verbose</computeroutput>: 2492 show entries/exits of all wrappers. Also show extra 2493 debugging info, such as the status of outstanding 2494 <computeroutput>MPI_Request</computeroutput>s resulting 2495 from uncompleted <computeroutput>MPI_Irecv</computeroutput>s.</para> 2496 </listitem> 2497 <listitem> 2498 <para><computeroutput>quiet</computeroutput>: 2499 opposite of <computeroutput>verbose</computeroutput>, only print 2500 anything when the wrappers want 2501 to report a detected programming error, or in case of catastrophic 2502 failure of the wrappers.</para> 2503 </listitem> 2504 <listitem> 2505 <para><computeroutput>warn</computeroutput>: 2506 by default, functions which lack proper wrappers 2507 are not commented on, just silently 2508 ignored. This causes a warning to be printed for each unwrapped 2509 function used, up to a maximum of three warnings per function.</para> 2510 </listitem> 2511 <listitem> 2512 <para><computeroutput>strict</computeroutput>: 2513 print an error message and abort the program if 2514 a function lacking a wrapper is used.</para> 2515 </listitem> 2516</itemizedlist> 2517 2518<para> If you want to use Valgrind's XML output facility 2519(<option>--xml=yes</option>), you should pass 2520<computeroutput>quiet</computeroutput> in 2521<computeroutput>MPIWRAP_DEBUG</computeroutput> so as to get rid of any 2522extraneous printing from the wrappers.</para> 2523 2524</sect2> 2525 2526 2527<sect2 id="mc-manual.mpiwrap.limitations.functions" 2528 xreflabel="Functions: Abilities and Limitations"> 2529<title>Functions</title> 2530 2531<para>All MPI2 functions except 2532<computeroutput>MPI_Wtick</computeroutput>, 2533<computeroutput>MPI_Wtime</computeroutput> and 2534<computeroutput>MPI_Pcontrol</computeroutput> have wrappers. The 2535first two are not wrapped because they return a 2536<computeroutput>double</computeroutput>, which Valgrind's 2537function-wrap mechanism cannot handle (but it could easily be 2538extended to do so). <computeroutput>MPI_Pcontrol</computeroutput> cannot be 2539wrapped as it has variable arity: 2540<computeroutput>int MPI_Pcontrol(const int level, ...)</computeroutput></para> 2541 2542<para>Most functions are wrapped with a default wrapper which does 2543nothing except complain or abort if it is called, depending on 2544settings in <computeroutput>MPIWRAP_DEBUG</computeroutput> listed 2545above. The following functions have "real", do-something-useful 2546wrappers:</para> 2547 2548<programlisting><![CDATA[ 2549PMPI_Send PMPI_Bsend PMPI_Ssend PMPI_Rsend 2550 2551PMPI_Recv PMPI_Get_count 2552 2553PMPI_Isend PMPI_Ibsend PMPI_Issend PMPI_Irsend 2554 2555PMPI_Irecv 2556PMPI_Wait PMPI_Waitall 2557PMPI_Test PMPI_Testall 2558 2559PMPI_Iprobe PMPI_Probe 2560 2561PMPI_Cancel 2562 2563PMPI_Sendrecv 2564 2565PMPI_Type_commit PMPI_Type_free 2566 2567PMPI_Pack PMPI_Unpack 2568 2569PMPI_Bcast PMPI_Gather PMPI_Scatter PMPI_Alltoall 2570PMPI_Reduce PMPI_Allreduce PMPI_Op_create 2571 2572PMPI_Comm_create PMPI_Comm_dup PMPI_Comm_free PMPI_Comm_rank PMPI_Comm_size 2573 2574PMPI_Error_string 2575PMPI_Init PMPI_Initialized PMPI_Finalize 2576]]></programlisting> 2577 2578<para> A few functions such as 2579<computeroutput>PMPI_Address</computeroutput> are listed as 2580<computeroutput>HAS_NO_WRAPPER</computeroutput>. They have no wrapper 2581at all as there is nothing worth checking, and giving a no-op wrapper 2582would reduce performance for no reason.</para> 2583 2584<para> Note that the wrapper library itself can itself generate large 2585numbers of calls to the MPI implementation, especially when walking 2586complex types. The most common functions called are 2587<computeroutput>PMPI_Extent</computeroutput>, 2588<computeroutput>PMPI_Type_get_envelope</computeroutput>, 2589<computeroutput>PMPI_Type_get_contents</computeroutput>, and 2590<computeroutput>PMPI_Type_free</computeroutput>. </para> 2591</sect2> 2592 2593<sect2 id="mc-manual.mpiwrap.limitations.types" 2594 xreflabel="Types: Abilities and Limitations"> 2595<title>Types</title> 2596 2597<para> MPI-1.1 structured types are supported, and walked exactly. 2598The currently supported combiners are 2599<computeroutput>MPI_COMBINER_NAMED</computeroutput>, 2600<computeroutput>MPI_COMBINER_CONTIGUOUS</computeroutput>, 2601<computeroutput>MPI_COMBINER_VECTOR</computeroutput>, 2602<computeroutput>MPI_COMBINER_HVECTOR</computeroutput> 2603<computeroutput>MPI_COMBINER_INDEXED</computeroutput>, 2604<computeroutput>MPI_COMBINER_HINDEXED</computeroutput> and 2605<computeroutput>MPI_COMBINER_STRUCT</computeroutput>. This should 2606cover all MPI-1.1 types. The mechanism (function 2607<computeroutput>walk_type</computeroutput>) should extend easily to 2608cover MPI2 combiners.</para> 2609 2610<para>MPI defines some named structured types 2611(<computeroutput>MPI_FLOAT_INT</computeroutput>, 2612<computeroutput>MPI_DOUBLE_INT</computeroutput>, 2613<computeroutput>MPI_LONG_INT</computeroutput>, 2614<computeroutput>MPI_2INT</computeroutput>, 2615<computeroutput>MPI_SHORT_INT</computeroutput>, 2616<computeroutput>MPI_LONG_DOUBLE_INT</computeroutput>) which are pairs 2617of some basic type and a C <computeroutput>int</computeroutput>. 2618Unfortunately the MPI specification makes it impossible to look inside 2619these types and see where the fields are. Therefore these wrappers 2620assume the types are laid out as <computeroutput>struct { float val; 2621int loc; }</computeroutput> (for 2622<computeroutput>MPI_FLOAT_INT</computeroutput>), etc, and act 2623accordingly. This appears to be correct at least for Open MPI 1.0.2 2624and for Quadrics MPI.</para> 2625 2626<para>If <computeroutput>strict</computeroutput> is an option specified 2627in <computeroutput>MPIWRAP_DEBUG</computeroutput>, the application 2628will abort if an unhandled type is encountered. Otherwise, the 2629application will print a warning message and continue.</para> 2630 2631<para>Some effort is made to mark/check memory ranges corresponding to 2632arrays of values in a single pass. This is important for performance 2633since asking Valgrind to mark/check any range, no matter how small, 2634carries quite a large constant cost. This optimisation is applied to 2635arrays of primitive types (<computeroutput>double</computeroutput>, 2636<computeroutput>float</computeroutput>, 2637<computeroutput>int</computeroutput>, 2638<computeroutput>long</computeroutput>, <computeroutput>long 2639long</computeroutput>, <computeroutput>short</computeroutput>, 2640<computeroutput>char</computeroutput>, and <computeroutput>long 2641double</computeroutput> on platforms where <computeroutput>sizeof(long 2642double) == 8</computeroutput>). For arrays of all other types, the 2643wrappers handle each element individually and so there can be a very 2644large performance cost.</para> 2645 2646</sect2> 2647 2648 2649<sect2 id="mc-manual.mpiwrap.writingwrappers" 2650 xreflabel="Writing new MPI Wrappers"> 2651<title>Writing new wrappers</title> 2652 2653<para> 2654For the most part the wrappers are straightforward. The only 2655significant complexity arises with nonblocking receives.</para> 2656 2657<para>The issue is that <computeroutput>MPI_Irecv</computeroutput> 2658states the recv buffer and returns immediately, giving a handle 2659(<computeroutput>MPI_Request</computeroutput>) for the transaction. 2660Later the user will have to poll for completion with 2661<computeroutput>MPI_Wait</computeroutput> etc, and when the 2662transaction completes successfully, the wrappers have to paint the 2663recv buffer. But the recv buffer details are not presented to 2664<computeroutput>MPI_Wait</computeroutput> -- only the handle is. The 2665library therefore maintains a shadow table which associates 2666uncompleted <computeroutput>MPI_Request</computeroutput>s with the 2667corresponding buffer address/count/type. When an operation completes, 2668the table is searched for the associated address/count/type info, and 2669memory is marked accordingly.</para> 2670 2671<para>Access to the table is guarded by a (POSIX pthreads) lock, so as 2672to make the library thread-safe.</para> 2673 2674<para>The table is allocated with 2675<computeroutput>malloc</computeroutput> and never 2676<computeroutput>free</computeroutput>d, so it will show up in leak 2677checks.</para> 2678 2679<para>Writing new wrappers should be fairly easy. The source file is 2680<computeroutput>mpi/libmpiwrap.c</computeroutput>. If possible, 2681find an existing wrapper for a function of similar behaviour to the 2682one you want to wrap, and use it as a starting point. The wrappers 2683are organised in sections in the same order as the MPI 1.1 spec, to 2684aid navigation. When adding a wrapper, remember to comment out the 2685definition of the default wrapper in the long list of defaults at the 2686bottom of the file (do not remove it, just comment it out).</para> 2687</sect2> 2688 2689<sect2 id="mc-manual.mpiwrap.whattoexpect" 2690 xreflabel="What to expect with MPI Wrappers"> 2691<title>What to expect when using the wrappers</title> 2692 2693<para>The wrappers should reduce Memcheck's false-error rate on MPI 2694applications. Because the wrapping is done at the MPI interface, 2695there will still potentially be a large number of errors reported in 2696the MPI implementation below the interface. The best you can do is 2697try to suppress them.</para> 2698 2699<para>You may also find that the input-side (buffer 2700length/definedness) checks find errors in your MPI use, for example 2701passing too short a buffer to 2702<computeroutput>MPI_Recv</computeroutput>.</para> 2703 2704<para>Functions which are not wrapped may increase the false 2705error rate. A possible approach is to run with 2706<computeroutput>MPI_DEBUG</computeroutput> containing 2707<computeroutput>warn</computeroutput>. This will show you functions 2708which lack proper wrappers but which are nevertheless used. You can 2709then write wrappers for them. 2710</para> 2711 2712<para>A known source of potential false errors are the 2713<computeroutput>PMPI_Reduce</computeroutput> family of functions, when 2714using a custom (user-defined) reduction function. In a reduction 2715operation, each node notionally sends data to a "central point" which 2716uses the specified reduction function to merge the data items into a 2717single item. Hence, in general, data is passed between nodes and fed 2718to the reduction function, but the wrapper library cannot mark the 2719transferred data as initialised before it is handed to the reduction 2720function, because all that happens "inside" the 2721<computeroutput>PMPI_Reduce</computeroutput> call. As a result you 2722may see false positives reported in your reduction function.</para> 2723 2724</sect2> 2725 2726</sect1> 2727 2728 2729 2730 2731 2732</chapter> 2733