1<?xml version="1.0"?> <!-- -*- sgml -*- --> 2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" 4[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> 5 6<chapter id="cl-format" xreflabel="Callgrind Format Specification"> 7<title>Callgrind Format Specification</title> 8 9<para>This chapter describes the Callgrind Format, Version 1.</para> 10 11<para>The format description is meant for the user to be able to understand the 12file contents; but more important, it is given for authors of measurement or 13visualization tools to be able to write and read this format.</para> 14 15<sect1 id="cl-format.overview" xreflabel="Overview"> 16<title>Overview</title> 17 18<para>The profile data format is ASCII based. 19It is written by Callgrind, and it is upwards compatible 20to the format used by Cachegrind (ie. Cachegrind uses a subset). It can 21be read by callgrind_annotate and KCachegrind.</para> 22 23<para>This chapter gives on overview of format features and examples. 24For detailed syntax, look at the format reference.</para> 25 26<sect2 id="cl-format.overview.basics" xreflabel="Basic Structure"> 27<title>Basic Structure</title> 28 29<para>To uniquely specify that a file is a callgrind profile, it 30should add "# callgrind format" as first line. This is optional but 31recommended for easy format detection.</para> 32 33<para>Each file has a header part of an arbitrary number of lines of the 34format "key: value". After the header, lines specifying profile costs 35follow. Everywhere, comments on own lines starting with '#' are allowed. 36The header lines with keys "positions" and "events" define 37the meaning of cost lines in the second part of the file: the value of 38"positions" is a list of subpositions, and the value of "events" is a list 39of event type names. Cost lines consist of subpositions followed by 64-bit 40counters for the events, in the order specified by the "positions" and "events" 41header line.</para> 42 43<para>The "events" header line is always required in contrast to the optional 44line for "positions", which defaults to "line", i.e. a line number of some 45source file. In addition, the second part of the file contains position 46specifications of the form "spec=name". "spec" can be e.g. "fn" for a 47function name or "fl" for a file name. Cost lines are always related to 48the function/file specifications given directly before.</para> 49 50</sect2> 51 52<sect2 id="cl-format.overview.example1" xreflabel="Simple Example"> 53<title>Simple Example</title> 54 55<para>The event names in the following example are quite arbitrary, and are not 56related to event names used by Callgrind. Especially, cycle counts matching 57real processors probably will never be generated by any Valgrind tools, as these 58are bound to simulations of simple machine models for acceptable slowdown. 59However, any profiling tool could use the format described in this chapter.</para> 60 61<para> 62<screen># callgrind format 63events: Cycles Instructions Flops 64fl=file.f 65fn=main 6615 90 14 2 6716 20 12</screen></para> 68 69<para>The above example gives profile information for event types "Cycles", 70"Instructions", and "Flops". Thus, cost lines give the number of CPU cycles 71passed by, number of executed instructions, and number of floating point 72operations executed while running code corresponding to some source 73position. As there is no line specifying the value of "positions", it defaults 74to "line", which means that the first number of a cost line is always a line 75number.</para> 76 77<para>Thus, the first cost line specifies that in line 15 of source file 78<filename>file.f</filename> there is code belonging to function 79<function>main</function>. While running, 90 CPU cycles passed by, and 2 of 80the 14 instructions executed were floating point operations. Similarly, the 81next line specifies that there were 12 instructions executed in the context 82of function <function>main</function> which can be related to line 16 in 83file <filename>file.f</filename>, taking 20 CPU cycles. If a cost line 84specifies less event counts than given in the "events" line, the rest is 85assumed to be zero. I.e. there was no floating point instruction executed 86relating to line 16.</para> 87 88<para>Note that regular cost lines always give self (also called exclusive) 89cost of code at a given position. If you specify multiple cost lines for the 90same position, these will be summed up. On the other hand, in the example above 91there is no specification of how many times function 92<function>main</function> actually was 93called: profile data only contains sums.</para> 94 95</sect2> 96 97 98<sect2 id="cl-format.overview.associations" xreflabel="Associations"> 99<title>Associations</title> 100 101<para>The most important extension to the original format of Cachegrind is the 102ability to specify call relationship among functions. More generally, you 103specify associations among positions. For this, the second part of the 104file also can contain association specifications. These look similar to 105position specifications, but consist of two lines. For calls, the format 106looks like 107<screen> 108 calls=(Call Count) (Target position) 109 (Source position) (Inclusive cost of call) 110</screen></para> 111 112<para>The destination only specifies subpositions like line number. Therefore, 113to be able to specify a call to another function in another source file, you 114have to precede the above lines with a "cfn=" specification for the name of the 115called function, and optionally a "cfi=" specification if the function is in 116another source file ("cfl=" is an alternative specification for "cfi=" because 117of historical reasons, and both should be supported by format readers). 118The second line looks like a regular cost line with the difference 119that inclusive cost spent inside of the function call has to be specified.</para> 120 121<para>Other associations are for example (conditional) jumps. See the 122reference below for details.</para> 123 124</sect2> 125 126 127<sect2 id="cl-format.overview.example2" xreflabel="Extended Example"> 128<title>Extended Example</title> 129 130<para>The following example shows 3 functions, <function>main</function>, 131<function>func1</function>, and <function>func2</function>. Function 132<function>main</function> calls <function>func1</function> once and 133<function>func2</function> 3 times. <function>func1</function> calls 134<function>func2</function> 2 times. 135 136<screen># callgrind format 137events: Instructions 138 139fl=file1.c 140fn=main 14116 20 142cfn=func1 143calls=1 50 14416 400 145cfi=file2.c 146cfn=func2 147calls=3 20 14816 400 149 150fn=func1 15151 100 152cfi=file2.c 153cfn=func2 154calls=2 20 15551 300 156 157fl=file2.c 158fn=func2 15920 700</screen></para> 160 161<para>One can see that in <function>main</function> only code from line 16 162is executed where also the other functions are called. Inclusive cost of 163<function>main</function> is 820, which is the sum of self cost 20 and costs 164spent in the calls: 400 for the single call to <function>func1</function> 165and 400 as sum for the three calls to <function>func2</function>.</para> 166 167<para>Function <function>func1</function> is located in 168<filename>file1.c</filename>, the same as <function>main</function>. 169Therefore, a "cfi=" specification for the call to <function>func1</function> 170is not needed. The function <function>func1</function> only consists of code 171at line 51 of <filename>file1.c</filename>, where <function>func2</function> 172is called.</para> 173 174</sect2> 175 176 177<sect2 id="cl-format.overview.compression1" xreflabel="Name Compression"> 178<title>Name Compression</title> 179 180<para>With the introduction of association specifications like calls it is 181needed to specify the same function or same file name multiple times. As 182absolute filenames or symbol names in C++ can be quite long, it is advantageous 183to be able to specify integer IDs for position specifications. 184Here, the term "position" corresponds to a file name (source or object file) 185or function name.</para> 186 187<para>To support name compression, a position specification can be not only of 188the format "spec=name", but also "spec=(ID) name" to specify a mapping of an 189integer ID to a name, and "spec=(ID)" to reference a previously defined ID 190mapping. There is a separate ID mapping for each position specification, 191i.e. you can use ID 1 for both a file name and a symbol name.</para> 192 193<para>With string compression, the example from above looks like this: 194<screen># callgrind format 195events: Instructions 196 197fl=(1) file1.c 198fn=(1) main 19916 20 200cfn=(2) func1 201calls=1 50 20216 400 203cfi=(2) file2.c 204cfn=(3) func2 205calls=3 20 20616 400 207 208fn=(2) 20951 100 210cfi=(2) 211cfn=(3) 212calls=2 20 21351 300 214 215fl=(2) 216fn=(3) 21720 700</screen></para> 218 219<para>As position specifications carry no information themselves, but only change 220the meaning of subsequent cost lines or associations, they can appear 221everywhere in the file without any negative consequence. Especially, you can 222define name compression mappings directly after the header, and before any cost 223lines. Thus, the above example can also be written as 224<screen># callgrind format 225events: Instructions 226 227# define file ID mapping 228fl=(1) file1.c 229fl=(2) file2.c 230# define function ID mapping 231fn=(1) main 232fn=(2) func1 233fn=(3) func2 234 235fl=(1) 236fn=(1) 23716 20 238...</screen></para> 239 240</sect2> 241 242 243<sect2 id="cl-format.overview.compression2" xreflabel="Subposition Compression"> 244<title>Subposition Compression</title> 245 246<para>If a Callgrind data file should hold costs for each assembler instruction 247of a program, you specify subposition "instr" in the "positions:" header line, 248and each cost line has to include the address of some instruction. Addresses 249are allowed to have a size of 64 bits to support 64-bit architectures. Thus, 250repeating similar, long addresses for almost every line in the data file can 251enlarge the file size quite significantly, and 252motivates for subposition compression: instead of every cost line starting with 253a 16 character long address, one is allowed to specify relative addresses. 254This relative specification is not only allowed for instruction addresses, but 255also for line numbers; both addresses and line numbers are called "subpositions".</para> 256 257<para>A relative subposition always is based on the corresponding subposition 258of the last cost line, and starts with a "+" to specify a positive difference, 259a "-" to specify a negative difference, or consists of "*" to specify the same 260subposition. Because absolute subpositions always are positive (ie. never 261prefixed by "-"), any relative specification is non-ambiguous; additionally, 262absolute and relative subposition specifications can be mixed freely. 263Assume the following example (subpositions can always be specified 264as hexadecimal numbers, beginning with "0x"): 265<screen># callgrind format 266positions: instr line 267events: ticks 268 269fn=func 2700x80001234 90 1 2710x80001237 90 5 2720x80001238 91 6</screen></para> 273 274<para>With subposition compression, this looks like 275<screen># callgrind format 276positions: instr line 277events: ticks 278 279fn=func 2800x80001234 90 1 281+3 * 5 282+1 +1 6</screen></para> 283 284<para>Remark: For assembler annotation to work, instruction addresses have to 285be corrected to correspond to addresses found in the original binary. I.e. for 286relocatable shared objects, often a load offset has to be subtracted.</para> 287 288</sect2> 289 290 291<sect2 id="cl-format.overview.misc" xreflabel="Miscellaneous"> 292<title>Miscellaneous</title> 293 294<sect3 id="cl-format.overview.misc.summary" xreflabel="Cost Summary Information"> 295<title>Cost Summary Information</title> 296 297<para>For the visualization to be able to show cost percentage, a sum of the 298cost of the full run has to be known. Usually, it is assumed that this is the 299sum of all cost lines in a file. But sometimes, this is not correct. Thus, you 300can specify a "summary:" line in the header giving the full cost for the 301profile run. An import filter may use this to show a progress bar 302while loading a large data file.</para> 303 304</sect3> 305 306<sect3 id="cl-format.overview.misc.events" xreflabel="Long Names for Event Types and inherited Types"> 307<title>Long Names for Event Types and inherited Types</title> 308 309<para>Event types for cost lines are specified in the "events:" line with an 310abbreviated name. For visualization, it makes sense to be able to specify some 311longer, more descriptive name. For an event type "Ir" which means "Instruction 312Fetches", this can be specified the header line 313<screen>event: Ir : Instruction Fetches 314events: Ir Dr</screen></para> 315 316<para>In this example, "Dr" itself has no long name associated. The order of 317"event:" lines and the "events:" line is of no importance. Additionally, 318inherited event types can be introduced for which no raw data is available, but 319which are calculated from given types. Suppose the last example, you could add 320<screen>event: Sum = Ir + Dr</screen> 321to specify an additional event type "Sum", which is calculated by adding costs 322for "Ir and "Dr".</para> 323 324</sect3> 325 326</sect2> 327 328</sect1> 329 330<sect1 id="cl-format.reference" xreflabel="Reference"> 331<title>Reference</title> 332 333<sect2 id="cl-format.reference.grammar" xreflabel="Grammar"> 334<title>Grammar</title> 335 336<para> 337<screen>ProfileDataFile := FormatSpec? FormatVersion? Creator? PartData*</screen> 338<screen>FormatSpec := "# callgrind format\n"</screen> 339<screen>FormatVersion := "version: 1\n"</screen> 340<screen>Creator := "creator:" NoNewLineChar* "\n"</screen> 341<screen>PartData := (HeaderLine "\n")+ (BodyLine "\n")+</screen> 342<screen>HeaderLine := (empty line) 343 | ('#' NoNewLineChar*) 344 | PartDetail 345 | Description 346 | EventSpecification 347 | CostLineDef</screen> 348<screen>PartDetail := TargetCommand | TargetID</screen> 349<screen>TargetCommand := "cmd:" Space* NoNewLineChar*</screen> 350<screen>TargetID := ("pid"|"thread"|"part") ":" Space* Number</screen> 351<screen>Description := "desc:" Space* Name Space* ":" NoNewLineChar*</screen> 352<screen>EventSpecification := "event:" Space* Name InheritedDef? LongNameDef?</screen> 353<screen>InheritedDef := "=" InheritedExpr</screen> 354<screen>InheritedExpr := Name 355 | Number Space* ("*" Space*)? Name 356 | InheritedExpr Space* "+" Space* InheritedExpr</screen> 357<screen>LongNameDef := ":" NoNewLineChar*</screen> 358<screen>CostLineDef := "events:" Space* Name (Space+ Name)* 359 | "positions:" "instr"? (Space+ "line")?</screen> 360<screen>BodyLine := (empty line) 361 | ('#' NoNewLineChar*) 362 | CostLine 363 | PositionSpec 364 | CallSpec 365 | UncondJumpSpec 366 | CondJumpSpec</screen> 367<screen>CostLine := SubPositionList Costs?</screen> 368<screen>SubPositionList := (SubPosition+ Space+)+</screen> 369<screen>SubPosition := Number | "+" Number | "-" Number | "*"</screen> 370<screen>Costs := (Number Space+)+</screen> 371<screen>PositionSpec := Position "=" Space* PositionName</screen> 372<screen>Position := CostPosition | CalledPosition</screen> 373<screen>CostPosition := "ob" | "fl" | "fi" | "fe" | "fn"</screen> 374<screen>CalledPosition := " "cob" | "cfi" | "cfl" | "cfn"</screen> 375<screen>PositionName := ( "(" Number ")" )? (Space* NoNewLineChar* )?</screen> 376<screen>CallSpec := CallLine "\n" CostLine</screen> 377<screen>CallLine := "calls=" Space* Number Space+ SubPositionList</screen> 378<screen>UncondJumpSpec := "jump=" Space* Number Space+ SubPositionList</screen> 379<screen>CondJumpSpec := "jcnd=" Space* Number Space+ Number Space+ SubPositionList</screen> 380<screen>Space := " " | "\t"</screen> 381<screen>Number := HexNumber | (Digit)+</screen> 382<screen>Digit := "0" | ... | "9"</screen> 383<screen>HexNumber := "0x" (Digit | HexChar)+</screen> 384<screen>HexChar := "a" | ... | "f" | "A" | ... | "F"</screen> 385<screen>Name = Alpha (Digit | Alpha)*</screen> 386<screen>Alpha = "a" | ... | "z" | "A" | ... | "Z"</screen> 387<screen>NoNewLineChar := all characters without "\n"</screen> 388</para> 389 390<para>A profile data file ("ProfileDataFile") starts with basic information 391 such as a format marker, the version and creator information, and then has a list of parts, where 392 each part has its own header and body. Parts typically are different threads 393 and/or time spans/phases within a profiled application run.</para> 394 395<para>Note that callgrind_annotate currently only supports profile data files with 396 one part. Callgrind may produce multiple parts for one profile run, but defaults 397 to one output file for each part.</para> 398 399</sect2> 400 401<sect2 id="cl-format.reference.header" xreflabel="Description of Header Lines"> 402<title>Description of Header Lines</title> 403 404<para>Basic information in the first lines of a profile data file:</para> 405 406<itemizedlist> 407 <listitem> 408 <para><computeroutput># callgrind format</computeroutput> [Callgrind]</para> 409 <para>This line specifies that the file is a callgrind profile, 410 and it has to be the first line. It was added late to the 411 format (with Valgrind 3.13) and is optional, as all readers also 412 should work with older callgrind profiles not including this line. 413 However, generation of this line is recommended to allow desktop 414 environments and file managers to uniquely detect the format.</para> 415 </listitem> 416 417 <listitem> 418 <para><computeroutput>version: number</computeroutput> [Callgrind]</para> 419 <para>This is used to distinguish future profile data formats. A 420 major version of 0 or 1 is supposed to be upwards compatible with 421 Cachegrind's format. It is optional; if not appearing, version 1 422 is assumed. Otherwise, it has to follow directly after the format 423 specification (i.e. be the first line if the optional format 424 specification is skipped).</para> 425 </listitem> 426 427 <listitem> 428 <para><computeroutput>creator: string</computeroutput> [Callgrind]</para> 429 <para>This is an arbitrary string to denote the creator of this file. 430 Optional.</para> 431 </listitem> 432 433</itemizedlist> 434 435<para>The header for each part has an arbitrary number of lines of the format 436"key: value". Possible <emphasis>key</emphasis> values for the header are:</para> 437 438<itemizedlist> 439 440 <listitem> 441 <para><computeroutput>pid: process id</computeroutput> [Callgrind]</para> 442 <para>Optional. This specifies the process ID of the supervised application 443 for which this profile was generated.</para> 444 </listitem> 445 446 <listitem> 447 <para><computeroutput>cmd: program name + args</computeroutput> [Cachegrind]</para> 448 <para>Optional. This specifies the full command line of the supervised 449 application for which this profile was generated.</para> 450 </listitem> 451 452 <listitem> 453 <para><computeroutput>part: number</computeroutput> [Callgrind]</para> 454 <para>Optional. This specifies a sequentially incremented number for each dump 455 generated, starting at 1.</para> 456 </listitem> 457 458 <listitem> 459 <para><computeroutput>desc: type: value</computeroutput> [Cachegrind]</para> 460 <para>This specifies various information for this dump. For some 461 types, the semantic is defined, but any description type is allowed. 462 Unknown types should be ignored.</para> 463 <para>There are the types "I1 cache", "D1 cache", "LL cache", which 464 specify parameters used for the cache simulator. These are the only 465 types originally used by Cachegrind. Additionally, Callgrind uses 466 the following types: "Timerange" gives a rough range of the basic 467 block counter, for which the cost of this dump was collected. 468 Type "Trigger" states the reason of why this trace was generated. 469 E.g. program termination or forced interactive dump.</para> 470 </listitem> 471 472 <listitem> 473 <para><computeroutput>positions: [instr] [line]</computeroutput> [Callgrind]</para> 474 <para>For cost lines, this defines the semantic of the first numbers. 475 Any combination of "instr", "bb" and "line" is allowed, but has to be 476 in this order which corresponds to position numbers at the start of 477 the cost lines later in the file.</para> 478 <para>If "instr" is specified, the position is the address of an 479 instruction whose execution raised the events given later on the 480 line. This address is relative to the offset of the binary/shared 481 library file to not have to specify relocation info. For "line", 482 the position is the line number of a source file, which is 483 responsible for the events raised. Note that the mapping of "instr" 484 and "line" positions are given by the debugging line information 485 produced by the compiler.</para> 486 <para>This header line is optional, defaulting to "positions: 487 line" if not specified.</para> 488 </listitem> 489 490 <listitem> 491 <para><computeroutput>events: event type abbreviations</computeroutput> [Cachegrind]</para> 492 <para>A list of short names of the event types logged in cost 493 lines in this part of the profile data file. Arbitrary short 494 names are allowed. The order given specifies the required order 495 in cost lines. Thus, the first event type is the second or third 496 number in a cost line, depending on the value of "positions". 497 Required to appear for each header part exactly once.</para> 498 </listitem> 499 500 <listitem> 501 <para><computeroutput>summary: costs</computeroutput> [Callgrind]</para> 502 <para>Optional. This header line specifies a summary cost, which should be 503 equal or larger than a total over all self costs. It may be larger as 504 the cost lines may not represent all cost of the program run.</para> 505 </listitem> 506 507 <listitem> 508 <para><computeroutput>totals: costs</computeroutput> [Cachegrind]</para> 509 <para>Optional. Should appear at the end of the file (although 510 looking like a header line). Must give the total of all cost lines, 511 to allow for a consistency check.</para> 512 </listitem> 513 514</itemizedlist> 515 516</sect2> 517 518<sect2 id="cl-format.reference.body" xreflabel="Description of Body Lines"> 519<title>Description of Body Lines</title> 520 521<para>The regular body line is a cost line consisting of one or two 522position numbers (depending on "positions:" header line, see above) 523and an array of cost numbers. A position number either is a 524line numbers into a source file or an instruction address within binary 525code, with source/binary file names specified as position names (see 526below). The cost numbers get mapped to event types in the same order 527as specified in the "events:" header line. If less numbers than event 528types are given, the costs default to zero for the remaining event 529types.</para> 530 531<para>Further, there exist lines 532<computeroutput>spec=position name</computeroutput>. A position name 533is an arbitrary string. If it starts with "(" and a 534digit, it's a string in compressed format. Otherwise it's the real 535position string. This allows for file and symbol names as position 536strings, as these never start with "(" + <emphasis>digit</emphasis>. 537The compressed format is either "(" <emphasis>number</emphasis> ")" 538<emphasis>space</emphasis> <emphasis>position</emphasis> or only 539"(" <emphasis>number</emphasis> ")". The first relates 540<emphasis>position</emphasis> to <emphasis>number</emphasis> in the 541context of the given format specification from this line to the end of 542the file; it makes the (<emphasis>number</emphasis>) an alias for 543<emphasis>position</emphasis>. Compressed format is always 544optional.</para> 545 546<para>Position specifications allowed:</para> 547<itemizedlist> 548 549 <listitem> 550 <para><computeroutput>ob=</computeroutput> [Callgrind]</para> 551 <para>The ELF object where the cost of next cost lines happens.</para> 552 </listitem> 553 554 <listitem> 555 <para><computeroutput>fl=</computeroutput> [Cachegrind]</para> 556 </listitem> 557 558 <listitem> 559 <para><computeroutput>fi=</computeroutput> [Cachegrind]</para> 560 </listitem> 561 562 <listitem> 563 <para><computeroutput>fe=</computeroutput> [Cachegrind]</para> 564 <para>The source file including the code which is responsible for 565 the cost of next cost lines. "fi="/"fe=" is used when the source 566 file changes inside of a function, i.e. for inlined code.</para> 567 </listitem> 568 569 <listitem> 570 <para><computeroutput>fn=</computeroutput> [Cachegrind]</para> 571 <para>The name of the function where the cost of next cost lines 572 happens.</para> 573 </listitem> 574 575 <listitem> 576 <para><computeroutput>cob=</computeroutput> [Callgrind]</para> 577 <para>The ELF object of the target of the next call cost lines.</para> 578 </listitem> 579 580 <listitem> 581 <para><computeroutput>cfi=</computeroutput> [Callgrind]</para> 582 <para>The source file including the code of the target of the 583 next call cost lines.</para> 584 </listitem> 585 586 <listitem> 587 <para><computeroutput>cfl=</computeroutput> [Callgrind]</para> 588 <para>Alternative spelling for <computeroutput>cfi=</computeroutput> 589 specification (because of historical reasons).</para> 590 </listitem> 591 592 <listitem> 593 <para><computeroutput>cfn=</computeroutput> [Callgrind]</para> 594 <para>The name of the target function of the next call cost 595 lines.</para> 596 </listitem> 597 598</itemizedlist> 599 600<para>The last type of body line provides specific costs not just 601related to one position as regular cost lines. It starts with specific 602strings similar to position name specifications.</para> 603 604<itemizedlist> 605 606 <listitem> 607 <para><computeroutput>calls=count target-position</computeroutput> [Callgrind]</para> 608 <para>Call executed "count" times to "target-position". 609 After a "calls=" line there MUST be a cost line. This provides the source position 610 of the call and the cost spent in the called function in total.</para> 611 </listitem> 612 613 <listitem> 614 <para><computeroutput>jump=count target-position</computeroutput> [Callgrind]</para> 615 <para>Unconditional jump, executed "count" times, to "target-position".</para> 616 </listitem> 617 618 <listitem> 619 <para><computeroutput>jcnd=exe-count jump-count target-position</computeroutput> [Callgrind]</para> 620 <para>Conditional jump, executed "exe-count" times with "jump-count" jumps 621 happening (rest is fall-through) to "target-position".</para> 622 </listitem> 623 624</itemizedlist> 625 626</sect2> 627 628</sect1> 629 630</chapter> 631