1<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3  "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
4[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
5
6<chapter id="cl-format" xreflabel="Callgrind Format Specification">
7<title>Callgrind Format Specification</title>
8
9<para>This chapter describes the Callgrind Format, Version 1.</para>
10
11<para>The format description is meant for the user to be able to understand the
12file contents; but more important, it is given for authors of measurement or
13visualization tools to be able to write and read this format.</para>
14
15<sect1 id="cl-format.overview" xreflabel="Overview">
16<title>Overview</title>
17
18<para>The profile data format is ASCII based.
19It is written by Callgrind, and it is upwards compatible
20to the format used by Cachegrind (ie. Cachegrind uses a subset). It can
21be read by callgrind_annotate and KCachegrind.</para>
22
23<para>This chapter gives on overview of format features and examples.
24For detailed syntax, look at the format reference.</para>
25
26<sect2 id="cl-format.overview.basics" xreflabel="Basic Structure">
27<title>Basic Structure</title>
28
29<para>To uniquely specify that a file is a callgrind profile, it
30should add "# callgrind format" as first line. This is optional but
31recommended for easy format detection.</para>
32
33<para>Each file has a header part of an arbitrary number of lines of the
34format "key: value". After the header, lines specifying profile costs
35follow. Everywhere, comments on own lines starting with '#' are allowed.
36The header lines with keys "positions" and "events" define
37the meaning of cost lines in the second part of the file: the value of
38"positions" is a list of subpositions, and the value of "events" is a list
39of event type names. Cost lines consist of subpositions followed by 64-bit
40counters for the events, in the order specified by the "positions" and "events"
41header line.</para>
42
43<para>The "events" header line is always required in contrast to the optional
44line for "positions", which defaults to "line", i.e. a line number of some
45source file. In addition, the second part of the file contains position
46specifications of the form "spec=name". "spec" can be e.g. "fn" for a
47function name or "fl" for a file name. Cost lines are always related to
48the function/file specifications given directly before.</para>
49
50</sect2>
51
52<sect2 id="cl-format.overview.example1" xreflabel="Simple Example">
53<title>Simple Example</title>
54
55<para>The event names in the following example are quite arbitrary, and are not
56related to event names used by Callgrind. Especially, cycle counts matching
57real processors probably will never be generated by any Valgrind tools, as these
58are bound to simulations of simple machine models for acceptable slowdown.
59However, any profiling tool could use the format described in this chapter.</para>
60
61<para>
62<screen># callgrind format
63events: Cycles Instructions Flops
64fl=file.f
65fn=main
6615 90 14 2
6716 20 12</screen></para>
68
69<para>The above example gives profile information for event types "Cycles",
70"Instructions", and "Flops". Thus, cost lines give the number of CPU cycles
71passed by, number of executed instructions, and number of floating point
72operations executed while running code corresponding to some source
73position. As there is no line specifying the value of "positions", it defaults
74to "line", which means that the first number of a cost line is always a line
75number.</para>
76
77<para>Thus, the first cost line specifies that in line 15 of source file
78<filename>file.f</filename> there is code belonging to function
79<function>main</function>. While running, 90 CPU cycles passed by, and 2 of
80the 14 instructions executed were floating point operations. Similarly, the
81next line specifies that there were 12 instructions executed in the context
82of function <function>main</function> which can be related to line 16 in
83file <filename>file.f</filename>, taking 20 CPU cycles. If a cost line
84specifies less event counts than given in the "events" line, the rest is
85assumed to be zero.  I.e. there was no floating point instruction executed
86relating to line 16.</para>
87
88<para>Note that regular cost lines always give self (also called exclusive)
89cost of code at a given position. If you specify multiple cost lines for the
90same position, these will be summed up. On the other hand, in the example above
91there is no specification of how many times function
92<function>main</function> actually was
93called: profile data only contains sums.</para>
94
95</sect2>
96
97
98<sect2 id="cl-format.overview.associations" xreflabel="Associations">
99<title>Associations</title>
100
101<para>The most important extension to the original format of Cachegrind is the
102ability to specify call relationship among functions. More generally, you
103specify associations among positions. For this, the second part of the
104file also can contain association specifications. These look similar to
105position specifications, but consist of two lines. For calls, the format
106looks like
107<screen>
108 calls=(Call Count) (Target position)
109 (Source position) (Inclusive cost of call)
110</screen></para>
111
112<para>The destination only specifies subpositions like line number. Therefore,
113to be able to specify a call to another function in another source file, you
114have to precede the above lines with a "cfn=" specification for the name of the
115called function, and optionally a "cfi=" specification if the function is in
116another source file ("cfl=" is an alternative specification for "cfi=" because
117of historical reasons, and both should be supported by format readers).
118The second line looks like a regular cost line with the difference
119that inclusive cost spent inside of the function call has to be specified.</para>
120
121<para>Other associations are for example (conditional) jumps. See the
122reference below for details.</para>
123
124</sect2>
125
126
127<sect2 id="cl-format.overview.example2" xreflabel="Extended Example">
128<title>Extended Example</title>
129
130<para>The following example shows 3 functions, <function>main</function>,
131<function>func1</function>, and <function>func2</function>. Function
132<function>main</function> calls <function>func1</function> once and
133<function>func2</function> 3 times. <function>func1</function> calls
134<function>func2</function> 2 times.
135
136<screen># callgrind format
137events: Instructions
138
139fl=file1.c
140fn=main
14116 20
142cfn=func1
143calls=1 50
14416 400
145cfi=file2.c
146cfn=func2
147calls=3 20
14816 400
149
150fn=func1
15151 100
152cfi=file2.c
153cfn=func2
154calls=2 20
15551 300
156
157fl=file2.c
158fn=func2
15920 700</screen></para>
160
161<para>One can see that in <function>main</function> only code from line 16
162is executed where also the other functions are called. Inclusive cost of
163<function>main</function> is 820, which is the sum of self cost 20 and costs
164spent in the calls: 400 for the single call to <function>func1</function>
165and 400 as sum for the three calls to <function>func2</function>.</para>
166
167<para>Function <function>func1</function> is located in
168<filename>file1.c</filename>, the same as <function>main</function>.
169Therefore, a "cfi=" specification for the call to <function>func1</function>
170is not needed. The function <function>func1</function> only consists of code
171at line 51 of <filename>file1.c</filename>, where <function>func2</function>
172is called.</para>
173
174</sect2>
175
176
177<sect2 id="cl-format.overview.compression1" xreflabel="Name Compression">
178<title>Name Compression</title>
179
180<para>With the introduction of association specifications like calls it is
181needed to specify the same function or same file name multiple times. As
182absolute filenames or symbol names in C++ can be quite long, it is advantageous
183to be able to specify integer IDs for position specifications.
184Here, the term "position" corresponds to a file name (source or object file)
185or function name.</para>
186
187<para>To support name compression, a position specification can be not only of
188the format "spec=name", but also "spec=(ID) name" to specify a mapping of an
189integer ID to a name, and "spec=(ID)" to reference a previously defined ID
190mapping. There is a separate ID mapping for each position specification,
191i.e. you can use ID 1 for both a file name and a symbol name.</para>
192
193<para>With string compression, the example from above looks like this:
194<screen># callgrind format
195events: Instructions
196
197fl=(1) file1.c
198fn=(1) main
19916 20
200cfn=(2) func1
201calls=1 50
20216 400
203cfi=(2) file2.c
204cfn=(3) func2
205calls=3 20
20616 400
207
208fn=(2)
20951 100
210cfi=(2)
211cfn=(3)
212calls=2 20
21351 300
214
215fl=(2)
216fn=(3)
21720 700</screen></para>
218
219<para>As position specifications carry no information themselves, but only change
220the meaning of subsequent cost lines or associations, they can appear
221everywhere in the file without any negative consequence. Especially, you can
222define name compression mappings directly after the header, and before any cost
223lines. Thus, the above example can also be written as
224<screen># callgrind format
225events: Instructions
226
227# define file ID mapping
228fl=(1) file1.c
229fl=(2) file2.c
230# define function ID mapping
231fn=(1) main
232fn=(2) func1
233fn=(3) func2
234
235fl=(1)
236fn=(1)
23716 20
238...</screen></para>
239
240</sect2>
241
242
243<sect2 id="cl-format.overview.compression2" xreflabel="Subposition Compression">
244<title>Subposition Compression</title>
245
246<para>If a Callgrind data file should hold costs for each assembler instruction
247of a program, you specify subposition "instr" in the "positions:" header line,
248and each cost line has to include the address of some instruction. Addresses
249are allowed to have a size of 64 bits to support 64-bit architectures. Thus,
250repeating similar, long addresses for almost every line in the data file can
251enlarge the file size quite significantly, and
252motivates for subposition compression: instead of every cost line starting with
253a 16 character long address, one is allowed to specify relative addresses.
254This relative specification is not only allowed for instruction addresses, but
255also for line numbers; both addresses and line numbers are called "subpositions".</para>
256
257<para>A relative subposition always is based on the corresponding subposition
258of the last cost line, and starts with a "+" to specify a positive difference,
259a "-" to specify a negative difference, or consists of "*" to specify the same
260subposition. Because absolute subpositions always are positive (ie. never
261prefixed by "-"), any relative specification is non-ambiguous; additionally,
262absolute and relative subposition specifications can be mixed freely.
263Assume the following example (subpositions can always be specified
264as hexadecimal numbers, beginning with "0x"):
265<screen># callgrind format
266positions: instr line
267events: ticks
268
269fn=func
2700x80001234 90 1
2710x80001237 90 5
2720x80001238 91 6</screen></para>
273
274<para>With subposition compression, this looks like
275<screen># callgrind format
276positions: instr line
277events: ticks
278
279fn=func
2800x80001234 90 1
281+3 * 5
282+1 +1 6</screen></para>
283
284<para>Remark: For assembler annotation to work, instruction addresses have to
285be corrected to correspond to addresses found in the original binary. I.e. for
286relocatable shared objects, often a load offset has to be subtracted.</para>
287
288</sect2>
289
290
291<sect2 id="cl-format.overview.misc" xreflabel="Miscellaneous">
292<title>Miscellaneous</title>
293
294<sect3 id="cl-format.overview.misc.summary" xreflabel="Cost Summary Information">
295<title>Cost Summary Information</title>
296
297<para>For the visualization to be able to show cost percentage, a sum of the
298cost of the full run has to be known. Usually, it is assumed that this is the
299sum of all cost lines in a file. But sometimes, this is not correct. Thus, you
300can specify a "summary:" line in the header giving the full cost for the
301profile run. An import filter may use this to show a progress bar
302while loading a large data file.</para>
303
304</sect3>
305
306<sect3 id="cl-format.overview.misc.events" xreflabel="Long Names for Event Types and inherited Types">
307<title>Long Names for Event Types and inherited Types</title>
308
309<para>Event types for cost lines are specified in the "events:" line with an
310abbreviated name. For visualization, it makes sense to be able to specify some
311longer, more descriptive name. For an event type "Ir" which means "Instruction
312Fetches", this can be specified the header line
313<screen>event: Ir : Instruction Fetches
314events: Ir Dr</screen></para>
315
316<para>In this example, "Dr" itself has no long name associated. The order of
317"event:" lines and the "events:" line is of no importance. Additionally,
318inherited event types can be introduced for which no raw data is available, but
319which are calculated from given types. Suppose the last example, you could add
320<screen>event: Sum = Ir + Dr</screen>
321to specify an additional event type "Sum", which is calculated by adding costs
322for "Ir and "Dr".</para>
323
324</sect3>
325
326</sect2>
327
328</sect1>
329
330<sect1 id="cl-format.reference" xreflabel="Reference">
331<title>Reference</title>
332
333<sect2 id="cl-format.reference.grammar" xreflabel="Grammar">
334<title>Grammar</title>
335
336<para>
337<screen>ProfileDataFile := FormatSpec? FormatVersion? Creator? PartData*</screen>
338<screen>FormatSpec := "# callgrind format\n"</screen>
339<screen>FormatVersion := "version: 1\n"</screen>
340<screen>Creator := "creator:" NoNewLineChar* "\n"</screen>
341<screen>PartData := (HeaderLine "\n")+ (BodyLine "\n")+</screen>
342<screen>HeaderLine := (empty line)
343  | ('#' NoNewLineChar*)
344  | PartDetail
345  | Description
346  | EventSpecification
347  | CostLineDef</screen>
348<screen>PartDetail := TargetCommand | TargetID</screen>
349<screen>TargetCommand := "cmd:" Space* NoNewLineChar*</screen>
350<screen>TargetID := ("pid"|"thread"|"part") ":" Space* Number</screen>
351<screen>Description := "desc:" Space* Name Space* ":" NoNewLineChar*</screen>
352<screen>EventSpecification := "event:" Space* Name InheritedDef? LongNameDef?</screen>
353<screen>InheritedDef := "=" InheritedExpr</screen>
354<screen>InheritedExpr := Name
355  | Number Space* ("*" Space*)? Name
356  | InheritedExpr Space* "+" Space* InheritedExpr</screen>
357<screen>LongNameDef := ":" NoNewLineChar*</screen>
358<screen>CostLineDef := "events:" Space* Name (Space+ Name)*
359  | "positions:" "instr"? (Space+ "line")?</screen>
360<screen>BodyLine := (empty line)
361  | ('#' NoNewLineChar*)
362  | CostLine
363  | PositionSpec
364  | CallSpec
365  | UncondJumpSpec
366  | CondJumpSpec</screen>
367<screen>CostLine := SubPositionList Costs?</screen>
368<screen>SubPositionList := (SubPosition+ Space+)+</screen>
369<screen>SubPosition := Number | "+" Number | "-" Number | "*"</screen>
370<screen>Costs := (Number Space+)+</screen>
371<screen>PositionSpec := Position "=" Space* PositionName</screen>
372<screen>Position := CostPosition | CalledPosition</screen>
373<screen>CostPosition := "ob" | "fl" | "fi" | "fe" | "fn"</screen>
374<screen>CalledPosition := " "cob" | "cfi" | "cfl" | "cfn"</screen>
375<screen>PositionName := ( "(" Number ")" )? (Space* NoNewLineChar* )?</screen>
376<screen>CallSpec := CallLine "\n" CostLine</screen>
377<screen>CallLine := "calls=" Space* Number Space+ SubPositionList</screen>
378<screen>UncondJumpSpec := "jump=" Space* Number Space+ SubPositionList</screen>
379<screen>CondJumpSpec := "jcnd=" Space* Number Space+ Number Space+ SubPositionList</screen>
380<screen>Space := " " | "\t"</screen>
381<screen>Number := HexNumber | (Digit)+</screen>
382<screen>Digit := "0" | ... | "9"</screen>
383<screen>HexNumber := "0x" (Digit | HexChar)+</screen>
384<screen>HexChar := "a" | ... | "f" | "A" | ... | "F"</screen>
385<screen>Name = Alpha (Digit | Alpha)*</screen>
386<screen>Alpha = "a" | ... | "z" | "A" | ... | "Z"</screen>
387<screen>NoNewLineChar := all characters without "\n"</screen>
388</para>
389
390<para>A profile data file ("ProfileDataFile") starts with basic information
391  such as a format marker, the version and creator information, and then has a list of parts, where
392  each part has its own header and body. Parts typically are different threads
393  and/or time spans/phases within a profiled application run.</para>
394
395<para>Note that callgrind_annotate currently only supports profile data files with
396  one part. Callgrind may produce multiple parts for one profile run, but defaults
397  to one output file for each part.</para>
398
399</sect2>
400
401<sect2 id="cl-format.reference.header" xreflabel="Description of Header Lines">
402<title>Description of Header Lines</title>
403
404<para>Basic information in the first lines of a profile data file:</para>
405
406<itemizedlist>
407  <listitem>
408    <para><computeroutput># callgrind format</computeroutput> [Callgrind]</para>
409    <para>This line specifies that the file is a callgrind profile,
410      and it has to be the first line. It was added late to the
411      format (with Valgrind 3.13) and is optional, as all readers also
412      should work with older callgrind profiles not including this line.
413      However, generation of this line is recommended to allow desktop
414      environments and file managers to uniquely detect the format.</para>
415  </listitem>
416
417  <listitem>
418    <para><computeroutput>version: number</computeroutput> [Callgrind]</para>
419    <para>This is used to distinguish future profile data formats.  A
420    major version of 0 or 1 is supposed to be upwards compatible with
421    Cachegrind's format.  It is optional; if not appearing, version 1
422    is assumed.  Otherwise, it has to follow directly after the format
423    specification (i.e. be the first line if the optional format
424    specification is skipped).</para>
425  </listitem>
426
427  <listitem>
428    <para><computeroutput>creator: string</computeroutput> [Callgrind]</para>
429    <para>This is an arbitrary string to denote the creator of this file.
430      Optional.</para>
431  </listitem>
432
433</itemizedlist>
434
435<para>The header for each part has an arbitrary number of lines of the format
436"key: value". Possible <emphasis>key</emphasis> values for the header are:</para>
437
438<itemizedlist>
439
440  <listitem>
441    <para><computeroutput>pid: process id</computeroutput> [Callgrind]</para>
442    <para>Optional. This specifies the process ID of the supervised application
443    for which this profile was generated.</para>
444  </listitem>
445
446  <listitem>
447    <para><computeroutput>cmd: program name + args</computeroutput> [Cachegrind]</para>
448    <para>Optional. This specifies the full command line of the supervised
449    application for which this profile was generated.</para>
450  </listitem>
451
452  <listitem>
453    <para><computeroutput>part: number</computeroutput> [Callgrind]</para>
454    <para>Optional. This specifies a sequentially incremented number for each dump
455    generated, starting at 1.</para>
456  </listitem>
457
458  <listitem>
459    <para><computeroutput>desc: type: value</computeroutput> [Cachegrind]</para>
460    <para>This specifies various information for this dump.  For some
461    types, the semantic is defined, but any description type is allowed.
462    Unknown types should be ignored.</para>
463    <para>There are the types "I1 cache", "D1 cache", "LL cache", which
464    specify parameters used for the cache simulator.  These are the only
465    types originally used by Cachegrind.  Additionally, Callgrind uses
466    the following types:  "Timerange" gives a rough range of the basic
467    block counter, for which the cost of this dump was collected.
468    Type "Trigger" states the reason of why this trace was generated.
469    E.g. program termination or forced interactive dump.</para>
470  </listitem>
471
472  <listitem>
473    <para><computeroutput>positions: [instr] [line]</computeroutput> [Callgrind]</para>
474    <para>For cost lines, this defines the semantic of the first numbers.
475    Any combination of "instr", "bb" and "line" is allowed, but has to be
476    in this order which corresponds to position numbers at the start of
477    the cost lines later in the file.</para>
478    <para>If "instr" is specified, the position is the address of an
479    instruction whose execution raised the events given later on the
480    line.  This address is relative to the offset of the binary/shared
481    library file to not have to specify relocation info.  For "line",
482    the position is the line number of a source file, which is
483    responsible for the events raised. Note that the mapping of "instr"
484    and "line" positions are given by the debugging line information
485    produced by the compiler.</para>
486    <para>This header line is optional, defaulting to "positions:
487    line" if not specified.</para>
488  </listitem>
489
490  <listitem>
491    <para><computeroutput>events: event type abbreviations</computeroutput> [Cachegrind]</para>
492    <para>A list of short names of the event types logged in cost
493      lines in this part of the profile data file. Arbitrary short
494      names are allowed.  The order given specifies the required order
495      in cost lines. Thus, the first event type is the second or third
496      number in a cost line, depending on the value of "positions".
497      Required to appear for each header part exactly once.</para>
498  </listitem>
499
500  <listitem>
501    <para><computeroutput>summary: costs</computeroutput> [Callgrind]</para>
502    <para>Optional. This header line specifies a summary cost, which should be
503    equal or larger than a total over all self costs. It may be larger as
504    the cost lines may not represent all cost of the program run.</para>
505  </listitem>
506
507  <listitem>
508    <para><computeroutput>totals: costs</computeroutput> [Cachegrind]</para>
509    <para>Optional. Should appear at the end of the file (although
510    looking like a header line). Must give the total of all cost lines,
511    to allow for a consistency check.</para>
512  </listitem>
513
514</itemizedlist>
515
516</sect2>
517
518<sect2 id="cl-format.reference.body" xreflabel="Description of Body Lines">
519<title>Description of Body Lines</title>
520
521<para>The regular body line is a cost line consisting of one or two
522position numbers (depending on "positions:" header line, see above)
523and an array of cost numbers. A position number either is a
524line numbers into a source file or an instruction address within binary
525code, with source/binary file names specified as position names (see
526below). The cost numbers get mapped to event types in the same order
527as specified in the "events:" header line. If less numbers than event
528types are given, the costs default to zero for the remaining event
529types.</para>
530
531<para>Further, there exist lines
532<computeroutput>spec=position name</computeroutput>.  A position name
533is an arbitrary string. If it starts with "(" and a
534digit, it's a string in compressed format.  Otherwise it's the real
535position string.  This allows for file and symbol names as position
536strings, as these never start with "(" + <emphasis>digit</emphasis>.
537The compressed format is either "(" <emphasis>number</emphasis> ")"
538<emphasis>space</emphasis> <emphasis>position</emphasis> or only
539"(" <emphasis>number</emphasis> ")".  The first relates
540<emphasis>position</emphasis> to <emphasis>number</emphasis> in the
541context of the given format specification from this line to the end of
542the file; it makes the (<emphasis>number</emphasis>) an alias for
543<emphasis>position</emphasis>.  Compressed format is always
544optional.</para>
545
546<para>Position specifications allowed:</para>
547<itemizedlist>
548
549  <listitem>
550    <para><computeroutput>ob=</computeroutput> [Callgrind]</para>
551    <para>The ELF object where the cost of next cost lines happens.</para>
552  </listitem>
553
554  <listitem>
555    <para><computeroutput>fl=</computeroutput> [Cachegrind]</para>
556  </listitem>
557
558  <listitem>
559    <para><computeroutput>fi=</computeroutput> [Cachegrind]</para>
560  </listitem>
561
562  <listitem>
563    <para><computeroutput>fe=</computeroutput> [Cachegrind]</para>
564    <para>The source file including the code which is responsible for
565    the cost of next cost lines. "fi="/"fe=" is used when the source
566    file changes inside of a function, i.e. for inlined code.</para>
567  </listitem>
568
569  <listitem>
570    <para><computeroutput>fn=</computeroutput> [Cachegrind]</para>
571    <para>The name of the function where the cost of next cost lines
572    happens.</para>
573  </listitem>
574
575  <listitem>
576     <para><computeroutput>cob=</computeroutput> [Callgrind]</para>
577    <para>The ELF object of the target of the next call cost lines.</para>
578  </listitem>
579
580  <listitem>
581    <para><computeroutput>cfi=</computeroutput> [Callgrind]</para>
582    <para>The source file including the code of the target of the
583    next call cost lines.</para>
584  </listitem>
585
586  <listitem>
587    <para><computeroutput>cfl=</computeroutput> [Callgrind]</para>
588    <para>Alternative spelling for <computeroutput>cfi=</computeroutput>
589    specification (because of historical reasons).</para>
590  </listitem>
591
592  <listitem>
593    <para><computeroutput>cfn=</computeroutput> [Callgrind]</para>
594    <para>The name of the target function of the next call cost
595    lines.</para>
596  </listitem>
597
598</itemizedlist>
599
600<para>The last type of body line provides specific costs not just
601related to one position as regular cost lines. It starts with specific
602strings similar to position name specifications.</para>
603
604<itemizedlist>
605
606  <listitem>
607    <para><computeroutput>calls=count target-position</computeroutput> [Callgrind]</para>
608    <para>Call executed "count" times to "target-position".
609    After a "calls=" line there MUST be a cost line. This provides the source position
610    of the call and the cost spent in the called function in total.</para>
611  </listitem>
612
613  <listitem>
614    <para><computeroutput>jump=count target-position</computeroutput> [Callgrind]</para>
615    <para>Unconditional jump, executed "count" times, to "target-position".</para>
616  </listitem>
617
618  <listitem>
619    <para><computeroutput>jcnd=exe-count jump-count target-position</computeroutput> [Callgrind]</para>
620    <para>Conditional jump, executed "exe-count" times with "jump-count" jumps
621    happening (rest is fall-through) to "target-position".</para>
622  </listitem>
623
624</itemizedlist>
625
626</sect2>
627
628</sect1>
629
630</chapter>
631