1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2                      "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
5  <title>Exception Handling in LLVM</title>
6  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
7  <meta name="description"
8        content="Exception Handling in LLVM.">
9  <link rel="stylesheet" href="llvm.css" type="text/css">
10</head>
11
12<body>
13
14<h1>Exception Handling in LLVM</h1>
15
16<table class="layout" style="width:100%">
17  <tr class="layout">
18    <td class="left">
19<ul>
20  <li><a href="#introduction">Introduction</a>
21  <ol>
22    <li><a href="#itanium">Itanium ABI Zero-cost Exception Handling</a></li>
23    <li><a href="#sjlj">Setjmp/Longjmp Exception Handling</a></li>
24    <li><a href="#overview">Overview</a></li>
25  </ol></li>
26  <li><a href="#codegen">LLVM Code Generation</a>
27  <ol>
28    <li><a href="#throw">Throw</a></li>
29    <li><a href="#try_catch">Try/Catch</a></li>
30    <li><a href="#cleanups">Cleanups</a></li>
31    <li><a href="#throw_filters">Throw Filters</a></li>
32    <li><a href="#restrictions">Restrictions</a></li>
33  </ol></li>
34  <li><a href="#format_common_intrinsics">Exception Handling Intrinsics</a>
35  <ol>
36  	<li><a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a></li>
37  	<li><a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a></li>
38  	<li><a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a></li>
39  	<li><a href="#llvm_eh_sjlj_lsda"><tt>llvm.eh.sjlj.lsda</tt></a></li>
40  	<li><a href="#llvm_eh_sjlj_callsite"><tt>llvm.eh.sjlj.callsite</tt></a></li>
41  	<li><a href="#llvm_eh_sjlj_dispatchsetup"><tt>llvm.eh.sjlj.dispatchsetup</tt></a></li>
42  </ol></li>
43  <li><a href="#asm">Asm Table Formats</a>
44  <ol>
45    <li><a href="#unwind_tables">Exception Handling Frame</a></li>
46    <li><a href="#exception_tables">Exception Tables</a></li>
47  </ol></li>
48</ul>
49</td>
50</tr></table>
51
52<div class="doc_author">
53  <p>Written by <a href="mailto:jlaskey@mac.com">Jim Laskey</a></p>
54</div>
55
56
57<!-- *********************************************************************** -->
58<h2><a name="introduction">Introduction</a></h2>
59<!-- *********************************************************************** -->
60
61<div>
62
63<p>This document is the central repository for all information pertaining to
64   exception handling in LLVM.  It describes the format that LLVM exception
65   handling information takes, which is useful for those interested in creating
66   front-ends or dealing directly with the information.  Further, this document
67   provides specific examples of what exception handling information is used for
68   in C and C++.</p>
69
70<!-- ======================================================================= -->
71<h3>
72  <a name="itanium">Itanium ABI Zero-cost Exception Handling</a>
73</h3>
74
75<div>
76
77<p>Exception handling for most programming languages is designed to recover from
78   conditions that rarely occur during general use of an application.  To that
79   end, exception handling should not interfere with the main flow of an
80   application's algorithm by performing checkpointing tasks, such as saving the
81   current pc or register state.</p>
82
83<p>The Itanium ABI Exception Handling Specification defines a methodology for
84   providing outlying data in the form of exception tables without inlining
85   speculative exception handling code in the flow of an application's main
86   algorithm.  Thus, the specification is said to add "zero-cost" to the normal
87   execution of an application.</p>
88
89<p>A more complete description of the Itanium ABI exception handling runtime
90   support of can be found at
91   <a href="http://www.codesourcery.com/cxx-abi/abi-eh.html">Itanium C++ ABI:
92   Exception Handling</a>. A description of the exception frame format can be
93   found at
94   <a href="http://refspecs.freestandards.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html">Exception
95   Frames</a>, with details of the DWARF 4 specification at
96   <a href="http://dwarfstd.org/Dwarf4Std.php">DWARF 4 Standard</a>.
97   A description for the C++ exception table formats can be found at
98   <a href="http://www.codesourcery.com/cxx-abi/exceptions.pdf">Exception Handling
99   Tables</a>.</p>
100
101</div>
102
103<!-- ======================================================================= -->
104<h3>
105  <a name="sjlj">Setjmp/Longjmp Exception Handling</a>
106</h3>
107
108<div>
109
110<p>Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics
111   <a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a> and
112   <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a> to
113   handle control flow for exception handling.</p>
114
115<p>For each function which does exception processing &mdash; be
116   it <tt>try</tt>/<tt>catch</tt> blocks or cleanups &mdash; that function
117   registers itself on a global frame list. When exceptions are unwinding, the
118   runtime uses this list to identify which functions need processing.<p>
119
120<p>Landing pad selection is encoded in the call site entry of the function
121   context. The runtime returns to the function via
122   <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a>, where
123   a switch table transfers control to the appropriate landing pad based on
124   the index stored in the function context.</p>
125
126<p>In contrast to DWARF exception handling, which encodes exception regions
127   and frame information in out-of-line tables, SJLJ exception handling
128   builds and removes the unwind frame context at runtime. This results in
129   faster exception handling at the expense of slower execution when no
130   exceptions are thrown. As exceptions are, by their nature, intended for
131   uncommon code paths, DWARF exception handling is generally preferred to
132   SJLJ.</p>
133
134</div>
135
136<!-- ======================================================================= -->
137<h3>
138  <a name="overview">Overview</a>
139</h3>
140
141<div>
142
143<p>When an exception is thrown in LLVM code, the runtime does its best to find a
144   handler suited to processing the circumstance.</p>
145
146<p>The runtime first attempts to find an <i>exception frame</i> corresponding to
147   the function where the exception was thrown.  If the programming language
148   supports exception handling (e.g. C++), the exception frame contains a
149   reference to an exception table describing how to process the exception.  If
150   the language does not support exception handling (e.g. C), or if the
151   exception needs to be forwarded to a prior activation, the exception frame
152   contains information about how to unwind the current activation and restore
153   the state of the prior activation.  This process is repeated until the
154   exception is handled. If the exception is not handled and no activations
155   remain, then the application is terminated with an appropriate error
156   message.</p>
157
158<p>Because different programming languages have different behaviors when
159   handling exceptions, the exception handling ABI provides a mechanism for
160   supplying <i>personalities</i>. An exception handling personality is defined
161   by way of a <i>personality function</i> (e.g. <tt>__gxx_personality_v0</tt>
162   in C++), which receives the context of the exception, an <i>exception
163   structure</i> containing the exception object type and value, and a reference
164   to the exception table for the current function.  The personality function
165   for the current compile unit is specified in a <i>common exception
166   frame</i>.</p>
167
168<p>The organization of an exception table is language dependent. For C++, an
169   exception table is organized as a series of code ranges defining what to do
170   if an exception occurs in that range. Typically, the information associated
171   with a range defines which types of exception objects (using C++ <i>type
172   info</i>) that are handled in that range, and an associated action that
173   should take place. Actions typically pass control to a <i>landing
174   pad</i>.</p>
175
176<p>A landing pad corresponds roughly to the code found in the <tt>catch</tt>
177   portion of a <tt>try</tt>/<tt>catch</tt> sequence. When execution resumes at
178   a landing pad, it receives an <i>exception structure</i> and a
179   <i>selector value</i> corresponding to the <i>type</i> of exception
180   thrown. The selector is then used to determine which <i>catch</i> should
181   actually process the exception.</p>
182
183</div>
184
185</div>
186
187<!-- ======================================================================= -->
188<h2>
189  <a name="codegen">LLVM Code Generation</a>
190</h2>
191
192<div>
193
194<p>From a C++ developer's perspective, exceptions are defined in terms of the
195   <tt>throw</tt> and <tt>try</tt>/<tt>catch</tt> statements. In this section
196   we will describe the implementation of LLVM exception handling in terms of
197   C++ examples.</p>
198
199<!-- ======================================================================= -->
200<h3>
201  <a name="throw">Throw</a>
202</h3>
203
204<div>
205
206<p>Languages that support exception handling typically provide a <tt>throw</tt>
207   operation to initiate the exception process. Internally, a <tt>throw</tt>
208   operation breaks down into two steps.</p>
209
210<ol>
211  <li>A request is made to allocate exception space for an exception structure.
212      This structure needs to survive beyond the current activation. This
213      structure will contain the type and value of the object being thrown.</li>
214
215  <li>A call is made to the runtime to raise the exception, passing the
216      exception structure as an argument.</li>
217</ol>
218
219<p>In C++, the allocation of the exception structure is done by the
220   <tt>__cxa_allocate_exception</tt> runtime function. The exception raising is
221   handled by <tt>__cxa_throw</tt>. The type of the exception is represented
222   using a C++ RTTI structure.</p>
223
224</div>
225
226<!-- ======================================================================= -->
227<h3>
228  <a name="try_catch">Try/Catch</a>
229</h3>
230
231<div>
232
233<p>A call within the scope of a <i>try</i> statement can potentially raise an
234   exception. In those circumstances, the LLVM C++ front-end replaces the call
235   with an <tt>invoke</tt> instruction. Unlike a call, the <tt>invoke</tt> has
236   two potential continuation points:</p>
237
238<ol>
239  <li>where to continue when the call succeeds as per normal, and</li>
240
241  <li>where to continue if the call raises an exception, either by a throw or
242      the unwinding of a throw</li>
243</ol>
244
245<p>The term used to define a the place where an <tt>invoke</tt> continues after
246   an exception is called a <i>landing pad</i>. LLVM landing pads are
247   conceptually alternative function entry points where an exception structure
248   reference and a type info index are passed in as arguments. The landing pad
249   saves the exception structure reference and then proceeds to select the catch
250   block that corresponds to the type info of the exception object.</p>
251
252<p>The LLVM <a href="LangRef.html#i_landingpad"><tt>landingpad</tt>
253   instruction</a> is used to convey information about the landing pad to the
254   back end. For C++, the <tt>landingpad</tt> instruction returns a pointer and
255   integer pair corresponding to the pointer to the <i>exception structure</i>
256   and the <i>selector value</i> respectively.</p>
257
258<p>The <tt>landingpad</tt> instruction takes a reference to the personality
259   function to be used for this <tt>try</tt>/<tt>catch</tt> sequence. The
260   remainder of the instruction is a list of <i>cleanup</i>, <i>catch</i>,
261   and <i>filter</i> clauses. The exception is tested against the clauses
262   sequentially from first to last. The selector value is a positive number if
263   the exception matched a type info, a negative number if it matched a filter,
264   and zero if it matched a cleanup. If nothing is matched, the behavior of
265   the program is <a href="#restrictions">undefined</a>. If a type info matched,
266   then the selector value is the index of the type info in the exception table,
267   which can be obtained using the
268   <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> intrinsic.</p>
269
270<p>Once the landing pad has the type info selector, the code branches to the
271   code for the first catch. The catch then checks the value of the type info
272   selector against the index of type info for that catch.  Since the type info
273   index is not known until all the type infos have been gathered in the
274   backend, the catch code must call the
275   <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> intrinsic to
276   determine the index for a given type info. If the catch fails to match the
277   selector then control is passed on to the next catch.</p>
278
279<p>Finally, the entry and exit of catch code is bracketed with calls to
280   <tt>__cxa_begin_catch</tt> and <tt>__cxa_end_catch</tt>.</p>
281
282<ul>
283  <li><tt>__cxa_begin_catch</tt> takes an exception structure reference as an
284      argument and returns the value of the exception object.</li>
285
286  <li><tt>__cxa_end_catch</tt> takes no arguments. This function:<br><br>
287    <ol>
288      <li>Locates the most recently caught exception and decrements its handler
289          count,</li>
290      <li>Removes the exception from the <i>caught</i> stack if the handler
291          count goes to zero, and</li>
292      <li>Destroys the exception if the handler count goes to zero and the
293          exception was not re-thrown by throw.</li>
294    </ol>
295    <p><b>Note:</b> a rethrow from within the catch may replace this call with
296       a <tt>__cxa_rethrow</tt>.</p></li>
297</ul>
298
299</div>
300
301<!-- ======================================================================= -->
302<h3>
303  <a name="cleanups">Cleanups</a>
304</h3>
305
306<div>
307
308<p>A cleanup is extra code which needs to be run as part of unwinding a scope.
309   C++ destructors are a typical example, but other languages and language
310   extensions provide a variety of different kinds of cleanups. In general, a
311   landing pad may need to run arbitrary amounts of cleanup code before actually
312   entering a catch block. To indicate the presence of cleanups, a
313   <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a>
314   should have a <i>cleanup</i> clause. Otherwise, the unwinder will not stop at
315   the landing pad if there are no catches or filters that require it to.</p>
316
317<p><b>Note:</b> Do not allow a new exception to propagate out of the execution
318   of a cleanup. This can corrupt the internal state of the unwinder.
319   Different languages describe different high-level semantics for these
320   situations: for example, C++ requires that the process be terminated, whereas
321   Ada cancels both exceptions and throws a third.</p>
322
323<p>When all cleanups are finished, if the exception is not handled by the
324   current function, resume unwinding by calling the
325   <a href="LangRef.html#i_resume"><tt>resume</tt> instruction</a>, passing in
326   the result of the <tt>landingpad</tt> instruction for the original landing
327   pad.</p>
328
329</div>
330
331<!-- ======================================================================= -->
332<h3>
333  <a name="throw_filters">Throw Filters</a>
334</h3>
335
336<div>
337
338<p>C++ allows the specification of which exception types may be thrown from a
339   function. To represent this, a top level landing pad may exist to filter out
340   invalid types. To express this in LLVM code the
341   <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a> will
342   have a filter clause. The clause consists of an array of type infos.
343   <tt>landingpad</tt> will return a negative value if the exception does not
344   match any of the type infos. If no match is found then a call
345   to <tt>__cxa_call_unexpected</tt> should be made, otherwise
346   <tt>_Unwind_Resume</tt>.  Each of these functions requires a reference to the
347   exception structure.  Note that the most general form of a
348   <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a> can
349   have any number of catch, cleanup, and filter clauses (though having more
350   than one cleanup is pointless). The LLVM C++ front-end can generate such
351   <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instructions</a> due
352   to inlining creating nested exception handling scopes.</p>
353
354</div>
355
356<!-- ======================================================================= -->
357<h3>
358  <a name="restrictions">Restrictions</a>
359</h3>
360
361<div>
362
363<p>The unwinder delegates the decision of whether to stop in a call frame to
364   that call frame's language-specific personality function. Not all unwinders
365   guarantee that they will stop to perform cleanups. For example, the GNU C++
366   unwinder doesn't do so unless the exception is actually caught somewhere
367   further up the stack.</p>
368
369<p>In order for inlining to behave correctly, landing pads must be prepared to
370   handle selector results that they did not originally advertise. Suppose that
371   a function catches exceptions of type <tt>A</tt>, and it's inlined into a
372   function that catches exceptions of type <tt>B</tt>. The inliner will update
373   the <tt>landingpad</tt> instruction for the inlined landing pad to include
374   the fact that <tt>B</tt> is also caught. If that landing pad assumes that it
375   will only be entered to catch an <tt>A</tt>, it's in for a rude awakening.
376   Consequently, landing pads must test for the selector results they understand
377   and then resume exception propagation with the
378   <a href="LangRef.html#i_resume"><tt>resume</tt> instruction</a> if none of
379   the conditions match.</p>
380
381</div>
382
383</div>
384
385<!-- ======================================================================= -->
386<h2>
387  <a name="format_common_intrinsics">Exception Handling Intrinsics</a>
388</h2>
389
390<div>
391
392<p>In addition to the
393   <a href="LangRef.html#i_landingpad"><tt>landingpad</tt></a> and
394   <a href="LangRef.html#i_resume"><tt>resume</tt></a> instructions, LLVM uses
395   several intrinsic functions (name prefixed with <i><tt>llvm.eh</tt></i>) to
396   provide exception handling information at various points in generated
397   code.</p>
398
399<!-- ======================================================================= -->
400<h4>
401  <a name="llvm_eh_typeid_for">llvm.eh.typeid.for</a>
402</h4>
403
404<div>
405
406<pre>
407  i32 @llvm.eh.typeid.for(i8* %type_info)
408</pre>
409
410<p>This intrinsic returns the type info index in the exception table of the
411   current function.  This value can be used to compare against the result
412   of <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a>.
413   The single argument is a reference to a type info.</p>
414
415</div>
416
417<!-- ======================================================================= -->
418<h4>
419  <a name="llvm_eh_sjlj_setjmp">llvm.eh.sjlj.setjmp</a>
420</h4>
421
422<div>
423
424<pre>
425  i32 @llvm.eh.sjlj.setjmp(i8* %setjmp_buf)
426</pre>
427
428<p>For SJLJ based exception handling, this intrinsic forces register saving for
429   the current function and stores the address of the following instruction for
430   use as a destination address
431   by <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a>. The
432   buffer format and the overall functioning of this intrinsic is compatible
433   with the GCC <tt>__builtin_setjmp</tt> implementation allowing code built
434   with the clang and GCC to interoperate.</p>
435
436<p>The single parameter is a pointer to a five word buffer in which the calling
437   context is saved. The front end places the frame pointer in the first word,
438   and the target implementation of this intrinsic should place the destination
439   address for a
440   <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a> in the
441   second word. The following three words are available for use in a
442   target-specific manner.</p>
443
444</div>
445
446<!-- ======================================================================= -->
447<h4>
448  <a name="llvm_eh_sjlj_longjmp">llvm.eh.sjlj.longjmp</a>
449</h4>
450
451<div>
452
453<pre>
454  void @llvm.eh.sjlj.longjmp(i8* %setjmp_buf)
455</pre>
456
457<p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.longjmp</tt>
458   intrinsic is used to implement <tt>__builtin_longjmp()</tt>. The single
459   parameter is a pointer to a buffer populated
460   by <a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a>. The frame
461   pointer and stack pointer are restored from the buffer, then control is
462   transferred to the destination address.</p>
463
464</div>
465<!-- ======================================================================= -->
466<h4>
467  <a name="llvm_eh_sjlj_lsda">llvm.eh.sjlj.lsda</a>
468</h4>
469
470<div>
471
472<pre>
473  i8* @llvm.eh.sjlj.lsda()
474</pre>
475
476<p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.lsda</tt> intrinsic
477   returns the address of the Language Specific Data Area (LSDA) for the current
478   function. The SJLJ front-end code stores this address in the exception
479   handling function context for use by the runtime.</p>
480
481</div>
482
483<!-- ======================================================================= -->
484<h4>
485  <a name="llvm_eh_sjlj_callsite">llvm.eh.sjlj.callsite</a>
486</h4>
487
488<div>
489
490<pre>
491  void @llvm.eh.sjlj.callsite(i32 %call_site_num)
492</pre>
493
494<p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.callsite</tt>
495   intrinsic identifies the callsite value associated with the
496   following <tt>invoke</tt> instruction. This is used to ensure that landing
497   pad entries in the LSDA are generated in matching order.</p>
498
499</div>
500
501<!-- ======================================================================= -->
502<h4>
503  <a name="llvm_eh_sjlj_dispatchsetup">llvm.eh.sjlj.dispatchsetup</a>
504</h4>
505
506<div>
507
508<pre>
509  void @llvm.eh.sjlj.dispatchsetup(i32 %dispatch_value)
510</pre>
511
512<p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.dispatchsetup</tt>
513   intrinsic is used by targets to do any unwind edge setup they need. By
514   default, no action is taken.</p>
515
516</div>
517
518</div>
519
520<!-- ======================================================================= -->
521<h2>
522  <a name="asm">Asm Table Formats</a>
523</h2>
524
525<div>
526
527<p>There are two tables that are used by the exception handling runtime to
528   determine which actions should be taken when an exception is thrown.</p>
529
530<!-- ======================================================================= -->
531<h3>
532  <a name="unwind_tables">Exception Handling Frame</a>
533</h3>
534
535<div>
536
537<p>An exception handling frame <tt>eh_frame</tt> is very similar to the unwind
538   frame used by DWARF debug info. The frame contains all the information
539   necessary to tear down the current frame and restore the state of the prior
540   frame. There is an exception handling frame for each function in a compile
541   unit, plus a common exception handling frame that defines information common
542   to all functions in the unit.</p>
543
544<!-- Todo - Table details here. -->
545
546</div>
547
548<!-- ======================================================================= -->
549<h3>
550  <a name="exception_tables">Exception Tables</a>
551</h3>
552
553<div>
554
555<p>An exception table contains information about what actions to take when an
556   exception is thrown in a particular part of a function's code. There is one
557   exception table per function, except leaf functions and functions that have
558   calls only to non-throwing functions. They do not need an exception
559   table.</p>
560
561<!-- Todo - Table details here. -->
562
563</div>
564
565</div>
566
567<!-- *********************************************************************** -->
568
569<hr>
570<address>
571  <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
572  src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
573  <a href="http://validator.w3.org/check/referer"><img
574  src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
575
576  <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
577  <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>
578  Last modified: $Date: 2011-09-27 16:16:57 -0400 (Tue, 27 Sep 2011) $
579</address>
580
581</body>
582</html>
583