1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2                      "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6  <title>Kaleidoscope: Extending the Language: User-defined Operators</title>
7  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8  <meta name="author" content="Chris Lattner">
9  <meta name="author" content="Erick Tryzelaar">
10  <link rel="stylesheet" href="../llvm.css" type="text/css">
11</head>
12
13<body>
14
15<h1>Kaleidoscope: Extending the Language: User-defined Operators</h1>
16
17<ul>
18<li><a href="index.html">Up to Tutorial Index</a></li>
19<li>Chapter 6
20  <ol>
21    <li><a href="#intro">Chapter 6 Introduction</a></li>
22    <li><a href="#idea">User-defined Operators: the Idea</a></li>
23    <li><a href="#binary">User-defined Binary Operators</a></li>
24    <li><a href="#unary">User-defined Unary Operators</a></li>
25    <li><a href="#example">Kicking the Tires</a></li>
26    <li><a href="#code">Full Code Listing</a></li>
27  </ol>
28</li>
29<li><a href="OCamlLangImpl7.html">Chapter 7</a>: Extending the Language: Mutable
30Variables / SSA Construction</li>
31</ul>
32
33<div class="doc_author">
34	<p>
35		Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>
36		and <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a>
37	</p>
38</div>
39
40<!-- *********************************************************************** -->
41<h2><a name="intro">Chapter 6 Introduction</a></h2>
42<!-- *********************************************************************** -->
43
44<div>
45
46<p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language
47with LLVM</a>" tutorial.  At this point in our tutorial, we now have a fully
48functional language that is fairly minimal, but also useful.  There
49is still one big problem with it, however. Our language doesn't have many
50useful operators (like division, logical negation, or even any comparisons
51besides less-than).</p>
52
53<p>This chapter of the tutorial takes a wild digression into adding user-defined
54operators to the simple and beautiful Kaleidoscope language. This digression now
55gives us a simple and ugly language in some ways, but also a powerful one at the
56same time.  One of the great things about creating your own language is that you
57get to decide what is good or bad.  In this tutorial we'll assume that it is
58okay to use this as a way to show some interesting parsing techniques.</p>
59
60<p>At the end of this tutorial, we'll run through an example Kaleidoscope
61application that <a href="#example">renders the Mandelbrot set</a>.  This gives
62an example of what you can build with Kaleidoscope and its feature set.</p>
63
64</div>
65
66<!-- *********************************************************************** -->
67<h2><a name="idea">User-defined Operators: the Idea</a></h2>
68<!-- *********************************************************************** -->
69
70<div>
71
72<p>
73The "operator overloading" that we will add to Kaleidoscope is more general than
74languages like C++.  In C++, you are only allowed to redefine existing
75operators: you can't programatically change the grammar, introduce new
76operators, change precedence levels, etc.  In this chapter, we will add this
77capability to Kaleidoscope, which will let the user round out the set of
78operators that are supported.</p>
79
80<p>The point of going into user-defined operators in a tutorial like this is to
81show the power and flexibility of using a hand-written parser.  Thus far, the parser
82we have been implementing uses recursive descent for most parts of the grammar and
83operator precedence parsing for the expressions.  See <a
84href="OCamlLangImpl2.html">Chapter 2</a> for details.  Without using operator
85precedence parsing, it would be very difficult to allow the programmer to
86introduce new operators into the grammar: the grammar is dynamically extensible
87as the JIT runs.</p>
88
89<p>The two specific features we'll add are programmable unary operators (right
90now, Kaleidoscope has no unary operators at all) as well as binary operators.
91An example of this is:</p>
92
93<div class="doc_code">
94<pre>
95# Logical unary not.
96def unary!(v)
97  if v then
98    0
99  else
100    1;
101
102# Define &gt; with the same precedence as &lt;.
103def binary&gt; 10 (LHS RHS)
104  RHS &lt; LHS;
105
106# Binary "logical or", (note that it does not "short circuit")
107def binary| 5 (LHS RHS)
108  if LHS then
109    1
110  else if RHS then
111    1
112  else
113    0;
114
115# Define = with slightly lower precedence than relationals.
116def binary= 9 (LHS RHS)
117  !(LHS &lt; RHS | LHS &gt; RHS);
118</pre>
119</div>
120
121<p>Many languages aspire to being able to implement their standard runtime
122library in the language itself.  In Kaleidoscope, we can implement significant
123parts of the language in the library!</p>
124
125<p>We will break down implementation of these features into two parts:
126implementing support for user-defined binary operators and adding unary
127operators.</p>
128
129</div>
130
131<!-- *********************************************************************** -->
132<h2><a name="binary">User-defined Binary Operators</a></h2>
133<!-- *********************************************************************** -->
134
135<div>
136
137<p>Adding support for user-defined binary operators is pretty simple with our
138current framework.  We'll first add support for the unary/binary keywords:</p>
139
140<div class="doc_code">
141<pre>
142type token =
143  ...
144  <b>(* operators *)
145  | Binary | Unary</b>
146
147...
148
149and lex_ident buffer = parser
150  ...
151      | "for" -&gt; [&lt; 'Token.For; stream &gt;]
152      | "in" -&gt; [&lt; 'Token.In; stream &gt;]
153      <b>| "binary" -&gt; [&lt; 'Token.Binary; stream &gt;]
154      | "unary" -&gt; [&lt; 'Token.Unary; stream &gt;]</b>
155</pre>
156</div>
157
158<p>This just adds lexer support for the unary and binary keywords, like we
159did in <a href="OCamlLangImpl5.html#iflexer">previous chapters</a>.  One nice
160thing about our current AST, is that we represent binary operators with full
161generalisation by using their ASCII code as the opcode.  For our extended
162operators, we'll use this same representation, so we don't need any new AST or
163parser support.</p>
164
165<p>On the other hand, we have to be able to represent the definitions of these
166new operators, in the "def binary| 5" part of the function definition.  In our
167grammar so far, the "name" for the function definition is parsed as the
168"prototype" production and into the <tt>Ast.Prototype</tt> AST node.  To
169represent our new user-defined operators as prototypes, we have to extend
170the  <tt>Ast.Prototype</tt> AST node like this:</p>
171
172<div class="doc_code">
173<pre>
174(* proto - This type represents the "prototype" for a function, which captures
175 * its name, and its argument names (thus implicitly the number of arguments the
176 * function takes). *)
177type proto =
178  | Prototype of string * string array
179  <b>| BinOpPrototype of string * string array * int</b>
180</pre>
181</div>
182
183<p>Basically, in addition to knowing a name for the prototype, we now keep track
184of whether it was an operator, and if it was, what precedence level the operator
185is at.  The precedence is only used for binary operators (as you'll see below,
186it just doesn't apply for unary operators).  Now that we have a way to represent
187the prototype for a user-defined operator, we need to parse it:</p>
188
189<div class="doc_code">
190<pre>
191(* prototype
192 *   ::= id '(' id* ')'
193 <b>*   ::= binary LETTER number? (id, id)
194 *   ::= unary LETTER number? (id) *)</b>
195let parse_prototype =
196  let rec parse_args accumulator = parser
197    | [&lt; 'Token.Ident id; e=parse_args (id::accumulator) &gt;] -&gt; e
198    | [&lt; &gt;] -&gt; accumulator
199  in
200  let parse_operator = parser
201    | [&lt; 'Token.Unary &gt;] -&gt; "unary", 1
202    | [&lt; 'Token.Binary &gt;] -&gt; "binary", 2
203  in
204  let parse_binary_precedence = parser
205    | [&lt; 'Token.Number n &gt;] -&gt; int_of_float n
206    | [&lt; &gt;] -&gt; 30
207  in
208  parser
209  | [&lt; 'Token.Ident id;
210       'Token.Kwd '(' ?? "expected '(' in prototype";
211       args=parse_args [];
212       'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
213      (* success. *)
214      Ast.Prototype (id, Array.of_list (List.rev args))
215  <b>| [&lt; (prefix, kind)=parse_operator;
216       'Token.Kwd op ?? "expected an operator";
217       (* Read the precedence if present. *)
218       binary_precedence=parse_binary_precedence;
219       'Token.Kwd '(' ?? "expected '(' in prototype";
220        args=parse_args [];
221       'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
222      let name = prefix ^ (String.make 1 op) in
223      let args = Array.of_list (List.rev args) in
224
225      (* Verify right number of arguments for operator. *)
226      if Array.length args != kind
227      then raise (Stream.Error "invalid number of operands for operator")
228      else
229        if kind == 1 then
230          Ast.Prototype (name, args)
231        else
232          Ast.BinOpPrototype (name, args, binary_precedence)</b>
233  | [&lt; &gt;] -&gt;
234      raise (Stream.Error "expected function name in prototype")
235</pre>
236</div>
237
238<p>This is all fairly straightforward parsing code, and we have already seen
239a lot of similar code in the past.  One interesting part about the code above is
240the couple lines that set up <tt>name</tt> for binary operators.  This builds
241names like "binary@" for a newly defined "@" operator.  This then takes
242advantage of the fact that symbol names in the LLVM symbol table are allowed to
243have any character in them, including embedded nul characters.</p>
244
245<p>The next interesting thing to add, is codegen support for these binary
246operators.  Given our current structure, this is a simple addition of a default
247case for our existing binary operator node:</p>
248
249<div class="doc_code">
250<pre>
251let codegen_expr = function
252  ...
253  | Ast.Binary (op, lhs, rhs) -&gt;
254      let lhs_val = codegen_expr lhs in
255      let rhs_val = codegen_expr rhs in
256      begin
257        match op with
258        | '+' -&gt; build_add lhs_val rhs_val "addtmp" builder
259        | '-' -&gt; build_sub lhs_val rhs_val "subtmp" builder
260        | '*' -&gt; build_mul lhs_val rhs_val "multmp" builder
261        | '&lt;' -&gt;
262            (* Convert bool 0/1 to double 0.0 or 1.0 *)
263            let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
264            build_uitofp i double_type "booltmp" builder
265        <b>| _ -&gt;
266            (* If it wasn't a builtin binary operator, it must be a user defined
267             * one. Emit a call to it. *)
268            let callee = "binary" ^ (String.make 1 op) in
269            let callee =
270              match lookup_function callee the_module with
271              | Some callee -&gt; callee
272              | None -&gt; raise (Error "binary operator not found!")
273            in
274            build_call callee [|lhs_val; rhs_val|] "binop" builder</b>
275      end
276</pre>
277</div>
278
279<p>As you can see above, the new code is actually really simple.  It just does
280a lookup for the appropriate operator in the symbol table and generates a
281function call to it.  Since user-defined operators are just built as normal
282functions (because the "prototype" boils down to a function with the right
283name) everything falls into place.</p>
284
285<p>The final piece of code we are missing, is a bit of top level magic:</p>
286
287<div class="doc_code">
288<pre>
289let codegen_func the_fpm = function
290  | Ast.Function (proto, body) -&gt;
291      Hashtbl.clear named_values;
292      let the_function = codegen_proto proto in
293
294      <b>(* If this is an operator, install it. *)
295      begin match proto with
296      | Ast.BinOpPrototype (name, args, prec) -&gt;
297          let op = name.[String.length name - 1] in
298          Hashtbl.add Parser.binop_precedence op prec;
299      | _ -&gt; ()
300      end;</b>
301
302      (* Create a new basic block to start insertion into. *)
303      let bb = append_block context "entry" the_function in
304      position_at_end bb builder;
305      ...
306</pre>
307</div>
308
309<p>Basically, before codegening a function, if it is a user-defined operator, we
310register it in the precedence table.  This allows the binary operator parsing
311logic we already have in place to handle it.  Since we are working on a
312fully-general operator precedence parser, this is all we need to do to "extend
313the grammar".</p>
314
315<p>Now we have useful user-defined binary operators.  This builds a lot
316on the previous framework we built for other operators.  Adding unary operators
317is a bit more challenging, because we don't have any framework for it yet - lets
318see what it takes.</p>
319
320</div>
321
322<!-- *********************************************************************** -->
323<h2><a name="unary">User-defined Unary Operators</a></h2>
324<!-- *********************************************************************** -->
325
326<div>
327
328<p>Since we don't currently support unary operators in the Kaleidoscope
329language, we'll need to add everything to support them.  Above, we added simple
330support for the 'unary' keyword to the lexer.  In addition to that, we need an
331AST node:</p>
332
333<div class="doc_code">
334<pre>
335type expr =
336  ...
337  (* variant for a unary operator. *)
338  | Unary of char * expr
339  ...
340</pre>
341</div>
342
343<p>This AST node is very simple and obvious by now.  It directly mirrors the
344binary operator AST node, except that it only has one child.  With this, we
345need to add the parsing logic.  Parsing a unary operator is pretty simple: we'll
346add a new function to do it:</p>
347
348<div class="doc_code">
349<pre>
350(* unary
351 *   ::= primary
352 *   ::= '!' unary *)
353and parse_unary = parser
354  (* If this is a unary operator, read it. *)
355  | [&lt; 'Token.Kwd op when op != '(' &amp;&amp; op != ')'; operand=parse_expr &gt;] -&gt;
356      Ast.Unary (op, operand)
357
358  (* If the current token is not an operator, it must be a primary expr. *)
359  | [&lt; stream &gt;] -&gt; parse_primary stream
360</pre>
361</div>
362
363<p>The grammar we add is pretty straightforward here.  If we see a unary
364operator when parsing a primary operator, we eat the operator as a prefix and
365parse the remaining piece as another unary operator.  This allows us to handle
366multiple unary operators (e.g. "!!x").  Note that unary operators can't have
367ambiguous parses like binary operators can, so there is no need for precedence
368information.</p>
369
370<p>The problem with this function, is that we need to call ParseUnary from
371somewhere.  To do this, we change previous callers of ParsePrimary to call
372<tt>parse_unary</tt> instead:</p>
373
374<div class="doc_code">
375<pre>
376(* binoprhs
377 *   ::= ('+' primary)* *)
378and parse_bin_rhs expr_prec lhs stream =
379        ...
380        <b>(* Parse the unary expression after the binary operator. *)
381        let rhs = parse_unary stream in</b>
382        ...
383
384...
385
386(* expression
387 *   ::= primary binoprhs *)
388and parse_expr = parser
389  | [&lt; lhs=<b>parse_unary</b>; stream &gt;] -&gt; parse_bin_rhs 0 lhs stream
390</pre>
391</div>
392
393<p>With these two simple changes, we are now able to parse unary operators and build the
394AST for them.  Next up, we need to add parser support for prototypes, to parse
395the unary operator prototype.  We extend the binary operator code above
396with:</p>
397
398<div class="doc_code">
399<pre>
400(* prototype
401 *   ::= id '(' id* ')'
402 *   ::= binary LETTER number? (id, id)
403 <b>*   ::= unary LETTER number? (id)</b> *)
404let parse_prototype =
405  let rec parse_args accumulator = parser
406    | [&lt; 'Token.Ident id; e=parse_args (id::accumulator) &gt;] -&gt; e
407    | [&lt; &gt;] -&gt; accumulator
408  in
409  <b>let parse_operator = parser
410    | [&lt; 'Token.Unary &gt;] -&gt; "unary", 1
411    | [&lt; 'Token.Binary &gt;] -&gt; "binary", 2
412  in</b>
413  let parse_binary_precedence = parser
414    | [&lt; 'Token.Number n &gt;] -&gt; int_of_float n
415    | [&lt; &gt;] -&gt; 30
416  in
417  parser
418  | [&lt; 'Token.Ident id;
419       'Token.Kwd '(' ?? "expected '(' in prototype";
420       args=parse_args [];
421       'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
422      (* success. *)
423      Ast.Prototype (id, Array.of_list (List.rev args))
424  <b>| [&lt; (prefix, kind)=parse_operator;
425       'Token.Kwd op ?? "expected an operator";
426       (* Read the precedence if present. *)
427       binary_precedence=parse_binary_precedence;
428       'Token.Kwd '(' ?? "expected '(' in prototype";
429        args=parse_args [];
430       'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
431      let name = prefix ^ (String.make 1 op) in
432      let args = Array.of_list (List.rev args) in
433
434      (* Verify right number of arguments for operator. *)
435      if Array.length args != kind
436      then raise (Stream.Error "invalid number of operands for operator")
437      else
438        if kind == 1 then
439          Ast.Prototype (name, args)
440        else
441          Ast.BinOpPrototype (name, args, binary_precedence)</b>
442  | [&lt; &gt;] -&gt;
443      raise (Stream.Error "expected function name in prototype")
444</pre>
445</div>
446
447<p>As with binary operators, we name unary operators with a name that includes
448the operator character.  This assists us at code generation time.  Speaking of,
449the final piece we need to add is codegen support for unary operators.  It looks
450like this:</p>
451
452<div class="doc_code">
453<pre>
454let rec codegen_expr = function
455  ...
456  | Ast.Unary (op, operand) -&gt;
457      let operand = codegen_expr operand in
458      let callee = "unary" ^ (String.make 1 op) in
459      let callee =
460        match lookup_function callee the_module with
461        | Some callee -&gt; callee
462        | None -&gt; raise (Error "unknown unary operator")
463      in
464      build_call callee [|operand|] "unop" builder
465</pre>
466</div>
467
468<p>This code is similar to, but simpler than, the code for binary operators.  It
469is simpler primarily because it doesn't need to handle any predefined operators.
470</p>
471
472</div>
473
474<!-- *********************************************************************** -->
475<h2><a name="example">Kicking the Tires</a></h2>
476<!-- *********************************************************************** -->
477
478<div>
479
480<p>It is somewhat hard to believe, but with a few simple extensions we've
481covered in the last chapters, we have grown a real-ish language.  With this, we
482can do a lot of interesting things, including I/O, math, and a bunch of other
483things.  For example, we can now add a nice sequencing operator (printd is
484defined to print out the specified value and a newline):</p>
485
486<div class="doc_code">
487<pre>
488ready&gt; <b>extern printd(x);</b>
489Read extern: declare double @printd(double)
490ready&gt; <b>def binary : 1 (x y) 0;  # Low-precedence operator that ignores operands.</b>
491..
492ready&gt; <b>printd(123) : printd(456) : printd(789);</b>
493123.000000
494456.000000
495789.000000
496Evaluated to 0.000000
497</pre>
498</div>
499
500<p>We can also define a bunch of other "primitive" operations, such as:</p>
501
502<div class="doc_code">
503<pre>
504# Logical unary not.
505def unary!(v)
506  if v then
507    0
508  else
509    1;
510
511# Unary negate.
512def unary-(v)
513  0-v;
514
515# Define &gt; with the same precedence as &lt;.
516def binary&gt; 10 (LHS RHS)
517  RHS &lt; LHS;
518
519# Binary logical or, which does not short circuit.
520def binary| 5 (LHS RHS)
521  if LHS then
522    1
523  else if RHS then
524    1
525  else
526    0;
527
528# Binary logical and, which does not short circuit.
529def binary&amp; 6 (LHS RHS)
530  if !LHS then
531    0
532  else
533    !!RHS;
534
535# Define = with slightly lower precedence than relationals.
536def binary = 9 (LHS RHS)
537  !(LHS &lt; RHS | LHS &gt; RHS);
538
539</pre>
540</div>
541
542
543<p>Given the previous if/then/else support, we can also define interesting
544functions for I/O.  For example, the following prints out a character whose
545"density" reflects the value passed in: the lower the value, the denser the
546character:</p>
547
548<div class="doc_code">
549<pre>
550ready&gt;
551<b>
552extern putchard(char)
553def printdensity(d)
554  if d &gt; 8 then
555    putchard(32)  # ' '
556  else if d &gt; 4 then
557    putchard(46)  # '.'
558  else if d &gt; 2 then
559    putchard(43)  # '+'
560  else
561    putchard(42); # '*'</b>
562...
563ready&gt; <b>printdensity(1): printdensity(2): printdensity(3) :
564          printdensity(4): printdensity(5): printdensity(9): putchard(10);</b>
565*++..
566Evaluated to 0.000000
567</pre>
568</div>
569
570<p>Based on these simple primitive operations, we can start to define more
571interesting things.  For example, here's a little function that solves for the
572number of iterations it takes a function in the complex plane to
573converge:</p>
574
575<div class="doc_code">
576<pre>
577# determine whether the specific location diverges.
578# Solve for z = z^2 + c in the complex plane.
579def mandleconverger(real imag iters creal cimag)
580  if iters &gt; 255 | (real*real + imag*imag &gt; 4) then
581    iters
582  else
583    mandleconverger(real*real - imag*imag + creal,
584                    2*real*imag + cimag,
585                    iters+1, creal, cimag);
586
587# return the number of iterations required for the iteration to escape
588def mandleconverge(real imag)
589  mandleconverger(real, imag, 0, real, imag);
590</pre>
591</div>
592
593<p>This "z = z<sup>2</sup> + c" function is a beautiful little creature that is the basis
594for computation of the <a
595href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>.  Our
596<tt>mandelconverge</tt> function returns the number of iterations that it takes
597for a complex orbit to escape, saturating to 255.  This is not a very useful
598function by itself, but if you plot its value over a two-dimensional plane,
599you can see the Mandelbrot set.  Given that we are limited to using putchard
600here, our amazing graphical output is limited, but we can whip together
601something using the density plotter above:</p>
602
603<div class="doc_code">
604<pre>
605# compute and plot the mandlebrot set with the specified 2 dimensional range
606# info.
607def mandelhelp(xmin xmax xstep   ymin ymax ystep)
608  for y = ymin, y &lt; ymax, ystep in (
609    (for x = xmin, x &lt; xmax, xstep in
610       printdensity(mandleconverge(x,y)))
611    : putchard(10)
612  )
613
614# mandel - This is a convenient helper function for ploting the mandelbrot set
615# from the specified position with the specified Magnification.
616def mandel(realstart imagstart realmag imagmag)
617  mandelhelp(realstart, realstart+realmag*78, realmag,
618             imagstart, imagstart+imagmag*40, imagmag);
619</pre>
620</div>
621
622<p>Given this, we can try plotting out the mandlebrot set!  Lets try it out:</p>
623
624<div class="doc_code">
625<pre>
626ready&gt; <b>mandel(-2.3, -1.3, 0.05, 0.07);</b>
627*******************************+++++++++++*************************************
628*************************+++++++++++++++++++++++*******************************
629**********************+++++++++++++++++++++++++++++****************************
630*******************+++++++++++++++++++++.. ...++++++++*************************
631*****************++++++++++++++++++++++.... ...+++++++++***********************
632***************+++++++++++++++++++++++.....   ...+++++++++*********************
633**************+++++++++++++++++++++++....     ....+++++++++********************
634*************++++++++++++++++++++++......      .....++++++++*******************
635************+++++++++++++++++++++.......       .......+++++++******************
636***********+++++++++++++++++++....                ... .+++++++*****************
637**********+++++++++++++++++.......                     .+++++++****************
638*********++++++++++++++...........                    ...+++++++***************
639********++++++++++++............                      ...++++++++**************
640********++++++++++... ..........                        .++++++++**************
641*******+++++++++.....                                   .+++++++++*************
642*******++++++++......                                  ..+++++++++*************
643*******++++++.......                                   ..+++++++++*************
644*******+++++......                                     ..+++++++++*************
645*******.... ....                                      ...+++++++++*************
646*******.... .                                         ...+++++++++*************
647*******+++++......                                    ...+++++++++*************
648*******++++++.......                                   ..+++++++++*************
649*******++++++++......                                   .+++++++++*************
650*******+++++++++.....                                  ..+++++++++*************
651********++++++++++... ..........                        .++++++++**************
652********++++++++++++............                      ...++++++++**************
653*********++++++++++++++..........                     ...+++++++***************
654**********++++++++++++++++........                     .+++++++****************
655**********++++++++++++++++++++....                ... ..+++++++****************
656***********++++++++++++++++++++++.......       .......++++++++*****************
657************+++++++++++++++++++++++......      ......++++++++******************
658**************+++++++++++++++++++++++....      ....++++++++********************
659***************+++++++++++++++++++++++.....   ...+++++++++*********************
660*****************++++++++++++++++++++++....  ...++++++++***********************
661*******************+++++++++++++++++++++......++++++++*************************
662*********************++++++++++++++++++++++.++++++++***************************
663*************************+++++++++++++++++++++++*******************************
664******************************+++++++++++++************************************
665*******************************************************************************
666*******************************************************************************
667*******************************************************************************
668Evaluated to 0.000000
669ready&gt; <b>mandel(-2, -1, 0.02, 0.04);</b>
670**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
671***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
672*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
673*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
674*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
675***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
676**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
677************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
678***********++++++++++++++++++++++++++++++++++++++++++++++++++........        .
679**********++++++++++++++++++++++++++++++++++++++++++++++.............
680********+++++++++++++++++++++++++++++++++++++++++++..................
681*******+++++++++++++++++++++++++++++++++++++++.......................
682******+++++++++++++++++++++++++++++++++++...........................
683*****++++++++++++++++++++++++++++++++............................
684*****++++++++++++++++++++++++++++...............................
685****++++++++++++++++++++++++++......   .........................
686***++++++++++++++++++++++++.........     ......    ...........
687***++++++++++++++++++++++............
688**+++++++++++++++++++++..............
689**+++++++++++++++++++................
690*++++++++++++++++++.................
691*++++++++++++++++............ ...
692*++++++++++++++..............
693*+++....++++................
694*..........  ...........
695*
696*..........  ...........
697*+++....++++................
698*++++++++++++++..............
699*++++++++++++++++............ ...
700*++++++++++++++++++.................
701**+++++++++++++++++++................
702**+++++++++++++++++++++..............
703***++++++++++++++++++++++............
704***++++++++++++++++++++++++.........     ......    ...........
705****++++++++++++++++++++++++++......   .........................
706*****++++++++++++++++++++++++++++...............................
707*****++++++++++++++++++++++++++++++++............................
708******+++++++++++++++++++++++++++++++++++...........................
709*******+++++++++++++++++++++++++++++++++++++++.......................
710********+++++++++++++++++++++++++++++++++++++++++++..................
711Evaluated to 0.000000
712ready&gt; <b>mandel(-0.9, -1.4, 0.02, 0.03);</b>
713*******************************************************************************
714*******************************************************************************
715*******************************************************************************
716**********+++++++++++++++++++++************************************************
717*+++++++++++++++++++++++++++++++++++++++***************************************
718+++++++++++++++++++++++++++++++++++++++++++++**********************************
719++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
720++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
721+++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
722+++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
723+++++++++++++++++++++++++++++++....   ......+++++++++++++++++++****************
724+++++++++++++++++++++++++++++.......  ........+++++++++++++++++++**************
725++++++++++++++++++++++++++++........   ........++++++++++++++++++++************
726+++++++++++++++++++++++++++.........     ..  ...+++++++++++++++++++++**********
727++++++++++++++++++++++++++...........        ....++++++++++++++++++++++********
728++++++++++++++++++++++++.............       .......++++++++++++++++++++++******
729+++++++++++++++++++++++.............        ........+++++++++++++++++++++++****
730++++++++++++++++++++++...........           ..........++++++++++++++++++++++***
731++++++++++++++++++++...........                .........++++++++++++++++++++++*
732++++++++++++++++++............                  ...........++++++++++++++++++++
733++++++++++++++++...............                 .............++++++++++++++++++
734++++++++++++++.................                 ...............++++++++++++++++
735++++++++++++..................                  .................++++++++++++++
736+++++++++..................                      .................+++++++++++++
737++++++........        .                               .........  ..++++++++++++
738++............                                         ......    ....++++++++++
739..............                                                    ...++++++++++
740..............                                                    ....+++++++++
741..............                                                    .....++++++++
742.............                                                    ......++++++++
743...........                                                     .......++++++++
744.........                                                       ........+++++++
745.........                                                       ........+++++++
746.........                                                           ....+++++++
747........                                                             ...+++++++
748.......                                                              ...+++++++
749                                                                    ....+++++++
750                                                                   .....+++++++
751                                                                    ....+++++++
752                                                                    ....+++++++
753                                                                    ....+++++++
754Evaluated to 0.000000
755ready&gt; <b>^D</b>
756</pre>
757</div>
758
759<p>At this point, you may be starting to realize that Kaleidoscope is a real
760and powerful language.  It may not be self-similar :), but it can be used to
761plot things that are!</p>
762
763<p>With this, we conclude the "adding user-defined operators" chapter of the
764tutorial.  We have successfully augmented our language, adding the ability to
765extend the language in the library, and we have shown how this can be used to
766build a simple but interesting end-user application in Kaleidoscope.  At this
767point, Kaleidoscope can build a variety of applications that are functional and
768can call functions with side-effects, but it can't actually define and mutate a
769variable itself.</p>
770
771<p>Strikingly, variable mutation is an important feature of some
772languages, and it is not at all obvious how to <a href="OCamlLangImpl7.html">add
773support for mutable variables</a> without having to add an "SSA construction"
774phase to your front-end.  In the next chapter, we will describe how you can
775add variable mutation without building SSA in your front-end.</p>
776
777</div>
778
779
780<!-- *********************************************************************** -->
781<h2><a name="code">Full Code Listing</a></h2>
782<!-- *********************************************************************** -->
783
784<div>
785
786<p>
787Here is the complete code listing for our running example, enhanced with the
788if/then/else and for expressions..  To build this example, use:
789</p>
790
791<div class="doc_code">
792<pre>
793# Compile
794ocamlbuild toy.byte
795# Run
796./toy.byte
797</pre>
798</div>
799
800<p>Here is the code:</p>
801
802<dl>
803<dt>_tags:</dt>
804<dd class="doc_code">
805<pre>
806&lt;{lexer,parser}.ml&gt;: use_camlp4, pp(camlp4of)
807&lt;*.{byte,native}&gt;: g++, use_llvm, use_llvm_analysis
808&lt;*.{byte,native}&gt;: use_llvm_executionengine, use_llvm_target
809&lt;*.{byte,native}&gt;: use_llvm_scalar_opts, use_bindings
810</pre>
811</dd>
812
813<dt>myocamlbuild.ml:</dt>
814<dd class="doc_code">
815<pre>
816open Ocamlbuild_plugin;;
817
818ocaml_lib ~extern:true "llvm";;
819ocaml_lib ~extern:true "llvm_analysis";;
820ocaml_lib ~extern:true "llvm_executionengine";;
821ocaml_lib ~extern:true "llvm_target";;
822ocaml_lib ~extern:true "llvm_scalar_opts";;
823
824flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
825dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
826</pre>
827</dd>
828
829<dt>token.ml:</dt>
830<dd class="doc_code">
831<pre>
832(*===----------------------------------------------------------------------===
833 * Lexer Tokens
834 *===----------------------------------------------------------------------===*)
835
836(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
837 * these others for known things. *)
838type token =
839  (* commands *)
840  | Def | Extern
841
842  (* primary *)
843  | Ident of string | Number of float
844
845  (* unknown *)
846  | Kwd of char
847
848  (* control *)
849  | If | Then | Else
850  | For | In
851
852  (* operators *)
853  | Binary | Unary
854</pre>
855</dd>
856
857<dt>lexer.ml:</dt>
858<dd class="doc_code">
859<pre>
860(*===----------------------------------------------------------------------===
861 * Lexer
862 *===----------------------------------------------------------------------===*)
863
864let rec lex = parser
865  (* Skip any whitespace. *)
866  | [&lt; ' (' ' | '\n' | '\r' | '\t'); stream &gt;] -&gt; lex stream
867
868  (* identifier: [a-zA-Z][a-zA-Z0-9] *)
869  | [&lt; ' ('A' .. 'Z' | 'a' .. 'z' as c); stream &gt;] -&gt;
870      let buffer = Buffer.create 1 in
871      Buffer.add_char buffer c;
872      lex_ident buffer stream
873
874  (* number: [0-9.]+ *)
875  | [&lt; ' ('0' .. '9' as c); stream &gt;] -&gt;
876      let buffer = Buffer.create 1 in
877      Buffer.add_char buffer c;
878      lex_number buffer stream
879
880  (* Comment until end of line. *)
881  | [&lt; ' ('#'); stream &gt;] -&gt;
882      lex_comment stream
883
884  (* Otherwise, just return the character as its ascii value. *)
885  | [&lt; 'c; stream &gt;] -&gt;
886      [&lt; 'Token.Kwd c; lex stream &gt;]
887
888  (* end of stream. *)
889  | [&lt; &gt;] -&gt; [&lt; &gt;]
890
891and lex_number buffer = parser
892  | [&lt; ' ('0' .. '9' | '.' as c); stream &gt;] -&gt;
893      Buffer.add_char buffer c;
894      lex_number buffer stream
895  | [&lt; stream=lex &gt;] -&gt;
896      [&lt; 'Token.Number (float_of_string (Buffer.contents buffer)); stream &gt;]
897
898and lex_ident buffer = parser
899  | [&lt; ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream &gt;] -&gt;
900      Buffer.add_char buffer c;
901      lex_ident buffer stream
902  | [&lt; stream=lex &gt;] -&gt;
903      match Buffer.contents buffer with
904      | "def" -&gt; [&lt; 'Token.Def; stream &gt;]
905      | "extern" -&gt; [&lt; 'Token.Extern; stream &gt;]
906      | "if" -&gt; [&lt; 'Token.If; stream &gt;]
907      | "then" -&gt; [&lt; 'Token.Then; stream &gt;]
908      | "else" -&gt; [&lt; 'Token.Else; stream &gt;]
909      | "for" -&gt; [&lt; 'Token.For; stream &gt;]
910      | "in" -&gt; [&lt; 'Token.In; stream &gt;]
911      | "binary" -&gt; [&lt; 'Token.Binary; stream &gt;]
912      | "unary" -&gt; [&lt; 'Token.Unary; stream &gt;]
913      | id -&gt; [&lt; 'Token.Ident id; stream &gt;]
914
915and lex_comment = parser
916  | [&lt; ' ('\n'); stream=lex &gt;] -&gt; stream
917  | [&lt; 'c; e=lex_comment &gt;] -&gt; e
918  | [&lt; &gt;] -&gt; [&lt; &gt;]
919</pre>
920</dd>
921
922<dt>ast.ml:</dt>
923<dd class="doc_code">
924<pre>
925(*===----------------------------------------------------------------------===
926 * Abstract Syntax Tree (aka Parse Tree)
927 *===----------------------------------------------------------------------===*)
928
929(* expr - Base type for all expression nodes. *)
930type expr =
931  (* variant for numeric literals like "1.0". *)
932  | Number of float
933
934  (* variant for referencing a variable, like "a". *)
935  | Variable of string
936
937  (* variant for a unary operator. *)
938  | Unary of char * expr
939
940  (* variant for a binary operator. *)
941  | Binary of char * expr * expr
942
943  (* variant for function calls. *)
944  | Call of string * expr array
945
946  (* variant for if/then/else. *)
947  | If of expr * expr * expr
948
949  (* variant for for/in. *)
950  | For of string * expr * expr * expr option * expr
951
952(* proto - This type represents the "prototype" for a function, which captures
953 * its name, and its argument names (thus implicitly the number of arguments the
954 * function takes). *)
955type proto =
956  | Prototype of string * string array
957  | BinOpPrototype of string * string array * int
958
959(* func - This type represents a function definition itself. *)
960type func = Function of proto * expr
961</pre>
962</dd>
963
964<dt>parser.ml:</dt>
965<dd class="doc_code">
966<pre>
967(*===---------------------------------------------------------------------===
968 * Parser
969 *===---------------------------------------------------------------------===*)
970
971(* binop_precedence - This holds the precedence for each binary operator that is
972 * defined *)
973let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
974
975(* precedence - Get the precedence of the pending binary operator token. *)
976let precedence c = try Hashtbl.find binop_precedence c with Not_found -&gt; -1
977
978(* primary
979 *   ::= identifier
980 *   ::= numberexpr
981 *   ::= parenexpr
982 *   ::= ifexpr
983 *   ::= forexpr *)
984let rec parse_primary = parser
985  (* numberexpr ::= number *)
986  | [&lt; 'Token.Number n &gt;] -&gt; Ast.Number n
987
988  (* parenexpr ::= '(' expression ')' *)
989  | [&lt; 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" &gt;] -&gt; e
990
991  (* identifierexpr
992   *   ::= identifier
993   *   ::= identifier '(' argumentexpr ')' *)
994  | [&lt; 'Token.Ident id; stream &gt;] -&gt;
995      let rec parse_args accumulator = parser
996        | [&lt; e=parse_expr; stream &gt;] -&gt;
997            begin parser
998              | [&lt; 'Token.Kwd ','; e=parse_args (e :: accumulator) &gt;] -&gt; e
999              | [&lt; &gt;] -&gt; e :: accumulator
1000            end stream
1001        | [&lt; &gt;] -&gt; accumulator
1002      in
1003      let rec parse_ident id = parser
1004        (* Call. *)
1005        | [&lt; 'Token.Kwd '(';
1006             args=parse_args [];
1007             'Token.Kwd ')' ?? "expected ')'"&gt;] -&gt;
1008            Ast.Call (id, Array.of_list (List.rev args))
1009
1010        (* Simple variable ref. *)
1011        | [&lt; &gt;] -&gt; Ast.Variable id
1012      in
1013      parse_ident id stream
1014
1015  (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
1016  | [&lt; 'Token.If; c=parse_expr;
1017       'Token.Then ?? "expected 'then'"; t=parse_expr;
1018       'Token.Else ?? "expected 'else'"; e=parse_expr &gt;] -&gt;
1019      Ast.If (c, t, e)
1020
1021  (* forexpr
1022        ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
1023  | [&lt; 'Token.For;
1024       'Token.Ident id ?? "expected identifier after for";
1025       'Token.Kwd '=' ?? "expected '=' after for";
1026       stream &gt;] -&gt;
1027      begin parser
1028        | [&lt;
1029             start=parse_expr;
1030             'Token.Kwd ',' ?? "expected ',' after for";
1031             end_=parse_expr;
1032             stream &gt;] -&gt;
1033            let step =
1034              begin parser
1035              | [&lt; 'Token.Kwd ','; step=parse_expr &gt;] -&gt; Some step
1036              | [&lt; &gt;] -&gt; None
1037              end stream
1038            in
1039            begin parser
1040            | [&lt; 'Token.In; body=parse_expr &gt;] -&gt;
1041                Ast.For (id, start, end_, step, body)
1042            | [&lt; &gt;] -&gt;
1043                raise (Stream.Error "expected 'in' after for")
1044            end stream
1045        | [&lt; &gt;] -&gt;
1046            raise (Stream.Error "expected '=' after for")
1047      end stream
1048
1049  | [&lt; &gt;] -&gt; raise (Stream.Error "unknown token when expecting an expression.")
1050
1051(* unary
1052 *   ::= primary
1053 *   ::= '!' unary *)
1054and parse_unary = parser
1055  (* If this is a unary operator, read it. *)
1056  | [&lt; 'Token.Kwd op when op != '(' &amp;&amp; op != ')'; operand=parse_expr &gt;] -&gt;
1057      Ast.Unary (op, operand)
1058
1059  (* If the current token is not an operator, it must be a primary expr. *)
1060  | [&lt; stream &gt;] -&gt; parse_primary stream
1061
1062(* binoprhs
1063 *   ::= ('+' primary)* *)
1064and parse_bin_rhs expr_prec lhs stream =
1065  match Stream.peek stream with
1066  (* If this is a binop, find its precedence. *)
1067  | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c -&gt;
1068      let token_prec = precedence c in
1069
1070      (* If this is a binop that binds at least as tightly as the current binop,
1071       * consume it, otherwise we are done. *)
1072      if token_prec &lt; expr_prec then lhs else begin
1073        (* Eat the binop. *)
1074        Stream.junk stream;
1075
1076        (* Parse the unary expression after the binary operator. *)
1077        let rhs = parse_unary stream in
1078
1079        (* Okay, we know this is a binop. *)
1080        let rhs =
1081          match Stream.peek stream with
1082          | Some (Token.Kwd c2) -&gt;
1083              (* If BinOp binds less tightly with rhs than the operator after
1084               * rhs, let the pending operator take rhs as its lhs. *)
1085              let next_prec = precedence c2 in
1086              if token_prec &lt; next_prec
1087              then parse_bin_rhs (token_prec + 1) rhs stream
1088              else rhs
1089          | _ -&gt; rhs
1090        in
1091
1092        (* Merge lhs/rhs. *)
1093        let lhs = Ast.Binary (c, lhs, rhs) in
1094        parse_bin_rhs expr_prec lhs stream
1095      end
1096  | _ -&gt; lhs
1097
1098(* expression
1099 *   ::= primary binoprhs *)
1100and parse_expr = parser
1101  | [&lt; lhs=parse_unary; stream &gt;] -&gt; parse_bin_rhs 0 lhs stream
1102
1103(* prototype
1104 *   ::= id '(' id* ')'
1105 *   ::= binary LETTER number? (id, id)
1106 *   ::= unary LETTER number? (id) *)
1107let parse_prototype =
1108  let rec parse_args accumulator = parser
1109    | [&lt; 'Token.Ident id; e=parse_args (id::accumulator) &gt;] -&gt; e
1110    | [&lt; &gt;] -&gt; accumulator
1111  in
1112  let parse_operator = parser
1113    | [&lt; 'Token.Unary &gt;] -&gt; "unary", 1
1114    | [&lt; 'Token.Binary &gt;] -&gt; "binary", 2
1115  in
1116  let parse_binary_precedence = parser
1117    | [&lt; 'Token.Number n &gt;] -&gt; int_of_float n
1118    | [&lt; &gt;] -&gt; 30
1119  in
1120  parser
1121  | [&lt; 'Token.Ident id;
1122       'Token.Kwd '(' ?? "expected '(' in prototype";
1123       args=parse_args [];
1124       'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
1125      (* success. *)
1126      Ast.Prototype (id, Array.of_list (List.rev args))
1127  | [&lt; (prefix, kind)=parse_operator;
1128       'Token.Kwd op ?? "expected an operator";
1129       (* Read the precedence if present. *)
1130       binary_precedence=parse_binary_precedence;
1131       'Token.Kwd '(' ?? "expected '(' in prototype";
1132        args=parse_args [];
1133       'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
1134      let name = prefix ^ (String.make 1 op) in
1135      let args = Array.of_list (List.rev args) in
1136
1137      (* Verify right number of arguments for operator. *)
1138      if Array.length args != kind
1139      then raise (Stream.Error "invalid number of operands for operator")
1140      else
1141        if kind == 1 then
1142          Ast.Prototype (name, args)
1143        else
1144          Ast.BinOpPrototype (name, args, binary_precedence)
1145  | [&lt; &gt;] -&gt;
1146      raise (Stream.Error "expected function name in prototype")
1147
1148(* definition ::= 'def' prototype expression *)
1149let parse_definition = parser
1150  | [&lt; 'Token.Def; p=parse_prototype; e=parse_expr &gt;] -&gt;
1151      Ast.Function (p, e)
1152
1153(* toplevelexpr ::= expression *)
1154let parse_toplevel = parser
1155  | [&lt; e=parse_expr &gt;] -&gt;
1156      (* Make an anonymous proto. *)
1157      Ast.Function (Ast.Prototype ("", [||]), e)
1158
1159(*  external ::= 'extern' prototype *)
1160let parse_extern = parser
1161  | [&lt; 'Token.Extern; e=parse_prototype &gt;] -&gt; e
1162</pre>
1163</dd>
1164
1165<dt>codegen.ml:</dt>
1166<dd class="doc_code">
1167<pre>
1168(*===----------------------------------------------------------------------===
1169 * Code Generation
1170 *===----------------------------------------------------------------------===*)
1171
1172open Llvm
1173
1174exception Error of string
1175
1176let context = global_context ()
1177let the_module = create_module context "my cool jit"
1178let builder = builder context
1179let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
1180let double_type = double_type context
1181
1182let rec codegen_expr = function
1183  | Ast.Number n -&gt; const_float double_type n
1184  | Ast.Variable name -&gt;
1185      (try Hashtbl.find named_values name with
1186        | Not_found -&gt; raise (Error "unknown variable name"))
1187  | Ast.Unary (op, operand) -&gt;
1188      let operand = codegen_expr operand in
1189      let callee = "unary" ^ (String.make 1 op) in
1190      let callee =
1191        match lookup_function callee the_module with
1192        | Some callee -&gt; callee
1193        | None -&gt; raise (Error "unknown unary operator")
1194      in
1195      build_call callee [|operand|] "unop" builder
1196  | Ast.Binary (op, lhs, rhs) -&gt;
1197      let lhs_val = codegen_expr lhs in
1198      let rhs_val = codegen_expr rhs in
1199      begin
1200        match op with
1201        | '+' -&gt; build_add lhs_val rhs_val "addtmp" builder
1202        | '-' -&gt; build_sub lhs_val rhs_val "subtmp" builder
1203        | '*' -&gt; build_mul lhs_val rhs_val "multmp" builder
1204        | '&lt;' -&gt;
1205            (* Convert bool 0/1 to double 0.0 or 1.0 *)
1206            let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
1207            build_uitofp i double_type "booltmp" builder
1208        | _ -&gt;
1209            (* If it wasn't a builtin binary operator, it must be a user defined
1210             * one. Emit a call to it. *)
1211            let callee = "binary" ^ (String.make 1 op) in
1212            let callee =
1213              match lookup_function callee the_module with
1214              | Some callee -&gt; callee
1215              | None -&gt; raise (Error "binary operator not found!")
1216            in
1217            build_call callee [|lhs_val; rhs_val|] "binop" builder
1218      end
1219  | Ast.Call (callee, args) -&gt;
1220      (* Look up the name in the module table. *)
1221      let callee =
1222        match lookup_function callee the_module with
1223        | Some callee -&gt; callee
1224        | None -&gt; raise (Error "unknown function referenced")
1225      in
1226      let params = params callee in
1227
1228      (* If argument mismatch error. *)
1229      if Array.length params == Array.length args then () else
1230        raise (Error "incorrect # arguments passed");
1231      let args = Array.map codegen_expr args in
1232      build_call callee args "calltmp" builder
1233  | Ast.If (cond, then_, else_) -&gt;
1234      let cond = codegen_expr cond in
1235
1236      (* Convert condition to a bool by comparing equal to 0.0 *)
1237      let zero = const_float double_type 0.0 in
1238      let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
1239
1240      (* Grab the first block so that we might later add the conditional branch
1241       * to it at the end of the function. *)
1242      let start_bb = insertion_block builder in
1243      let the_function = block_parent start_bb in
1244
1245      let then_bb = append_block context "then" the_function in
1246
1247      (* Emit 'then' value. *)
1248      position_at_end then_bb builder;
1249      let then_val = codegen_expr then_ in
1250
1251      (* Codegen of 'then' can change the current block, update then_bb for the
1252       * phi. We create a new name because one is used for the phi node, and the
1253       * other is used for the conditional branch. *)
1254      let new_then_bb = insertion_block builder in
1255
1256      (* Emit 'else' value. *)
1257      let else_bb = append_block context "else" the_function in
1258      position_at_end else_bb builder;
1259      let else_val = codegen_expr else_ in
1260
1261      (* Codegen of 'else' can change the current block, update else_bb for the
1262       * phi. *)
1263      let new_else_bb = insertion_block builder in
1264
1265      (* Emit merge block. *)
1266      let merge_bb = append_block context "ifcont" the_function in
1267      position_at_end merge_bb builder;
1268      let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
1269      let phi = build_phi incoming "iftmp" builder in
1270
1271      (* Return to the start block to add the conditional branch. *)
1272      position_at_end start_bb builder;
1273      ignore (build_cond_br cond_val then_bb else_bb builder);
1274
1275      (* Set a unconditional branch at the end of the 'then' block and the
1276       * 'else' block to the 'merge' block. *)
1277      position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
1278      position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
1279
1280      (* Finally, set the builder to the end of the merge block. *)
1281      position_at_end merge_bb builder;
1282
1283      phi
1284  | Ast.For (var_name, start, end_, step, body) -&gt;
1285      (* Emit the start code first, without 'variable' in scope. *)
1286      let start_val = codegen_expr start in
1287
1288      (* Make the new basic block for the loop header, inserting after current
1289       * block. *)
1290      let preheader_bb = insertion_block builder in
1291      let the_function = block_parent preheader_bb in
1292      let loop_bb = append_block context "loop" the_function in
1293
1294      (* Insert an explicit fall through from the current block to the
1295       * loop_bb. *)
1296      ignore (build_br loop_bb builder);
1297
1298      (* Start insertion in loop_bb. *)
1299      position_at_end loop_bb builder;
1300
1301      (* Start the PHI node with an entry for start. *)
1302      let variable = build_phi [(start_val, preheader_bb)] var_name builder in
1303
1304      (* Within the loop, the variable is defined equal to the PHI node. If it
1305       * shadows an existing variable, we have to restore it, so save it
1306       * now. *)
1307      let old_val =
1308        try Some (Hashtbl.find named_values var_name) with Not_found -&gt; None
1309      in
1310      Hashtbl.add named_values var_name variable;
1311
1312      (* Emit the body of the loop.  This, like any other expr, can change the
1313       * current BB.  Note that we ignore the value computed by the body, but
1314       * don't allow an error *)
1315      ignore (codegen_expr body);
1316
1317      (* Emit the step value. *)
1318      let step_val =
1319        match step with
1320        | Some step -&gt; codegen_expr step
1321        (* If not specified, use 1.0. *)
1322        | None -&gt; const_float double_type 1.0
1323      in
1324
1325      let next_var = build_add variable step_val "nextvar" builder in
1326
1327      (* Compute the end condition. *)
1328      let end_cond = codegen_expr end_ in
1329
1330      (* Convert condition to a bool by comparing equal to 0.0. *)
1331      let zero = const_float double_type 0.0 in
1332      let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
1333
1334      (* Create the "after loop" block and insert it. *)
1335      let loop_end_bb = insertion_block builder in
1336      let after_bb = append_block context "afterloop" the_function in
1337
1338      (* Insert the conditional branch into the end of loop_end_bb. *)
1339      ignore (build_cond_br end_cond loop_bb after_bb builder);
1340
1341      (* Any new code will be inserted in after_bb. *)
1342      position_at_end after_bb builder;
1343
1344      (* Add a new entry to the PHI node for the backedge. *)
1345      add_incoming (next_var, loop_end_bb) variable;
1346
1347      (* Restore the unshadowed variable. *)
1348      begin match old_val with
1349      | Some old_val -&gt; Hashtbl.add named_values var_name old_val
1350      | None -&gt; ()
1351      end;
1352
1353      (* for expr always returns 0.0. *)
1354      const_null double_type
1355
1356let codegen_proto = function
1357  | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) -&gt;
1358      (* Make the function type: double(double,double) etc. *)
1359      let doubles = Array.make (Array.length args) double_type in
1360      let ft = function_type double_type doubles in
1361      let f =
1362        match lookup_function name the_module with
1363        | None -&gt; declare_function name ft the_module
1364
1365        (* If 'f' conflicted, there was already something named 'name'. If it
1366         * has a body, don't allow redefinition or reextern. *)
1367        | Some f -&gt;
1368            (* If 'f' already has a body, reject this. *)
1369            if block_begin f &lt;&gt; At_end f then
1370              raise (Error "redefinition of function");
1371
1372            (* If 'f' took a different number of arguments, reject. *)
1373            if element_type (type_of f) &lt;&gt; ft then
1374              raise (Error "redefinition of function with different # args");
1375            f
1376      in
1377
1378      (* Set names for all arguments. *)
1379      Array.iteri (fun i a -&gt;
1380        let n = args.(i) in
1381        set_value_name n a;
1382        Hashtbl.add named_values n a;
1383      ) (params f);
1384      f
1385
1386let codegen_func the_fpm = function
1387  | Ast.Function (proto, body) -&gt;
1388      Hashtbl.clear named_values;
1389      let the_function = codegen_proto proto in
1390
1391      (* If this is an operator, install it. *)
1392      begin match proto with
1393      | Ast.BinOpPrototype (name, args, prec) -&gt;
1394          let op = name.[String.length name - 1] in
1395          Hashtbl.add Parser.binop_precedence op prec;
1396      | _ -&gt; ()
1397      end;
1398
1399      (* Create a new basic block to start insertion into. *)
1400      let bb = append_block context "entry" the_function in
1401      position_at_end bb builder;
1402
1403      try
1404        let ret_val = codegen_expr body in
1405
1406        (* Finish off the function. *)
1407        let _ = build_ret ret_val builder in
1408
1409        (* Validate the generated code, checking for consistency. *)
1410        Llvm_analysis.assert_valid_function the_function;
1411
1412        (* Optimize the function. *)
1413        let _ = PassManager.run_function the_function the_fpm in
1414
1415        the_function
1416      with e -&gt;
1417        delete_function the_function;
1418        raise e
1419</pre>
1420</dd>
1421
1422<dt>toplevel.ml:</dt>
1423<dd class="doc_code">
1424<pre>
1425(*===----------------------------------------------------------------------===
1426 * Top-Level parsing and JIT Driver
1427 *===----------------------------------------------------------------------===*)
1428
1429open Llvm
1430open Llvm_executionengine
1431
1432(* top ::= definition | external | expression | ';' *)
1433let rec main_loop the_fpm the_execution_engine stream =
1434  match Stream.peek stream with
1435  | None -&gt; ()
1436
1437  (* ignore top-level semicolons. *)
1438  | Some (Token.Kwd ';') -&gt;
1439      Stream.junk stream;
1440      main_loop the_fpm the_execution_engine stream
1441
1442  | Some token -&gt;
1443      begin
1444        try match token with
1445        | Token.Def -&gt;
1446            let e = Parser.parse_definition stream in
1447            print_endline "parsed a function definition.";
1448            dump_value (Codegen.codegen_func the_fpm e);
1449        | Token.Extern -&gt;
1450            let e = Parser.parse_extern stream in
1451            print_endline "parsed an extern.";
1452            dump_value (Codegen.codegen_proto e);
1453        | _ -&gt;
1454            (* Evaluate a top-level expression into an anonymous function. *)
1455            let e = Parser.parse_toplevel stream in
1456            print_endline "parsed a top-level expr";
1457            let the_function = Codegen.codegen_func the_fpm e in
1458            dump_value the_function;
1459
1460            (* JIT the function, returning a function pointer. *)
1461            let result = ExecutionEngine.run_function the_function [||]
1462              the_execution_engine in
1463
1464            print_string "Evaluated to ";
1465            print_float (GenericValue.as_float Codegen.double_type result);
1466            print_newline ();
1467        with Stream.Error s | Codegen.Error s -&gt;
1468          (* Skip token for error recovery. *)
1469          Stream.junk stream;
1470          print_endline s;
1471      end;
1472      print_string "ready&gt; "; flush stdout;
1473      main_loop the_fpm the_execution_engine stream
1474</pre>
1475</dd>
1476
1477<dt>toy.ml:</dt>
1478<dd class="doc_code">
1479<pre>
1480(*===----------------------------------------------------------------------===
1481 * Main driver code.
1482 *===----------------------------------------------------------------------===*)
1483
1484open Llvm
1485open Llvm_executionengine
1486open Llvm_target
1487open Llvm_scalar_opts
1488
1489let main () =
1490  ignore (initialize_native_target ());
1491
1492  (* Install standard binary operators.
1493   * 1 is the lowest precedence. *)
1494  Hashtbl.add Parser.binop_precedence '&lt;' 10;
1495  Hashtbl.add Parser.binop_precedence '+' 20;
1496  Hashtbl.add Parser.binop_precedence '-' 20;
1497  Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
1498
1499  (* Prime the first token. *)
1500  print_string "ready&gt; "; flush stdout;
1501  let stream = Lexer.lex (Stream.of_channel stdin) in
1502
1503  (* Create the JIT. *)
1504  let the_execution_engine = ExecutionEngine.create Codegen.the_module in
1505  let the_fpm = PassManager.create_function Codegen.the_module in
1506
1507  (* Set up the optimizer pipeline.  Start with registering info about how the
1508   * target lays out data structures. *)
1509  TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
1510
1511  (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
1512  add_instruction_combination the_fpm;
1513
1514  (* reassociate expressions. *)
1515  add_reassociation the_fpm;
1516
1517  (* Eliminate Common SubExpressions. *)
1518  add_gvn the_fpm;
1519
1520  (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
1521  add_cfg_simplification the_fpm;
1522
1523  ignore (PassManager.initialize the_fpm);
1524
1525  (* Run the main "interpreter loop" now. *)
1526  Toplevel.main_loop the_fpm the_execution_engine stream;
1527
1528  (* Print out all the generated code. *)
1529  dump_module Codegen.the_module
1530;;
1531
1532main ()
1533</pre>
1534</dd>
1535
1536<dt>bindings.c</dt>
1537<dd class="doc_code">
1538<pre>
1539#include &lt;stdio.h&gt;
1540
1541/* putchard - putchar that takes a double and returns 0. */
1542extern double putchard(double X) {
1543  putchar((char)X);
1544  return 0;
1545}
1546
1547/* printd - printf that takes a double prints it as "%f\n", returning 0. */
1548extern double printd(double X) {
1549  printf("%f\n", X);
1550  return 0;
1551}
1552</pre>
1553</dd>
1554</dl>
1555
1556<a href="OCamlLangImpl7.html">Next: Extending the language: mutable variables /
1557SSA construction</a>
1558</div>
1559
1560<!-- *********************************************************************** -->
1561<hr>
1562<address>
1563  <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1564  src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1565  <a href="http://validator.w3.org/check/referer"><img
1566  src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1567
1568  <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1569  <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br>
1570  <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br>
1571  Last modified: $Date: 2011-04-22 20:30:22 -0400 (Fri, 22 Apr 2011) $
1572</address>
1573</body>
1574</html>
1575