1This is ../../../doc/bison.info, produced by makeinfo version 4.13 from 2../../../doc/bison.texi. 3 4This manual (9 December 2012) is for GNU Bison (version 2.7), the GNU 5parser generator. 6 7 Copyright (C) 1988-1993, 1995, 1998-2012 Free Software Foundation, 8Inc. 9 10 Permission is granted to copy, distribute and/or modify this 11 document under the terms of the GNU Free Documentation License, 12 Version 1.3 or any later version published by the Free Software 13 Foundation; with no Invariant Sections, with the Front-Cover texts 14 being "A GNU Manual," and with the Back-Cover Texts as in (a) 15 below. A copy of the license is included in the section entitled 16 "GNU Free Documentation License." 17 18 (a) The FSF's Back-Cover Text is: "You have the freedom to copy and 19 modify this GNU manual. Buying copies from the FSF supports it in 20 developing GNU and promoting software freedom." 21 22INFO-DIR-SECTION Software development 23START-INFO-DIR-ENTRY 24* bison: (bison). GNU parser generator (Yacc replacement). 25END-INFO-DIR-ENTRY 26 27 28File: bison.info, Node: Top, Next: Introduction, Up: (dir) 29 30Bison 31***** 32 33This manual (9 December 2012) is for GNU Bison (version 2.7), the GNU 34parser generator. 35 36 Copyright (C) 1988-1993, 1995, 1998-2012 Free Software Foundation, 37Inc. 38 39 Permission is granted to copy, distribute and/or modify this 40 document under the terms of the GNU Free Documentation License, 41 Version 1.3 or any later version published by the Free Software 42 Foundation; with no Invariant Sections, with the Front-Cover texts 43 being "A GNU Manual," and with the Back-Cover Texts as in (a) 44 below. A copy of the license is included in the section entitled 45 "GNU Free Documentation License." 46 47 (a) The FSF's Back-Cover Text is: "You have the freedom to copy and 48 modify this GNU manual. Buying copies from the FSF supports it in 49 developing GNU and promoting software freedom." 50 51* Menu: 52 53* Introduction:: 54* Conditions:: 55* Copying:: The GNU General Public License says 56 how you can copy and share Bison. 57 58Tutorial sections: 59* Concepts:: Basic concepts for understanding Bison. 60* Examples:: Three simple explained examples of using Bison. 61 62Reference sections: 63* Grammar File:: Writing Bison declarations and rules. 64* Interface:: C-language interface to the parser function `yyparse'. 65* Algorithm:: How the Bison parser works at run-time. 66* Error Recovery:: Writing rules for error recovery. 67* Context Dependency:: What to do if your language syntax is too 68 messy for Bison to handle straightforwardly. 69* Debugging:: Understanding or debugging Bison parsers. 70* Invocation:: How to run Bison (to produce the parser implementation). 71* Other Languages:: Creating C++ and Java parsers. 72* FAQ:: Frequently Asked Questions 73* Table of Symbols:: All the keywords of the Bison language are explained. 74* Glossary:: Basic concepts are explained. 75* Copying This Manual:: License for copying this manual. 76* Bibliography:: Publications cited in this manual. 77* Index of Terms:: Cross-references to the text. 78 79 --- The Detailed Node Listing --- 80 81The Concepts of Bison 82 83* Language and Grammar:: Languages and context-free grammars, 84 as mathematical ideas. 85* Grammar in Bison:: How we represent grammars for Bison's sake. 86* Semantic Values:: Each token or syntactic grouping can have 87 a semantic value (the value of an integer, 88 the name of an identifier, etc.). 89* Semantic Actions:: Each rule can have an action containing C code. 90* GLR Parsers:: Writing parsers for general context-free languages. 91* Locations:: Overview of location tracking. 92* Bison Parser:: What are Bison's input and output, 93 how is the output used? 94* Stages:: Stages in writing and running Bison grammars. 95* Grammar Layout:: Overall structure of a Bison grammar file. 96 97Writing GLR Parsers 98 99* Simple GLR Parsers:: Using GLR parsers on unambiguous grammars. 100* Merging GLR Parses:: Using GLR parsers to resolve ambiguities. 101* GLR Semantic Actions:: Deferred semantic actions have special concerns. 102* Compiler Requirements:: GLR parsers require a modern C compiler. 103 104Examples 105 106* RPN Calc:: Reverse polish notation calculator; 107 a first example with no operator precedence. 108* Infix Calc:: Infix (algebraic) notation calculator. 109 Operator precedence is introduced. 110* Simple Error Recovery:: Continuing after syntax errors. 111* Location Tracking Calc:: Demonstrating the use of @N and @$. 112* Multi-function Calc:: Calculator with memory and trig functions. 113 It uses multiple data-types for semantic values. 114* Exercises:: Ideas for improving the multi-function calculator. 115 116Reverse Polish Notation Calculator 117 118* Rpcalc Declarations:: Prologue (declarations) for rpcalc. 119* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation. 120* Rpcalc Lexer:: The lexical analyzer. 121* Rpcalc Main:: The controlling function. 122* Rpcalc Error:: The error reporting function. 123* Rpcalc Generate:: Running Bison on the grammar file. 124* Rpcalc Compile:: Run the C compiler on the output code. 125 126Grammar Rules for `rpcalc' 127 128* Rpcalc Input:: 129* Rpcalc Line:: 130* Rpcalc Expr:: 131 132Location Tracking Calculator: `ltcalc' 133 134* Ltcalc Declarations:: Bison and C declarations for ltcalc. 135* Ltcalc Rules:: Grammar rules for ltcalc, with explanations. 136* Ltcalc Lexer:: The lexical analyzer. 137 138Multi-Function Calculator: `mfcalc' 139 140* Mfcalc Declarations:: Bison declarations for multi-function calculator. 141* Mfcalc Rules:: Grammar rules for the calculator. 142* Mfcalc Symbol Table:: Symbol table management subroutines. 143 144Bison Grammar Files 145 146* Grammar Outline:: Overall layout of the grammar file. 147* Symbols:: Terminal and nonterminal symbols. 148* Rules:: How to write grammar rules. 149* Recursion:: Writing recursive rules. 150* Semantics:: Semantic values and actions. 151* Tracking Locations:: Locations and actions. 152* Named References:: Using named references in actions. 153* Declarations:: All kinds of Bison declarations are described here. 154* Multiple Parsers:: Putting more than one Bison parser in one program. 155 156Outline of a Bison Grammar 157 158* Prologue:: Syntax and usage of the prologue. 159* Prologue Alternatives:: Syntax and usage of alternatives to the prologue. 160* Bison Declarations:: Syntax and usage of the Bison declarations section. 161* Grammar Rules:: Syntax and usage of the grammar rules section. 162* Epilogue:: Syntax and usage of the epilogue. 163 164Defining Language Semantics 165 166* Value Type:: Specifying one data type for all semantic values. 167* Multiple Types:: Specifying several alternative data types. 168* Actions:: An action is the semantic definition of a grammar rule. 169* Action Types:: Specifying data types for actions to operate on. 170* Mid-Rule Actions:: Most actions go at the end of a rule. 171 This says when, why and how to use the exceptional 172 action in the middle of a rule. 173 174Actions in Mid-Rule 175 176* Using Mid-Rule Actions:: Putting an action in the middle of a rule. 177* Mid-Rule Action Translation:: How mid-rule actions are actually processed. 178* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts. 179 180Tracking Locations 181 182* Location Type:: Specifying a data type for locations. 183* Actions and Locations:: Using locations in actions. 184* Location Default Action:: Defining a general way to compute locations. 185 186Bison Declarations 187 188* Require Decl:: Requiring a Bison version. 189* Token Decl:: Declaring terminal symbols. 190* Precedence Decl:: Declaring terminals with precedence and associativity. 191* Union Decl:: Declaring the set of all semantic value types. 192* Type Decl:: Declaring the choice of type for a nonterminal symbol. 193* Initial Action Decl:: Code run before parsing starts. 194* Destructor Decl:: Declaring how symbols are freed. 195* Printer Decl:: Declaring how symbol values are displayed. 196* Expect Decl:: Suppressing warnings about parsing conflicts. 197* Start Decl:: Specifying the start symbol. 198* Pure Decl:: Requesting a reentrant parser. 199* Push Decl:: Requesting a push parser. 200* Decl Summary:: Table of all Bison declarations. 201* %define Summary:: Defining variables to adjust Bison's behavior. 202* %code Summary:: Inserting code into the parser source. 203 204Parser C-Language Interface 205 206* Parser Function:: How to call `yyparse' and what it returns. 207* Push Parser Function:: How to call `yypush_parse' and what it returns. 208* Pull Parser Function:: How to call `yypull_parse' and what it returns. 209* Parser Create Function:: How to call `yypstate_new' and what it returns. 210* Parser Delete Function:: How to call `yypstate_delete' and what it returns. 211* Lexical:: You must supply a function `yylex' 212 which reads tokens. 213* Error Reporting:: You must supply a function `yyerror'. 214* Action Features:: Special features for use in actions. 215* Internationalization:: How to let the parser speak in the user's 216 native language. 217 218The Lexical Analyzer Function `yylex' 219 220* Calling Convention:: How `yyparse' calls `yylex'. 221* Token Values:: How `yylex' must return the semantic value 222 of the token it has read. 223* Token Locations:: How `yylex' must return the text location 224 (line number, etc.) of the token, if the 225 actions want that. 226* Pure Calling:: How the calling convention differs in a pure parser 227 (*note A Pure (Reentrant) Parser: Pure Decl.). 228 229The Bison Parser Algorithm 230 231* Lookahead:: Parser looks one token ahead when deciding what to do. 232* Shift/Reduce:: Conflicts: when either shifting or reduction is valid. 233* Precedence:: Operator precedence works by resolving conflicts. 234* Contextual Precedence:: When an operator's precedence depends on context. 235* Parser States:: The parser is a finite-state-machine with stack. 236* Reduce/Reduce:: When two rules are applicable in the same situation. 237* Mysterious Conflicts:: Conflicts that look unjustified. 238* Tuning LR:: How to tune fundamental aspects of LR-based parsing. 239* Generalized LR Parsing:: Parsing arbitrary context-free grammars. 240* Memory Management:: What happens when memory is exhausted. How to avoid it. 241 242Operator Precedence 243 244* Why Precedence:: An example showing why precedence is needed. 245* Using Precedence:: How to specify precedence in Bison grammars. 246* Precedence Examples:: How these features are used in the previous example. 247* How Precedence:: How they work. 248* Non Operators:: Using precedence for general conflicts. 249 250Tuning LR 251 252* LR Table Construction:: Choose a different construction algorithm. 253* Default Reductions:: Disable default reductions. 254* LAC:: Correct lookahead sets in the parser states. 255* Unreachable States:: Keep unreachable parser states for debugging. 256 257Handling Context Dependencies 258 259* Semantic Tokens:: Token parsing can depend on the semantic context. 260* Lexical Tie-ins:: Token parsing can depend on the syntactic context. 261* Tie-in Recovery:: Lexical tie-ins have implications for how 262 error recovery rules must be written. 263 264Debugging Your Parser 265 266* Understanding:: Understanding the structure of your parser. 267* Graphviz:: Getting a visual representation of the parser. 268* Xml:: Getting a markup representation of the parser. 269* Tracing:: Tracing the execution of your parser. 270 271Tracing Your Parser 272 273* Enabling Traces:: Activating run-time trace support 274* Mfcalc Traces:: Extending `mfcalc' to support traces 275* The YYPRINT Macro:: Obsolete interface for semantic value reports 276 277Invoking Bison 278 279* Bison Options:: All the options described in detail, 280 in alphabetical order by short options. 281* Option Cross Key:: Alphabetical list of long options. 282* Yacc Library:: Yacc-compatible `yylex' and `main'. 283 284Parsers Written In Other Languages 285 286* C++ Parsers:: The interface to generate C++ parser classes 287* Java Parsers:: The interface to generate Java parser classes 288 289C++ Parsers 290 291* C++ Bison Interface:: Asking for C++ parser generation 292* C++ Semantic Values:: %union vs. C++ 293* C++ Location Values:: The position and location classes 294* C++ Parser Interface:: Instantiating and running the parser 295* C++ Scanner Interface:: Exchanges between yylex and parse 296* A Complete C++ Example:: Demonstrating their use 297 298C++ Location Values 299 300* C++ position:: One point in the source file 301* C++ location:: Two points in the source file 302* User Defined Location Type:: Required interface for locations 303 304A Complete C++ Example 305 306* Calc++ --- C++ Calculator:: The specifications 307* Calc++ Parsing Driver:: An active parsing context 308* Calc++ Parser:: A parser class 309* Calc++ Scanner:: A pure C++ Flex scanner 310* Calc++ Top Level:: Conducting the band 311 312Java Parsers 313 314* Java Bison Interface:: Asking for Java parser generation 315* Java Semantic Values:: %type and %token vs. Java 316* Java Location Values:: The position and location classes 317* Java Parser Interface:: Instantiating and running the parser 318* Java Scanner Interface:: Specifying the scanner for the parser 319* Java Action Features:: Special features for use in actions 320* Java Differences:: Differences between C/C++ and Java Grammars 321* Java Declarations Summary:: List of Bison declarations used with Java 322 323Frequently Asked Questions 324 325* Memory Exhausted:: Breaking the Stack Limits 326* How Can I Reset the Parser:: `yyparse' Keeps some State 327* Strings are Destroyed:: `yylval' Loses Track of Strings 328* Implementing Gotos/Loops:: Control Flow in the Calculator 329* Multiple start-symbols:: Factoring closely related grammars 330* Secure? Conform?:: Is Bison POSIX safe? 331* I can't build Bison:: Troubleshooting 332* Where can I find help?:: Troubleshouting 333* Bug Reports:: Troublereporting 334* More Languages:: Parsers in C++, Java, and so on 335* Beta Testing:: Experimenting development versions 336* Mailing Lists:: Meeting other Bison users 337 338Copying This Manual 339 340* Copying This Manual:: License for copying this manual. 341 342 343File: bison.info, Node: Introduction, Next: Conditions, Prev: Top, Up: Top 344 345Introduction 346************ 347 348"Bison" is a general-purpose parser generator that converts an 349annotated context-free grammar into a deterministic LR or generalized 350LR (GLR) parser employing LALR(1) parser tables. As an experimental 351feature, Bison can also generate IELR(1) or canonical LR(1) parser 352tables. Once you are proficient with Bison, you can use it to develop 353a wide range of language parsers, from those used in simple desk 354calculators to complex programming languages. 355 356 Bison is upward compatible with Yacc: all properly-written Yacc 357grammars ought to work with Bison with no change. Anyone familiar with 358Yacc should be able to use Bison with little trouble. You need to be 359fluent in C or C++ programming in order to use Bison or to understand 360this manual. Java is also supported as an experimental feature. 361 362 We begin with tutorial chapters that explain the basic concepts of 363using Bison and show three explained examples, each building on the 364last. If you don't know Bison or Yacc, start by reading these 365chapters. Reference chapters follow, which describe specific aspects 366of Bison in detail. 367 368 Bison was written originally by Robert Corbett. Richard Stallman 369made it Yacc-compatible. Wilfred Hansen of Carnegie Mellon University 370added multi-character string literals and other features. Since then, 371Bison has grown more robust and evolved many other new features thanks 372to the hard work of a long list of volunteers. For details, see the 373`THANKS' and `ChangeLog' files included in the Bison distribution. 374 375 This edition corresponds to version 2.7 of Bison. 376 377 378File: bison.info, Node: Conditions, Next: Copying, Prev: Introduction, Up: Top 379 380Conditions for Using Bison 381************************** 382 383The distribution terms for Bison-generated parsers permit using the 384parsers in nonfree programs. Before Bison version 2.2, these extra 385permissions applied only when Bison was generating LALR(1) parsers in 386C. And before Bison version 1.24, Bison-generated parsers could be 387used only in programs that were free software. 388 389 The other GNU programming tools, such as the GNU C compiler, have 390never had such a requirement. They could always be used for nonfree 391software. The reason Bison was different was not due to a special 392policy decision; it resulted from applying the usual General Public 393License to all of the Bison source code. 394 395 The main output of the Bison utility--the Bison parser implementation 396file--contains a verbatim copy of a sizable piece of Bison, which is 397the code for the parser's implementation. (The actions from your 398grammar are inserted into this implementation at one point, but most of 399the rest of the implementation is not changed.) When we applied the 400GPL terms to the skeleton code for the parser's implementation, the 401effect was to restrict the use of Bison output to free software. 402 403 We didn't change the terms because of sympathy for people who want to 404make software proprietary. *Software should be free.* But we 405concluded that limiting Bison's use to free software was doing little to 406encourage people to make other software free. So we decided to make the 407practical conditions for using Bison match the practical conditions for 408using the other GNU tools. 409 410 This exception applies when Bison is generating code for a parser. 411You can tell whether the exception applies to a Bison output file by 412inspecting the file for text beginning with "As a special 413exception...". The text spells out the exact terms of the exception. 414 415 416File: bison.info, Node: Copying, Next: Concepts, Prev: Conditions, Up: Top 417 418GNU GENERAL PUBLIC LICENSE 419************************** 420 421 Version 3, 29 June 2007 422 423 Copyright (C) 2007 Free Software Foundation, Inc. `http://fsf.org/' 424 425 Everyone is permitted to copy and distribute verbatim copies of this 426 license document, but changing it is not allowed. 427 428Preamble 429======== 430 431The GNU General Public License is a free, copyleft license for software 432and other kinds of works. 433 434 The licenses for most software and other practical works are designed 435to take away your freedom to share and change the works. By contrast, 436the GNU General Public License is intended to guarantee your freedom to 437share and change all versions of a program--to make sure it remains 438free software for all its users. We, the Free Software Foundation, use 439the GNU General Public License for most of our software; it applies 440also to any other work released this way by its authors. You can apply 441it to your programs, too. 442 443 When we speak of free software, we are referring to freedom, not 444price. Our General Public Licenses are designed to make sure that you 445have the freedom to distribute copies of free software (and charge for 446them if you wish), that you receive source code or can get it if you 447want it, that you can change the software or use pieces of it in new 448free programs, and that you know you can do these things. 449 450 To protect your rights, we need to prevent others from denying you 451these rights or asking you to surrender the rights. Therefore, you 452have certain responsibilities if you distribute copies of the software, 453or if you modify it: responsibilities to respect the freedom of others. 454 455 For example, if you distribute copies of such a program, whether 456gratis or for a fee, you must pass on to the recipients the same 457freedoms that you received. You must make sure that they, too, receive 458or can get the source code. And you must show them these terms so they 459know their rights. 460 461 Developers that use the GNU GPL protect your rights with two steps: 462(1) assert copyright on the software, and (2) offer you this License 463giving you legal permission to copy, distribute and/or modify it. 464 465 For the developers' and authors' protection, the GPL clearly explains 466that there is no warranty for this free software. For both users' and 467authors' sake, the GPL requires that modified versions be marked as 468changed, so that their problems will not be attributed erroneously to 469authors of previous versions. 470 471 Some devices are designed to deny users access to install or run 472modified versions of the software inside them, although the 473manufacturer can do so. This is fundamentally incompatible with the 474aim of protecting users' freedom to change the software. The 475systematic pattern of such abuse occurs in the area of products for 476individuals to use, which is precisely where it is most unacceptable. 477Therefore, we have designed this version of the GPL to prohibit the 478practice for those products. If such problems arise substantially in 479other domains, we stand ready to extend this provision to those domains 480in future versions of the GPL, as needed to protect the freedom of 481users. 482 483 Finally, every program is threatened constantly by software patents. 484States should not allow patents to restrict development and use of 485software on general-purpose computers, but in those that do, we wish to 486avoid the special danger that patents applied to a free program could 487make it effectively proprietary. To prevent this, the GPL assures that 488patents cannot be used to render the program non-free. 489 490 The precise terms and conditions for copying, distribution and 491modification follow. 492 493TERMS AND CONDITIONS 494==================== 495 496 0. Definitions. 497 498 "This License" refers to version 3 of the GNU General Public 499 License. 500 501 "Copyright" also means copyright-like laws that apply to other 502 kinds of works, such as semiconductor masks. 503 504 "The Program" refers to any copyrightable work licensed under this 505 License. Each licensee is addressed as "you". "Licensees" and 506 "recipients" may be individuals or organizations. 507 508 To "modify" a work means to copy from or adapt all or part of the 509 work in a fashion requiring copyright permission, other than the 510 making of an exact copy. The resulting work is called a "modified 511 version" of the earlier work or a work "based on" the earlier work. 512 513 A "covered work" means either the unmodified Program or a work 514 based on the Program. 515 516 To "propagate" a work means to do anything with it that, without 517 permission, would make you directly or secondarily liable for 518 infringement under applicable copyright law, except executing it 519 on a computer or modifying a private copy. Propagation includes 520 copying, distribution (with or without modification), making 521 available to the public, and in some countries other activities as 522 well. 523 524 To "convey" a work means any kind of propagation that enables other 525 parties to make or receive copies. Mere interaction with a user 526 through a computer network, with no transfer of a copy, is not 527 conveying. 528 529 An interactive user interface displays "Appropriate Legal Notices" 530 to the extent that it includes a convenient and prominently visible 531 feature that (1) displays an appropriate copyright notice, and (2) 532 tells the user that there is no warranty for the work (except to 533 the extent that warranties are provided), that licensees may 534 convey the work under this License, and how to view a copy of this 535 License. If the interface presents a list of user commands or 536 options, such as a menu, a prominent item in the list meets this 537 criterion. 538 539 1. Source Code. 540 541 The "source code" for a work means the preferred form of the work 542 for making modifications to it. "Object code" means any 543 non-source form of a work. 544 545 A "Standard Interface" means an interface that either is an 546 official standard defined by a recognized standards body, or, in 547 the case of interfaces specified for a particular programming 548 language, one that is widely used among developers working in that 549 language. 550 551 The "System Libraries" of an executable work include anything, 552 other than the work as a whole, that (a) is included in the normal 553 form of packaging a Major Component, but which is not part of that 554 Major Component, and (b) serves only to enable use of the work 555 with that Major Component, or to implement a Standard Interface 556 for which an implementation is available to the public in source 557 code form. A "Major Component", in this context, means a major 558 essential component (kernel, window system, and so on) of the 559 specific operating system (if any) on which the executable work 560 runs, or a compiler used to produce the work, or an object code 561 interpreter used to run it. 562 563 The "Corresponding Source" for a work in object code form means all 564 the source code needed to generate, install, and (for an executable 565 work) run the object code and to modify the work, including 566 scripts to control those activities. However, it does not include 567 the work's System Libraries, or general-purpose tools or generally 568 available free programs which are used unmodified in performing 569 those activities but which are not part of the work. For example, 570 Corresponding Source includes interface definition files 571 associated with source files for the work, and the source code for 572 shared libraries and dynamically linked subprograms that the work 573 is specifically designed to require, such as by intimate data 574 communication or control flow between those subprograms and other 575 parts of the work. 576 577 The Corresponding Source need not include anything that users can 578 regenerate automatically from other parts of the Corresponding 579 Source. 580 581 The Corresponding Source for a work in source code form is that 582 same work. 583 584 2. Basic Permissions. 585 586 All rights granted under this License are granted for the term of 587 copyright on the Program, and are irrevocable provided the stated 588 conditions are met. This License explicitly affirms your unlimited 589 permission to run the unmodified Program. The output from running 590 a covered work is covered by this License only if the output, 591 given its content, constitutes a covered work. This License 592 acknowledges your rights of fair use or other equivalent, as 593 provided by copyright law. 594 595 You may make, run and propagate covered works that you do not 596 convey, without conditions so long as your license otherwise 597 remains in force. You may convey covered works to others for the 598 sole purpose of having them make modifications exclusively for 599 you, or provide you with facilities for running those works, 600 provided that you comply with the terms of this License in 601 conveying all material for which you do not control copyright. 602 Those thus making or running the covered works for you must do so 603 exclusively on your behalf, under your direction and control, on 604 terms that prohibit them from making any copies of your 605 copyrighted material outside their relationship with you. 606 607 Conveying under any other circumstances is permitted solely under 608 the conditions stated below. Sublicensing is not allowed; section 609 10 makes it unnecessary. 610 611 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 612 613 No covered work shall be deemed part of an effective technological 614 measure under any applicable law fulfilling obligations under 615 article 11 of the WIPO copyright treaty adopted on 20 December 616 1996, or similar laws prohibiting or restricting circumvention of 617 such measures. 618 619 When you convey a covered work, you waive any legal power to forbid 620 circumvention of technological measures to the extent such 621 circumvention is effected by exercising rights under this License 622 with respect to the covered work, and you disclaim any intention 623 to limit operation or modification of the work as a means of 624 enforcing, against the work's users, your or third parties' legal 625 rights to forbid circumvention of technological measures. 626 627 4. Conveying Verbatim Copies. 628 629 You may convey verbatim copies of the Program's source code as you 630 receive it, in any medium, provided that you conspicuously and 631 appropriately publish on each copy an appropriate copyright notice; 632 keep intact all notices stating that this License and any 633 non-permissive terms added in accord with section 7 apply to the 634 code; keep intact all notices of the absence of any warranty; and 635 give all recipients a copy of this License along with the Program. 636 637 You may charge any price or no price for each copy that you convey, 638 and you may offer support or warranty protection for a fee. 639 640 5. Conveying Modified Source Versions. 641 642 You may convey a work based on the Program, or the modifications to 643 produce it from the Program, in the form of source code under the 644 terms of section 4, provided that you also meet all of these 645 conditions: 646 647 a. The work must carry prominent notices stating that you 648 modified it, and giving a relevant date. 649 650 b. The work must carry prominent notices stating that it is 651 released under this License and any conditions added under 652 section 7. This requirement modifies the requirement in 653 section 4 to "keep intact all notices". 654 655 c. You must license the entire work, as a whole, under this 656 License to anyone who comes into possession of a copy. This 657 License will therefore apply, along with any applicable 658 section 7 additional terms, to the whole of the work, and all 659 its parts, regardless of how they are packaged. This License 660 gives no permission to license the work in any other way, but 661 it does not invalidate such permission if you have separately 662 received it. 663 664 d. If the work has interactive user interfaces, each must display 665 Appropriate Legal Notices; however, if the Program has 666 interactive interfaces that do not display Appropriate Legal 667 Notices, your work need not make them do so. 668 669 A compilation of a covered work with other separate and independent 670 works, which are not by their nature extensions of the covered 671 work, and which are not combined with it such as to form a larger 672 program, in or on a volume of a storage or distribution medium, is 673 called an "aggregate" if the compilation and its resulting 674 copyright are not used to limit the access or legal rights of the 675 compilation's users beyond what the individual works permit. 676 Inclusion of a covered work in an aggregate does not cause this 677 License to apply to the other parts of the aggregate. 678 679 6. Conveying Non-Source Forms. 680 681 You may convey a covered work in object code form under the terms 682 of sections 4 and 5, provided that you also convey the 683 machine-readable Corresponding Source under the terms of this 684 License, in one of these ways: 685 686 a. Convey the object code in, or embodied in, a physical product 687 (including a physical distribution medium), accompanied by the 688 Corresponding Source fixed on a durable physical medium 689 customarily used for software interchange. 690 691 b. Convey the object code in, or embodied in, a physical product 692 (including a physical distribution medium), accompanied by a 693 written offer, valid for at least three years and valid for 694 as long as you offer spare parts or customer support for that 695 product model, to give anyone who possesses the object code 696 either (1) a copy of the Corresponding Source for all the 697 software in the product that is covered by this License, on a 698 durable physical medium customarily used for software 699 interchange, for a price no more than your reasonable cost of 700 physically performing this conveying of source, or (2) access 701 to copy the Corresponding Source from a network server at no 702 charge. 703 704 c. Convey individual copies of the object code with a copy of 705 the written offer to provide the Corresponding Source. This 706 alternative is allowed only occasionally and noncommercially, 707 and only if you received the object code with such an offer, 708 in accord with subsection 6b. 709 710 d. Convey the object code by offering access from a designated 711 place (gratis or for a charge), and offer equivalent access 712 to the Corresponding Source in the same way through the same 713 place at no further charge. You need not require recipients 714 to copy the Corresponding Source along with the object code. 715 If the place to copy the object code is a network server, the 716 Corresponding Source may be on a different server (operated 717 by you or a third party) that supports equivalent copying 718 facilities, provided you maintain clear directions next to 719 the object code saying where to find the Corresponding Source. 720 Regardless of what server hosts the Corresponding Source, you 721 remain obligated to ensure that it is available for as long 722 as needed to satisfy these requirements. 723 724 e. Convey the object code using peer-to-peer transmission, 725 provided you inform other peers where the object code and 726 Corresponding Source of the work are being offered to the 727 general public at no charge under subsection 6d. 728 729 730 A separable portion of the object code, whose source code is 731 excluded from the Corresponding Source as a System Library, need 732 not be included in conveying the object code work. 733 734 A "User Product" is either (1) a "consumer product", which means 735 any tangible personal property which is normally used for personal, 736 family, or household purposes, or (2) anything designed or sold for 737 incorporation into a dwelling. In determining whether a product 738 is a consumer product, doubtful cases shall be resolved in favor of 739 coverage. For a particular product received by a particular user, 740 "normally used" refers to a typical or common use of that class of 741 product, regardless of the status of the particular user or of the 742 way in which the particular user actually uses, or expects or is 743 expected to use, the product. A product is a consumer product 744 regardless of whether the product has substantial commercial, 745 industrial or non-consumer uses, unless such uses represent the 746 only significant mode of use of the product. 747 748 "Installation Information" for a User Product means any methods, 749 procedures, authorization keys, or other information required to 750 install and execute modified versions of a covered work in that 751 User Product from a modified version of its Corresponding Source. 752 The information must suffice to ensure that the continued 753 functioning of the modified object code is in no case prevented or 754 interfered with solely because modification has been made. 755 756 If you convey an object code work under this section in, or with, 757 or specifically for use in, a User Product, and the conveying 758 occurs as part of a transaction in which the right of possession 759 and use of the User Product is transferred to the recipient in 760 perpetuity or for a fixed term (regardless of how the transaction 761 is characterized), the Corresponding Source conveyed under this 762 section must be accompanied by the Installation Information. But 763 this requirement does not apply if neither you nor any third party 764 retains the ability to install modified object code on the User 765 Product (for example, the work has been installed in ROM). 766 767 The requirement to provide Installation Information does not 768 include a requirement to continue to provide support service, 769 warranty, or updates for a work that has been modified or 770 installed by the recipient, or for the User Product in which it 771 has been modified or installed. Access to a network may be denied 772 when the modification itself materially and adversely affects the 773 operation of the network or violates the rules and protocols for 774 communication across the network. 775 776 Corresponding Source conveyed, and Installation Information 777 provided, in accord with this section must be in a format that is 778 publicly documented (and with an implementation available to the 779 public in source code form), and must require no special password 780 or key for unpacking, reading or copying. 781 782 7. Additional Terms. 783 784 "Additional permissions" are terms that supplement the terms of 785 this License by making exceptions from one or more of its 786 conditions. Additional permissions that are applicable to the 787 entire Program shall be treated as though they were included in 788 this License, to the extent that they are valid under applicable 789 law. If additional permissions apply only to part of the Program, 790 that part may be used separately under those permissions, but the 791 entire Program remains governed by this License without regard to 792 the additional permissions. 793 794 When you convey a copy of a covered work, you may at your option 795 remove any additional permissions from that copy, or from any part 796 of it. (Additional permissions may be written to require their own 797 removal in certain cases when you modify the work.) You may place 798 additional permissions on material, added by you to a covered work, 799 for which you have or can give appropriate copyright permission. 800 801 Notwithstanding any other provision of this License, for material 802 you add to a covered work, you may (if authorized by the copyright 803 holders of that material) supplement the terms of this License 804 with terms: 805 806 a. Disclaiming warranty or limiting liability differently from 807 the terms of sections 15 and 16 of this License; or 808 809 b. Requiring preservation of specified reasonable legal notices 810 or author attributions in that material or in the Appropriate 811 Legal Notices displayed by works containing it; or 812 813 c. Prohibiting misrepresentation of the origin of that material, 814 or requiring that modified versions of such material be 815 marked in reasonable ways as different from the original 816 version; or 817 818 d. Limiting the use for publicity purposes of names of licensors 819 or authors of the material; or 820 821 e. Declining to grant rights under trademark law for use of some 822 trade names, trademarks, or service marks; or 823 824 f. Requiring indemnification of licensors and authors of that 825 material by anyone who conveys the material (or modified 826 versions of it) with contractual assumptions of liability to 827 the recipient, for any liability that these contractual 828 assumptions directly impose on those licensors and authors. 829 830 All other non-permissive additional terms are considered "further 831 restrictions" within the meaning of section 10. If the Program as 832 you received it, or any part of it, contains a notice stating that 833 it is governed by this License along with a term that is a further 834 restriction, you may remove that term. If a license document 835 contains a further restriction but permits relicensing or 836 conveying under this License, you may add to a covered work 837 material governed by the terms of that license document, provided 838 that the further restriction does not survive such relicensing or 839 conveying. 840 841 If you add terms to a covered work in accord with this section, you 842 must place, in the relevant source files, a statement of the 843 additional terms that apply to those files, or a notice indicating 844 where to find the applicable terms. 845 846 Additional terms, permissive or non-permissive, may be stated in 847 the form of a separately written license, or stated as exceptions; 848 the above requirements apply either way. 849 850 8. Termination. 851 852 You may not propagate or modify a covered work except as expressly 853 provided under this License. Any attempt otherwise to propagate or 854 modify it is void, and will automatically terminate your rights 855 under this License (including any patent licenses granted under 856 the third paragraph of section 11). 857 858 However, if you cease all violation of this License, then your 859 license from a particular copyright holder is reinstated (a) 860 provisionally, unless and until the copyright holder explicitly 861 and finally terminates your license, and (b) permanently, if the 862 copyright holder fails to notify you of the violation by some 863 reasonable means prior to 60 days after the cessation. 864 865 Moreover, your license from a particular copyright holder is 866 reinstated permanently if the copyright holder notifies you of the 867 violation by some reasonable means, this is the first time you have 868 received notice of violation of this License (for any work) from 869 that copyright holder, and you cure the violation prior to 30 days 870 after your receipt of the notice. 871 872 Termination of your rights under this section does not terminate 873 the licenses of parties who have received copies or rights from 874 you under this License. If your rights have been terminated and 875 not permanently reinstated, you do not qualify to receive new 876 licenses for the same material under section 10. 877 878 9. Acceptance Not Required for Having Copies. 879 880 You are not required to accept this License in order to receive or 881 run a copy of the Program. Ancillary propagation of a covered work 882 occurring solely as a consequence of using peer-to-peer 883 transmission to receive a copy likewise does not require 884 acceptance. However, nothing other than this License grants you 885 permission to propagate or modify any covered work. These actions 886 infringe copyright if you do not accept this License. Therefore, 887 by modifying or propagating a covered work, you indicate your 888 acceptance of this License to do so. 889 890 10. Automatic Licensing of Downstream Recipients. 891 892 Each time you convey a covered work, the recipient automatically 893 receives a license from the original licensors, to run, modify and 894 propagate that work, subject to this License. You are not 895 responsible for enforcing compliance by third parties with this 896 License. 897 898 An "entity transaction" is a transaction transferring control of an 899 organization, or substantially all assets of one, or subdividing an 900 organization, or merging organizations. If propagation of a 901 covered work results from an entity transaction, each party to that 902 transaction who receives a copy of the work also receives whatever 903 licenses to the work the party's predecessor in interest had or 904 could give under the previous paragraph, plus a right to 905 possession of the Corresponding Source of the work from the 906 predecessor in interest, if the predecessor has it or can get it 907 with reasonable efforts. 908 909 You may not impose any further restrictions on the exercise of the 910 rights granted or affirmed under this License. For example, you 911 may not impose a license fee, royalty, or other charge for 912 exercise of rights granted under this License, and you may not 913 initiate litigation (including a cross-claim or counterclaim in a 914 lawsuit) alleging that any patent claim is infringed by making, 915 using, selling, offering for sale, or importing the Program or any 916 portion of it. 917 918 11. Patents. 919 920 A "contributor" is a copyright holder who authorizes use under this 921 License of the Program or a work on which the Program is based. 922 The work thus licensed is called the contributor's "contributor 923 version". 924 925 A contributor's "essential patent claims" are all patent claims 926 owned or controlled by the contributor, whether already acquired or 927 hereafter acquired, that would be infringed by some manner, 928 permitted by this License, of making, using, or selling its 929 contributor version, but do not include claims that would be 930 infringed only as a consequence of further modification of the 931 contributor version. For purposes of this definition, "control" 932 includes the right to grant patent sublicenses in a manner 933 consistent with the requirements of this License. 934 935 Each contributor grants you a non-exclusive, worldwide, 936 royalty-free patent license under the contributor's essential 937 patent claims, to make, use, sell, offer for sale, import and 938 otherwise run, modify and propagate the contents of its 939 contributor version. 940 941 In the following three paragraphs, a "patent license" is any 942 express agreement or commitment, however denominated, not to 943 enforce a patent (such as an express permission to practice a 944 patent or covenant not to sue for patent infringement). To 945 "grant" such a patent license to a party means to make such an 946 agreement or commitment not to enforce a patent against the party. 947 948 If you convey a covered work, knowingly relying on a patent 949 license, and the Corresponding Source of the work is not available 950 for anyone to copy, free of charge and under the terms of this 951 License, through a publicly available network server or other 952 readily accessible means, then you must either (1) cause the 953 Corresponding Source to be so available, or (2) arrange to deprive 954 yourself of the benefit of the patent license for this particular 955 work, or (3) arrange, in a manner consistent with the requirements 956 of this License, to extend the patent license to downstream 957 recipients. "Knowingly relying" means you have actual knowledge 958 that, but for the patent license, your conveying the covered work 959 in a country, or your recipient's use of the covered work in a 960 country, would infringe one or more identifiable patents in that 961 country that you have reason to believe are valid. 962 963 If, pursuant to or in connection with a single transaction or 964 arrangement, you convey, or propagate by procuring conveyance of, a 965 covered work, and grant a patent license to some of the parties 966 receiving the covered work authorizing them to use, propagate, 967 modify or convey a specific copy of the covered work, then the 968 patent license you grant is automatically extended to all 969 recipients of the covered work and works based on it. 970 971 A patent license is "discriminatory" if it does not include within 972 the scope of its coverage, prohibits the exercise of, or is 973 conditioned on the non-exercise of one or more of the rights that 974 are specifically granted under this License. You may not convey a 975 covered work if you are a party to an arrangement with a third 976 party that is in the business of distributing software, under 977 which you make payment to the third party based on the extent of 978 your activity of conveying the work, and under which the third 979 party grants, to any of the parties who would receive the covered 980 work from you, a discriminatory patent license (a) in connection 981 with copies of the covered work conveyed by you (or copies made 982 from those copies), or (b) primarily for and in connection with 983 specific products or compilations that contain the covered work, 984 unless you entered into that arrangement, or that patent license 985 was granted, prior to 28 March 2007. 986 987 Nothing in this License shall be construed as excluding or limiting 988 any implied license or other defenses to infringement that may 989 otherwise be available to you under applicable patent law. 990 991 12. No Surrender of Others' Freedom. 992 993 If conditions are imposed on you (whether by court order, 994 agreement or otherwise) that contradict the conditions of this 995 License, they do not excuse you from the conditions of this 996 License. If you cannot convey a covered work so as to satisfy 997 simultaneously your obligations under this License and any other 998 pertinent obligations, then as a consequence you may not convey it 999 at all. For example, if you agree to terms that obligate you to 1000 collect a royalty for further conveying from those to whom you 1001 convey the Program, the only way you could satisfy both those 1002 terms and this License would be to refrain entirely from conveying 1003 the Program. 1004 1005 13. Use with the GNU Affero General Public License. 1006 1007 Notwithstanding any other provision of this License, you have 1008 permission to link or combine any covered work with a work licensed 1009 under version 3 of the GNU Affero General Public License into a 1010 single combined work, and to convey the resulting work. The terms 1011 of this License will continue to apply to the part which is the 1012 covered work, but the special requirements of the GNU Affero 1013 General Public License, section 13, concerning interaction through 1014 a network will apply to the combination as such. 1015 1016 14. Revised Versions of this License. 1017 1018 The Free Software Foundation may publish revised and/or new 1019 versions of the GNU General Public License from time to time. 1020 Such new versions will be similar in spirit to the present 1021 version, but may differ in detail to address new problems or 1022 concerns. 1023 1024 Each version is given a distinguishing version number. If the 1025 Program specifies that a certain numbered version of the GNU 1026 General Public License "or any later version" applies to it, you 1027 have the option of following the terms and conditions either of 1028 that numbered version or of any later version published by the 1029 Free Software Foundation. If the Program does not specify a 1030 version number of the GNU General Public License, you may choose 1031 any version ever published by the Free Software Foundation. 1032 1033 If the Program specifies that a proxy can decide which future 1034 versions of the GNU General Public License can be used, that 1035 proxy's public statement of acceptance of a version permanently 1036 authorizes you to choose that version for the Program. 1037 1038 Later license versions may give you additional or different 1039 permissions. However, no additional obligations are imposed on any 1040 author or copyright holder as a result of your choosing to follow a 1041 later version. 1042 1043 15. Disclaimer of Warranty. 1044 1045 THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 1046 APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE 1047 COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" 1048 WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, 1049 INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 1050 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE 1051 RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. 1052 SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL 1053 NECESSARY SERVICING, REPAIR OR CORRECTION. 1054 1055 16. Limitation of Liability. 1056 1057 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN 1058 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES 1059 AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU 1060 FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR 1061 CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE 1062 THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA 1063 BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 1064 PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER 1065 PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF 1066 THE POSSIBILITY OF SUCH DAMAGES. 1067 1068 17. Interpretation of Sections 15 and 16. 1069 1070 If the disclaimer of warranty and limitation of liability provided 1071 above cannot be given local legal effect according to their terms, 1072 reviewing courts shall apply local law that most closely 1073 approximates an absolute waiver of all civil liability in 1074 connection with the Program, unless a warranty or assumption of 1075 liability accompanies a copy of the Program in return for a fee. 1076 1077 1078END OF TERMS AND CONDITIONS 1079=========================== 1080 1081How to Apply These Terms to Your New Programs 1082============================================= 1083 1084If you develop a new program, and you want it to be of the greatest 1085possible use to the public, the best way to achieve this is to make it 1086free software which everyone can redistribute and change under these 1087terms. 1088 1089 To do so, attach the following notices to the program. It is safest 1090to attach them to the start of each source file to most effectively 1091state the exclusion of warranty; and each file should have at least the 1092"copyright" line and a pointer to where the full notice is found. 1093 1094 ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES. 1095 Copyright (C) YEAR NAME OF AUTHOR 1096 1097 This program is free software: you can redistribute it and/or modify 1098 it under the terms of the GNU General Public License as published by 1099 the Free Software Foundation, either version 3 of the License, or (at 1100 your option) any later version. 1101 1102 This program is distributed in the hope that it will be useful, but 1103 WITHOUT ANY WARRANTY; without even the implied warranty of 1104 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 1105 General Public License for more details. 1106 1107 You should have received a copy of the GNU General Public License 1108 along with this program. If not, see `http://www.gnu.org/licenses/'. 1109 1110 Also add information on how to contact you by electronic and paper 1111mail. 1112 1113 If the program does terminal interaction, make it output a short 1114notice like this when it starts in an interactive mode: 1115 1116 PROGRAM Copyright (C) YEAR NAME OF AUTHOR 1117 This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 1118 This is free software, and you are welcome to redistribute it 1119 under certain conditions; type `show c' for details. 1120 1121 The hypothetical commands `show w' and `show c' should show the 1122appropriate parts of the General Public License. Of course, your 1123program's commands might be different; for a GUI interface, you would 1124use an "about box". 1125 1126 You should also get your employer (if you work as a programmer) or 1127school, if any, to sign a "copyright disclaimer" for the program, if 1128necessary. For more information on this, and how to apply and follow 1129the GNU GPL, see `http://www.gnu.org/licenses/'. 1130 1131 The GNU General Public License does not permit incorporating your 1132program into proprietary programs. If your program is a subroutine 1133library, you may consider it more useful to permit linking proprietary 1134applications with the library. If this is what you want to do, use the 1135GNU Lesser General Public License instead of this License. But first, 1136please read `http://www.gnu.org/philosophy/why-not-lgpl.html'. 1137 1138 1139File: bison.info, Node: Concepts, Next: Examples, Prev: Copying, Up: Top 1140 11411 The Concepts of Bison 1142*********************** 1143 1144This chapter introduces many of the basic concepts without which the 1145details of Bison will not make sense. If you do not already know how to 1146use Bison or Yacc, we suggest you start by reading this chapter 1147carefully. 1148 1149* Menu: 1150 1151* Language and Grammar:: Languages and context-free grammars, 1152 as mathematical ideas. 1153* Grammar in Bison:: How we represent grammars for Bison's sake. 1154* Semantic Values:: Each token or syntactic grouping can have 1155 a semantic value (the value of an integer, 1156 the name of an identifier, etc.). 1157* Semantic Actions:: Each rule can have an action containing C code. 1158* GLR Parsers:: Writing parsers for general context-free languages. 1159* Locations:: Overview of location tracking. 1160* Bison Parser:: What are Bison's input and output, 1161 how is the output used? 1162* Stages:: Stages in writing and running Bison grammars. 1163* Grammar Layout:: Overall structure of a Bison grammar file. 1164 1165 1166File: bison.info, Node: Language and Grammar, Next: Grammar in Bison, Up: Concepts 1167 11681.1 Languages and Context-Free Grammars 1169======================================= 1170 1171In order for Bison to parse a language, it must be described by a 1172"context-free grammar". This means that you specify one or more 1173"syntactic groupings" and give rules for constructing them from their 1174parts. For example, in the C language, one kind of grouping is called 1175an `expression'. One rule for making an expression might be, "An 1176expression can be made of a minus sign and another expression". 1177Another would be, "An expression can be an integer". As you can see, 1178rules are often recursive, but there must be at least one rule which 1179leads out of the recursion. 1180 1181 The most common formal system for presenting such rules for humans 1182to read is "Backus-Naur Form" or "BNF", which was developed in order to 1183specify the language Algol 60. Any grammar expressed in BNF is a 1184context-free grammar. The input to Bison is essentially 1185machine-readable BNF. 1186 1187 There are various important subclasses of context-free grammars. 1188Although it can handle almost all context-free grammars, Bison is 1189optimized for what are called LR(1) grammars. In brief, in these 1190grammars, it must be possible to tell how to parse any portion of an 1191input string with just a single token of lookahead. For historical 1192reasons, Bison by default is limited by the additional restrictions of 1193LALR(1), which is hard to explain simply. *Note Mysterious 1194Conflicts::, for more information on this. As an experimental feature, 1195you can escape these additional restrictions by requesting IELR(1) or 1196canonical LR(1) parser tables. *Note LR Table Construction::, to learn 1197how. 1198 1199 Parsers for LR(1) grammars are "deterministic", meaning roughly that 1200the next grammar rule to apply at any point in the input is uniquely 1201determined by the preceding input and a fixed, finite portion (called a 1202"lookahead") of the remaining input. A context-free grammar can be 1203"ambiguous", meaning that there are multiple ways to apply the grammar 1204rules to get the same inputs. Even unambiguous grammars can be 1205"nondeterministic", meaning that no fixed lookahead always suffices to 1206determine the next grammar rule to apply. With the proper 1207declarations, Bison is also able to parse these more general 1208context-free grammars, using a technique known as GLR parsing (for 1209Generalized LR). Bison's GLR parsers are able to handle any 1210context-free grammar for which the number of possible parses of any 1211given string is finite. 1212 1213 In the formal grammatical rules for a language, each kind of 1214syntactic unit or grouping is named by a "symbol". Those which are 1215built by grouping smaller constructs according to grammatical rules are 1216called "nonterminal symbols"; those which can't be subdivided are called 1217"terminal symbols" or "token types". We call a piece of input 1218corresponding to a single terminal symbol a "token", and a piece 1219corresponding to a single nonterminal symbol a "grouping". 1220 1221 We can use the C language as an example of what symbols, terminal and 1222nonterminal, mean. The tokens of C are identifiers, constants (numeric 1223and string), and the various keywords, arithmetic operators and 1224punctuation marks. So the terminal symbols of a grammar for C include 1225`identifier', `number', `string', plus one symbol for each keyword, 1226operator or punctuation mark: `if', `return', `const', `static', `int', 1227`char', `plus-sign', `open-brace', `close-brace', `comma' and many more. 1228(These tokens can be subdivided into characters, but that is a matter of 1229lexicography, not grammar.) 1230 1231 Here is a simple C function subdivided into tokens: 1232 1233 int /* keyword `int' */ 1234 square (int x) /* identifier, open-paren, keyword `int', 1235 identifier, close-paren */ 1236 { /* open-brace */ 1237 return x * x; /* keyword `return', identifier, asterisk, 1238 identifier, semicolon */ 1239 } /* close-brace */ 1240 1241 The syntactic groupings of C include the expression, the statement, 1242the declaration, and the function definition. These are represented in 1243the grammar of C by nonterminal symbols `expression', `statement', 1244`declaration' and `function definition'. The full grammar uses dozens 1245of additional language constructs, each with its own nonterminal 1246symbol, in order to express the meanings of these four. The example 1247above is a function definition; it contains one declaration, and one 1248statement. In the statement, each `x' is an expression and so is `x * 1249x'. 1250 1251 Each nonterminal symbol must have grammatical rules showing how it 1252is made out of simpler constructs. For example, one kind of C 1253statement is the `return' statement; this would be described with a 1254grammar rule which reads informally as follows: 1255 1256 A `statement' can be made of a `return' keyword, an `expression' 1257 and a `semicolon'. 1258 1259There would be many other rules for `statement', one for each kind of 1260statement in C. 1261 1262 One nonterminal symbol must be distinguished as the special one which 1263defines a complete utterance in the language. It is called the "start 1264symbol". In a compiler, this means a complete input program. In the C 1265language, the nonterminal symbol `sequence of definitions and 1266declarations' plays this role. 1267 1268 For example, `1 + 2' is a valid C expression--a valid part of a C 1269program--but it is not valid as an _entire_ C program. In the 1270context-free grammar of C, this follows from the fact that `expression' 1271is not the start symbol. 1272 1273 The Bison parser reads a sequence of tokens as its input, and groups 1274the tokens using the grammar rules. If the input is valid, the end 1275result is that the entire token sequence reduces to a single grouping 1276whose symbol is the grammar's start symbol. If we use a grammar for C, 1277the entire input must be a `sequence of definitions and declarations'. 1278If not, the parser reports a syntax error. 1279 1280 1281File: bison.info, Node: Grammar in Bison, Next: Semantic Values, Prev: Language and Grammar, Up: Concepts 1282 12831.2 From Formal Rules to Bison Input 1284==================================== 1285 1286A formal grammar is a mathematical construct. To define the language 1287for Bison, you must write a file expressing the grammar in Bison syntax: 1288a "Bison grammar" file. *Note Bison Grammar Files: Grammar File. 1289 1290 A nonterminal symbol in the formal grammar is represented in Bison 1291input as an identifier, like an identifier in C. By convention, it 1292should be in lower case, such as `expr', `stmt' or `declaration'. 1293 1294 The Bison representation for a terminal symbol is also called a 1295"token type". Token types as well can be represented as C-like 1296identifiers. By convention, these identifiers should be upper case to 1297distinguish them from nonterminals: for example, `INTEGER', 1298`IDENTIFIER', `IF' or `RETURN'. A terminal symbol that stands for a 1299particular keyword in the language should be named after that keyword 1300converted to upper case. The terminal symbol `error' is reserved for 1301error recovery. *Note Symbols::. 1302 1303 A terminal symbol can also be represented as a character literal, 1304just like a C character constant. You should do this whenever a token 1305is just a single character (parenthesis, plus-sign, etc.): use that 1306same character in a literal as the terminal symbol for that token. 1307 1308 A third way to represent a terminal symbol is with a C string 1309constant containing several characters. *Note Symbols::, for more 1310information. 1311 1312 The grammar rules also have an expression in Bison syntax. For 1313example, here is the Bison rule for a C `return' statement. The 1314semicolon in quotes is a literal character token, representing part of 1315the C syntax for the statement; the naked semicolon, and the colon, are 1316Bison punctuation used in every rule. 1317 1318 stmt: RETURN expr ';' ; 1319 1320*Note Syntax of Grammar Rules: Rules. 1321 1322 1323File: bison.info, Node: Semantic Values, Next: Semantic Actions, Prev: Grammar in Bison, Up: Concepts 1324 13251.3 Semantic Values 1326=================== 1327 1328A formal grammar selects tokens only by their classifications: for 1329example, if a rule mentions the terminal symbol `integer constant', it 1330means that _any_ integer constant is grammatically valid in that 1331position. The precise value of the constant is irrelevant to how to 1332parse the input: if `x+4' is grammatical then `x+1' or `x+3989' is 1333equally grammatical. 1334 1335 But the precise value is very important for what the input means 1336once it is parsed. A compiler is useless if it fails to distinguish 1337between 4, 1 and 3989 as constants in the program! Therefore, each 1338token in a Bison grammar has both a token type and a "semantic value". 1339*Note Defining Language Semantics: Semantics, for details. 1340 1341 The token type is a terminal symbol defined in the grammar, such as 1342`INTEGER', `IDENTIFIER' or `',''. It tells everything you need to know 1343to decide where the token may validly appear and how to group it with 1344other tokens. The grammar rules know nothing about tokens except their 1345types. 1346 1347 The semantic value has all the rest of the information about the 1348meaning of the token, such as the value of an integer, or the name of an 1349identifier. (A token such as `','' which is just punctuation doesn't 1350need to have any semantic value.) 1351 1352 For example, an input token might be classified as token type 1353`INTEGER' and have the semantic value 4. Another input token might 1354have the same token type `INTEGER' but value 3989. When a grammar rule 1355says that `INTEGER' is allowed, either of these tokens is acceptable 1356because each is an `INTEGER'. When the parser accepts the token, it 1357keeps track of the token's semantic value. 1358 1359 Each grouping can also have a semantic value as well as its 1360nonterminal symbol. For example, in a calculator, an expression 1361typically has a semantic value that is a number. In a compiler for a 1362programming language, an expression typically has a semantic value that 1363is a tree structure describing the meaning of the expression. 1364 1365 1366File: bison.info, Node: Semantic Actions, Next: GLR Parsers, Prev: Semantic Values, Up: Concepts 1367 13681.4 Semantic Actions 1369==================== 1370 1371In order to be useful, a program must do more than parse input; it must 1372also produce some output based on the input. In a Bison grammar, a 1373grammar rule can have an "action" made up of C statements. Each time 1374the parser recognizes a match for that rule, the action is executed. 1375*Note Actions::. 1376 1377 Most of the time, the purpose of an action is to compute the 1378semantic value of the whole construct from the semantic values of its 1379parts. For example, suppose we have a rule which says an expression 1380can be the sum of two expressions. When the parser recognizes such a 1381sum, each of the subexpressions has a semantic value which describes 1382how it was built up. The action for this rule should create a similar 1383sort of value for the newly recognized larger expression. 1384 1385 For example, here is a rule that says an expression can be the sum of 1386two subexpressions: 1387 1388 expr: expr '+' expr { $$ = $1 + $3; } ; 1389 1390The action says how to produce the semantic value of the sum expression 1391from the values of the two subexpressions. 1392 1393 1394File: bison.info, Node: GLR Parsers, Next: Locations, Prev: Semantic Actions, Up: Concepts 1395 13961.5 Writing GLR Parsers 1397======================= 1398 1399In some grammars, Bison's deterministic LR(1) parsing algorithm cannot 1400decide whether to apply a certain grammar rule at a given point. That 1401is, it may not be able to decide (on the basis of the input read so 1402far) which of two possible reductions (applications of a grammar rule) 1403applies, or whether to apply a reduction or read more of the input and 1404apply a reduction later in the input. These are known respectively as 1405"reduce/reduce" conflicts (*note Reduce/Reduce::), and "shift/reduce" 1406conflicts (*note Shift/Reduce::). 1407 1408 To use a grammar that is not easily modified to be LR(1), a more 1409general parsing algorithm is sometimes necessary. If you include 1410`%glr-parser' among the Bison declarations in your file (*note Grammar 1411Outline::), the result is a Generalized LR (GLR) parser. These parsers 1412handle Bison grammars that contain no unresolved conflicts (i.e., after 1413applying precedence declarations) identically to deterministic parsers. 1414However, when faced with unresolved shift/reduce and reduce/reduce 1415conflicts, GLR parsers use the simple expedient of doing both, 1416effectively cloning the parser to follow both possibilities. Each of 1417the resulting parsers can again split, so that at any given time, there 1418can be any number of possible parses being explored. The parsers 1419proceed in lockstep; that is, all of them consume (shift) a given input 1420symbol before any of them proceed to the next. Each of the cloned 1421parsers eventually meets one of two possible fates: either it runs into 1422a parsing error, in which case it simply vanishes, or it merges with 1423another parser, because the two of them have reduced the input to an 1424identical set of symbols. 1425 1426 During the time that there are multiple parsers, semantic actions are 1427recorded, but not performed. When a parser disappears, its recorded 1428semantic actions disappear as well, and are never performed. When a 1429reduction makes two parsers identical, causing them to merge, Bison 1430records both sets of semantic actions. Whenever the last two parsers 1431merge, reverting to the single-parser case, Bison resolves all the 1432outstanding actions either by precedences given to the grammar rules 1433involved, or by performing both actions, and then calling a designated 1434user-defined function on the resulting values to produce an arbitrary 1435merged result. 1436 1437* Menu: 1438 1439* Simple GLR Parsers:: Using GLR parsers on unambiguous grammars. 1440* Merging GLR Parses:: Using GLR parsers to resolve ambiguities. 1441* GLR Semantic Actions:: Deferred semantic actions have special concerns. 1442* Compiler Requirements:: GLR parsers require a modern C compiler. 1443 1444 1445File: bison.info, Node: Simple GLR Parsers, Next: Merging GLR Parses, Up: GLR Parsers 1446 14471.5.1 Using GLR on Unambiguous Grammars 1448--------------------------------------- 1449 1450In the simplest cases, you can use the GLR algorithm to parse grammars 1451that are unambiguous but fail to be LR(1). Such grammars typically 1452require more than one symbol of lookahead. 1453 1454 Consider a problem that arises in the declaration of enumerated and 1455subrange types in the programming language Pascal. Here are some 1456examples: 1457 1458 type subrange = lo .. hi; 1459 type enum = (a, b, c); 1460 1461The original language standard allows only numeric literals and 1462constant identifiers for the subrange bounds (`lo' and `hi'), but 1463Extended Pascal (ISO/IEC 10206) and many other Pascal implementations 1464allow arbitrary expressions there. This gives rise to the following 1465situation, containing a superfluous pair of parentheses: 1466 1467 type subrange = (a) .. b; 1468 1469Compare this to the following declaration of an enumerated type with 1470only one value: 1471 1472 type enum = (a); 1473 1474(These declarations are contrived, but they are syntactically valid, 1475and more-complicated cases can come up in practical programs.) 1476 1477 These two declarations look identical until the `..' token. With 1478normal LR(1) one-token lookahead it is not possible to decide between 1479the two forms when the identifier `a' is parsed. It is, however, 1480desirable for a parser to decide this, since in the latter case `a' 1481must become a new identifier to represent the enumeration value, while 1482in the former case `a' must be evaluated with its current meaning, 1483which may be a constant or even a function call. 1484 1485 You could parse `(a)' as an "unspecified identifier in parentheses", 1486to be resolved later, but this typically requires substantial 1487contortions in both semantic actions and large parts of the grammar, 1488where the parentheses are nested in the recursive rules for expressions. 1489 1490 You might think of using the lexer to distinguish between the two 1491forms by returning different tokens for currently defined and undefined 1492identifiers. But if these declarations occur in a local scope, and `a' 1493is defined in an outer scope, then both forms are possible--either 1494locally redefining `a', or using the value of `a' from the outer scope. 1495So this approach cannot work. 1496 1497 A simple solution to this problem is to declare the parser to use 1498the GLR algorithm. When the GLR parser reaches the critical state, it 1499merely splits into two branches and pursues both syntax rules 1500simultaneously. Sooner or later, one of them runs into a parsing 1501error. If there is a `..' token before the next `;', the rule for 1502enumerated types fails since it cannot accept `..' anywhere; otherwise, 1503the subrange type rule fails since it requires a `..' token. So one of 1504the branches fails silently, and the other one continues normally, 1505performing all the intermediate actions that were postponed during the 1506split. 1507 1508 If the input is syntactically incorrect, both branches fail and the 1509parser reports a syntax error as usual. 1510 1511 The effect of all this is that the parser seems to "guess" the 1512correct branch to take, or in other words, it seems to use more 1513lookahead than the underlying LR(1) algorithm actually allows for. In 1514this example, LR(2) would suffice, but also some cases that are not 1515LR(k) for any k can be handled this way. 1516 1517 In general, a GLR parser can take quadratic or cubic worst-case time, 1518and the current Bison parser even takes exponential time and space for 1519some grammars. In practice, this rarely happens, and for many grammars 1520it is possible to prove that it cannot happen. The present example 1521contains only one conflict between two rules, and the type-declaration 1522context containing the conflict cannot be nested. So the number of 1523branches that can exist at any time is limited by the constant 2, and 1524the parsing time is still linear. 1525 1526 Here is a Bison grammar corresponding to the example above. It 1527parses a vastly simplified form of Pascal type declarations. 1528 1529 %token TYPE DOTDOT ID 1530 1531 %left '+' '-' 1532 %left '*' '/' 1533 1534 %% 1535 1536 type_decl: TYPE ID '=' type ';' ; 1537 1538 type: 1539 '(' id_list ')' 1540 | expr DOTDOT expr 1541 ; 1542 1543 id_list: 1544 ID 1545 | id_list ',' ID 1546 ; 1547 1548 expr: 1549 '(' expr ')' 1550 | expr '+' expr 1551 | expr '-' expr 1552 | expr '*' expr 1553 | expr '/' expr 1554 | ID 1555 ; 1556 1557 When used as a normal LR(1) grammar, Bison correctly complains about 1558one reduce/reduce conflict. In the conflicting situation the parser 1559chooses one of the alternatives, arbitrarily the one declared first. 1560Therefore the following correct input is not recognized: 1561 1562 type t = (a) .. b; 1563 1564 The parser can be turned into a GLR parser, while also telling Bison 1565to be silent about the one known reduce/reduce conflict, by adding 1566these two declarations to the Bison grammar file (before the first 1567`%%'): 1568 1569 %glr-parser 1570 %expect-rr 1 1571 1572No change in the grammar itself is required. Now the parser recognizes 1573all valid declarations, according to the limited syntax above, 1574transparently. In fact, the user does not even notice when the parser 1575splits. 1576 1577 So here we have a case where we can use the benefits of GLR, almost 1578without disadvantages. Even in simple cases like this, however, there 1579are at least two potential problems to beware. First, always analyze 1580the conflicts reported by Bison to make sure that GLR splitting is only 1581done where it is intended. A GLR parser splitting inadvertently may 1582cause problems less obvious than an LR parser statically choosing the 1583wrong alternative in a conflict. Second, consider interactions with 1584the lexer (*note Semantic Tokens::) with great care. Since a split 1585parser consumes tokens without performing any actions during the split, 1586the lexer cannot obtain information via parser actions. Some cases of 1587lexer interactions can be eliminated by using GLR to shift the 1588complications from the lexer to the parser. You must check the 1589remaining cases for correctness. 1590 1591 In our example, it would be safe for the lexer to return tokens 1592based on their current meanings in some symbol table, because no new 1593symbols are defined in the middle of a type declaration. Though it is 1594possible for a parser to define the enumeration constants as they are 1595parsed, before the type declaration is completed, it actually makes no 1596difference since they cannot be used within the same enumerated type 1597declaration. 1598 1599 1600File: bison.info, Node: Merging GLR Parses, Next: GLR Semantic Actions, Prev: Simple GLR Parsers, Up: GLR Parsers 1601 16021.5.2 Using GLR to Resolve Ambiguities 1603-------------------------------------- 1604 1605Let's consider an example, vastly simplified from a C++ grammar. 1606 1607 %{ 1608 #include <stdio.h> 1609 #define YYSTYPE char const * 1610 int yylex (void); 1611 void yyerror (char const *); 1612 %} 1613 1614 %token TYPENAME ID 1615 1616 %right '=' 1617 %left '+' 1618 1619 %glr-parser 1620 1621 %% 1622 1623 prog: 1624 /* Nothing. */ 1625 | prog stmt { printf ("\n"); } 1626 ; 1627 1628 stmt: 1629 expr ';' %dprec 1 1630 | decl %dprec 2 1631 ; 1632 1633 expr: 1634 ID { printf ("%s ", $$); } 1635 | TYPENAME '(' expr ')' 1636 { printf ("%s <cast> ", $1); } 1637 | expr '+' expr { printf ("+ "); } 1638 | expr '=' expr { printf ("= "); } 1639 ; 1640 1641 decl: 1642 TYPENAME declarator ';' 1643 { printf ("%s <declare> ", $1); } 1644 | TYPENAME declarator '=' expr ';' 1645 { printf ("%s <init-declare> ", $1); } 1646 ; 1647 1648 declarator: 1649 ID { printf ("\"%s\" ", $1); } 1650 | '(' declarator ')' 1651 ; 1652 1653This models a problematic part of the C++ grammar--the ambiguity between 1654certain declarations and statements. For example, 1655 1656 T (x) = y+z; 1657 1658parses as either an `expr' or a `stmt' (assuming that `T' is recognized 1659as a `TYPENAME' and `x' as an `ID'). Bison detects this as a 1660reduce/reduce conflict between the rules `expr : ID' and `declarator : 1661ID', which it cannot resolve at the time it encounters `x' in the 1662example above. Since this is a GLR parser, it therefore splits the 1663problem into two parses, one for each choice of resolving the 1664reduce/reduce conflict. Unlike the example from the previous section 1665(*note Simple GLR Parsers::), however, neither of these parses "dies," 1666because the grammar as it stands is ambiguous. One of the parsers 1667eventually reduces `stmt : expr ';'' and the other reduces `stmt : 1668decl', after which both parsers are in an identical state: they've seen 1669`prog stmt' and have the same unprocessed input remaining. We say that 1670these parses have "merged." 1671 1672 At this point, the GLR parser requires a specification in the 1673grammar of how to choose between the competing parses. In the example 1674above, the two `%dprec' declarations specify that Bison is to give 1675precedence to the parse that interprets the example as a `decl', which 1676implies that `x' is a declarator. The parser therefore prints 1677 1678 "x" y z + T <init-declare> 1679 1680 The `%dprec' declarations only come into play when more than one 1681parse survives. Consider a different input string for this parser: 1682 1683 T (x) + y; 1684 1685This is another example of using GLR to parse an unambiguous construct, 1686as shown in the previous section (*note Simple GLR Parsers::). Here, 1687there is no ambiguity (this cannot be parsed as a declaration). 1688However, at the time the Bison parser encounters `x', it does not have 1689enough information to resolve the reduce/reduce conflict (again, 1690between `x' as an `expr' or a `declarator'). In this case, no 1691precedence declaration is used. Again, the parser splits into two, one 1692assuming that `x' is an `expr', and the other assuming `x' is a 1693`declarator'. The second of these parsers then vanishes when it sees 1694`+', and the parser prints 1695 1696 x T <cast> y + 1697 1698 Suppose that instead of resolving the ambiguity, you wanted to see 1699all the possibilities. For this purpose, you must merge the semantic 1700actions of the two possible parsers, rather than choosing one over the 1701other. To do so, you could change the declaration of `stmt' as follows: 1702 1703 stmt: 1704 expr ';' %merge <stmtMerge> 1705 | decl %merge <stmtMerge> 1706 ; 1707 1708and define the `stmtMerge' function as: 1709 1710 static YYSTYPE 1711 stmtMerge (YYSTYPE x0, YYSTYPE x1) 1712 { 1713 printf ("<OR> "); 1714 return ""; 1715 } 1716 1717with an accompanying forward declaration in the C declarations at the 1718beginning of the file: 1719 1720 %{ 1721 #define YYSTYPE char const * 1722 static YYSTYPE stmtMerge (YYSTYPE x0, YYSTYPE x1); 1723 %} 1724 1725With these declarations, the resulting parser parses the first example 1726as both an `expr' and a `decl', and prints 1727 1728 "x" y z + T <init-declare> x T <cast> y z + = <OR> 1729 1730 Bison requires that all of the productions that participate in any 1731particular merge have identical `%merge' clauses. Otherwise, the 1732ambiguity would be unresolvable, and the parser will report an error 1733during any parse that results in the offending merge. 1734 1735 1736File: bison.info, Node: GLR Semantic Actions, Next: Compiler Requirements, Prev: Merging GLR Parses, Up: GLR Parsers 1737 17381.5.3 GLR Semantic Actions 1739-------------------------- 1740 1741By definition, a deferred semantic action is not performed at the same 1742time as the associated reduction. This raises caveats for several 1743Bison features you might use in a semantic action in a GLR parser. 1744 1745 In any semantic action, you can examine `yychar' to determine the 1746type of the lookahead token present at the time of the associated 1747reduction. After checking that `yychar' is not set to `YYEMPTY' or 1748`YYEOF', you can then examine `yylval' and `yylloc' to determine the 1749lookahead token's semantic value and location, if any. In a 1750nondeferred semantic action, you can also modify any of these variables 1751to influence syntax analysis. *Note Lookahead Tokens: Lookahead. 1752 1753 In a deferred semantic action, it's too late to influence syntax 1754analysis. In this case, `yychar', `yylval', and `yylloc' are set to 1755shallow copies of the values they had at the time of the associated 1756reduction. For this reason alone, modifying them is dangerous. 1757Moreover, the result of modifying them is undefined and subject to 1758change with future versions of Bison. For example, if a semantic 1759action might be deferred, you should never write it to invoke 1760`yyclearin' (*note Action Features::) or to attempt to free memory 1761referenced by `yylval'. 1762 1763 Another Bison feature requiring special consideration is `YYERROR' 1764(*note Action Features::), which you can invoke in a semantic action to 1765initiate error recovery. During deterministic GLR operation, the 1766effect of `YYERROR' is the same as its effect in a deterministic parser. 1767In a deferred semantic action, its effect is undefined. 1768 1769 Also, see *note Default Action for Locations: Location Default 1770Action, which describes a special usage of `YYLLOC_DEFAULT' in GLR 1771parsers. 1772 1773 1774File: bison.info, Node: Compiler Requirements, Prev: GLR Semantic Actions, Up: GLR Parsers 1775 17761.5.4 Considerations when Compiling GLR Parsers 1777----------------------------------------------- 1778 1779The GLR parsers require a compiler for ISO C89 or later. In addition, 1780they use the `inline' keyword, which is not C89, but is C99 and is a 1781common extension in pre-C99 compilers. It is up to the user of these 1782parsers to handle portability issues. For instance, if using Autoconf 1783and the Autoconf macro `AC_C_INLINE', a mere 1784 1785 %{ 1786 #include <config.h> 1787 %} 1788 1789will suffice. Otherwise, we suggest 1790 1791 %{ 1792 #if (__STDC_VERSION__ < 199901 && ! defined __GNUC__ \ 1793 && ! defined inline) 1794 # define inline 1795 #endif 1796 %} 1797 1798 1799File: bison.info, Node: Locations, Next: Bison Parser, Prev: GLR Parsers, Up: Concepts 1800 18011.6 Locations 1802============= 1803 1804Many applications, like interpreters or compilers, have to produce 1805verbose and useful error messages. To achieve this, one must be able 1806to keep track of the "textual location", or "location", of each 1807syntactic construct. Bison provides a mechanism for handling these 1808locations. 1809 1810 Each token has a semantic value. In a similar fashion, each token 1811has an associated location, but the type of locations is the same for 1812all tokens and groupings. Moreover, the output parser is equipped with 1813a default data structure for storing locations (*note Tracking 1814Locations::, for more details). 1815 1816 Like semantic values, locations can be reached in actions using a 1817dedicated set of constructs. In the example above, the location of the 1818whole grouping is `@$', while the locations of the subexpressions are 1819`@1' and `@3'. 1820 1821 When a rule is matched, a default action is used to compute the 1822semantic value of its left hand side (*note Actions::). In the same 1823way, another default action is used for locations. However, the action 1824for locations is general enough for most cases, meaning there is 1825usually no need to describe for each rule how `@$' should be formed. 1826When building a new location for a given grouping, the default behavior 1827of the output parser is to take the beginning of the first symbol, and 1828the end of the last symbol. 1829 1830 1831File: bison.info, Node: Bison Parser, Next: Stages, Prev: Locations, Up: Concepts 1832 18331.7 Bison Output: the Parser Implementation File 1834================================================ 1835 1836When you run Bison, you give it a Bison grammar file as input. The 1837most important output is a C source file that implements a parser for 1838the language described by the grammar. This parser is called a "Bison 1839parser", and this file is called a "Bison parser implementation file". 1840Keep in mind that the Bison utility and the Bison parser are two 1841distinct programs: the Bison utility is a program whose output is the 1842Bison parser implementation file that becomes part of your program. 1843 1844 The job of the Bison parser is to group tokens into groupings 1845according to the grammar rules--for example, to build identifiers and 1846operators into expressions. As it does this, it runs the actions for 1847the grammar rules it uses. 1848 1849 The tokens come from a function called the "lexical analyzer" that 1850you must supply in some fashion (such as by writing it in C). The Bison 1851parser calls the lexical analyzer each time it wants a new token. It 1852doesn't know what is "inside" the tokens (though their semantic values 1853may reflect this). Typically the lexical analyzer makes the tokens by 1854parsing characters of text, but Bison does not depend on this. *Note 1855The Lexical Analyzer Function `yylex': Lexical. 1856 1857 The Bison parser implementation file is C code which defines a 1858function named `yyparse' which implements that grammar. This function 1859does not make a complete C program: you must supply some additional 1860functions. One is the lexical analyzer. Another is an error-reporting 1861function which the parser calls to report an error. In addition, a 1862complete C program must start with a function called `main'; you have 1863to provide this, and arrange for it to call `yyparse' or the parser 1864will never run. *Note Parser C-Language Interface: Interface. 1865 1866 Aside from the token type names and the symbols in the actions you 1867write, all symbols defined in the Bison parser implementation file 1868itself begin with `yy' or `YY'. This includes interface functions such 1869as the lexical analyzer function `yylex', the error reporting function 1870`yyerror' and the parser function `yyparse' itself. This also includes 1871numerous identifiers used for internal purposes. Therefore, you should 1872avoid using C identifiers starting with `yy' or `YY' in the Bison 1873grammar file except for the ones defined in this manual. Also, you 1874should avoid using the C identifiers `malloc' and `free' for anything 1875other than their usual meanings. 1876 1877 In some cases the Bison parser implementation file includes system 1878headers, and in those cases your code should respect the identifiers 1879reserved by those headers. On some non-GNU hosts, `<alloca.h>', 1880`<malloc.h>', `<stddef.h>', and `<stdlib.h>' are included as needed to 1881declare memory allocators and related types. `<libintl.h>' is included 1882if message translation is in use (*note Internationalization::). Other 1883system headers may be included if you define `YYDEBUG' to a nonzero 1884value (*note Tracing Your Parser: Tracing.). 1885 1886 1887File: bison.info, Node: Stages, Next: Grammar Layout, Prev: Bison Parser, Up: Concepts 1888 18891.8 Stages in Using Bison 1890========================= 1891 1892The actual language-design process using Bison, from grammar 1893specification to a working compiler or interpreter, has these parts: 1894 1895 1. Formally specify the grammar in a form recognized by Bison (*note 1896 Bison Grammar Files: Grammar File.). For each grammatical rule in 1897 the language, describe the action that is to be taken when an 1898 instance of that rule is recognized. The action is described by a 1899 sequence of C statements. 1900 1901 2. Write a lexical analyzer to process input and pass tokens to the 1902 parser. The lexical analyzer may be written by hand in C (*note 1903 The Lexical Analyzer Function `yylex': Lexical.). It could also 1904 be produced using Lex, but the use of Lex is not discussed in this 1905 manual. 1906 1907 3. Write a controlling function that calls the Bison-produced parser. 1908 1909 4. Write error-reporting routines. 1910 1911 To turn this source code as written into a runnable program, you 1912must follow these steps: 1913 1914 1. Run Bison on the grammar to produce the parser. 1915 1916 2. Compile the code output by Bison, as well as any other source 1917 files. 1918 1919 3. Link the object files to produce the finished product. 1920 1921 1922File: bison.info, Node: Grammar Layout, Prev: Stages, Up: Concepts 1923 19241.9 The Overall Layout of a Bison Grammar 1925========================================= 1926 1927The input file for the Bison utility is a "Bison grammar file". The 1928general form of a Bison grammar file is as follows: 1929 1930 %{ 1931 PROLOGUE 1932 %} 1933 1934 BISON DECLARATIONS 1935 1936 %% 1937 GRAMMAR RULES 1938 %% 1939 EPILOGUE 1940 1941The `%%', `%{' and `%}' are punctuation that appears in every Bison 1942grammar file to separate the sections. 1943 1944 The prologue may define types and variables used in the actions. 1945You can also use preprocessor commands to define macros used there, and 1946use `#include' to include header files that do any of these things. 1947You need to declare the lexical analyzer `yylex' and the error printer 1948`yyerror' here, along with any other global identifiers used by the 1949actions in the grammar rules. 1950 1951 The Bison declarations declare the names of the terminal and 1952nonterminal symbols, and may also describe operator precedence and the 1953data types of semantic values of various symbols. 1954 1955 The grammar rules define how to construct each nonterminal symbol 1956from its parts. 1957 1958 The epilogue can contain any code you want to use. Often the 1959definitions of functions declared in the prologue go here. In a simple 1960program, all the rest of the program can go here. 1961 1962 1963File: bison.info, Node: Examples, Next: Grammar File, Prev: Concepts, Up: Top 1964 19652 Examples 1966********** 1967 1968Now we show and explain several sample programs written using Bison: a 1969reverse polish notation calculator, an algebraic (infix) notation 1970calculator -- later extended to track "locations" -- and a 1971multi-function calculator. All produce usable, though limited, 1972interactive desk-top calculators. 1973 1974 These examples are simple, but Bison grammars for real programming 1975languages are written the same way. You can copy these examples into a 1976source file to try them. 1977 1978* Menu: 1979 1980* RPN Calc:: Reverse polish notation calculator; 1981 a first example with no operator precedence. 1982* Infix Calc:: Infix (algebraic) notation calculator. 1983 Operator precedence is introduced. 1984* Simple Error Recovery:: Continuing after syntax errors. 1985* Location Tracking Calc:: Demonstrating the use of @N and @$. 1986* Multi-function Calc:: Calculator with memory and trig functions. 1987 It uses multiple data-types for semantic values. 1988* Exercises:: Ideas for improving the multi-function calculator. 1989 1990 1991File: bison.info, Node: RPN Calc, Next: Infix Calc, Up: Examples 1992 19932.1 Reverse Polish Notation Calculator 1994====================================== 1995 1996The first example is that of a simple double-precision "reverse polish 1997notation" calculator (a calculator using postfix operators). This 1998example provides a good starting point, since operator precedence is 1999not an issue. The second example will illustrate how operator 2000precedence is handled. 2001 2002 The source code for this calculator is named `rpcalc.y'. The `.y' 2003extension is a convention used for Bison grammar files. 2004 2005* Menu: 2006 2007* Rpcalc Declarations:: Prologue (declarations) for rpcalc. 2008* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation. 2009* Rpcalc Lexer:: The lexical analyzer. 2010* Rpcalc Main:: The controlling function. 2011* Rpcalc Error:: The error reporting function. 2012* Rpcalc Generate:: Running Bison on the grammar file. 2013* Rpcalc Compile:: Run the C compiler on the output code. 2014 2015 2016File: bison.info, Node: Rpcalc Declarations, Next: Rpcalc Rules, Up: RPN Calc 2017 20182.1.1 Declarations for `rpcalc' 2019------------------------------- 2020 2021Here are the C and Bison declarations for the reverse polish notation 2022calculator. As in C, comments are placed between `/*...*/'. 2023 2024 /* Reverse polish notation calculator. */ 2025 2026 %{ 2027 #define YYSTYPE double 2028 #include <math.h> 2029 int yylex (void); 2030 void yyerror (char const *); 2031 %} 2032 2033 %token NUM 2034 2035 %% /* Grammar rules and actions follow. */ 2036 2037 The declarations section (*note The prologue: Prologue.) contains two 2038preprocessor directives and two forward declarations. 2039 2040 The `#define' directive defines the macro `YYSTYPE', thus specifying 2041the C data type for semantic values of both tokens and groupings (*note 2042Data Types of Semantic Values: Value Type.). The Bison parser will use 2043whatever type `YYSTYPE' is defined as; if you don't define it, `int' is 2044the default. Because we specify `double', each token and each 2045expression has an associated value, which is a floating point number. 2046 2047 The `#include' directive is used to declare the exponentiation 2048function `pow'. 2049 2050 The forward declarations for `yylex' and `yyerror' are needed 2051because the C language requires that functions be declared before they 2052are used. These functions will be defined in the epilogue, but the 2053parser calls them so they must be declared in the prologue. 2054 2055 The second section, Bison declarations, provides information to Bison 2056about the token types (*note The Bison Declarations Section: Bison 2057Declarations.). Each terminal symbol that is not a single-character 2058literal must be declared here. (Single-character literals normally 2059don't need to be declared.) In this example, all the arithmetic 2060operators are designated by single-character literals, so the only 2061terminal symbol that needs to be declared is `NUM', the token type for 2062numeric constants. 2063 2064 2065File: bison.info, Node: Rpcalc Rules, Next: Rpcalc Lexer, Prev: Rpcalc Declarations, Up: RPN Calc 2066 20672.1.2 Grammar Rules for `rpcalc' 2068-------------------------------- 2069 2070Here are the grammar rules for the reverse polish notation calculator. 2071 2072 input: 2073 /* empty */ 2074 | input line 2075 ; 2076 2077 line: 2078 '\n' 2079 | exp '\n' { printf ("%.10g\n", $1); } 2080 ; 2081 2082 exp: 2083 NUM { $$ = $1; } 2084 | exp exp '+' { $$ = $1 + $2; } 2085 | exp exp '-' { $$ = $1 - $2; } 2086 | exp exp '*' { $$ = $1 * $2; } 2087 | exp exp '/' { $$ = $1 / $2; } 2088 | exp exp '^' { $$ = pow ($1, $2); } /* Exponentiation */ 2089 | exp 'n' { $$ = -$1; } /* Unary minus */ 2090 ; 2091 %% 2092 2093 The groupings of the rpcalc "language" defined here are the 2094expression (given the name `exp'), the line of input (`line'), and the 2095complete input transcript (`input'). Each of these nonterminal symbols 2096has several alternate rules, joined by the vertical bar `|' which is 2097read as "or". The following sections explain what these rules mean. 2098 2099 The semantics of the language is determined by the actions taken 2100when a grouping is recognized. The actions are the C code that appears 2101inside braces. *Note Actions::. 2102 2103 You must specify these actions in C, but Bison provides the means for 2104passing semantic values between the rules. In each action, the 2105pseudo-variable `$$' stands for the semantic value for the grouping 2106that the rule is going to construct. Assigning a value to `$$' is the 2107main job of most actions. The semantic values of the components of the 2108rule are referred to as `$1', `$2', and so on. 2109 2110* Menu: 2111 2112* Rpcalc Input:: 2113* Rpcalc Line:: 2114* Rpcalc Expr:: 2115 2116 2117File: bison.info, Node: Rpcalc Input, Next: Rpcalc Line, Up: Rpcalc Rules 2118 21192.1.2.1 Explanation of `input' 2120.............................. 2121 2122Consider the definition of `input': 2123 2124 input: 2125 /* empty */ 2126 | input line 2127 ; 2128 2129 This definition reads as follows: "A complete input is either an 2130empty string, or a complete input followed by an input line". Notice 2131that "complete input" is defined in terms of itself. This definition 2132is said to be "left recursive" since `input' appears always as the 2133leftmost symbol in the sequence. *Note Recursive Rules: Recursion. 2134 2135 The first alternative is empty because there are no symbols between 2136the colon and the first `|'; this means that `input' can match an empty 2137string of input (no tokens). We write the rules this way because it is 2138legitimate to type `Ctrl-d' right after you start the calculator. It's 2139conventional to put an empty alternative first and write the comment 2140`/* empty */' in it. 2141 2142 The second alternate rule (`input line') handles all nontrivial 2143input. It means, "After reading any number of lines, read one more 2144line if possible." The left recursion makes this rule into a loop. 2145Since the first alternative matches empty input, the loop can be 2146executed zero or more times. 2147 2148 The parser function `yyparse' continues to process input until a 2149grammatical error is seen or the lexical analyzer says there are no more 2150input tokens; we will arrange for the latter to happen at end-of-input. 2151 2152 2153File: bison.info, Node: Rpcalc Line, Next: Rpcalc Expr, Prev: Rpcalc Input, Up: Rpcalc Rules 2154 21552.1.2.2 Explanation of `line' 2156............................. 2157 2158Now consider the definition of `line': 2159 2160 line: 2161 '\n' 2162 | exp '\n' { printf ("%.10g\n", $1); } 2163 ; 2164 2165 The first alternative is a token which is a newline character; this 2166means that rpcalc accepts a blank line (and ignores it, since there is 2167no action). The second alternative is an expression followed by a 2168newline. This is the alternative that makes rpcalc useful. The 2169semantic value of the `exp' grouping is the value of `$1' because the 2170`exp' in question is the first symbol in the alternative. The action 2171prints this value, which is the result of the computation the user 2172asked for. 2173 2174 This action is unusual because it does not assign a value to `$$'. 2175As a consequence, the semantic value associated with the `line' is 2176uninitialized (its value will be unpredictable). This would be a bug if 2177that value were ever used, but we don't use it: once rpcalc has printed 2178the value of the user's input line, that value is no longer needed. 2179 2180 2181File: bison.info, Node: Rpcalc Expr, Prev: Rpcalc Line, Up: Rpcalc Rules 2182 21832.1.2.3 Explanation of `expr' 2184............................. 2185 2186The `exp' grouping has several rules, one for each kind of expression. 2187The first rule handles the simplest expressions: those that are just 2188numbers. The second handles an addition-expression, which looks like 2189two expressions followed by a plus-sign. The third handles 2190subtraction, and so on. 2191 2192 exp: 2193 NUM 2194 | exp exp '+' { $$ = $1 + $2; } 2195 | exp exp '-' { $$ = $1 - $2; } 2196 ... 2197 ; 2198 2199 We have used `|' to join all the rules for `exp', but we could 2200equally well have written them separately: 2201 2202 exp: NUM ; 2203 exp: exp exp '+' { $$ = $1 + $2; }; 2204 exp: exp exp '-' { $$ = $1 - $2; }; 2205 ... 2206 2207 Most of the rules have actions that compute the value of the 2208expression in terms of the value of its parts. For example, in the 2209rule for addition, `$1' refers to the first component `exp' and `$2' 2210refers to the second one. The third component, `'+'', has no meaningful 2211associated semantic value, but if it had one you could refer to it as 2212`$3'. When `yyparse' recognizes a sum expression using this rule, the 2213sum of the two subexpressions' values is produced as the value of the 2214entire expression. *Note Actions::. 2215 2216 You don't have to give an action for every rule. When a rule has no 2217action, Bison by default copies the value of `$1' into `$$'. This is 2218what happens in the first rule (the one that uses `NUM'). 2219 2220 The formatting shown here is the recommended convention, but Bison 2221does not require it. You can add or change white space as much as you 2222wish. For example, this: 2223 2224 exp: NUM | exp exp '+' {$$ = $1 + $2; } | ... ; 2225 2226means the same thing as this: 2227 2228 exp: 2229 NUM 2230 | exp exp '+' { $$ = $1 + $2; } 2231 | ... 2232 ; 2233 2234The latter, however, is much more readable. 2235 2236 2237File: bison.info, Node: Rpcalc Lexer, Next: Rpcalc Main, Prev: Rpcalc Rules, Up: RPN Calc 2238 22392.1.3 The `rpcalc' Lexical Analyzer 2240----------------------------------- 2241 2242The lexical analyzer's job is low-level parsing: converting characters 2243or sequences of characters into tokens. The Bison parser gets its 2244tokens by calling the lexical analyzer. *Note The Lexical Analyzer 2245Function `yylex': Lexical. 2246 2247 Only a simple lexical analyzer is needed for the RPN calculator. 2248This lexical analyzer skips blanks and tabs, then reads in numbers as 2249`double' and returns them as `NUM' tokens. Any other character that 2250isn't part of a number is a separate token. Note that the token-code 2251for such a single-character token is the character itself. 2252 2253 The return value of the lexical analyzer function is a numeric code 2254which represents a token type. The same text used in Bison rules to 2255stand for this token type is also a C expression for the numeric code 2256for the type. This works in two ways. If the token type is a 2257character literal, then its numeric code is that of the character; you 2258can use the same character literal in the lexical analyzer to express 2259the number. If the token type is an identifier, that identifier is 2260defined by Bison as a C macro whose definition is the appropriate 2261number. In this example, therefore, `NUM' becomes a macro for `yylex' 2262to use. 2263 2264 The semantic value of the token (if it has one) is stored into the 2265global variable `yylval', which is where the Bison parser will look for 2266it. (The C data type of `yylval' is `YYSTYPE', which was defined at 2267the beginning of the grammar; *note Declarations for `rpcalc': Rpcalc 2268Declarations.) 2269 2270 A token type code of zero is returned if the end-of-input is 2271encountered. (Bison recognizes any nonpositive value as indicating 2272end-of-input.) 2273 2274 Here is the code for the lexical analyzer: 2275 2276 /* The lexical analyzer returns a double floating point 2277 number on the stack and the token NUM, or the numeric code 2278 of the character read if not a number. It skips all blanks 2279 and tabs, and returns 0 for end-of-input. */ 2280 2281 #include <ctype.h> 2282 2283 int 2284 yylex (void) 2285 { 2286 int c; 2287 2288 /* Skip white space. */ 2289 while ((c = getchar ()) == ' ' || c == '\t') 2290 continue; 2291 /* Process numbers. */ 2292 if (c == '.' || isdigit (c)) 2293 { 2294 ungetc (c, stdin); 2295 scanf ("%lf", &yylval); 2296 return NUM; 2297 } 2298 /* Return end-of-input. */ 2299 if (c == EOF) 2300 return 0; 2301 /* Return a single char. */ 2302 return c; 2303 } 2304 2305 2306File: bison.info, Node: Rpcalc Main, Next: Rpcalc Error, Prev: Rpcalc Lexer, Up: RPN Calc 2307 23082.1.4 The Controlling Function 2309------------------------------ 2310 2311In keeping with the spirit of this example, the controlling function is 2312kept to the bare minimum. The only requirement is that it call 2313`yyparse' to start the process of parsing. 2314 2315 int 2316 main (void) 2317 { 2318 return yyparse (); 2319 } 2320 2321 2322File: bison.info, Node: Rpcalc Error, Next: Rpcalc Generate, Prev: Rpcalc Main, Up: RPN Calc 2323 23242.1.5 The Error Reporting Routine 2325--------------------------------- 2326 2327When `yyparse' detects a syntax error, it calls the error reporting 2328function `yyerror' to print an error message (usually but not always 2329`"syntax error"'). It is up to the programmer to supply `yyerror' 2330(*note Parser C-Language Interface: Interface.), so here is the 2331definition we will use: 2332 2333 #include <stdio.h> 2334 2335 /* Called by yyparse on error. */ 2336 void 2337 yyerror (char const *s) 2338 { 2339 fprintf (stderr, "%s\n", s); 2340 } 2341 2342 After `yyerror' returns, the Bison parser may recover from the error 2343and continue parsing if the grammar contains a suitable error rule 2344(*note Error Recovery::). Otherwise, `yyparse' returns nonzero. We 2345have not written any error rules in this example, so any invalid input 2346will cause the calculator program to exit. This is not clean behavior 2347for a real calculator, but it is adequate for the first example. 2348 2349 2350File: bison.info, Node: Rpcalc Generate, Next: Rpcalc Compile, Prev: Rpcalc Error, Up: RPN Calc 2351 23522.1.6 Running Bison to Make the Parser 2353-------------------------------------- 2354 2355Before running Bison to produce a parser, we need to decide how to 2356arrange all the source code in one or more source files. For such a 2357simple example, the easiest thing is to put everything in one file, the 2358grammar file. The definitions of `yylex', `yyerror' and `main' go at 2359the end, in the epilogue of the grammar file (*note The Overall Layout 2360of a Bison Grammar: Grammar Layout.). 2361 2362 For a large project, you would probably have several source files, 2363and use `make' to arrange to recompile them. 2364 2365 With all the source in the grammar file, you use the following 2366command to convert it into a parser implementation file: 2367 2368 bison FILE.y 2369 2370In this example, the grammar file is called `rpcalc.y' (for "Reverse 2371Polish CALCulator"). Bison produces a parser implementation file named 2372`FILE.tab.c', removing the `.y' from the grammar file name. The parser 2373implementation file contains the source code for `yyparse'. The 2374additional functions in the grammar file (`yylex', `yyerror' and 2375`main') are copied verbatim to the parser implementation file. 2376 2377 2378File: bison.info, Node: Rpcalc Compile, Prev: Rpcalc Generate, Up: RPN Calc 2379 23802.1.7 Compiling the Parser Implementation File 2381---------------------------------------------- 2382 2383Here is how to compile and run the parser implementation file: 2384 2385 # List files in current directory. 2386 $ ls 2387 rpcalc.tab.c rpcalc.y 2388 2389 # Compile the Bison parser. 2390 # `-lm' tells compiler to search math library for `pow'. 2391 $ cc -lm -o rpcalc rpcalc.tab.c 2392 2393 # List files again. 2394 $ ls 2395 rpcalc rpcalc.tab.c rpcalc.y 2396 2397 The file `rpcalc' now contains the executable code. Here is an 2398example session using `rpcalc'. 2399 2400 $ rpcalc 2401 4 9 + 2402 13 2403 3 7 + 3 4 5 *+- 2404 -13 2405 3 7 + 3 4 5 * + - n Note the unary minus, `n' 2406 13 2407 5 6 / 4 n + 2408 -3.166666667 2409 3 4 ^ Exponentiation 2410 81 2411 ^D End-of-file indicator 2412 $ 2413 2414 2415File: bison.info, Node: Infix Calc, Next: Simple Error Recovery, Prev: RPN Calc, Up: Examples 2416 24172.2 Infix Notation Calculator: `calc' 2418===================================== 2419 2420We now modify rpcalc to handle infix operators instead of postfix. 2421Infix notation involves the concept of operator precedence and the need 2422for parentheses nested to arbitrary depth. Here is the Bison code for 2423`calc.y', an infix desk-top calculator. 2424 2425 /* Infix notation calculator. */ 2426 2427 %{ 2428 #define YYSTYPE double 2429 #include <math.h> 2430 #include <stdio.h> 2431 int yylex (void); 2432 void yyerror (char const *); 2433 %} 2434 2435 /* Bison declarations. */ 2436 %token NUM 2437 %left '-' '+' 2438 %left '*' '/' 2439 %left NEG /* negation--unary minus */ 2440 %right '^' /* exponentiation */ 2441 2442 %% /* The grammar follows. */ 2443 input: 2444 /* empty */ 2445 | input line 2446 ; 2447 2448 line: 2449 '\n' 2450 | exp '\n' { printf ("\t%.10g\n", $1); } 2451 ; 2452 2453 exp: 2454 NUM { $$ = $1; } 2455 | exp '+' exp { $$ = $1 + $3; } 2456 | exp '-' exp { $$ = $1 - $3; } 2457 | exp '*' exp { $$ = $1 * $3; } 2458 | exp '/' exp { $$ = $1 / $3; } 2459 | '-' exp %prec NEG { $$ = -$2; } 2460 | exp '^' exp { $$ = pow ($1, $3); } 2461 | '(' exp ')' { $$ = $2; } 2462 ; 2463 %% 2464 2465The functions `yylex', `yyerror' and `main' can be the same as before. 2466 2467 There are two important new features shown in this code. 2468 2469 In the second section (Bison declarations), `%left' declares token 2470types and says they are left-associative operators. The declarations 2471`%left' and `%right' (right associativity) take the place of `%token' 2472which is used to declare a token type name without associativity. 2473(These tokens are single-character literals, which ordinarily don't 2474need to be declared. We declare them here to specify the 2475associativity.) 2476 2477 Operator precedence is determined by the line ordering of the 2478declarations; the higher the line number of the declaration (lower on 2479the page or screen), the higher the precedence. Hence, exponentiation 2480has the highest precedence, unary minus (`NEG') is next, followed by 2481`*' and `/', and so on. *Note Operator Precedence: Precedence. 2482 2483 The other important new feature is the `%prec' in the grammar 2484section for the unary minus operator. The `%prec' simply instructs 2485Bison that the rule `| '-' exp' has the same precedence as `NEG'--in 2486this case the next-to-highest. *Note Context-Dependent Precedence: 2487Contextual Precedence. 2488 2489 Here is a sample run of `calc.y': 2490 2491 $ calc 2492 4 + 4.5 - (34/(8*3+-3)) 2493 6.880952381 2494 -56 + 2 2495 -54 2496 3 ^ 2 2497 9 2498 2499 2500File: bison.info, Node: Simple Error Recovery, Next: Location Tracking Calc, Prev: Infix Calc, Up: Examples 2501 25022.3 Simple Error Recovery 2503========================= 2504 2505Up to this point, this manual has not addressed the issue of "error 2506recovery"--how to continue parsing after the parser detects a syntax 2507error. All we have handled is error reporting with `yyerror'. Recall 2508that by default `yyparse' returns after calling `yyerror'. This means 2509that an erroneous input line causes the calculator program to exit. 2510Now we show how to rectify this deficiency. 2511 2512 The Bison language itself includes the reserved word `error', which 2513may be included in the grammar rules. In the example below it has been 2514added to one of the alternatives for `line': 2515 2516 line: 2517 '\n' 2518 | exp '\n' { printf ("\t%.10g\n", $1); } 2519 | error '\n' { yyerrok; } 2520 ; 2521 2522 This addition to the grammar allows for simple error recovery in the 2523event of a syntax error. If an expression that cannot be evaluated is 2524read, the error will be recognized by the third rule for `line', and 2525parsing will continue. (The `yyerror' function is still called upon to 2526print its message as well.) The action executes the statement 2527`yyerrok', a macro defined automatically by Bison; its meaning is that 2528error recovery is complete (*note Error Recovery::). Note the 2529difference between `yyerrok' and `yyerror'; neither one is a misprint. 2530 2531 This form of error recovery deals with syntax errors. There are 2532other kinds of errors; for example, division by zero, which raises an 2533exception signal that is normally fatal. A real calculator program 2534must handle this signal and use `longjmp' to return to `main' and 2535resume parsing input lines; it would also have to discard the rest of 2536the current line of input. We won't discuss this issue further because 2537it is not specific to Bison programs. 2538 2539 2540File: bison.info, Node: Location Tracking Calc, Next: Multi-function Calc, Prev: Simple Error Recovery, Up: Examples 2541 25422.4 Location Tracking Calculator: `ltcalc' 2543========================================== 2544 2545This example extends the infix notation calculator with location 2546tracking. This feature will be used to improve the error messages. For 2547the sake of clarity, this example is a simple integer calculator, since 2548most of the work needed to use locations will be done in the lexical 2549analyzer. 2550 2551* Menu: 2552 2553* Ltcalc Declarations:: Bison and C declarations for ltcalc. 2554* Ltcalc Rules:: Grammar rules for ltcalc, with explanations. 2555* Ltcalc Lexer:: The lexical analyzer. 2556 2557 2558File: bison.info, Node: Ltcalc Declarations, Next: Ltcalc Rules, Up: Location Tracking Calc 2559 25602.4.1 Declarations for `ltcalc' 2561------------------------------- 2562 2563The C and Bison declarations for the location tracking calculator are 2564the same as the declarations for the infix notation calculator. 2565 2566 /* Location tracking calculator. */ 2567 2568 %{ 2569 #define YYSTYPE int 2570 #include <math.h> 2571 int yylex (void); 2572 void yyerror (char const *); 2573 %} 2574 2575 /* Bison declarations. */ 2576 %token NUM 2577 2578 %left '-' '+' 2579 %left '*' '/' 2580 %left NEG 2581 %right '^' 2582 2583 %% /* The grammar follows. */ 2584 2585Note there are no declarations specific to locations. Defining a data 2586type for storing locations is not needed: we will use the type provided 2587by default (*note Data Types of Locations: Location Type.), which is a 2588four member structure with the following integer fields: `first_line', 2589`first_column', `last_line' and `last_column'. By conventions, and in 2590accordance with the GNU Coding Standards and common practice, the line 2591and column count both start at 1. 2592 2593 2594File: bison.info, Node: Ltcalc Rules, Next: Ltcalc Lexer, Prev: Ltcalc Declarations, Up: Location Tracking Calc 2595 25962.4.2 Grammar Rules for `ltcalc' 2597-------------------------------- 2598 2599Whether handling locations or not has no effect on the syntax of your 2600language. Therefore, grammar rules for this example will be very close 2601to those of the previous example: we will only modify them to benefit 2602from the new information. 2603 2604 Here, we will use locations to report divisions by zero, and locate 2605the wrong expressions or subexpressions. 2606 2607 input: 2608 /* empty */ 2609 | input line 2610 ; 2611 2612 line: 2613 '\n' 2614 | exp '\n' { printf ("%d\n", $1); } 2615 ; 2616 2617 exp: 2618 NUM { $$ = $1; } 2619 | exp '+' exp { $$ = $1 + $3; } 2620 | exp '-' exp { $$ = $1 - $3; } 2621 | exp '*' exp { $$ = $1 * $3; } 2622 | exp '/' exp 2623 { 2624 if ($3) 2625 $$ = $1 / $3; 2626 else 2627 { 2628 $$ = 1; 2629 fprintf (stderr, "%d.%d-%d.%d: division by zero", 2630 @3.first_line, @3.first_column, 2631 @3.last_line, @3.last_column); 2632 } 2633 } 2634 | '-' exp %prec NEG { $$ = -$2; } 2635 | exp '^' exp { $$ = pow ($1, $3); } 2636 | '(' exp ')' { $$ = $2; } 2637 2638 This code shows how to reach locations inside of semantic actions, by 2639using the pseudo-variables `@N' for rule components, and the 2640pseudo-variable `@$' for groupings. 2641 2642 We don't need to assign a value to `@$': the output parser does it 2643automatically. By default, before executing the C code of each action, 2644`@$' is set to range from the beginning of `@1' to the end of `@N', for 2645a rule with N components. This behavior can be redefined (*note 2646Default Action for Locations: Location Default Action.), and for very 2647specific rules, `@$' can be computed by hand. 2648 2649 2650File: bison.info, Node: Ltcalc Lexer, Prev: Ltcalc Rules, Up: Location Tracking Calc 2651 26522.4.3 The `ltcalc' Lexical Analyzer. 2653------------------------------------ 2654 2655Until now, we relied on Bison's defaults to enable location tracking. 2656The next step is to rewrite the lexical analyzer, and make it able to 2657feed the parser with the token locations, as it already does for 2658semantic values. 2659 2660 To this end, we must take into account every single character of the 2661input text, to avoid the computed locations of being fuzzy or wrong: 2662 2663 int 2664 yylex (void) 2665 { 2666 int c; 2667 2668 /* Skip white space. */ 2669 while ((c = getchar ()) == ' ' || c == '\t') 2670 ++yylloc.last_column; 2671 2672 /* Step. */ 2673 yylloc.first_line = yylloc.last_line; 2674 yylloc.first_column = yylloc.last_column; 2675 2676 /* Process numbers. */ 2677 if (isdigit (c)) 2678 { 2679 yylval = c - '0'; 2680 ++yylloc.last_column; 2681 while (isdigit (c = getchar ())) 2682 { 2683 ++yylloc.last_column; 2684 yylval = yylval * 10 + c - '0'; 2685 } 2686 ungetc (c, stdin); 2687 return NUM; 2688 } 2689 2690 /* Return end-of-input. */ 2691 if (c == EOF) 2692 return 0; 2693 2694 /* Return a single char, and update location. */ 2695 if (c == '\n') 2696 { 2697 ++yylloc.last_line; 2698 yylloc.last_column = 0; 2699 } 2700 else 2701 ++yylloc.last_column; 2702 return c; 2703 } 2704 2705 Basically, the lexical analyzer performs the same processing as 2706before: it skips blanks and tabs, and reads numbers or single-character 2707tokens. In addition, it updates `yylloc', the global variable (of type 2708`YYLTYPE') containing the token's location. 2709 2710 Now, each time this function returns a token, the parser has its 2711number as well as its semantic value, and its location in the text. 2712The last needed change is to initialize `yylloc', for example in the 2713controlling function: 2714 2715 int 2716 main (void) 2717 { 2718 yylloc.first_line = yylloc.last_line = 1; 2719 yylloc.first_column = yylloc.last_column = 0; 2720 return yyparse (); 2721 } 2722 2723 Remember that computing locations is not a matter of syntax. Every 2724character must be associated to a location update, whether it is in 2725valid input, in comments, in literal strings, and so on. 2726 2727 2728File: bison.info, Node: Multi-function Calc, Next: Exercises, Prev: Location Tracking Calc, Up: Examples 2729 27302.5 Multi-Function Calculator: `mfcalc' 2731======================================= 2732 2733Now that the basics of Bison have been discussed, it is time to move on 2734to a more advanced problem. The above calculators provided only five 2735functions, `+', `-', `*', `/' and `^'. It would be nice to have a 2736calculator that provides other mathematical functions such as `sin', 2737`cos', etc. 2738 2739 It is easy to add new operators to the infix calculator as long as 2740they are only single-character literals. The lexical analyzer `yylex' 2741passes back all nonnumeric characters as tokens, so new grammar rules 2742suffice for adding a new operator. But we want something more 2743flexible: built-in functions whose syntax has this form: 2744 2745 FUNCTION_NAME (ARGUMENT) 2746 2747At the same time, we will add memory to the calculator, by allowing you 2748to create named variables, store values in them, and use them later. 2749Here is a sample session with the multi-function calculator: 2750 2751 $ mfcalc 2752 pi = 3.141592653589 2753 3.1415926536 2754 sin(pi) 2755 0.0000000000 2756 alpha = beta1 = 2.3 2757 2.3000000000 2758 alpha 2759 2.3000000000 2760 ln(alpha) 2761 0.8329091229 2762 exp(ln(beta1)) 2763 2.3000000000 2764 $ 2765 2766 Note that multiple assignment and nested function calls are 2767permitted. 2768 2769* Menu: 2770 2771* Mfcalc Declarations:: Bison declarations for multi-function calculator. 2772* Mfcalc Rules:: Grammar rules for the calculator. 2773* Mfcalc Symbol Table:: Symbol table management subroutines. 2774 2775 2776File: bison.info, Node: Mfcalc Declarations, Next: Mfcalc Rules, Up: Multi-function Calc 2777 27782.5.1 Declarations for `mfcalc' 2779------------------------------- 2780 2781Here are the C and Bison declarations for the multi-function calculator. 2782 2783 %{ 2784 #include <math.h> /* For math functions, cos(), sin(), etc. */ 2785 #include "calc.h" /* Contains definition of `symrec'. */ 2786 int yylex (void); 2787 void yyerror (char const *); 2788 %} 2789 2790 %union { 2791 double val; /* For returning numbers. */ 2792 symrec *tptr; /* For returning symbol-table pointers. */ 2793 } 2794 %token <val> NUM /* Simple double precision number. */ 2795 %token <tptr> VAR FNCT /* Variable and function. */ 2796 %type <val> exp 2797 2798 %right '=' 2799 %left '-' '+' 2800 %left '*' '/' 2801 %left NEG /* negation--unary minus */ 2802 %right '^' /* exponentiation */ 2803 2804 The above grammar introduces only two new features of the Bison 2805language. These features allow semantic values to have various data 2806types (*note More Than One Value Type: Multiple Types.). 2807 2808 The `%union' declaration specifies the entire list of possible types; 2809this is instead of defining `YYSTYPE'. The allowable types are now 2810double-floats (for `exp' and `NUM') and pointers to entries in the 2811symbol table. *Note The Collection of Value Types: Union Decl. 2812 2813 Since values can now have various types, it is necessary to 2814associate a type with each grammar symbol whose semantic value is used. 2815These symbols are `NUM', `VAR', `FNCT', and `exp'. Their declarations 2816are augmented with information about their data type (placed between 2817angle brackets). 2818 2819 The Bison construct `%type' is used for declaring nonterminal 2820symbols, just as `%token' is used for declaring token types. We have 2821not used `%type' before because nonterminal symbols are normally 2822declared implicitly by the rules that define them. But `exp' must be 2823declared explicitly so we can specify its value type. *Note 2824Nonterminal Symbols: Type Decl. 2825 2826 2827File: bison.info, Node: Mfcalc Rules, Next: Mfcalc Symbol Table, Prev: Mfcalc Declarations, Up: Multi-function Calc 2828 28292.5.2 Grammar Rules for `mfcalc' 2830-------------------------------- 2831 2832Here are the grammar rules for the multi-function calculator. Most of 2833them are copied directly from `calc'; three rules, those which mention 2834`VAR' or `FNCT', are new. 2835 2836 %% /* The grammar follows. */ 2837 input: 2838 /* empty */ 2839 | input line 2840 ; 2841 2842 line: 2843 '\n' 2844 | exp '\n' { printf ("%.10g\n", $1); } 2845 | error '\n' { yyerrok; } 2846 ; 2847 2848 exp: 2849 NUM { $$ = $1; } 2850 | VAR { $$ = $1->value.var; } 2851 | VAR '=' exp { $$ = $3; $1->value.var = $3; } 2852 | FNCT '(' exp ')' { $$ = (*($1->value.fnctptr))($3); } 2853 | exp '+' exp { $$ = $1 + $3; } 2854 | exp '-' exp { $$ = $1 - $3; } 2855 | exp '*' exp { $$ = $1 * $3; } 2856 | exp '/' exp { $$ = $1 / $3; } 2857 | '-' exp %prec NEG { $$ = -$2; } 2858 | exp '^' exp { $$ = pow ($1, $3); } 2859 | '(' exp ')' { $$ = $2; } 2860 ; 2861 /* End of grammar. */ 2862 %% 2863 2864 2865File: bison.info, Node: Mfcalc Symbol Table, Prev: Mfcalc Rules, Up: Multi-function Calc 2866 28672.5.3 The `mfcalc' Symbol Table 2868------------------------------- 2869 2870The multi-function calculator requires a symbol table to keep track of 2871the names and meanings of variables and functions. This doesn't affect 2872the grammar rules (except for the actions) or the Bison declarations, 2873but it requires some additional C functions for support. 2874 2875 The symbol table itself consists of a linked list of records. Its 2876definition, which is kept in the header `calc.h', is as follows. It 2877provides for either functions or variables to be placed in the table. 2878 2879 /* Function type. */ 2880 typedef double (*func_t) (double); 2881 2882 /* Data type for links in the chain of symbols. */ 2883 struct symrec 2884 { 2885 char *name; /* name of symbol */ 2886 int type; /* type of symbol: either VAR or FNCT */ 2887 union 2888 { 2889 double var; /* value of a VAR */ 2890 func_t fnctptr; /* value of a FNCT */ 2891 } value; 2892 struct symrec *next; /* link field */ 2893 }; 2894 2895 typedef struct symrec symrec; 2896 2897 /* The symbol table: a chain of `struct symrec'. */ 2898 extern symrec *sym_table; 2899 2900 symrec *putsym (char const *, int); 2901 symrec *getsym (char const *); 2902 2903 The new version of `main' includes a call to `init_table', a 2904function that initializes the symbol table. Here it is, and 2905`init_table' as well: 2906 2907 #include <stdio.h> 2908 2909 /* Called by yyparse on error. */ 2910 void 2911 yyerror (char const *s) 2912 { 2913 fprintf (stderr, "%s\n", s); 2914 } 2915 2916 struct init 2917 { 2918 char const *fname; 2919 double (*fnct) (double); 2920 }; 2921 2922 struct init const arith_fncts[] = 2923 { 2924 "sin", sin, 2925 "cos", cos, 2926 "atan", atan, 2927 "ln", log, 2928 "exp", exp, 2929 "sqrt", sqrt, 2930 0, 0 2931 }; 2932 2933 /* The symbol table: a chain of `struct symrec'. */ 2934 symrec *sym_table; 2935 2936 /* Put arithmetic functions in table. */ 2937 void 2938 init_table (void) 2939 { 2940 int i; 2941 for (i = 0; arith_fncts[i].fname != 0; i++) 2942 { 2943 symrec *ptr = putsym (arith_fncts[i].fname, FNCT); 2944 ptr->value.fnctptr = arith_fncts[i].fnct; 2945 } 2946 } 2947 2948 int 2949 main (void) 2950 { 2951 init_table (); 2952 return yyparse (); 2953 } 2954 2955 By simply editing the initialization list and adding the necessary 2956include files, you can add additional functions to the calculator. 2957 2958 Two important functions allow look-up and installation of symbols in 2959the symbol table. The function `putsym' is passed a name and the type 2960(`VAR' or `FNCT') of the object to be installed. The object is linked 2961to the front of the list, and a pointer to the object is returned. The 2962function `getsym' is passed the name of the symbol to look up. If 2963found, a pointer to that symbol is returned; otherwise zero is returned. 2964 2965 #include <stdlib.h> /* malloc. */ 2966 #include <string.h> /* strlen. */ 2967 2968 symrec * 2969 putsym (char const *sym_name, int sym_type) 2970 { 2971 symrec *ptr = (symrec *) malloc (sizeof (symrec)); 2972 ptr->name = (char *) malloc (strlen (sym_name) + 1); 2973 strcpy (ptr->name,sym_name); 2974 ptr->type = sym_type; 2975 ptr->value.var = 0; /* Set value to 0 even if fctn. */ 2976 ptr->next = (struct symrec *)sym_table; 2977 sym_table = ptr; 2978 return ptr; 2979 } 2980 2981 symrec * 2982 getsym (char const *sym_name) 2983 { 2984 symrec *ptr; 2985 for (ptr = sym_table; ptr != (symrec *) 0; 2986 ptr = (symrec *)ptr->next) 2987 if (strcmp (ptr->name,sym_name) == 0) 2988 return ptr; 2989 return 0; 2990 } 2991 2992 The function `yylex' must now recognize variables, numeric values, 2993and the single-character arithmetic operators. Strings of alphanumeric 2994characters with a leading letter are recognized as either variables or 2995functions depending on what the symbol table says about them. 2996 2997 The string is passed to `getsym' for look up in the symbol table. If 2998the name appears in the table, a pointer to its location and its type 2999(`VAR' or `FNCT') is returned to `yyparse'. If it is not already in 3000the table, then it is installed as a `VAR' using `putsym'. Again, a 3001pointer and its type (which must be `VAR') is returned to `yyparse'. 3002 3003 No change is needed in the handling of numeric values and arithmetic 3004operators in `yylex'. 3005 3006 #include <ctype.h> 3007 3008 int 3009 yylex (void) 3010 { 3011 int c; 3012 3013 /* Ignore white space, get first nonwhite character. */ 3014 while ((c = getchar ()) == ' ' || c == '\t') 3015 continue; 3016 3017 if (c == EOF) 3018 return 0; 3019 3020 /* Char starts a number => parse the number. */ 3021 if (c == '.' || isdigit (c)) 3022 { 3023 ungetc (c, stdin); 3024 scanf ("%lf", &yylval.val); 3025 return NUM; 3026 } 3027 3028 /* Char starts an identifier => read the name. */ 3029 if (isalpha (c)) 3030 { 3031 /* Initially make the buffer long enough 3032 for a 40-character symbol name. */ 3033 static size_t length = 40; 3034 static char *symbuf = 0; 3035 symrec *s; 3036 int i; 3037 3038 if (!symbuf) 3039 symbuf = (char *) malloc (length + 1); 3040 3041 i = 0; 3042 do 3043 { 3044 /* If buffer is full, make it bigger. */ 3045 if (i == length) 3046 { 3047 length *= 2; 3048 symbuf = (char *) realloc (symbuf, length + 1); 3049 } 3050 /* Add this character to the buffer. */ 3051 symbuf[i++] = c; 3052 /* Get another character. */ 3053 c = getchar (); 3054 } 3055 while (isalnum (c)); 3056 3057 ungetc (c, stdin); 3058 symbuf[i] = '\0'; 3059 3060 s = getsym (symbuf); 3061 if (s == 0) 3062 s = putsym (symbuf, VAR); 3063 yylval.tptr = s; 3064 return s->type; 3065 } 3066 3067 /* Any other character is a token by itself. */ 3068 return c; 3069 } 3070 3071 The error reporting function is unchanged, and the new version of 3072`main' includes a call to `init_table' and sets the `yydebug' on user 3073demand (*Note Tracing Your Parser: Tracing, for details): 3074 3075 /* Called by yyparse on error. */ 3076 void 3077 yyerror (char const *s) 3078 { 3079 fprintf (stderr, "%s\n", s); 3080 } 3081 3082 int 3083 main (int argc, char const* argv[]) 3084 { 3085 int i; 3086 /* Enable parse traces on option -p. */ 3087 for (i = 1; i < argc; ++i) 3088 if (!strcmp(argv[i], "-p")) 3089 yydebug = 1; 3090 init_table (); 3091 return yyparse (); 3092 } 3093 3094 This program is both powerful and flexible. You may easily add new 3095functions, and it is a simple job to modify this code to install 3096predefined variables such as `pi' or `e' as well. 3097 3098 3099File: bison.info, Node: Exercises, Prev: Multi-function Calc, Up: Examples 3100 31012.6 Exercises 3102============= 3103 3104 1. Add some new functions from `math.h' to the initialization list. 3105 3106 2. Add another array that contains constants and their values. Then 3107 modify `init_table' to add these constants to the symbol table. 3108 It will be easiest to give the constants type `VAR'. 3109 3110 3. Make the program report an error if the user refers to an 3111 uninitialized variable in any way except to store a value in it. 3112 3113 3114File: bison.info, Node: Grammar File, Next: Interface, Prev: Examples, Up: Top 3115 31163 Bison Grammar Files 3117********************* 3118 3119Bison takes as input a context-free grammar specification and produces a 3120C-language function that recognizes correct instances of the grammar. 3121 3122 The Bison grammar file conventionally has a name ending in `.y'. 3123*Note Invoking Bison: Invocation. 3124 3125* Menu: 3126 3127* Grammar Outline:: Overall layout of the grammar file. 3128* Symbols:: Terminal and nonterminal symbols. 3129* Rules:: How to write grammar rules. 3130* Recursion:: Writing recursive rules. 3131* Semantics:: Semantic values and actions. 3132* Tracking Locations:: Locations and actions. 3133* Named References:: Using named references in actions. 3134* Declarations:: All kinds of Bison declarations are described here. 3135* Multiple Parsers:: Putting more than one Bison parser in one program. 3136 3137 3138File: bison.info, Node: Grammar Outline, Next: Symbols, Up: Grammar File 3139 31403.1 Outline of a Bison Grammar 3141============================== 3142 3143A Bison grammar file has four main sections, shown here with the 3144appropriate delimiters: 3145 3146 %{ 3147 PROLOGUE 3148 %} 3149 3150 BISON DECLARATIONS 3151 3152 %% 3153 GRAMMAR RULES 3154 %% 3155 3156 EPILOGUE 3157 3158 Comments enclosed in `/* ... */' may appear in any of the sections. 3159As a GNU extension, `//' introduces a comment that continues until end 3160of line. 3161 3162* Menu: 3163 3164* Prologue:: Syntax and usage of the prologue. 3165* Prologue Alternatives:: Syntax and usage of alternatives to the prologue. 3166* Bison Declarations:: Syntax and usage of the Bison declarations section. 3167* Grammar Rules:: Syntax and usage of the grammar rules section. 3168* Epilogue:: Syntax and usage of the epilogue. 3169 3170 3171File: bison.info, Node: Prologue, Next: Prologue Alternatives, Up: Grammar Outline 3172 31733.1.1 The prologue 3174------------------ 3175 3176The PROLOGUE section contains macro definitions and declarations of 3177functions and variables that are used in the actions in the grammar 3178rules. These are copied to the beginning of the parser implementation 3179file so that they precede the definition of `yyparse'. You can use 3180`#include' to get the declarations from a header file. If you don't 3181need any C declarations, you may omit the `%{' and `%}' delimiters that 3182bracket this section. 3183 3184 The PROLOGUE section is terminated by the first occurrence of `%}' 3185that is outside a comment, a string literal, or a character constant. 3186 3187 You may have more than one PROLOGUE section, intermixed with the 3188BISON DECLARATIONS. This allows you to have C and Bison declarations 3189that refer to each other. For example, the `%union' declaration may 3190use types defined in a header file, and you may wish to prototype 3191functions that take arguments of type `YYSTYPE'. This can be done with 3192two PROLOGUE blocks, one before and one after the `%union' declaration. 3193 3194 %{ 3195 #define _GNU_SOURCE 3196 #include <stdio.h> 3197 #include "ptypes.h" 3198 %} 3199 3200 %union { 3201 long int n; 3202 tree t; /* `tree' is defined in `ptypes.h'. */ 3203 } 3204 3205 %{ 3206 static void print_token_value (FILE *, int, YYSTYPE); 3207 #define YYPRINT(F, N, L) print_token_value (F, N, L) 3208 %} 3209 3210 ... 3211 3212 When in doubt, it is usually safer to put prologue code before all 3213Bison declarations, rather than after. For example, any definitions of 3214feature test macros like `_GNU_SOURCE' or `_POSIX_C_SOURCE' should 3215appear before all Bison declarations, as feature test macros can affect 3216the behavior of Bison-generated `#include' directives. 3217 3218 3219File: bison.info, Node: Prologue Alternatives, Next: Bison Declarations, Prev: Prologue, Up: Grammar Outline 3220 32213.1.2 Prologue Alternatives 3222--------------------------- 3223 3224The functionality of PROLOGUE sections can often be subtle and 3225inflexible. As an alternative, Bison provides a `%code' directive with 3226an explicit qualifier field, which identifies the purpose of the code 3227and thus the location(s) where Bison should generate it. For C/C++, 3228the qualifier can be omitted for the default location, or it can be one 3229of `requires', `provides', `top'. *Note %code Summary::. 3230 3231 Look again at the example of the previous section: 3232 3233 %{ 3234 #define _GNU_SOURCE 3235 #include <stdio.h> 3236 #include "ptypes.h" 3237 %} 3238 3239 %union { 3240 long int n; 3241 tree t; /* `tree' is defined in `ptypes.h'. */ 3242 } 3243 3244 %{ 3245 static void print_token_value (FILE *, int, YYSTYPE); 3246 #define YYPRINT(F, N, L) print_token_value (F, N, L) 3247 %} 3248 3249 ... 3250 3251Notice that there are two PROLOGUE sections here, but there's a subtle 3252distinction between their functionality. For example, if you decide to 3253override Bison's default definition for `YYLTYPE', in which PROLOGUE 3254section should you write your new definition? You should write it in 3255the first since Bison will insert that code into the parser 3256implementation file _before_ the default `YYLTYPE' definition. In 3257which PROLOGUE section should you prototype an internal function, 3258`trace_token', that accepts `YYLTYPE' and `yytokentype' as arguments? 3259You should prototype it in the second since Bison will insert that code 3260_after_ the `YYLTYPE' and `yytokentype' definitions. 3261 3262 This distinction in functionality between the two PROLOGUE sections 3263is established by the appearance of the `%union' between them. This 3264behavior raises a few questions. First, why should the position of a 3265`%union' affect definitions related to `YYLTYPE' and `yytokentype'? 3266Second, what if there is no `%union'? In that case, the second kind of 3267PROLOGUE section is not available. This behavior is not intuitive. 3268 3269 To avoid this subtle `%union' dependency, rewrite the example using a 3270`%code top' and an unqualified `%code'. Let's go ahead and add the new 3271`YYLTYPE' definition and the `trace_token' prototype at the same time: 3272 3273 %code top { 3274 #define _GNU_SOURCE 3275 #include <stdio.h> 3276 3277 /* WARNING: The following code really belongs 3278 * in a `%code requires'; see below. */ 3279 3280 #include "ptypes.h" 3281 #define YYLTYPE YYLTYPE 3282 typedef struct YYLTYPE 3283 { 3284 int first_line; 3285 int first_column; 3286 int last_line; 3287 int last_column; 3288 char *filename; 3289 } YYLTYPE; 3290 } 3291 3292 %union { 3293 long int n; 3294 tree t; /* `tree' is defined in `ptypes.h'. */ 3295 } 3296 3297 %code { 3298 static void print_token_value (FILE *, int, YYSTYPE); 3299 #define YYPRINT(F, N, L) print_token_value (F, N, L) 3300 static void trace_token (enum yytokentype token, YYLTYPE loc); 3301 } 3302 3303 ... 3304 3305In this way, `%code top' and the unqualified `%code' achieve the same 3306functionality as the two kinds of PROLOGUE sections, but it's always 3307explicit which kind you intend. Moreover, both kinds are always 3308available even in the absence of `%union'. 3309 3310 The `%code top' block above logically contains two parts. The first 3311two lines before the warning need to appear near the top of the parser 3312implementation file. The first line after the warning is required by 3313`YYSTYPE' and thus also needs to appear in the parser implementation 3314file. However, if you've instructed Bison to generate a parser header 3315file (*note %defines: Decl Summary.), you probably want that line to 3316appear before the `YYSTYPE' definition in that header file as well. 3317The `YYLTYPE' definition should also appear in the parser header file 3318to override the default `YYLTYPE' definition there. 3319 3320 In other words, in the `%code top' block above, all but the first two 3321lines are dependency code required by the `YYSTYPE' and `YYLTYPE' 3322definitions. Thus, they belong in one or more `%code requires': 3323 3324 %code top { 3325 #define _GNU_SOURCE 3326 #include <stdio.h> 3327 } 3328 3329 %code requires { 3330 #include "ptypes.h" 3331 } 3332 %union { 3333 long int n; 3334 tree t; /* `tree' is defined in `ptypes.h'. */ 3335 } 3336 3337 %code requires { 3338 #define YYLTYPE YYLTYPE 3339 typedef struct YYLTYPE 3340 { 3341 int first_line; 3342 int first_column; 3343 int last_line; 3344 int last_column; 3345 char *filename; 3346 } YYLTYPE; 3347 } 3348 3349 %code { 3350 static void print_token_value (FILE *, int, YYSTYPE); 3351 #define YYPRINT(F, N, L) print_token_value (F, N, L) 3352 static void trace_token (enum yytokentype token, YYLTYPE loc); 3353 } 3354 3355 ... 3356 3357Now Bison will insert `#include "ptypes.h"' and the new `YYLTYPE' 3358definition before the Bison-generated `YYSTYPE' and `YYLTYPE' 3359definitions in both the parser implementation file and the parser 3360header file. (By the same reasoning, `%code requires' would also be 3361the appropriate place to write your own definition for `YYSTYPE'.) 3362 3363 When you are writing dependency code for `YYSTYPE' and `YYLTYPE', 3364you should prefer `%code requires' over `%code top' regardless of 3365whether you instruct Bison to generate a parser header file. When you 3366are writing code that you need Bison to insert only into the parser 3367implementation file and that has no special need to appear at the top 3368of that file, you should prefer the unqualified `%code' over `%code 3369top'. These practices will make the purpose of each block of your code 3370explicit to Bison and to other developers reading your grammar file. 3371Following these practices, we expect the unqualified `%code' and `%code 3372requires' to be the most important of the four PROLOGUE alternatives. 3373 3374 At some point while developing your parser, you might decide to 3375provide `trace_token' to modules that are external to your parser. 3376Thus, you might wish for Bison to insert the prototype into both the 3377parser header file and the parser implementation file. Since this 3378function is not a dependency required by `YYSTYPE' or `YYLTYPE', it 3379doesn't make sense to move its prototype to a `%code requires'. More 3380importantly, since it depends upon `YYLTYPE' and `yytokentype', `%code 3381requires' is not sufficient. Instead, move its prototype from the 3382unqualified `%code' to a `%code provides': 3383 3384 %code top { 3385 #define _GNU_SOURCE 3386 #include <stdio.h> 3387 } 3388 3389 %code requires { 3390 #include "ptypes.h" 3391 } 3392 %union { 3393 long int n; 3394 tree t; /* `tree' is defined in `ptypes.h'. */ 3395 } 3396 3397 %code requires { 3398 #define YYLTYPE YYLTYPE 3399 typedef struct YYLTYPE 3400 { 3401 int first_line; 3402 int first_column; 3403 int last_line; 3404 int last_column; 3405 char *filename; 3406 } YYLTYPE; 3407 } 3408 3409 %code provides { 3410 void trace_token (enum yytokentype token, YYLTYPE loc); 3411 } 3412 3413 %code { 3414 static void print_token_value (FILE *, int, YYSTYPE); 3415 #define YYPRINT(F, N, L) print_token_value (F, N, L) 3416 } 3417 3418 ... 3419 3420Bison will insert the `trace_token' prototype into both the parser 3421header file and the parser implementation file after the definitions 3422for `yytokentype', `YYLTYPE', and `YYSTYPE'. 3423 3424 The above examples are careful to write directives in an order that 3425reflects the layout of the generated parser implementation and header 3426files: `%code top', `%code requires', `%code provides', and then 3427`%code'. While your grammar files may generally be easier to read if 3428you also follow this order, Bison does not require it. Instead, Bison 3429lets you choose an organization that makes sense to you. 3430 3431 You may declare any of these directives multiple times in the 3432grammar file. In that case, Bison concatenates the contained code in 3433declaration order. This is the only way in which the position of one 3434of these directives within the grammar file affects its functionality. 3435 3436 The result of the previous two properties is greater flexibility in 3437how you may organize your grammar file. For example, you may organize 3438semantic-type-related directives by semantic type: 3439 3440 %code requires { #include "type1.h" } 3441 %union { type1 field1; } 3442 %destructor { type1_free ($$); } <field1> 3443 %printer { type1_print (yyoutput, $$); } <field1> 3444 3445 %code requires { #include "type2.h" } 3446 %union { type2 field2; } 3447 %destructor { type2_free ($$); } <field2> 3448 %printer { type2_print (yyoutput, $$); } <field2> 3449 3450You could even place each of the above directive groups in the rules 3451section of the grammar file next to the set of rules that uses the 3452associated semantic type. (In the rules section, you must terminate 3453each of those directives with a semicolon.) And you don't have to 3454worry that some directive (like a `%union') in the definitions section 3455is going to adversely affect their functionality in some 3456counter-intuitive manner just because it comes first. Such an 3457organization is not possible using PROLOGUE sections. 3458 3459 This section has been concerned with explaining the advantages of 3460the four PROLOGUE alternatives over the original Yacc PROLOGUE. 3461However, in most cases when using these directives, you shouldn't need 3462to think about all the low-level ordering issues discussed here. 3463Instead, you should simply use these directives to label each block of 3464your code according to its purpose and let Bison handle the ordering. 3465`%code' is the most generic label. Move code to `%code requires', 3466`%code provides', or `%code top' as needed. 3467 3468 3469File: bison.info, Node: Bison Declarations, Next: Grammar Rules, Prev: Prologue Alternatives, Up: Grammar Outline 3470 34713.1.3 The Bison Declarations Section 3472------------------------------------ 3473 3474The BISON DECLARATIONS section contains declarations that define 3475terminal and nonterminal symbols, specify precedence, and so on. In 3476some simple grammars you may not need any declarations. *Note Bison 3477Declarations: Declarations. 3478 3479 3480File: bison.info, Node: Grammar Rules, Next: Epilogue, Prev: Bison Declarations, Up: Grammar Outline 3481 34823.1.4 The Grammar Rules Section 3483------------------------------- 3484 3485The "grammar rules" section contains one or more Bison grammar rules, 3486and nothing else. *Note Syntax of Grammar Rules: Rules. 3487 3488 There must always be at least one grammar rule, and the first `%%' 3489(which precedes the grammar rules) may never be omitted even if it is 3490the first thing in the file. 3491 3492 3493File: bison.info, Node: Epilogue, Prev: Grammar Rules, Up: Grammar Outline 3494 34953.1.5 The epilogue 3496------------------ 3497 3498The EPILOGUE is copied verbatim to the end of the parser implementation 3499file, just as the PROLOGUE is copied to the beginning. This is the 3500most convenient place to put anything that you want to have in the 3501parser implementation file but which need not come before the 3502definition of `yyparse'. For example, the definitions of `yylex' and 3503`yyerror' often go here. Because C requires functions to be declared 3504before being used, you often need to declare functions like `yylex' and 3505`yyerror' in the Prologue, even if you define them in the Epilogue. 3506*Note Parser C-Language Interface: Interface. 3507 3508 If the last section is empty, you may omit the `%%' that separates it 3509from the grammar rules. 3510 3511 The Bison parser itself contains many macros and identifiers whose 3512names start with `yy' or `YY', so it is a good idea to avoid using any 3513such names (except those documented in this manual) in the epilogue of 3514the grammar file. 3515 3516 3517File: bison.info, Node: Symbols, Next: Rules, Prev: Grammar Outline, Up: Grammar File 3518 35193.2 Symbols, Terminal and Nonterminal 3520===================================== 3521 3522"Symbols" in Bison grammars represent the grammatical classifications 3523of the language. 3524 3525 A "terminal symbol" (also known as a "token type") represents a 3526class of syntactically equivalent tokens. You use the symbol in grammar 3527rules to mean that a token in that class is allowed. The symbol is 3528represented in the Bison parser by a numeric code, and the `yylex' 3529function returns a token type code to indicate what kind of token has 3530been read. You don't need to know what the code value is; you can use 3531the symbol to stand for it. 3532 3533 A "nonterminal symbol" stands for a class of syntactically 3534equivalent groupings. The symbol name is used in writing grammar rules. 3535By convention, it should be all lower case. 3536 3537 Symbol names can contain letters, underscores, periods, and 3538non-initial digits and dashes. Dashes in symbol names are a GNU 3539extension, incompatible with POSIX Yacc. Periods and dashes make 3540symbol names less convenient to use with named references, which 3541require brackets around such names (*note Named References::). 3542Terminal symbols that contain periods or dashes make little sense: 3543since they are not valid symbols (in most programming languages) they 3544are not exported as token names. 3545 3546 There are three ways of writing terminal symbols in the grammar: 3547 3548 * A "named token type" is written with an identifier, like an 3549 identifier in C. By convention, it should be all upper case. Each 3550 such name must be defined with a Bison declaration such as 3551 `%token'. *Note Token Type Names: Token Decl. 3552 3553 * A "character token type" (or "literal character token") is written 3554 in the grammar using the same syntax used in C for character 3555 constants; for example, `'+'' is a character token type. A 3556 character token type doesn't need to be declared unless you need to 3557 specify its semantic value data type (*note Data Types of Semantic 3558 Values: Value Type.), associativity, or precedence (*note Operator 3559 Precedence: Precedence.). 3560 3561 By convention, a character token type is used only to represent a 3562 token that consists of that particular character. Thus, the token 3563 type `'+'' is used to represent the character `+' as a token. 3564 Nothing enforces this convention, but if you depart from it, your 3565 program will confuse other readers. 3566 3567 All the usual escape sequences used in character literals in C can 3568 be used in Bison as well, but you must not use the null character 3569 as a character literal because its numeric code, zero, signifies 3570 end-of-input (*note Calling Convention for `yylex': Calling 3571 Convention.). Also, unlike standard C, trigraphs have no special 3572 meaning in Bison character literals, nor is backslash-newline 3573 allowed. 3574 3575 * A "literal string token" is written like a C string constant; for 3576 example, `"<="' is a literal string token. A literal string token 3577 doesn't need to be declared unless you need to specify its semantic 3578 value data type (*note Value Type::), associativity, or precedence 3579 (*note Precedence::). 3580 3581 You can associate the literal string token with a symbolic name as 3582 an alias, using the `%token' declaration (*note Token 3583 Declarations: Token Decl.). If you don't do that, the lexical 3584 analyzer has to retrieve the token number for the literal string 3585 token from the `yytname' table (*note Calling Convention::). 3586 3587 *Warning*: literal string tokens do not work in Yacc. 3588 3589 By convention, a literal string token is used only to represent a 3590 token that consists of that particular string. Thus, you should 3591 use the token type `"<="' to represent the string `<=' as a token. 3592 Bison does not enforce this convention, but if you depart from it, 3593 people who read your program will be confused. 3594 3595 All the escape sequences used in string literals in C can be used 3596 in Bison as well, except that you must not use a null character 3597 within a string literal. Also, unlike Standard C, trigraphs have 3598 no special meaning in Bison string literals, nor is 3599 backslash-newline allowed. A literal string token must contain 3600 two or more characters; for a token containing just one character, 3601 use a character token (see above). 3602 3603 How you choose to write a terminal symbol has no effect on its 3604grammatical meaning. That depends only on where it appears in rules and 3605on when the parser function returns that symbol. 3606 3607 The value returned by `yylex' is always one of the terminal symbols, 3608except that a zero or negative value signifies end-of-input. Whichever 3609way you write the token type in the grammar rules, you write it the 3610same way in the definition of `yylex'. The numeric code for a 3611character token type is simply the positive numeric code of the 3612character, so `yylex' can use the identical value to generate the 3613requisite code, though you may need to convert it to `unsigned char' to 3614avoid sign-extension on hosts where `char' is signed. Each named token 3615type becomes a C macro in the parser implementation file, so `yylex' 3616can use the name to stand for the code. (This is why periods don't 3617make sense in terminal symbols.) *Note Calling Convention for `yylex': 3618Calling Convention. 3619 3620 If `yylex' is defined in a separate file, you need to arrange for the 3621token-type macro definitions to be available there. Use the `-d' 3622option when you run Bison, so that it will write these macro definitions 3623into a separate header file `NAME.tab.h' which you can include in the 3624other source files that need it. *Note Invoking Bison: Invocation. 3625 3626 If you want to write a grammar that is portable to any Standard C 3627host, you must use only nonnull character tokens taken from the basic 3628execution character set of Standard C. This set consists of the ten 3629digits, the 52 lower- and upper-case English letters, and the 3630characters in the following C-language string: 3631 3632 "\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_{|}~" 3633 3634 The `yylex' function and Bison must use a consistent character set 3635and encoding for character tokens. For example, if you run Bison in an 3636ASCII environment, but then compile and run the resulting program in an 3637environment that uses an incompatible character set like EBCDIC, the 3638resulting program may not work because the tables generated by Bison 3639will assume ASCII numeric values for character tokens. It is standard 3640practice for software distributions to contain C source files that were 3641generated by Bison in an ASCII environment, so installers on platforms 3642that are incompatible with ASCII must rebuild those files before 3643compiling them. 3644 3645 The symbol `error' is a terminal symbol reserved for error recovery 3646(*note Error Recovery::); you shouldn't use it for any other purpose. 3647In particular, `yylex' should never return this value. The default 3648value of the error token is 256, unless you explicitly assigned 256 to 3649one of your tokens with a `%token' declaration. 3650 3651 3652File: bison.info, Node: Rules, Next: Recursion, Prev: Symbols, Up: Grammar File 3653 36543.3 Syntax of Grammar Rules 3655=========================== 3656 3657A Bison grammar rule has the following general form: 3658 3659 RESULT: COMPONENTS...; 3660 3661where RESULT is the nonterminal symbol that this rule describes, and 3662COMPONENTS are various terminal and nonterminal symbols that are put 3663together by this rule (*note Symbols::). 3664 3665 For example, 3666 3667 exp: exp '+' exp; 3668 3669says that two groupings of type `exp', with a `+' token in between, can 3670be combined into a larger grouping of type `exp'. 3671 3672 White space in rules is significant only to separate symbols. You 3673can add extra white space as you wish. 3674 3675 Scattered among the components can be ACTIONS that determine the 3676semantics of the rule. An action looks like this: 3677 3678 {C STATEMENTS} 3679 3680This is an example of "braced code", that is, C code surrounded by 3681braces, much like a compound statement in C. Braced code can contain 3682any sequence of C tokens, so long as its braces are balanced. Bison 3683does not check the braced code for correctness directly; it merely 3684copies the code to the parser implementation file, where the C compiler 3685can check it. 3686 3687 Within braced code, the balanced-brace count is not affected by 3688braces within comments, string literals, or character constants, but it 3689is affected by the C digraphs `<%' and `%>' that represent braces. At 3690the top level braced code must be terminated by `}' and not by a 3691digraph. Bison does not look for trigraphs, so if braced code uses 3692trigraphs you should ensure that they do not affect the nesting of 3693braces or the boundaries of comments, string literals, or character 3694constants. 3695 3696 Usually there is only one action and it follows the components. 3697*Note Actions::. 3698 3699 Multiple rules for the same RESULT can be written separately or can 3700be joined with the vertical-bar character `|' as follows: 3701 3702 RESULT: 3703 RULE1-COMPONENTS... 3704 | RULE2-COMPONENTS... 3705 ... 3706 ; 3707 3708They are still considered distinct rules even when joined in this way. 3709 3710 If COMPONENTS in a rule is empty, it means that RESULT can match the 3711empty string. For example, here is how to define a comma-separated 3712sequence of zero or more `exp' groupings: 3713 3714 expseq: 3715 /* empty */ 3716 | expseq1 3717 ; 3718 3719 expseq1: 3720 exp 3721 | expseq1 ',' exp 3722 ; 3723 3724It is customary to write a comment `/* empty */' in each rule with no 3725components. 3726 3727 3728File: bison.info, Node: Recursion, Next: Semantics, Prev: Rules, Up: Grammar File 3729 37303.4 Recursive Rules 3731=================== 3732 3733A rule is called "recursive" when its RESULT nonterminal appears also 3734on its right hand side. Nearly all Bison grammars need to use 3735recursion, because that is the only way to define a sequence of any 3736number of a particular thing. Consider this recursive definition of a 3737comma-separated sequence of one or more expressions: 3738 3739 expseq1: 3740 exp 3741 | expseq1 ',' exp 3742 ; 3743 3744Since the recursive use of `expseq1' is the leftmost symbol in the 3745right hand side, we call this "left recursion". By contrast, here the 3746same construct is defined using "right recursion": 3747 3748 expseq1: 3749 exp 3750 | exp ',' expseq1 3751 ; 3752 3753Any kind of sequence can be defined using either left recursion or right 3754recursion, but you should always use left recursion, because it can 3755parse a sequence of any number of elements with bounded stack space. 3756Right recursion uses up space on the Bison stack in proportion to the 3757number of elements in the sequence, because all the elements must be 3758shifted onto the stack before the rule can be applied even once. *Note 3759The Bison Parser Algorithm: Algorithm, for further explanation of this. 3760 3761 "Indirect" or "mutual" recursion occurs when the result of the rule 3762does not appear directly on its right hand side, but does appear in 3763rules for other nonterminals which do appear on its right hand side. 3764 3765 For example: 3766 3767 expr: 3768 primary 3769 | primary '+' primary 3770 ; 3771 3772 primary: 3773 constant 3774 | '(' expr ')' 3775 ; 3776 3777defines two mutually-recursive nonterminals, since each refers to the 3778other. 3779 3780 3781File: bison.info, Node: Semantics, Next: Tracking Locations, Prev: Recursion, Up: Grammar File 3782 37833.5 Defining Language Semantics 3784=============================== 3785 3786The grammar rules for a language determine only the syntax. The 3787semantics are determined by the semantic values associated with various 3788tokens and groupings, and by the actions taken when various groupings 3789are recognized. 3790 3791 For example, the calculator calculates properly because the value 3792associated with each expression is the proper number; it adds properly 3793because the action for the grouping `X + Y' is to add the numbers 3794associated with X and Y. 3795 3796* Menu: 3797 3798* Value Type:: Specifying one data type for all semantic values. 3799* Multiple Types:: Specifying several alternative data types. 3800* Actions:: An action is the semantic definition of a grammar rule. 3801* Action Types:: Specifying data types for actions to operate on. 3802* Mid-Rule Actions:: Most actions go at the end of a rule. 3803 This says when, why and how to use the exceptional 3804 action in the middle of a rule. 3805 3806 3807File: bison.info, Node: Value Type, Next: Multiple Types, Up: Semantics 3808 38093.5.1 Data Types of Semantic Values 3810----------------------------------- 3811 3812In a simple program it may be sufficient to use the same data type for 3813the semantic values of all language constructs. This was true in the 3814RPN and infix calculator examples (*note Reverse Polish Notation 3815Calculator: RPN Calc.). 3816 3817 Bison normally uses the type `int' for semantic values if your 3818program uses the same data type for all language constructs. To 3819specify some other type, define `YYSTYPE' as a macro, like this: 3820 3821 #define YYSTYPE double 3822 3823`YYSTYPE''s replacement list should be a type name that does not 3824contain parentheses or square brackets. This macro definition must go 3825in the prologue of the grammar file (*note Outline of a Bison Grammar: 3826Grammar Outline.). 3827 3828 3829File: bison.info, Node: Multiple Types, Next: Actions, Prev: Value Type, Up: Semantics 3830 38313.5.2 More Than One Value Type 3832------------------------------ 3833 3834In most programs, you will need different data types for different kinds 3835of tokens and groupings. For example, a numeric constant may need type 3836`int' or `long int', while a string constant needs type `char *', and 3837an identifier might need a pointer to an entry in the symbol table. 3838 3839 To use more than one data type for semantic values in one parser, 3840Bison requires you to do two things: 3841 3842 * Specify the entire collection of possible data types, either by 3843 using the `%union' Bison declaration (*note The Collection of 3844 Value Types: Union Decl.), or by using a `typedef' or a `#define' 3845 to define `YYSTYPE' to be a union type whose member names are the 3846 type tags. 3847 3848 * Choose one of those types for each symbol (terminal or 3849 nonterminal) for which semantic values are used. This is done for 3850 tokens with the `%token' Bison declaration (*note Token Type 3851 Names: Token Decl.) and for groupings with the `%type' Bison 3852 declaration (*note Nonterminal Symbols: Type Decl.). 3853 3854 3855File: bison.info, Node: Actions, Next: Action Types, Prev: Multiple Types, Up: Semantics 3856 38573.5.3 Actions 3858------------- 3859 3860An action accompanies a syntactic rule and contains C code to be 3861executed each time an instance of that rule is recognized. The task of 3862most actions is to compute a semantic value for the grouping built by 3863the rule from the semantic values associated with tokens or smaller 3864groupings. 3865 3866 An action consists of braced code containing C statements, and can be 3867placed at any position in the rule; it is executed at that position. 3868Most rules have just one action at the end of the rule, following all 3869the components. Actions in the middle of a rule are tricky and used 3870only for special purposes (*note Actions in Mid-Rule: Mid-Rule 3871Actions.). 3872 3873 The C code in an action can refer to the semantic values of the 3874components matched by the rule with the construct `$N', which stands 3875for the value of the Nth component. The semantic value for the 3876grouping being constructed is `$$'. In addition, the semantic values 3877of symbols can be accessed with the named references construct `$NAME' 3878or `$[NAME]'. Bison translates both of these constructs into 3879expressions of the appropriate type when it copies the actions into the 3880parser implementation file. `$$' (or `$NAME', when it stands for the 3881current grouping) is translated to a modifiable lvalue, so it can be 3882assigned to. 3883 3884 Here is a typical example: 3885 3886 exp: 3887 ... 3888 | exp '+' exp { $$ = $1 + $3; } 3889 3890 Or, in terms of named references: 3891 3892 exp[result]: 3893 ... 3894 | exp[left] '+' exp[right] { $result = $left + $right; } 3895 3896This rule constructs an `exp' from two smaller `exp' groupings 3897connected by a plus-sign token. In the action, `$1' and `$3' (`$left' 3898and `$right') refer to the semantic values of the two component `exp' 3899groupings, which are the first and third symbols on the right hand side 3900of the rule. The sum is stored into `$$' (`$result') so that it 3901becomes the semantic value of the addition-expression just recognized 3902by the rule. If there were a useful semantic value associated with the 3903`+' token, it could be referred to as `$2'. 3904 3905 *Note Named References::, for more information about using the named 3906references construct. 3907 3908 Note that the vertical-bar character `|' is really a rule separator, 3909and actions are attached to a single rule. This is a difference with 3910tools like Flex, for which `|' stands for either "or", or "the same 3911action as that of the next rule". In the following example, the action 3912is triggered only when `b' is found: 3913 3914 a-or-b: 'a'|'b' { a_or_b_found = 1; }; 3915 3916 If you don't specify an action for a rule, Bison supplies a default: 3917`$$ = $1'. Thus, the value of the first symbol in the rule becomes the 3918value of the whole rule. Of course, the default action is valid only 3919if the two data types match. There is no meaningful default action for 3920an empty rule; every empty rule must have an explicit action unless the 3921rule's value does not matter. 3922 3923 `$N' with N zero or negative is allowed for reference to tokens and 3924groupings on the stack _before_ those that match the current rule. 3925This is a very risky practice, and to use it reliably you must be 3926certain of the context in which the rule is applied. Here is a case in 3927which you can use this reliably: 3928 3929 foo: 3930 expr bar '+' expr { ... } 3931 | expr bar '-' expr { ... } 3932 ; 3933 3934 bar: 3935 /* empty */ { previous_expr = $0; } 3936 ; 3937 3938 As long as `bar' is used only in the fashion shown here, `$0' always 3939refers to the `expr' which precedes `bar' in the definition of `foo'. 3940 3941 It is also possible to access the semantic value of the lookahead 3942token, if any, from a semantic action. This semantic value is stored 3943in `yylval'. *Note Special Features for Use in Actions: Action 3944Features. 3945 3946 3947File: bison.info, Node: Action Types, Next: Mid-Rule Actions, Prev: Actions, Up: Semantics 3948 39493.5.4 Data Types of Values in Actions 3950------------------------------------- 3951 3952If you have chosen a single data type for semantic values, the `$$' and 3953`$N' constructs always have that data type. 3954 3955 If you have used `%union' to specify a variety of data types, then 3956you must declare a choice among these types for each terminal or 3957nonterminal symbol that can have a semantic value. Then each time you 3958use `$$' or `$N', its data type is determined by which symbol it refers 3959to in the rule. In this example, 3960 3961 exp: 3962 ... 3963 | exp '+' exp { $$ = $1 + $3; } 3964 3965`$1' and `$3' refer to instances of `exp', so they all have the data 3966type declared for the nonterminal symbol `exp'. If `$2' were used, it 3967would have the data type declared for the terminal symbol `'+'', 3968whatever that might be. 3969 3970 Alternatively, you can specify the data type when you refer to the 3971value, by inserting `<TYPE>' after the `$' at the beginning of the 3972reference. For example, if you have defined types as shown here: 3973 3974 %union { 3975 int itype; 3976 double dtype; 3977 } 3978 3979then you can write `$<itype>1' to refer to the first subunit of the 3980rule as an integer, or `$<dtype>1' to refer to it as a double. 3981 3982 3983File: bison.info, Node: Mid-Rule Actions, Prev: Action Types, Up: Semantics 3984 39853.5.5 Actions in Mid-Rule 3986------------------------- 3987 3988Occasionally it is useful to put an action in the middle of a rule. 3989These actions are written just like usual end-of-rule actions, but they 3990are executed before the parser even recognizes the following components. 3991 3992* Menu: 3993 3994* Using Mid-Rule Actions:: Putting an action in the middle of a rule. 3995* Mid-Rule Action Translation:: How mid-rule actions are actually processed. 3996* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts. 3997 3998 3999File: bison.info, Node: Using Mid-Rule Actions, Next: Mid-Rule Action Translation, Up: Mid-Rule Actions 4000 40013.5.5.1 Using Mid-Rule Actions 4002.............................. 4003 4004A mid-rule action may refer to the components preceding it using `$N', 4005but it may not refer to subsequent components because it is run before 4006they are parsed. 4007 4008 The mid-rule action itself counts as one of the components of the 4009rule. This makes a difference when there is another action later in 4010the same rule (and usually there is another at the end): you have to 4011count the actions along with the symbols when working out which number 4012N to use in `$N'. 4013 4014 The mid-rule action can also have a semantic value. The action can 4015set its value with an assignment to `$$', and actions later in the rule 4016can refer to the value using `$N'. Since there is no symbol to name 4017the action, there is no way to declare a data type for the value in 4018advance, so you must use the `$<...>N' construct to specify a data type 4019each time you refer to this value. 4020 4021 There is no way to set the value of the entire rule with a mid-rule 4022action, because assignments to `$$' do not have that effect. The only 4023way to set the value for the entire rule is with an ordinary action at 4024the end of the rule. 4025 4026 Here is an example from a hypothetical compiler, handling a `let' 4027statement that looks like `let (VARIABLE) STATEMENT' and serves to 4028create a variable named VARIABLE temporarily for the duration of 4029STATEMENT. To parse this construct, we must put VARIABLE into the 4030symbol table while STATEMENT is parsed, then remove it afterward. Here 4031is how it is done: 4032 4033 stmt: 4034 "let" '(' var ')' 4035 { 4036 $<context>$ = push_context (); 4037 declare_variable ($3); 4038 } 4039 stmt 4040 { 4041 $$ = $6; 4042 pop_context ($<context>5); 4043 } 4044 4045As soon as `let (VARIABLE)' has been recognized, the first action is 4046run. It saves a copy of the current semantic context (the list of 4047accessible variables) as its semantic value, using alternative 4048`context' in the data-type union. Then it calls `declare_variable' to 4049add the new variable to that list. Once the first action is finished, 4050the embedded statement `stmt' can be parsed. 4051 4052 Note that the mid-rule action is component number 5, so the `stmt' is 4053component number 6. Named references can be used to improve the 4054readability and maintainability (*note Named References::): 4055 4056 stmt: 4057 "let" '(' var ')' 4058 { 4059 $<context>let = push_context (); 4060 declare_variable ($3); 4061 }[let] 4062 stmt 4063 { 4064 $$ = $6; 4065 pop_context ($<context>let); 4066 } 4067 4068 After the embedded statement is parsed, its semantic value becomes 4069the value of the entire `let'-statement. Then the semantic value from 4070the earlier action is used to restore the prior list of variables. This 4071removes the temporary `let'-variable from the list so that it won't 4072appear to exist while the rest of the program is parsed. 4073 4074 In the above example, if the parser initiates error recovery (*note 4075Error Recovery::) while parsing the tokens in the embedded statement 4076`stmt', it might discard the previous semantic context `$<context>5' 4077without restoring it. Thus, `$<context>5' needs a destructor (*note 4078Freeing Discarded Symbols: Destructor Decl.). However, Bison currently 4079provides no means to declare a destructor specific to a particular 4080mid-rule action's semantic value. 4081 4082 One solution is to bury the mid-rule action inside a nonterminal 4083symbol and to declare a destructor for that symbol: 4084 4085 %type <context> let 4086 %destructor { pop_context ($$); } let 4087 4088 %% 4089 4090 stmt: 4091 let stmt 4092 { 4093 $$ = $2; 4094 pop_context ($let); 4095 }; 4096 4097 let: 4098 "let" '(' var ')' 4099 { 4100 $let = push_context (); 4101 declare_variable ($3); 4102 }; 4103 4104Note that the action is now at the end of its rule. Any mid-rule 4105action can be converted to an end-of-rule action in this way, and this 4106is what Bison actually does to implement mid-rule actions. 4107 4108 4109File: bison.info, Node: Mid-Rule Action Translation, Next: Mid-Rule Conflicts, Prev: Using Mid-Rule Actions, Up: Mid-Rule Actions 4110 41113.5.5.2 Mid-Rule Action Translation 4112................................... 4113 4114As hinted earlier, mid-rule actions are actually transformed into 4115regular rules and actions. The various reports generated by Bison 4116(textual, graphical, etc., see *note Understanding Your Parser: 4117Understanding.) reveal this translation, best explained by means of an 4118example. The following rule: 4119 4120 exp: { a(); } "b" { c(); } { d(); } "e" { f(); }; 4121 4122is translated into: 4123 4124 $@1: /* empty */ { a(); }; 4125 $@2: /* empty */ { c(); }; 4126 $@3: /* empty */ { d(); }; 4127 exp: $@1 "b" $@2 $@3 "e" { f(); }; 4128 4129with new nonterminal symbols `$@N', where N is a number. 4130 4131 A mid-rule action is expected to generate a value if it uses `$$', or 4132the (final) action uses `$N' where N denote the mid-rule action. In 4133that case its nonterminal is rather named `@N': 4134 4135 exp: { a(); } "b" { $$ = c(); } { d(); } "e" { f = $1; }; 4136 4137is translated into 4138 4139 @1: /* empty */ { a(); }; 4140 @2: /* empty */ { $$ = c(); }; 4141 $@3: /* empty */ { d(); }; 4142 exp: @1 "b" @2 $@3 "e" { f = $1; } 4143 4144 There are probably two errors in the above example: the first 4145mid-rule action does not generate a value (it does not use `$$' 4146although the final action uses it), and the value of the second one is 4147not used (the final action does not use `$3'). Bison reports these 4148errors when the `midrule-value' warnings are enabled (*note Invoking 4149Bison: Invocation.): 4150 4151 $ bison -fcaret -Wmidrule-value mid.y 4152 mid.y:2.6-13: warning: unset value: $$ 4153 exp: { a(); } "b" { $$ = c(); } { d(); } "e" { f = $1; }; 4154 ^^^^^^^^ 4155 mid.y:2.19-31: warning: unused value: $3 4156 exp: { a(); } "b" { $$ = c(); } { d(); } "e" { f = $1; }; 4157 ^^^^^^^^^^^^^ 4158 4159 4160File: bison.info, Node: Mid-Rule Conflicts, Prev: Mid-Rule Action Translation, Up: Mid-Rule Actions 4161 41623.5.5.3 Conflicts due to Mid-Rule Actions 4163......................................... 4164 4165Taking action before a rule is completely recognized often leads to 4166conflicts since the parser must commit to a parse in order to execute 4167the action. For example, the following two rules, without mid-rule 4168actions, can coexist in a working parser because the parser can shift 4169the open-brace token and look at what follows before deciding whether 4170there is a declaration or not: 4171 4172 compound: 4173 '{' declarations statements '}' 4174 | '{' statements '}' 4175 ; 4176 4177But when we add a mid-rule action as follows, the rules become 4178nonfunctional: 4179 4180 compound: 4181 { prepare_for_local_variables (); } 4182 '{' declarations statements '}' 4183 | '{' statements '}' 4184 ; 4185 4186Now the parser is forced to decide whether to run the mid-rule action 4187when it has read no farther than the open-brace. In other words, it 4188must commit to using one rule or the other, without sufficient 4189information to do it correctly. (The open-brace token is what is called 4190the "lookahead" token at this time, since the parser is still deciding 4191what to do about it. *Note Lookahead Tokens: Lookahead.) 4192 4193 You might think that you could correct the problem by putting 4194identical actions into the two rules, like this: 4195 4196 compound: 4197 { prepare_for_local_variables (); } 4198 '{' declarations statements '}' 4199 | { prepare_for_local_variables (); } 4200 '{' statements '}' 4201 ; 4202 4203But this does not help, because Bison does not realize that the two 4204actions are identical. (Bison never tries to understand the C code in 4205an action.) 4206 4207 If the grammar is such that a declaration can be distinguished from a 4208statement by the first token (which is true in C), then one solution 4209which does work is to put the action after the open-brace, like this: 4210 4211 compound: 4212 '{' { prepare_for_local_variables (); } 4213 declarations statements '}' 4214 | '{' statements '}' 4215 ; 4216 4217Now the first token of the following declaration or statement, which 4218would in any case tell Bison which rule to use, can still do so. 4219 4220 Another solution is to bury the action inside a nonterminal symbol 4221which serves as a subroutine: 4222 4223 subroutine: 4224 /* empty */ { prepare_for_local_variables (); } 4225 ; 4226 4227 compound: 4228 subroutine '{' declarations statements '}' 4229 | subroutine '{' statements '}' 4230 ; 4231 4232Now Bison can execute the action in the rule for `subroutine' without 4233deciding which rule for `compound' it will eventually use. 4234 4235 4236File: bison.info, Node: Tracking Locations, Next: Named References, Prev: Semantics, Up: Grammar File 4237 42383.6 Tracking Locations 4239====================== 4240 4241Though grammar rules and semantic actions are enough to write a fully 4242functional parser, it can be useful to process some additional 4243information, especially symbol locations. 4244 4245 The way locations are handled is defined by providing a data type, 4246and actions to take when rules are matched. 4247 4248* Menu: 4249 4250* Location Type:: Specifying a data type for locations. 4251* Actions and Locations:: Using locations in actions. 4252* Location Default Action:: Defining a general way to compute locations. 4253 4254 4255File: bison.info, Node: Location Type, Next: Actions and Locations, Up: Tracking Locations 4256 42573.6.1 Data Type of Locations 4258---------------------------- 4259 4260Defining a data type for locations is much simpler than for semantic 4261values, since all tokens and groupings always use the same type. 4262 4263 You can specify the type of locations by defining a macro called 4264`YYLTYPE', just as you can specify the semantic value type by defining 4265a `YYSTYPE' macro (*note Value Type::). When `YYLTYPE' is not defined, 4266Bison uses a default structure type with four members: 4267 4268 typedef struct YYLTYPE 4269 { 4270 int first_line; 4271 int first_column; 4272 int last_line; 4273 int last_column; 4274 } YYLTYPE; 4275 4276 When `YYLTYPE' is not defined, at the beginning of the parsing, Bison 4277initializes all these fields to 1 for `yylloc'. To initialize `yylloc' 4278with a custom location type (or to chose a different initialization), 4279use the `%initial-action' directive. *Note Performing Actions before 4280Parsing: Initial Action Decl. 4281 4282 4283File: bison.info, Node: Actions and Locations, Next: Location Default Action, Prev: Location Type, Up: Tracking Locations 4284 42853.6.2 Actions and Locations 4286--------------------------- 4287 4288Actions are not only useful for defining language semantics, but also 4289for describing the behavior of the output parser with locations. 4290 4291 The most obvious way for building locations of syntactic groupings 4292is very similar to the way semantic values are computed. In a given 4293rule, several constructs can be used to access the locations of the 4294elements being matched. The location of the Nth component of the right 4295hand side is `@N', while the location of the left hand side grouping is 4296`@$'. 4297 4298 In addition, the named references construct `@NAME' and `@[NAME]' 4299may also be used to address the symbol locations. *Note Named 4300References::, for more information about using the named references 4301construct. 4302 4303 Here is a basic example using the default data type for locations: 4304 4305 exp: 4306 ... 4307 | exp '/' exp 4308 { 4309 @$.first_column = @1.first_column; 4310 @$.first_line = @1.first_line; 4311 @$.last_column = @3.last_column; 4312 @$.last_line = @3.last_line; 4313 if ($3) 4314 $$ = $1 / $3; 4315 else 4316 { 4317 $$ = 1; 4318 fprintf (stderr, 4319 "Division by zero, l%d,c%d-l%d,c%d", 4320 @3.first_line, @3.first_column, 4321 @3.last_line, @3.last_column); 4322 } 4323 } 4324 4325 As for semantic values, there is a default action for locations that 4326is run each time a rule is matched. It sets the beginning of `@$' to 4327the beginning of the first symbol, and the end of `@$' to the end of the 4328last symbol. 4329 4330 With this default action, the location tracking can be fully 4331automatic. The example above simply rewrites this way: 4332 4333 exp: 4334 ... 4335 | exp '/' exp 4336 { 4337 if ($3) 4338 $$ = $1 / $3; 4339 else 4340 { 4341 $$ = 1; 4342 fprintf (stderr, 4343 "Division by zero, l%d,c%d-l%d,c%d", 4344 @3.first_line, @3.first_column, 4345 @3.last_line, @3.last_column); 4346 } 4347 } 4348 4349 It is also possible to access the location of the lookahead token, 4350if any, from a semantic action. This location is stored in `yylloc'. 4351*Note Special Features for Use in Actions: Action Features. 4352 4353 4354File: bison.info, Node: Location Default Action, Prev: Actions and Locations, Up: Tracking Locations 4355 43563.6.3 Default Action for Locations 4357---------------------------------- 4358 4359Actually, actions are not the best place to compute locations. Since 4360locations are much more general than semantic values, there is room in 4361the output parser to redefine the default action to take for each rule. 4362The `YYLLOC_DEFAULT' macro is invoked each time a rule is matched, 4363before the associated action is run. It is also invoked while 4364processing a syntax error, to compute the error's location. Before 4365reporting an unresolvable syntactic ambiguity, a GLR parser invokes 4366`YYLLOC_DEFAULT' recursively to compute the location of that ambiguity. 4367 4368 Most of the time, this macro is general enough to suppress location 4369dedicated code from semantic actions. 4370 4371 The `YYLLOC_DEFAULT' macro takes three parameters. The first one is 4372the location of the grouping (the result of the computation). When a 4373rule is matched, the second parameter identifies locations of all right 4374hand side elements of the rule being matched, and the third parameter 4375is the size of the rule's right hand side. When a GLR parser reports 4376an ambiguity, which of multiple candidate right hand sides it passes to 4377`YYLLOC_DEFAULT' is undefined. When processing a syntax error, the 4378second parameter identifies locations of the symbols that were 4379discarded during error processing, and the third parameter is the 4380number of discarded symbols. 4381 4382 By default, `YYLLOC_DEFAULT' is defined this way: 4383 4384 # define YYLLOC_DEFAULT(Cur, Rhs, N) \ 4385 do \ 4386 if (N) \ 4387 { \ 4388 (Cur).first_line = YYRHSLOC(Rhs, 1).first_line; \ 4389 (Cur).first_column = YYRHSLOC(Rhs, 1).first_column; \ 4390 (Cur).last_line = YYRHSLOC(Rhs, N).last_line; \ 4391 (Cur).last_column = YYRHSLOC(Rhs, N).last_column; \ 4392 } \ 4393 else \ 4394 { \ 4395 (Cur).first_line = (Cur).last_line = \ 4396 YYRHSLOC(Rhs, 0).last_line; \ 4397 (Cur).first_column = (Cur).last_column = \ 4398 YYRHSLOC(Rhs, 0).last_column; \ 4399 } \ 4400 while (0) 4401 4402where `YYRHSLOC (rhs, k)' is the location of the Kth symbol in RHS when 4403K is positive, and the location of the symbol just before the reduction 4404when K and N are both zero. 4405 4406 When defining `YYLLOC_DEFAULT', you should consider that: 4407 4408 * All arguments are free of side-effects. However, only the first 4409 one (the result) should be modified by `YYLLOC_DEFAULT'. 4410 4411 * For consistency with semantic actions, valid indexes within the 4412 right hand side range from 1 to N. When N is zero, only 0 is a 4413 valid index, and it refers to the symbol just before the reduction. 4414 During error processing N is always positive. 4415 4416 * Your macro should parenthesize its arguments, if need be, since the 4417 actual arguments may not be surrounded by parentheses. Also, your 4418 macro should expand to something that can be used as a single 4419 statement when it is followed by a semicolon. 4420 4421 4422File: bison.info, Node: Named References, Next: Declarations, Prev: Tracking Locations, Up: Grammar File 4423 44243.7 Named References 4425==================== 4426 4427As described in the preceding sections, the traditional way to refer to 4428any semantic value or location is a "positional reference", which takes 4429the form `$N', `$$', `@N', and `@$'. However, such a reference is not 4430very descriptive. Moreover, if you later decide to insert or remove 4431symbols in the right-hand side of a grammar rule, the need to renumber 4432such references can be tedious and error-prone. 4433 4434 To avoid these issues, you can also refer to a semantic value or 4435location using a "named reference". First of all, original symbol 4436names may be used as named references. For example: 4437 4438 invocation: op '(' args ')' 4439 { $invocation = new_invocation ($op, $args, @invocation); } 4440 4441Positional and named references can be mixed arbitrarily. For example: 4442 4443 invocation: op '(' args ')' 4444 { $$ = new_invocation ($op, $args, @$); } 4445 4446However, sometimes regular symbol names are not sufficient due to 4447ambiguities: 4448 4449 exp: exp '/' exp 4450 { $exp = $exp / $exp; } // $exp is ambiguous. 4451 4452 exp: exp '/' exp 4453 { $$ = $1 / $exp; } // One usage is ambiguous. 4454 4455 exp: exp '/' exp 4456 { $$ = $1 / $3; } // No error. 4457 4458When ambiguity occurs, explicitly declared names may be used for values 4459and locations. Explicit names are declared as a bracketed name after a 4460symbol appearance in rule definitions. For example: 4461 exp[result]: exp[left] '/' exp[right] 4462 { $result = $left / $right; } 4463 4464In order to access a semantic value generated by a mid-rule action, an 4465explicit name may also be declared by putting a bracketed name after the 4466closing brace of the mid-rule action code: 4467 exp[res]: exp[x] '+' {$left = $x;}[left] exp[right] 4468 { $res = $left + $right; } 4469 4470In references, in order to specify names containing dots and dashes, an 4471explicit bracketed syntax `$[name]' and `@[name]' must be used: 4472 if-stmt: "if" '(' expr ')' "then" then.stmt ';' 4473 { $[if-stmt] = new_if_stmt ($expr, $[then.stmt]); } 4474 4475 It often happens that named references are followed by a dot, dash 4476or other C punctuation marks and operators. By default, Bison will read 4477`$name.suffix' as a reference to symbol value `$name' followed by 4478`.suffix', i.e., an access to the `suffix' field of the semantic value. 4479In order to force Bison to recognize `name.suffix' in its entirety as 4480the name of a semantic value, the bracketed syntax `$[name.suffix]' 4481must be used. 4482 4483 The named references feature is experimental. More user feedback 4484will help to stabilize it. 4485 4486 4487File: bison.info, Node: Declarations, Next: Multiple Parsers, Prev: Named References, Up: Grammar File 4488 44893.8 Bison Declarations 4490====================== 4491 4492The "Bison declarations" section of a Bison grammar defines the symbols 4493used in formulating the grammar and the data types of semantic values. 4494*Note Symbols::. 4495 4496 All token type names (but not single-character literal tokens such as 4497`'+'' and `'*'') must be declared. Nonterminal symbols must be 4498declared if you need to specify which data type to use for the semantic 4499value (*note More Than One Value Type: Multiple Types.). 4500 4501 The first rule in the grammar file also specifies the start symbol, 4502by default. If you want some other symbol to be the start symbol, you 4503must declare it explicitly (*note Languages and Context-Free Grammars: 4504Language and Grammar.). 4505 4506* Menu: 4507 4508* Require Decl:: Requiring a Bison version. 4509* Token Decl:: Declaring terminal symbols. 4510* Precedence Decl:: Declaring terminals with precedence and associativity. 4511* Union Decl:: Declaring the set of all semantic value types. 4512* Type Decl:: Declaring the choice of type for a nonterminal symbol. 4513* Initial Action Decl:: Code run before parsing starts. 4514* Destructor Decl:: Declaring how symbols are freed. 4515* Printer Decl:: Declaring how symbol values are displayed. 4516* Expect Decl:: Suppressing warnings about parsing conflicts. 4517* Start Decl:: Specifying the start symbol. 4518* Pure Decl:: Requesting a reentrant parser. 4519* Push Decl:: Requesting a push parser. 4520* Decl Summary:: Table of all Bison declarations. 4521* %define Summary:: Defining variables to adjust Bison's behavior. 4522* %code Summary:: Inserting code into the parser source. 4523 4524 4525File: bison.info, Node: Require Decl, Next: Token Decl, Up: Declarations 4526 45273.8.1 Require a Version of Bison 4528-------------------------------- 4529 4530You may require the minimum version of Bison to process the grammar. If 4531the requirement is not met, `bison' exits with an error (exit status 453263). 4533 4534 %require "VERSION" 4535 4536 4537File: bison.info, Node: Token Decl, Next: Precedence Decl, Prev: Require Decl, Up: Declarations 4538 45393.8.2 Token Type Names 4540---------------------- 4541 4542The basic way to declare a token type name (terminal symbol) is as 4543follows: 4544 4545 %token NAME 4546 4547 Bison will convert this into a `#define' directive in the parser, so 4548that the function `yylex' (if it is in this file) can use the name NAME 4549to stand for this token type's code. 4550 4551 Alternatively, you can use `%left', `%right', or `%nonassoc' instead 4552of `%token', if you wish to specify associativity and precedence. 4553*Note Operator Precedence: Precedence Decl. 4554 4555 You can explicitly specify the numeric code for a token type by 4556appending a nonnegative decimal or hexadecimal integer value in the 4557field immediately following the token name: 4558 4559 %token NUM 300 4560 %token XNUM 0x12d // a GNU extension 4561 4562It is generally best, however, to let Bison choose the numeric codes for 4563all token types. Bison will automatically select codes that don't 4564conflict with each other or with normal characters. 4565 4566 In the event that the stack type is a union, you must augment the 4567`%token' or other token declaration to include the data type 4568alternative delimited by angle-brackets (*note More Than One Value 4569Type: Multiple Types.). 4570 4571 For example: 4572 4573 %union { /* define stack type */ 4574 double val; 4575 symrec *tptr; 4576 } 4577 %token <val> NUM /* define token NUM and its type */ 4578 4579 You can associate a literal string token with a token type name by 4580writing the literal string at the end of a `%token' declaration which 4581declares the name. For example: 4582 4583 %token arrow "=>" 4584 4585For example, a grammar for the C language might specify these names with 4586equivalent literal string tokens: 4587 4588 %token <operator> OR "||" 4589 %token <operator> LE 134 "<=" 4590 %left OR "<=" 4591 4592Once you equate the literal string and the token name, you can use them 4593interchangeably in further declarations or the grammar rules. The 4594`yylex' function can use the token name or the literal string to obtain 4595the token type code number (*note Calling Convention::). Syntax error 4596messages passed to `yyerror' from the parser will reference the literal 4597string instead of the token name. 4598 4599 The token numbered as 0 corresponds to end of file; the following 4600line allows for nicer error messages referring to "end of file" instead 4601of "$end": 4602 4603 %token END 0 "end of file" 4604 4605 4606File: bison.info, Node: Precedence Decl, Next: Union Decl, Prev: Token Decl, Up: Declarations 4607 46083.8.3 Operator Precedence 4609------------------------- 4610 4611Use the `%left', `%right' or `%nonassoc' declaration to declare a token 4612and specify its precedence and associativity, all at once. These are 4613called "precedence declarations". *Note Operator Precedence: 4614Precedence, for general information on operator precedence. 4615 4616 The syntax of a precedence declaration is nearly the same as that of 4617`%token': either 4618 4619 %left SYMBOLS... 4620 4621or 4622 4623 %left <TYPE> SYMBOLS... 4624 4625 And indeed any of these declarations serves the purposes of `%token'. 4626But in addition, they specify the associativity and relative precedence 4627for all the SYMBOLS: 4628 4629 * The associativity of an operator OP determines how repeated uses 4630 of the operator nest: whether `X OP Y OP Z' is parsed by grouping 4631 X with Y first or by grouping Y with Z first. `%left' specifies 4632 left-associativity (grouping X with Y first) and `%right' 4633 specifies right-associativity (grouping Y with Z first). 4634 `%nonassoc' specifies no associativity, which means that `X OP Y 4635 OP Z' is considered a syntax error. 4636 4637 * The precedence of an operator determines how it nests with other 4638 operators. All the tokens declared in a single precedence 4639 declaration have equal precedence and nest together according to 4640 their associativity. When two tokens declared in different 4641 precedence declarations associate, the one declared later has the 4642 higher precedence and is grouped first. 4643 4644 For backward compatibility, there is a confusing difference between 4645the argument lists of `%token' and precedence declarations. Only a 4646`%token' can associate a literal string with a token type name. A 4647precedence declaration always interprets a literal string as a 4648reference to a separate token. For example: 4649 4650 %left OR "<=" // Does not declare an alias. 4651 %left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=". 4652 4653 4654File: bison.info, Node: Union Decl, Next: Type Decl, Prev: Precedence Decl, Up: Declarations 4655 46563.8.4 The Collection of Value Types 4657----------------------------------- 4658 4659The `%union' declaration specifies the entire collection of possible 4660data types for semantic values. The keyword `%union' is followed by 4661braced code containing the same thing that goes inside a `union' in C. 4662 4663 For example: 4664 4665 %union { 4666 double val; 4667 symrec *tptr; 4668 } 4669 4670This says that the two alternative types are `double' and `symrec *'. 4671They are given names `val' and `tptr'; these names are used in the 4672`%token' and `%type' declarations to pick one of the types for a 4673terminal or nonterminal symbol (*note Nonterminal Symbols: Type Decl.). 4674 4675 As an extension to POSIX, a tag is allowed after the `union'. For 4676example: 4677 4678 %union value { 4679 double val; 4680 symrec *tptr; 4681 } 4682 4683specifies the union tag `value', so the corresponding C type is `union 4684value'. If you do not specify a tag, it defaults to `YYSTYPE'. 4685 4686 As another extension to POSIX, you may specify multiple `%union' 4687declarations; their contents are concatenated. However, only the first 4688`%union' declaration can specify a tag. 4689 4690 Note that, unlike making a `union' declaration in C, you need not 4691write a semicolon after the closing brace. 4692 4693 Instead of `%union', you can define and use your own union type 4694`YYSTYPE' if your grammar contains at least one `<TYPE>' tag. For 4695example, you can put the following into a header file `parser.h': 4696 4697 union YYSTYPE { 4698 double val; 4699 symrec *tptr; 4700 }; 4701 typedef union YYSTYPE YYSTYPE; 4702 4703and then your grammar can use the following instead of `%union': 4704 4705 %{ 4706 #include "parser.h" 4707 %} 4708 %type <val> expr 4709 %token <tptr> ID 4710 4711 4712File: bison.info, Node: Type Decl, Next: Initial Action Decl, Prev: Union Decl, Up: Declarations 4713 47143.8.5 Nonterminal Symbols 4715------------------------- 4716 4717When you use `%union' to specify multiple value types, you must declare 4718the value type of each nonterminal symbol for which values are used. 4719This is done with a `%type' declaration, like this: 4720 4721 %type <TYPE> NONTERMINAL... 4722 4723Here NONTERMINAL is the name of a nonterminal symbol, and TYPE is the 4724name given in the `%union' to the alternative that you want (*note The 4725Collection of Value Types: Union Decl.). You can give any number of 4726nonterminal symbols in the same `%type' declaration, if they have the 4727same value type. Use spaces to separate the symbol names. 4728 4729 You can also declare the value type of a terminal symbol. To do 4730this, use the same `<TYPE>' construction in a declaration for the 4731terminal symbol. All kinds of token declarations allow `<TYPE>'. 4732 4733 4734File: bison.info, Node: Initial Action Decl, Next: Destructor Decl, Prev: Type Decl, Up: Declarations 4735 47363.8.6 Performing Actions before Parsing 4737--------------------------------------- 4738 4739Sometimes your parser needs to perform some initializations before 4740parsing. The `%initial-action' directive allows for such arbitrary 4741code. 4742 4743 -- Directive: %initial-action { CODE } 4744 Declare that the braced CODE must be invoked before parsing each 4745 time `yyparse' is called. The CODE may use `$$' (or `$<TAG>$') 4746 and `@$' -- initial value and location of the lookahead -- and the 4747 `%parse-param'. 4748 4749 For instance, if your locations use a file name, you may use 4750 4751 %parse-param { char const *file_name }; 4752 %initial-action 4753 { 4754 @$.initialize (file_name); 4755 }; 4756 4757 4758File: bison.info, Node: Destructor Decl, Next: Printer Decl, Prev: Initial Action Decl, Up: Declarations 4759 47603.8.7 Freeing Discarded Symbols 4761------------------------------- 4762 4763During error recovery (*note Error Recovery::), symbols already pushed 4764on the stack and tokens coming from the rest of the file are discarded 4765until the parser falls on its feet. If the parser runs out of memory, 4766or if it returns via `YYABORT' or `YYACCEPT', all the symbols on the 4767stack must be discarded. Even if the parser succeeds, it must discard 4768the start symbol. 4769 4770 When discarded symbols convey heap based information, this memory is 4771lost. While this behavior can be tolerable for batch parsers, such as 4772in traditional compilers, it is unacceptable for programs like shells or 4773protocol implementations that may parse and execute indefinitely. 4774 4775 The `%destructor' directive defines code that is called when a 4776symbol is automatically discarded. 4777 4778 -- Directive: %destructor { CODE } SYMBOLS 4779 Invoke the braced CODE whenever the parser discards one of the 4780 SYMBOLS. Within CODE, `$$' (or `$<TAG>$') designates the semantic 4781 value associated with the discarded symbol, and `@$' designates 4782 its location. The additional parser parameters are also available 4783 (*note The Parser Function `yyparse': Parser Function.). 4784 4785 When a symbol is listed among SYMBOLS, its `%destructor' is called 4786 a per-symbol `%destructor'. You may also define a per-type 4787 `%destructor' by listing a semantic type tag among SYMBOLS. In 4788 that case, the parser will invoke this CODE whenever it discards 4789 any grammar symbol that has that semantic type tag unless that 4790 symbol has its own per-symbol `%destructor'. 4791 4792 Finally, you can define two different kinds of default 4793 `%destructor's. (These default forms are experimental. More user 4794 feedback will help to determine whether they should become 4795 permanent features.) You can place each of `<*>' and `<>' in the 4796 SYMBOLS list of exactly one `%destructor' declaration in your 4797 grammar file. The parser will invoke the CODE associated with one 4798 of these whenever it discards any user-defined grammar symbol that 4799 has no per-symbol and no per-type `%destructor'. The parser uses 4800 the CODE for `<*>' in the case of such a grammar symbol for which 4801 you have formally declared a semantic type tag (`%type' counts as 4802 such a declaration, but `$<tag>$' does not). The parser uses the 4803 CODE for `<>' in the case of such a grammar symbol that has no 4804 declared semantic type tag. 4805 4806For example: 4807 4808 %union { char *string; } 4809 %token <string> STRING1 4810 %token <string> STRING2 4811 %type <string> string1 4812 %type <string> string2 4813 %union { char character; } 4814 %token <character> CHR 4815 %type <character> chr 4816 %token TAGLESS 4817 4818 %destructor { } <character> 4819 %destructor { free ($$); } <*> 4820 %destructor { free ($$); printf ("%d", @$.first_line); } STRING1 string1 4821 %destructor { printf ("Discarding tagless symbol.\n"); } <> 4822 4823guarantees that, when the parser discards any user-defined symbol that 4824has a semantic type tag other than `<character>', it passes its 4825semantic value to `free' by default. However, when the parser discards 4826a `STRING1' or a `string1', it also prints its line number to `stdout'. 4827It performs only the second `%destructor' in this case, so it invokes 4828`free' only once. Finally, the parser merely prints a message whenever 4829it discards any symbol, such as `TAGLESS', that has no semantic type 4830tag. 4831 4832 A Bison-generated parser invokes the default `%destructor's only for 4833user-defined as opposed to Bison-defined symbols. For example, the 4834parser will not invoke either kind of default `%destructor' for the 4835special Bison-defined symbols `$accept', `$undefined', or `$end' (*note 4836Bison Symbols: Table of Symbols.), none of which you can reference in 4837your grammar. It also will not invoke either for the `error' token 4838(*note error: Table of Symbols.), which is always defined by Bison 4839regardless of whether you reference it in your grammar. However, it 4840may invoke one of them for the end token (token 0) if you redefine it 4841from `$end' to, for example, `END': 4842 4843 %token END 0 4844 4845 Finally, Bison will never invoke a `%destructor' for an unreferenced 4846mid-rule semantic value (*note Actions in Mid-Rule: Mid-Rule Actions.). 4847That is, Bison does not consider a mid-rule to have a semantic value if 4848you do not reference `$$' in the mid-rule's action or `$N' (where N is 4849the right-hand side symbol position of the mid-rule) in any later 4850action in that rule. However, if you do reference either, the 4851Bison-generated parser will invoke the `<>' `%destructor' whenever it 4852discards the mid-rule symbol. 4853 4854 4855 "Discarded symbols" are the following: 4856 4857 * stacked symbols popped during the first phase of error recovery, 4858 4859 * incoming terminals during the second phase of error recovery, 4860 4861 * the current lookahead and the entire stack (except the current 4862 right-hand side symbols) when the parser returns immediately, and 4863 4864 * the current lookahead and the entire stack (including the current 4865 right-hand side symbols) when the C++ parser (`lalr1.cc') catches 4866 an exception in `parse', 4867 4868 * the start symbol, when the parser succeeds. 4869 4870 The parser can "return immediately" because of an explicit call to 4871`YYABORT' or `YYACCEPT', or failed error recovery, or memory exhaustion. 4872 4873 Right-hand side symbols of a rule that explicitly triggers a syntax 4874error via `YYERROR' are not discarded automatically. As a rule of 4875thumb, destructors are invoked only when user actions cannot manage the 4876memory. 4877 4878 4879File: bison.info, Node: Printer Decl, Next: Expect Decl, Prev: Destructor Decl, Up: Declarations 4880 48813.8.8 Printing Semantic Values 4882------------------------------ 4883 4884When run-time traces are enabled (*note Tracing Your Parser: Tracing.), 4885the parser reports its actions, such as reductions. When a symbol 4886involved in an action is reported, only its kind is displayed, as the 4887parser cannot know how semantic values should be formatted. 4888 4889 The `%printer' directive defines code that is called when a symbol is 4890reported. Its syntax is the same as `%destructor' (*note Freeing 4891Discarded Symbols: Destructor Decl.). 4892 4893 -- Directive: %printer { CODE } SYMBOLS 4894 Invoke the braced CODE whenever the parser displays one of the 4895 SYMBOLS. Within CODE, `yyoutput' denotes the output stream (a 4896 `FILE*' in C, and an `std::ostream&' in C++), `$$' (or `$<TAG>$') 4897 designates the semantic value associated with the symbol, and `@$' 4898 its location. The additional parser parameters are also available 4899 (*note The Parser Function `yyparse': Parser Function.). 4900 4901 The SYMBOLS are defined as for `%destructor' (*note Freeing 4902 Discarded Symbols: Destructor Decl.): they can be per-type (e.g., 4903 `<ival>'), per-symbol (e.g., `exp', `NUM', `"float"'), typed 4904 per-default (i.e., `<*>', or untyped per-default (i.e., `<>'). 4905 4906For example: 4907 4908 %union { char *string; } 4909 %token <string> STRING1 4910 %token <string> STRING2 4911 %type <string> string1 4912 %type <string> string2 4913 %union { char character; } 4914 %token <character> CHR 4915 %type <character> chr 4916 %token TAGLESS 4917 4918 %printer { fprintf (yyoutput, "'%c'", $$); } <character> 4919 %printer { fprintf (yyoutput, "&%p", $$); } <*> 4920 %printer { fprintf (yyoutput, "\"%s\"", $$); } STRING1 string1 4921 %printer { fprintf (yyoutput, "<>"); } <> 4922 4923guarantees that, when the parser print any symbol that has a semantic 4924type tag other than `<character>', it display the address of the 4925semantic value by default. However, when the parser displays a 4926`STRING1' or a `string1', it formats it as a string in double quotes. 4927It performs only the second `%printer' in this case, so it prints only 4928once. Finally, the parser print `<>' for any symbol, such as `TAGLESS', 4929that has no semantic type tag. See also 4930 4931 4932File: bison.info, Node: Expect Decl, Next: Start Decl, Prev: Printer Decl, Up: Declarations 4933 49343.8.9 Suppressing Conflict Warnings 4935----------------------------------- 4936 4937Bison normally warns if there are any conflicts in the grammar (*note 4938Shift/Reduce Conflicts: Shift/Reduce.), but most real grammars have 4939harmless shift/reduce conflicts which are resolved in a predictable way 4940and would be difficult to eliminate. It is desirable to suppress the 4941warning about these conflicts unless the number of conflicts changes. 4942You can do this with the `%expect' declaration. 4943 4944 The declaration looks like this: 4945 4946 %expect N 4947 4948 Here N is a decimal integer. The declaration says there should be N 4949shift/reduce conflicts and no reduce/reduce conflicts. Bison reports 4950an error if the number of shift/reduce conflicts differs from N, or if 4951there are any reduce/reduce conflicts. 4952 4953 For deterministic parsers, reduce/reduce conflicts are more serious, 4954and should be eliminated entirely. Bison will always report 4955reduce/reduce conflicts for these parsers. With GLR parsers, however, 4956both kinds of conflicts are routine; otherwise, there would be no need 4957to use GLR parsing. Therefore, it is also possible to specify an 4958expected number of reduce/reduce conflicts in GLR parsers, using the 4959declaration: 4960 4961 %expect-rr N 4962 4963 In general, using `%expect' involves these steps: 4964 4965 * Compile your grammar without `%expect'. Use the `-v' option to 4966 get a verbose list of where the conflicts occur. Bison will also 4967 print the number of conflicts. 4968 4969 * Check each of the conflicts to make sure that Bison's default 4970 resolution is what you really want. If not, rewrite the grammar 4971 and go back to the beginning. 4972 4973 * Add an `%expect' declaration, copying the number N from the number 4974 which Bison printed. With GLR parsers, add an `%expect-rr' 4975 declaration as well. 4976 4977 Now Bison will report an error if you introduce an unexpected 4978conflict, but will keep silent otherwise. 4979 4980 4981File: bison.info, Node: Start Decl, Next: Pure Decl, Prev: Expect Decl, Up: Declarations 4982 49833.8.10 The Start-Symbol 4984----------------------- 4985 4986Bison assumes by default that the start symbol for the grammar is the 4987first nonterminal specified in the grammar specification section. The 4988programmer may override this restriction with the `%start' declaration 4989as follows: 4990 4991 %start SYMBOL 4992 4993 4994File: bison.info, Node: Pure Decl, Next: Push Decl, Prev: Start Decl, Up: Declarations 4995 49963.8.11 A Pure (Reentrant) Parser 4997-------------------------------- 4998 4999A "reentrant" program is one which does not alter in the course of 5000execution; in other words, it consists entirely of "pure" (read-only) 5001code. Reentrancy is important whenever asynchronous execution is 5002possible; for example, a nonreentrant program may not be safe to call 5003from a signal handler. In systems with multiple threads of control, a 5004nonreentrant program must be called only within interlocks. 5005 5006 Normally, Bison generates a parser which is not reentrant. This is 5007suitable for most uses, and it permits compatibility with Yacc. (The 5008standard Yacc interfaces are inherently nonreentrant, because they use 5009statically allocated variables for communication with `yylex', 5010including `yylval' and `yylloc'.) 5011 5012 Alternatively, you can generate a pure, reentrant parser. The Bison 5013declaration `%define api.pure' says that you want the parser to be 5014reentrant. It looks like this: 5015 5016 %define api.pure full 5017 5018 The result is that the communication variables `yylval' and `yylloc' 5019become local variables in `yyparse', and a different calling convention 5020is used for the lexical analyzer function `yylex'. *Note Calling 5021Conventions for Pure Parsers: Pure Calling, for the details of this. 5022The variable `yynerrs' becomes local in `yyparse' in pull mode but it 5023becomes a member of yypstate in push mode. (*note The Error Reporting 5024Function `yyerror': Error Reporting.). The convention for calling 5025`yyparse' itself is unchanged. 5026 5027 Whether the parser is pure has nothing to do with the grammar rules. 5028You can generate either a pure parser or a nonreentrant parser from any 5029valid grammar. 5030 5031 5032File: bison.info, Node: Push Decl, Next: Decl Summary, Prev: Pure Decl, Up: Declarations 5033 50343.8.12 A Push Parser 5035-------------------- 5036 5037(The current push parsing interface is experimental and may evolve. 5038More user feedback will help to stabilize it.) 5039 5040 A pull parser is called once and it takes control until all its input 5041is completely parsed. A push parser, on the other hand, is called each 5042time a new token is made available. 5043 5044 A push parser is typically useful when the parser is part of a main 5045event loop in the client's application. This is typically a 5046requirement of a GUI, when the main event loop needs to be triggered 5047within a certain time period. 5048 5049 Normally, Bison generates a pull parser. The following Bison 5050declaration says that you want the parser to be a push parser (*note 5051api.push-pull: %define Summary.): 5052 5053 %define api.push-pull push 5054 5055 In almost all cases, you want to ensure that your push parser is also 5056a pure parser (*note A Pure (Reentrant) Parser: Pure Decl.). The only 5057time you should create an impure push parser is to have backwards 5058compatibility with the impure Yacc pull mode interface. Unless you know 5059what you are doing, your declarations should look like this: 5060 5061 %define api.pure full 5062 %define api.push-pull push 5063 5064 There is a major notable functional difference between the pure push 5065parser and the impure push parser. It is acceptable for a pure push 5066parser to have many parser instances, of the same type of parser, in 5067memory at the same time. An impure push parser should only use one 5068parser at a time. 5069 5070 When a push parser is selected, Bison will generate some new symbols 5071in the generated parser. `yypstate' is a structure that the generated 5072parser uses to store the parser's state. `yypstate_new' is the 5073function that will create a new parser instance. `yypstate_delete' 5074will free the resources associated with the corresponding parser 5075instance. Finally, `yypush_parse' is the function that should be 5076called whenever a token is available to provide the parser. A trivial 5077example of using a pure push parser would look like this: 5078 5079 int status; 5080 yypstate *ps = yypstate_new (); 5081 do { 5082 status = yypush_parse (ps, yylex (), NULL); 5083 } while (status == YYPUSH_MORE); 5084 yypstate_delete (ps); 5085 5086 If the user decided to use an impure push parser, a few things about 5087the generated parser will change. The `yychar' variable becomes a 5088global variable instead of a variable in the `yypush_parse' function. 5089For this reason, the signature of the `yypush_parse' function is 5090changed to remove the token as a parameter. A nonreentrant push parser 5091example would thus look like this: 5092 5093 extern int yychar; 5094 int status; 5095 yypstate *ps = yypstate_new (); 5096 do { 5097 yychar = yylex (); 5098 status = yypush_parse (ps); 5099 } while (status == YYPUSH_MORE); 5100 yypstate_delete (ps); 5101 5102 That's it. Notice the next token is put into the global variable 5103`yychar' for use by the next invocation of the `yypush_parse' function. 5104 5105 Bison also supports both the push parser interface along with the 5106pull parser interface in the same generated parser. In order to get 5107this functionality, you should replace the `%define api.push-pull push' 5108declaration with the `%define api.push-pull both' declaration. Doing 5109this will create all of the symbols mentioned earlier along with the 5110two extra symbols, `yyparse' and `yypull_parse'. `yyparse' can be used 5111exactly as it normally would be used. However, the user should note 5112that it is implemented in the generated parser by calling 5113`yypull_parse'. This makes the `yyparse' function that is generated 5114with the `%define api.push-pull both' declaration slower than the normal 5115`yyparse' function. If the user calls the `yypull_parse' function it 5116will parse the rest of the input stream. It is possible to 5117`yypush_parse' tokens to select a subgrammar and then `yypull_parse' 5118the rest of the input stream. If you would like to switch back and 5119forth between between parsing styles, you would have to write your own 5120`yypull_parse' function that knows when to quit looking for input. An 5121example of using the `yypull_parse' function would look like this: 5122 5123 yypstate *ps = yypstate_new (); 5124 yypull_parse (ps); /* Will call the lexer */ 5125 yypstate_delete (ps); 5126 5127 Adding the `%define api.pure full' declaration does exactly the same 5128thing to the generated parser with `%define api.push-pull both' as it 5129did for `%define api.push-pull push'. 5130 5131 5132File: bison.info, Node: Decl Summary, Next: %define Summary, Prev: Push Decl, Up: Declarations 5133 51343.8.13 Bison Declaration Summary 5135-------------------------------- 5136 5137Here is a summary of the declarations used to define a grammar: 5138 5139 -- Directive: %union 5140 Declare the collection of data types that semantic values may have 5141 (*note The Collection of Value Types: Union Decl.). 5142 5143 -- Directive: %token 5144 Declare a terminal symbol (token type name) with no precedence or 5145 associativity specified (*note Token Type Names: Token Decl.). 5146 5147 -- Directive: %right 5148 Declare a terminal symbol (token type name) that is 5149 right-associative (*note Operator Precedence: Precedence Decl.). 5150 5151 -- Directive: %left 5152 Declare a terminal symbol (token type name) that is 5153 left-associative (*note Operator Precedence: Precedence Decl.). 5154 5155 -- Directive: %nonassoc 5156 Declare a terminal symbol (token type name) that is nonassociative 5157 (*note Operator Precedence: Precedence Decl.). Using it in a way 5158 that would be associative is a syntax error. 5159 5160 -- Directive: %type 5161 Declare the type of semantic values for a nonterminal symbol 5162 (*note Nonterminal Symbols: Type Decl.). 5163 5164 -- Directive: %start 5165 Specify the grammar's start symbol (*note The Start-Symbol: Start 5166 Decl.). 5167 5168 -- Directive: %expect 5169 Declare the expected number of shift-reduce conflicts (*note 5170 Suppressing Conflict Warnings: Expect Decl.). 5171 5172 5173In order to change the behavior of `bison', use the following 5174directives: 5175 5176 -- Directive: %code {CODE} 5177 -- Directive: %code QUALIFIER {CODE} 5178 Insert CODE verbatim into the output parser source at the default 5179 location or at the location specified by QUALIFIER. *Note %code 5180 Summary::. 5181 5182 -- Directive: %debug 5183 In the parser implementation file, define the macro `YYDEBUG' (or 5184 `PREFIXDEBUG' with `%define api.prefix PREFIX', see *note Multiple 5185 Parsers in the Same Program: Multiple Parsers.) to 1 if it is not 5186 already defined, so that the debugging facilities are compiled. 5187 *Note Tracing Your Parser: Tracing. 5188 5189 -- Directive: %define VARIABLE 5190 -- Directive: %define VARIABLE VALUE 5191 -- Directive: %define VARIABLE "VALUE" 5192 Define a variable to adjust Bison's behavior. *Note %define 5193 Summary::. 5194 5195 -- Directive: %defines 5196 Write a parser header file containing macro definitions for the 5197 token type names defined in the grammar as well as a few other 5198 declarations. If the parser implementation file is named `NAME.c' 5199 then the parser header file is named `NAME.h'. 5200 5201 For C parsers, the parser header file declares `YYSTYPE' unless 5202 `YYSTYPE' is already defined as a macro or you have used a 5203 `<TYPE>' tag without using `%union'. Therefore, if you are using 5204 a `%union' (*note More Than One Value Type: Multiple Types.) with 5205 components that require other definitions, or if you have defined 5206 a `YYSTYPE' macro or type definition (*note Data Types of Semantic 5207 Values: Value Type.), you need to arrange for these definitions to 5208 be propagated to all modules, e.g., by putting them in a 5209 prerequisite header that is included both by your parser and by any 5210 other module that needs `YYSTYPE'. 5211 5212 Unless your parser is pure, the parser header file declares 5213 `yylval' as an external variable. *Note A Pure (Reentrant) 5214 Parser: Pure Decl. 5215 5216 If you have also used locations, the parser header file declares 5217 `YYLTYPE' and `yylloc' using a protocol similar to that of the 5218 `YYSTYPE' macro and `yylval'. *Note Tracking Locations::. 5219 5220 This parser header file is normally essential if you wish to put 5221 the definition of `yylex' in a separate source file, because 5222 `yylex' typically needs to be able to refer to the above-mentioned 5223 declarations and to the token type codes. *Note Semantic Values 5224 of Tokens: Token Values. 5225 5226 If you have declared `%code requires' or `%code provides', the 5227 output header also contains their code. *Note %code Summary::. 5228 5229 The generated header is protected against multiple inclusions with 5230 a C preprocessor guard: `YY_PREFIX_FILE_INCLUDED', where PREFIX 5231 and FILE are the prefix (*note Multiple Parsers in the Same 5232 Program: Multiple Parsers.) and generated file name turned 5233 uppercase, with each series of non alphanumerical characters 5234 converted to a single underscore. 5235 5236 For instance with `%define api.prefix "calc"' and `%defines 5237 "lib/parse.h"', the header will be guarded as follows. 5238 #ifndef YY_CALC_LIB_PARSE_H_INCLUDED 5239 # define YY_CALC_LIB_PARSE_H_INCLUDED 5240 ... 5241 #endif /* ! YY_CALC_LIB_PARSE_H_INCLUDED */ 5242 5243 -- Directive: %defines DEFINES-FILE 5244 Same as above, but save in the file DEFINES-FILE. 5245 5246 -- Directive: %destructor 5247 Specify how the parser should reclaim the memory associated to 5248 discarded symbols. *Note Freeing Discarded Symbols: Destructor 5249 Decl. 5250 5251 -- Directive: %file-prefix "PREFIX" 5252 Specify a prefix to use for all Bison output file names. The names 5253 are chosen as if the grammar file were named `PREFIX.y'. 5254 5255 -- Directive: %language "LANGUAGE" 5256 Specify the programming language for the generated parser. 5257 Currently supported languages include C, C++, and Java. LANGUAGE 5258 is case-insensitive. 5259 5260 5261 -- Directive: %locations 5262 Generate the code processing the locations (*note Special Features 5263 for Use in Actions: Action Features.). This mode is enabled as 5264 soon as the grammar uses the special `@N' tokens, but if your 5265 grammar does not use it, using `%locations' allows for more 5266 accurate syntax error messages. 5267 5268 -- Directive: %no-lines 5269 Don't generate any `#line' preprocessor commands in the parser 5270 implementation file. Ordinarily Bison writes these commands in the 5271 parser implementation file so that the C compiler and debuggers 5272 will associate errors and object code with your source file (the 5273 grammar file). This directive causes them to associate errors 5274 with the parser implementation file, treating it as an independent 5275 source file in its own right. 5276 5277 -- Directive: %output "FILE" 5278 Specify FILE for the parser implementation file. 5279 5280 -- Directive: %pure-parser 5281 Deprecated version of `%define api.pure' (*note api.pure: %define 5282 Summary.), for which Bison is more careful to warn about 5283 unreasonable usage. 5284 5285 -- Directive: %require "VERSION" 5286 Require version VERSION or higher of Bison. *Note Require a 5287 Version of Bison: Require Decl. 5288 5289 -- Directive: %skeleton "FILE" 5290 Specify the skeleton to use. 5291 5292 If FILE does not contain a `/', FILE is the name of a skeleton 5293 file in the Bison installation directory. If it does, FILE is an 5294 absolute file name or a file name relative to the directory of the 5295 grammar file. This is similar to how most shells resolve commands. 5296 5297 -- Directive: %token-table 5298 Generate an array of token names in the parser implementation file. 5299 The name of the array is `yytname'; `yytname[I]' is the name of 5300 the token whose internal Bison token code number is I. The first 5301 three elements of `yytname' correspond to the predefined tokens 5302 `"$end"', `"error"', and `"$undefined"'; after these come the 5303 symbols defined in the grammar file. 5304 5305 The name in the table includes all the characters needed to 5306 represent the token in Bison. For single-character literals and 5307 literal strings, this includes the surrounding quoting characters 5308 and any escape sequences. For example, the Bison single-character 5309 literal `'+'' corresponds to a three-character name, represented 5310 in C as `"'+'"'; and the Bison two-character literal string `"\\/"' 5311 corresponds to a five-character name, represented in C as 5312 `"\"\\\\/\""'. 5313 5314 When you specify `%token-table', Bison also generates macro 5315 definitions for macros `YYNTOKENS', `YYNNTS', and `YYNRULES', and 5316 `YYNSTATES': 5317 5318 `YYNTOKENS' 5319 The highest token number, plus one. 5320 5321 `YYNNTS' 5322 The number of nonterminal symbols. 5323 5324 `YYNRULES' 5325 The number of grammar rules, 5326 5327 `YYNSTATES' 5328 The number of parser states (*note Parser States::). 5329 5330 -- Directive: %verbose 5331 Write an extra output file containing verbose descriptions of the 5332 parser states and what is done for each type of lookahead token in 5333 that state. *Note Understanding Your Parser: Understanding, for 5334 more information. 5335 5336 -- Directive: %yacc 5337 Pretend the option `--yacc' was given, i.e., imitate Yacc, 5338 including its naming conventions. *Note Bison Options::, for more. 5339 5340 5341File: bison.info, Node: %define Summary, Next: %code Summary, Prev: Decl Summary, Up: Declarations 5342 53433.8.14 %define Summary 5344---------------------- 5345 5346There are many features of Bison's behavior that can be controlled by 5347assigning the feature a single value. For historical reasons, some 5348such features are assigned values by dedicated directives, such as 5349`%start', which assigns the start symbol. However, newer such features 5350are associated with variables, which are assigned by the `%define' 5351directive: 5352 5353 -- Directive: %define VARIABLE 5354 -- Directive: %define VARIABLE VALUE 5355 -- Directive: %define VARIABLE "VALUE" 5356 Define VARIABLE to VALUE. 5357 5358 VALUE must be placed in quotation marks if it contains any 5359 character other than a letter, underscore, period, or non-initial 5360 dash or digit. Omitting `"VALUE"' entirely is always equivalent 5361 to specifying `""'. 5362 5363 It is an error if a VARIABLE is defined by `%define' multiple 5364 times, but see *note -D NAME[=VALUE]: Bison Options. 5365 5366 The rest of this section summarizes variables and values that 5367`%define' accepts. 5368 5369 Some VARIABLEs take Boolean values. In this case, Bison will 5370complain if the variable definition does not meet one of the following 5371four conditions: 5372 5373 1. `VALUE' is `true' 5374 5375 2. `VALUE' is omitted (or `""' is specified). This is equivalent to 5376 `true'. 5377 5378 3. `VALUE' is `false'. 5379 5380 4. VARIABLE is never defined. In this case, Bison selects a default 5381 value. 5382 5383 What VARIABLEs are accepted, as well as their meanings and default 5384values, depend on the selected target language and/or the parser 5385skeleton (*note %language: Decl Summary, *note %skeleton: Decl 5386Summary.). Unaccepted VARIABLEs produce an error. Some of the 5387accepted VARIABLEs are: 5388 5389 * `api.location.type' 5390 5391 * Language(s): C++, Java 5392 5393 * Purpose: Define the location type. *Note User Defined 5394 Location Type::. 5395 5396 * Accepted Values: String 5397 5398 * Default Value: none 5399 5400 * History: introduced in Bison 2.7 5401 5402 * `api.prefix' 5403 5404 * Language(s): All 5405 5406 * Purpose: Rename exported symbols. *Note Multiple Parsers in 5407 the Same Program: Multiple Parsers. 5408 5409 * Accepted Values: String 5410 5411 * Default Value: `yy' 5412 5413 * History: introduced in Bison 2.6 5414 5415 * `api.pure' 5416 5417 * Language(s): C 5418 5419 * Purpose: Request a pure (reentrant) parser program. *Note A 5420 Pure (Reentrant) Parser: Pure Decl. 5421 5422 * Accepted Values: `true', `false', `full' 5423 5424 The value may be omitted: this is equivalent to specifying 5425 `true', as is the case for Boolean values. 5426 5427 When `%define api.pure full' is used, the parser is made 5428 reentrant. This changes the signature for `yylex' (*note Pure 5429 Calling::), and also that of `yyerror' when the tracking of 5430 locations has been activated, as shown below. 5431 5432 The `true' value is very similar to the `full' value, the only 5433 difference is in the signature of `yyerror' on Yacc parsers 5434 without `%parse-param', for historical reasons. 5435 5436 I.e., if `%locations %define api.pure' is passed then the 5437 prototypes for `yyerror' are: 5438 5439 void yyerror (char const *msg); // Yacc parsers. 5440 void yyerror (YYLTYPE *locp, char const *msg); // GLR parsers. 5441 5442 But if `%locations %define api.pure %parse-param {int 5443 *nastiness}' is used, then both parsers have the same 5444 signature: 5445 5446 void yyerror (YYLTYPE *llocp, int *nastiness, char const *msg); 5447 5448 (*note The Error Reporting Function `yyerror': Error 5449 Reporting.) 5450 5451 * Default Value: `false' 5452 5453 * History: the `full' value was introduced in Bison 2.7 5454 5455 * `api.push-pull' 5456 5457 * Language(s): C (deterministic parsers only) 5458 5459 * Purpose: Request a pull parser, a push parser, or both. 5460 *Note A Push Parser: Push Decl. (The current push parsing 5461 interface is experimental and may evolve. More user feedback 5462 will help to stabilize it.) 5463 5464 * Accepted Values: `pull', `push', `both' 5465 5466 * Default Value: `pull' 5467 5468 * `lr.default-reductions' 5469 5470 * Language(s): all 5471 5472 * Purpose: Specify the kind of states that are permitted to 5473 contain default reductions. *Note Default Reductions::. 5474 (The ability to specify where default reductions should be 5475 used is experimental. More user feedback will help to 5476 stabilize it.) 5477 5478 * Accepted Values: `most', `consistent', `accepting' 5479 5480 * Default Value: 5481 * `accepting' if `lr.type' is `canonical-lr'. 5482 5483 * `most' otherwise. 5484 5485 * `lr.keep-unreachable-states' 5486 5487 * Language(s): all 5488 5489 * Purpose: Request that Bison allow unreachable parser states to 5490 remain in the parser tables. *Note Unreachable States::. 5491 5492 * Accepted Values: Boolean 5493 5494 * Default Value: `false' 5495 5496 * `lr.type' 5497 5498 * Language(s): all 5499 5500 * Purpose: Specify the type of parser tables within the LR(1) 5501 family. *Note LR Table Construction::. (This feature is 5502 experimental. More user feedback will help to stabilize it.) 5503 5504 * Accepted Values: `lalr', `ielr', `canonical-lr' 5505 5506 * Default Value: `lalr' 5507 5508 * `namespace' 5509 5510 * Languages(s): C++ 5511 5512 * Purpose: Specify the namespace for the parser class. For 5513 example, if you specify: 5514 5515 %define namespace "foo::bar" 5516 5517 Bison uses `foo::bar' verbatim in references such as: 5518 5519 foo::bar::parser::semantic_type 5520 5521 However, to open a namespace, Bison removes any leading `::' 5522 and then splits on any remaining occurrences: 5523 5524 namespace foo { namespace bar { 5525 class position; 5526 class location; 5527 } } 5528 5529 * Accepted Values: Any absolute or relative C++ namespace 5530 reference without a trailing `"::"'. For example, `"foo"' or 5531 `"::foo::bar"'. 5532 5533 * Default Value: The value specified by `%name-prefix', which 5534 defaults to `yy'. This usage of `%name-prefix' is for 5535 backward compatibility and can be confusing since 5536 `%name-prefix' also specifies the textual prefix for the 5537 lexical analyzer function. Thus, if you specify 5538 `%name-prefix', it is best to also specify `%define 5539 namespace' so that `%name-prefix' _only_ affects the lexical 5540 analyzer function. For example, if you specify: 5541 5542 %define namespace "foo" 5543 %name-prefix "bar::" 5544 5545 The parser namespace is `foo' and `yylex' is referenced as 5546 `bar::lex'. 5547 5548 * `parse.lac' 5549 5550 * Languages(s): C (deterministic parsers only) 5551 5552 * Purpose: Enable LAC (lookahead correction) to improve syntax 5553 error handling. *Note LAC::. 5554 5555 * Accepted Values: `none', `full' 5556 5557 * Default Value: `none' 5558 5559 5560File: bison.info, Node: %code Summary, Prev: %define Summary, Up: Declarations 5561 55623.8.15 %code Summary 5563-------------------- 5564 5565The `%code' directive inserts code verbatim into the output parser 5566source at any of a predefined set of locations. It thus serves as a 5567flexible and user-friendly alternative to the traditional Yacc 5568prologue, `%{CODE%}'. This section summarizes the functionality of 5569`%code' for the various target languages supported by Bison. For a 5570detailed discussion of how to use `%code' in place of `%{CODE%}' for 5571C/C++ and why it is advantageous to do so, *note Prologue 5572Alternatives::. 5573 5574 -- Directive: %code {CODE} 5575 This is the unqualified form of the `%code' directive. It inserts 5576 CODE verbatim at a language-dependent default location in the 5577 parser implementation. 5578 5579 For C/C++, the default location is the parser implementation file 5580 after the usual contents of the parser header file. Thus, the 5581 unqualified form replaces `%{CODE%}' for most purposes. 5582 5583 For Java, the default location is inside the parser class. 5584 5585 -- Directive: %code QUALIFIER {CODE} 5586 This is the qualified form of the `%code' directive. QUALIFIER 5587 identifies the purpose of CODE and thus the location(s) where 5588 Bison should insert it. That is, if you need to specify 5589 location-sensitive CODE that does not belong at the default 5590 location selected by the unqualified `%code' form, use this form 5591 instead. 5592 5593 For any particular qualifier or for the unqualified form, if there 5594are multiple occurrences of the `%code' directive, Bison concatenates 5595the specified code in the order in which it appears in the grammar file. 5596 5597 Not all qualifiers are accepted for all target languages. Unaccepted 5598qualifiers produce an error. Some of the accepted qualifiers are: 5599 5600 * requires 5601 5602 * Language(s): C, C++ 5603 5604 * Purpose: This is the best place to write dependency code 5605 required for `YYSTYPE' and `YYLTYPE'. In other words, it's 5606 the best place to define types referenced in `%union' 5607 directives, and it's the best place to override Bison's 5608 default `YYSTYPE' and `YYLTYPE' definitions. 5609 5610 * Location(s): The parser header file and the parser 5611 implementation file before the Bison-generated `YYSTYPE' and 5612 `YYLTYPE' definitions. 5613 5614 * provides 5615 5616 * Language(s): C, C++ 5617 5618 * Purpose: This is the best place to write additional 5619 definitions and declarations that should be provided to other 5620 modules. 5621 5622 * Location(s): The parser header file and the parser 5623 implementation file after the Bison-generated `YYSTYPE', 5624 `YYLTYPE', and token definitions. 5625 5626 * top 5627 5628 * Language(s): C, C++ 5629 5630 * Purpose: The unqualified `%code' or `%code requires' should 5631 usually be more appropriate than `%code top'. However, 5632 occasionally it is necessary to insert code much nearer the 5633 top of the parser implementation file. For example: 5634 5635 %code top { 5636 #define _GNU_SOURCE 5637 #include <stdio.h> 5638 } 5639 5640 * Location(s): Near the top of the parser implementation file. 5641 5642 * imports 5643 5644 * Language(s): Java 5645 5646 * Purpose: This is the best place to write Java import 5647 directives. 5648 5649 * Location(s): The parser Java file after any Java package 5650 directive and before any class definitions. 5651 5652 Though we say the insertion locations are language-dependent, they 5653are technically skeleton-dependent. Writers of non-standard skeletons 5654however should choose their locations consistently with the behavior of 5655the standard Bison skeletons. 5656 5657 5658File: bison.info, Node: Multiple Parsers, Prev: Declarations, Up: Grammar File 5659 56603.9 Multiple Parsers in the Same Program 5661======================================== 5662 5663Most programs that use Bison parse only one language and therefore 5664contain only one Bison parser. But what if you want to parse more than 5665one language with the same program? Then you need to avoid name 5666conflicts between different definitions of functions and variables such 5667as `yyparse', `yylval'. To use different parsers from the same 5668compilation unit, you also need to avoid conflicts on types and macros 5669(e.g., `YYSTYPE') exported in the generated header. 5670 5671 The easy way to do this is to define the `%define' variable 5672`api.prefix'. With different `api.prefix's it is guaranteed that 5673headers do not conflict when included together, and that compiled 5674objects can be linked together too. Specifying `%define api.prefix 5675PREFIX' (or passing the option `-Dapi.prefix=PREFIX', see *note 5676Invoking Bison: Invocation.) renames the interface functions and 5677variables of the Bison parser to start with PREFIX instead of `yy', and 5678all the macros to start by PREFIX (i.e., PREFIX upper-cased) instead of 5679`YY'. 5680 5681 The renamed symbols include `yyparse', `yylex', `yyerror', 5682`yynerrs', `yylval', `yylloc', `yychar' and `yydebug'. If you use a 5683push parser, `yypush_parse', `yypull_parse', `yypstate', `yypstate_new' 5684and `yypstate_delete' will also be renamed. The renamed macros include 5685`YYSTYPE', `YYLTYPE', and `YYDEBUG', which is treated specifically -- 5686more about this below. 5687 5688 For example, if you use `%define api.prefix c', the names become 5689`cparse', `clex', ..., `CSTYPE', `CLTYPE', and so on. 5690 5691 The `%define' variable `api.prefix' works in two different ways. In 5692the implementation file, it works by adding macro definitions to the 5693beginning of the parser implementation file, defining `yyparse' as 5694`PREFIXparse', and so on: 5695 5696 #define YYSTYPE CTYPE 5697 #define yyparse cparse 5698 #define yylval clval 5699 ... 5700 YYSTYPE yylval; 5701 int yyparse (void); 5702 5703 This effectively substitutes one name for the other in the entire 5704parser implementation file, thus the "original" names (`yylex', 5705`YYSTYPE', ...) are also usable in the parser implementation file. 5706 5707 However, in the parser header file, the symbols are defined renamed, 5708for instance: 5709 5710 extern CSTYPE clval; 5711 int cparse (void); 5712 5713 The macro `YYDEBUG' is commonly used to enable the tracing support in 5714parsers. To comply with this tradition, when `api.prefix' is used, 5715`YYDEBUG' (not renamed) is used as a default value: 5716 5717 /* Enabling traces. */ 5718 #ifndef CDEBUG 5719 # if defined YYDEBUG 5720 # if YYDEBUG 5721 # define CDEBUG 1 5722 # else 5723 # define CDEBUG 0 5724 # endif 5725 # else 5726 # define CDEBUG 0 5727 # endif 5728 #endif 5729 #if CDEBUG 5730 extern int cdebug; 5731 #endif 5732 5733 5734 5735 Prior to Bison 2.6, a feature similar to `api.prefix' was provided by 5736the obsolete directive `%name-prefix' (*note Bison Symbols: Table of 5737Symbols.) and the option `--name-prefix' (*note Bison Options::). 5738 5739 5740File: bison.info, Node: Interface, Next: Algorithm, Prev: Grammar File, Up: Top 5741 57424 Parser C-Language Interface 5743***************************** 5744 5745The Bison parser is actually a C function named `yyparse'. Here we 5746describe the interface conventions of `yyparse' and the other functions 5747that it needs to use. 5748 5749 Keep in mind that the parser uses many C identifiers starting with 5750`yy' and `YY' for internal purposes. If you use such an identifier 5751(aside from those in this manual) in an action or in epilogue in the 5752grammar file, you are likely to run into trouble. 5753 5754* Menu: 5755 5756* Parser Function:: How to call `yyparse' and what it returns. 5757* Push Parser Function:: How to call `yypush_parse' and what it returns. 5758* Pull Parser Function:: How to call `yypull_parse' and what it returns. 5759* Parser Create Function:: How to call `yypstate_new' and what it returns. 5760* Parser Delete Function:: How to call `yypstate_delete' and what it returns. 5761* Lexical:: You must supply a function `yylex' 5762 which reads tokens. 5763* Error Reporting:: You must supply a function `yyerror'. 5764* Action Features:: Special features for use in actions. 5765* Internationalization:: How to let the parser speak in the user's 5766 native language. 5767 5768 5769File: bison.info, Node: Parser Function, Next: Push Parser Function, Up: Interface 5770 57714.1 The Parser Function `yyparse' 5772================================= 5773 5774You call the function `yyparse' to cause parsing to occur. This 5775function reads tokens, executes actions, and ultimately returns when it 5776encounters end-of-input or an unrecoverable syntax error. You can also 5777write an action which directs `yyparse' to return immediately without 5778reading further. 5779 5780 -- Function: int yyparse (void) 5781 The value returned by `yyparse' is 0 if parsing was successful 5782 (return is due to end-of-input). 5783 5784 The value is 1 if parsing failed because of invalid input, i.e., 5785 input that contains a syntax error or that causes `YYABORT' to be 5786 invoked. 5787 5788 The value is 2 if parsing failed due to memory exhaustion. 5789 5790 In an action, you can cause immediate return from `yyparse' by using 5791these macros: 5792 5793 -- Macro: YYACCEPT 5794 Return immediately with value 0 (to report success). 5795 5796 -- Macro: YYABORT 5797 Return immediately with value 1 (to report failure). 5798 5799 If you use a reentrant parser, you can optionally pass additional 5800parameter information to it in a reentrant way. To do so, use the 5801declaration `%parse-param': 5802 5803 -- Directive: %parse-param {ARGUMENT-DECLARATION} 5804 Declare that an argument declared by the braced-code 5805 ARGUMENT-DECLARATION is an additional `yyparse' argument. The 5806 ARGUMENT-DECLARATION is used when declaring functions or 5807 prototypes. The last identifier in ARGUMENT-DECLARATION must be 5808 the argument name. 5809 5810 Here's an example. Write this in the parser: 5811 5812 %parse-param {int *nastiness} 5813 %parse-param {int *randomness} 5814 5815Then call the parser like this: 5816 5817 { 5818 int nastiness, randomness; 5819 ... /* Store proper data in `nastiness' and `randomness'. */ 5820 value = yyparse (&nastiness, &randomness); 5821 ... 5822 } 5823 5824In the grammar actions, use expressions like this to refer to the data: 5825 5826 exp: ... { ...; *randomness += 1; ... } 5827 5828Using the following: 5829 %parse-param {int *randomness} 5830 5831 Results in these signatures: 5832 void yyerror (int *randomness, const char *msg); 5833 int yyparse (int *randomness); 5834 5835Or, if both `%define api.pure full' (or just `%define api.pure') and 5836`%locations' are used: 5837 5838 void yyerror (YYLTYPE *llocp, int *randomness, const char *msg); 5839 int yyparse (int *randomness); 5840 5841 5842File: bison.info, Node: Push Parser Function, Next: Pull Parser Function, Prev: Parser Function, Up: Interface 5843 58444.2 The Push Parser Function `yypush_parse' 5845=========================================== 5846 5847(The current push parsing interface is experimental and may evolve. 5848More user feedback will help to stabilize it.) 5849 5850 You call the function `yypush_parse' to parse a single token. This 5851function is available if either the `%define api.push-pull push' or 5852`%define api.push-pull both' declaration is used. *Note A Push Parser: 5853Push Decl. 5854 5855 -- Function: int yypush_parse (yypstate *yyps) 5856 The value returned by `yypush_parse' is the same as for yyparse 5857 with the following exception: it returns `YYPUSH_MORE' if more 5858 input is required to finish parsing the grammar. 5859 5860 5861File: bison.info, Node: Pull Parser Function, Next: Parser Create Function, Prev: Push Parser Function, Up: Interface 5862 58634.3 The Pull Parser Function `yypull_parse' 5864=========================================== 5865 5866(The current push parsing interface is experimental and may evolve. 5867More user feedback will help to stabilize it.) 5868 5869 You call the function `yypull_parse' to parse the rest of the input 5870stream. This function is available if the `%define api.push-pull both' 5871declaration is used. *Note A Push Parser: Push Decl. 5872 5873 -- Function: int yypull_parse (yypstate *yyps) 5874 The value returned by `yypull_parse' is the same as for `yyparse'. 5875 5876 5877File: bison.info, Node: Parser Create Function, Next: Parser Delete Function, Prev: Pull Parser Function, Up: Interface 5878 58794.4 The Parser Create Function `yystate_new' 5880============================================ 5881 5882(The current push parsing interface is experimental and may evolve. 5883More user feedback will help to stabilize it.) 5884 5885 You call the function `yypstate_new' to create a new parser instance. 5886This function is available if either the `%define api.push-pull push' or 5887`%define api.push-pull both' declaration is used. *Note A Push Parser: 5888Push Decl. 5889 5890 -- Function: yypstate* yypstate_new (void) 5891 The function will return a valid parser instance if there was 5892 memory available or 0 if no memory was available. In impure mode, 5893 it will also return 0 if a parser instance is currently allocated. 5894 5895 5896File: bison.info, Node: Parser Delete Function, Next: Lexical, Prev: Parser Create Function, Up: Interface 5897 58984.5 The Parser Delete Function `yystate_delete' 5899=============================================== 5900 5901(The current push parsing interface is experimental and may evolve. 5902More user feedback will help to stabilize it.) 5903 5904 You call the function `yypstate_delete' to delete a parser instance. 5905function is available if either the `%define api.push-pull push' or 5906`%define api.push-pull both' declaration is used. *Note A Push Parser: 5907Push Decl. 5908 5909 -- Function: void yypstate_delete (yypstate *yyps) 5910 This function will reclaim the memory associated with a parser 5911 instance. After this call, you should no longer attempt to use 5912 the parser instance. 5913 5914 5915File: bison.info, Node: Lexical, Next: Error Reporting, Prev: Parser Delete Function, Up: Interface 5916 59174.6 The Lexical Analyzer Function `yylex' 5918========================================= 5919 5920The "lexical analyzer" function, `yylex', recognizes tokens from the 5921input stream and returns them to the parser. Bison does not create 5922this function automatically; you must write it so that `yyparse' can 5923call it. The function is sometimes referred to as a lexical scanner. 5924 5925 In simple programs, `yylex' is often defined at the end of the Bison 5926grammar file. If `yylex' is defined in a separate source file, you 5927need to arrange for the token-type macro definitions to be available 5928there. To do this, use the `-d' option when you run Bison, so that it 5929will write these macro definitions into the separate parser header 5930file, `NAME.tab.h', which you can include in the other source files 5931that need it. *Note Invoking Bison: Invocation. 5932 5933* Menu: 5934 5935* Calling Convention:: How `yyparse' calls `yylex'. 5936* Token Values:: How `yylex' must return the semantic value 5937 of the token it has read. 5938* Token Locations:: How `yylex' must return the text location 5939 (line number, etc.) of the token, if the 5940 actions want that. 5941* Pure Calling:: How the calling convention differs in a pure parser 5942 (*note A Pure (Reentrant) Parser: Pure Decl.). 5943 5944 5945File: bison.info, Node: Calling Convention, Next: Token Values, Up: Lexical 5946 59474.6.1 Calling Convention for `yylex' 5948------------------------------------ 5949 5950The value that `yylex' returns must be the positive numeric code for 5951the type of token it has just found; a zero or negative value signifies 5952end-of-input. 5953 5954 When a token is referred to in the grammar rules by a name, that name 5955in the parser implementation file becomes a C macro whose definition is 5956the proper numeric code for that token type. So `yylex' can use the 5957name to indicate that type. *Note Symbols::. 5958 5959 When a token is referred to in the grammar rules by a character 5960literal, the numeric code for that character is also the code for the 5961token type. So `yylex' can simply return that character code, possibly 5962converted to `unsigned char' to avoid sign-extension. The null 5963character must not be used this way, because its code is zero and that 5964signifies end-of-input. 5965 5966 Here is an example showing these things: 5967 5968 int 5969 yylex (void) 5970 { 5971 ... 5972 if (c == EOF) /* Detect end-of-input. */ 5973 return 0; 5974 ... 5975 if (c == '+' || c == '-') 5976 return c; /* Assume token type for `+' is '+'. */ 5977 ... 5978 return INT; /* Return the type of the token. */ 5979 ... 5980 } 5981 5982This interface has been designed so that the output from the `lex' 5983utility can be used without change as the definition of `yylex'. 5984 5985 If the grammar uses literal string tokens, there are two ways that 5986`yylex' can determine the token type codes for them: 5987 5988 * If the grammar defines symbolic token names as aliases for the 5989 literal string tokens, `yylex' can use these symbolic names like 5990 all others. In this case, the use of the literal string tokens in 5991 the grammar file has no effect on `yylex'. 5992 5993 * `yylex' can find the multicharacter token in the `yytname' table. 5994 The index of the token in the table is the token type's code. The 5995 name of a multicharacter token is recorded in `yytname' with a 5996 double-quote, the token's characters, and another double-quote. 5997 The token's characters are escaped as necessary to be suitable as 5998 input to Bison. 5999 6000 Here's code for looking up a multicharacter token in `yytname', 6001 assuming that the characters of the token are stored in 6002 `token_buffer', and assuming that the token does not contain any 6003 characters like `"' that require escaping. 6004 6005 for (i = 0; i < YYNTOKENS; i++) 6006 { 6007 if (yytname[i] != 0 6008 && yytname[i][0] == '"' 6009 && ! strncmp (yytname[i] + 1, token_buffer, 6010 strlen (token_buffer)) 6011 && yytname[i][strlen (token_buffer) + 1] == '"' 6012 && yytname[i][strlen (token_buffer) + 2] == 0) 6013 break; 6014 } 6015 6016 The `yytname' table is generated only if you use the 6017 `%token-table' declaration. *Note Decl Summary::. 6018 6019 6020File: bison.info, Node: Token Values, Next: Token Locations, Prev: Calling Convention, Up: Lexical 6021 60224.6.2 Semantic Values of Tokens 6023------------------------------- 6024 6025In an ordinary (nonreentrant) parser, the semantic value of the token 6026must be stored into the global variable `yylval'. When you are using 6027just one data type for semantic values, `yylval' has that type. Thus, 6028if the type is `int' (the default), you might write this in `yylex': 6029 6030 ... 6031 yylval = value; /* Put value onto Bison stack. */ 6032 return INT; /* Return the type of the token. */ 6033 ... 6034 6035 When you are using multiple data types, `yylval''s type is a union 6036made from the `%union' declaration (*note The Collection of Value 6037Types: Union Decl.). So when you store a token's value, you must use 6038the proper member of the union. If the `%union' declaration looks like 6039this: 6040 6041 %union { 6042 int intval; 6043 double val; 6044 symrec *tptr; 6045 } 6046 6047then the code in `yylex' might look like this: 6048 6049 ... 6050 yylval.intval = value; /* Put value onto Bison stack. */ 6051 return INT; /* Return the type of the token. */ 6052 ... 6053 6054 6055File: bison.info, Node: Token Locations, Next: Pure Calling, Prev: Token Values, Up: Lexical 6056 60574.6.3 Textual Locations of Tokens 6058--------------------------------- 6059 6060If you are using the `@N'-feature (*note Tracking Locations::) in 6061actions to keep track of the textual locations of tokens and groupings, 6062then you must provide this information in `yylex'. The function 6063`yyparse' expects to find the textual location of a token just parsed 6064in the global variable `yylloc'. So `yylex' must store the proper data 6065in that variable. 6066 6067 By default, the value of `yylloc' is a structure and you need only 6068initialize the members that are going to be used by the actions. The 6069four members are called `first_line', `first_column', `last_line' and 6070`last_column'. Note that the use of this feature makes the parser 6071noticeably slower. 6072 6073 The data type of `yylloc' has the name `YYLTYPE'. 6074 6075 6076File: bison.info, Node: Pure Calling, Prev: Token Locations, Up: Lexical 6077 60784.6.4 Calling Conventions for Pure Parsers 6079------------------------------------------ 6080 6081When you use the Bison declaration `%define api.pure full' to request a 6082pure, reentrant parser, the global communication variables `yylval' and 6083`yylloc' cannot be used. (*Note A Pure (Reentrant) Parser: Pure Decl.) 6084In such parsers the two global variables are replaced by pointers 6085passed as arguments to `yylex'. You must declare them as shown here, 6086and pass the information back by storing it through those pointers. 6087 6088 int 6089 yylex (YYSTYPE *lvalp, YYLTYPE *llocp) 6090 { 6091 ... 6092 *lvalp = value; /* Put value onto Bison stack. */ 6093 return INT; /* Return the type of the token. */ 6094 ... 6095 } 6096 6097 If the grammar file does not use the `@' constructs to refer to 6098textual locations, then the type `YYLTYPE' will not be defined. In 6099this case, omit the second argument; `yylex' will be called with only 6100one argument. 6101 6102 If you wish to pass the additional parameter data to `yylex', use 6103`%lex-param' just like `%parse-param' (*note Parser Function::). 6104 6105 -- Directive: lex-param {ARGUMENT-DECLARATION} 6106 Declare that the braced-code ARGUMENT-DECLARATION is an additional 6107 `yylex' argument declaration. 6108 6109For instance: 6110 6111 %lex-param {int *nastiness} 6112 6113results in the following signature: 6114 6115 int yylex (int *nastiness); 6116 6117If `%define api.pure full' (or just `%define api.pure') is added: 6118 6119 int yylex (YYSTYPE *lvalp, int *nastiness); 6120 6121 6122File: bison.info, Node: Error Reporting, Next: Action Features, Prev: Lexical, Up: Interface 6123 61244.7 The Error Reporting Function `yyerror' 6125========================================== 6126 6127The Bison parser detects a "syntax error" or "parse error" whenever it 6128reads a token which cannot satisfy any syntax rule. An action in the 6129grammar can also explicitly proclaim an error, using the macro 6130`YYERROR' (*note Special Features for Use in Actions: Action Features.). 6131 6132 The Bison parser expects to report the error by calling an error 6133reporting function named `yyerror', which you must supply. It is 6134called by `yyparse' whenever a syntax error is found, and it receives 6135one argument. For a syntax error, the string is normally 6136`"syntax error"'. 6137 6138 If you invoke the directive `%error-verbose' in the Bison 6139declarations section (*note The Bison Declarations Section: Bison 6140Declarations.), then Bison provides a more verbose and specific error 6141message string instead of just plain `"syntax error"'. However, that 6142message sometimes contains incorrect information if LAC is not enabled 6143(*note LAC::). 6144 6145 The parser can detect one other kind of error: memory exhaustion. 6146This can happen when the input contains constructions that are very 6147deeply nested. It isn't likely you will encounter this, since the Bison 6148parser normally extends its stack automatically up to a very large 6149limit. But if memory is exhausted, `yyparse' calls `yyerror' in the 6150usual fashion, except that the argument string is `"memory exhausted"'. 6151 6152 In some cases diagnostics like `"syntax error"' are translated 6153automatically from English to some other language before they are 6154passed to `yyerror'. *Note Internationalization::. 6155 6156 The following definition suffices in simple programs: 6157 6158 void 6159 yyerror (char const *s) 6160 { 6161 fprintf (stderr, "%s\n", s); 6162 } 6163 6164 After `yyerror' returns to `yyparse', the latter will attempt error 6165recovery if you have written suitable error recovery grammar rules 6166(*note Error Recovery::). If recovery is impossible, `yyparse' will 6167immediately return 1. 6168 6169 Obviously, in location tracking pure parsers, `yyerror' should have 6170an access to the current location. With `%define api.pure', this is 6171indeed the case for the GLR parsers, but not for the Yacc parser, for 6172historical reasons, and this is the why `%define api.pure full' should 6173be prefered over `%define api.pure'. 6174 6175 When `%locations %define api.pure full' is used, `yyerror' has the 6176following signature: 6177 6178 void yyerror (YYLTYPE *locp, char const *msg); 6179 6180The prototypes are only indications of how the code produced by Bison 6181uses `yyerror'. Bison-generated code always ignores the returned 6182value, so `yyerror' can return any type, including `void'. Also, 6183`yyerror' can be a variadic function; that is why the message is always 6184passed last. 6185 6186 Traditionally `yyerror' returns an `int' that is always ignored, but 6187this is purely for historical reasons, and `void' is preferable since 6188it more accurately describes the return type for `yyerror'. 6189 6190 The variable `yynerrs' contains the number of syntax errors reported 6191so far. Normally this variable is global; but if you request a pure 6192parser (*note A Pure (Reentrant) Parser: Pure Decl.) then it is a 6193local variable which only the actions can access. 6194 6195 6196File: bison.info, Node: Action Features, Next: Internationalization, Prev: Error Reporting, Up: Interface 6197 61984.8 Special Features for Use in Actions 6199======================================= 6200 6201Here is a table of Bison constructs, variables and macros that are 6202useful in actions. 6203 6204 -- Variable: $$ 6205 Acts like a variable that contains the semantic value for the 6206 grouping made by the current rule. *Note Actions::. 6207 6208 -- Variable: $N 6209 Acts like a variable that contains the semantic value for the Nth 6210 component of the current rule. *Note Actions::. 6211 6212 -- Variable: $<TYPEALT>$ 6213 Like `$$' but specifies alternative TYPEALT in the union specified 6214 by the `%union' declaration. *Note Data Types of Values in 6215 Actions: Action Types. 6216 6217 -- Variable: $<TYPEALT>N 6218 Like `$N' but specifies alternative TYPEALT in the union specified 6219 by the `%union' declaration. *Note Data Types of Values in 6220 Actions: Action Types. 6221 6222 -- Macro: YYABORT `;' 6223 Return immediately from `yyparse', indicating failure. *Note The 6224 Parser Function `yyparse': Parser Function. 6225 6226 -- Macro: YYACCEPT `;' 6227 Return immediately from `yyparse', indicating success. *Note The 6228 Parser Function `yyparse': Parser Function. 6229 6230 -- Macro: YYBACKUP (TOKEN, VALUE)`;' 6231 Unshift a token. This macro is allowed only for rules that reduce 6232 a single value, and only when there is no lookahead token. It is 6233 also disallowed in GLR parsers. It installs a lookahead token 6234 with token type TOKEN and semantic value VALUE; then it discards 6235 the value that was going to be reduced by this rule. 6236 6237 If the macro is used when it is not valid, such as when there is a 6238 lookahead token already, then it reports a syntax error with a 6239 message `cannot back up' and performs ordinary error recovery. 6240 6241 In either case, the rest of the action is not executed. 6242 6243 -- Macro: YYEMPTY 6244 Value stored in `yychar' when there is no lookahead token. 6245 6246 -- Macro: YYEOF 6247 Value stored in `yychar' when the lookahead is the end of the input 6248 stream. 6249 6250 -- Macro: YYERROR `;' 6251 Cause an immediate syntax error. This statement initiates error 6252 recovery just as if the parser itself had detected an error; 6253 however, it does not call `yyerror', and does not print any 6254 message. If you want to print an error message, call `yyerror' 6255 explicitly before the `YYERROR;' statement. *Note Error 6256 Recovery::. 6257 6258 -- Macro: YYRECOVERING 6259 The expression `YYRECOVERING ()' yields 1 when the parser is 6260 recovering from a syntax error, and 0 otherwise. *Note Error 6261 Recovery::. 6262 6263 -- Variable: yychar 6264 Variable containing either the lookahead token, or `YYEOF' when the 6265 lookahead is the end of the input stream, or `YYEMPTY' when no 6266 lookahead has been performed so the next token is not yet known. 6267 Do not modify `yychar' in a deferred semantic action (*note GLR 6268 Semantic Actions::). *Note Lookahead Tokens: Lookahead. 6269 6270 -- Macro: yyclearin `;' 6271 Discard the current lookahead token. This is useful primarily in 6272 error rules. Do not invoke `yyclearin' in a deferred semantic 6273 action (*note GLR Semantic Actions::). *Note Error Recovery::. 6274 6275 -- Macro: yyerrok `;' 6276 Resume generating error messages immediately for subsequent syntax 6277 errors. This is useful primarily in error rules. *Note Error 6278 Recovery::. 6279 6280 -- Variable: yylloc 6281 Variable containing the lookahead token location when `yychar' is 6282 not set to `YYEMPTY' or `YYEOF'. Do not modify `yylloc' in a 6283 deferred semantic action (*note GLR Semantic Actions::). *Note 6284 Actions and Locations: Actions and Locations. 6285 6286 -- Variable: yylval 6287 Variable containing the lookahead token semantic value when 6288 `yychar' is not set to `YYEMPTY' or `YYEOF'. Do not modify 6289 `yylval' in a deferred semantic action (*note GLR Semantic 6290 Actions::). *Note Actions: Actions. 6291 6292 -- Value: @$ 6293 Acts like a structure variable containing information on the 6294 textual location of the grouping made by the current rule. *Note 6295 Tracking Locations::. 6296 6297 6298 -- Value: @N 6299 Acts like a structure variable containing information on the 6300 textual location of the Nth component of the current rule. *Note 6301 Tracking Locations::. 6302 6303 6304File: bison.info, Node: Internationalization, Prev: Action Features, Up: Interface 6305 63064.9 Parser Internationalization 6307=============================== 6308 6309A Bison-generated parser can print diagnostics, including error and 6310tracing messages. By default, they appear in English. However, Bison 6311also supports outputting diagnostics in the user's native language. To 6312make this work, the user should set the usual environment variables. 6313*Note The User's View: (gettext)Users. For example, the shell command 6314`export LC_ALL=fr_CA.UTF-8' might set the user's locale to French 6315Canadian using the UTF-8 encoding. The exact set of available locales 6316depends on the user's installation. 6317 6318 The maintainer of a package that uses a Bison-generated parser 6319enables the internationalization of the parser's output through the 6320following steps. Here we assume a package that uses GNU Autoconf and 6321GNU Automake. 6322 6323 1. Into the directory containing the GNU Autoconf macros used by the 6324 package --often called `m4'-- copy the `bison-i18n.m4' file 6325 installed by Bison under `share/aclocal/bison-i18n.m4' in Bison's 6326 installation directory. For example: 6327 6328 cp /usr/local/share/aclocal/bison-i18n.m4 m4/bison-i18n.m4 6329 6330 2. In the top-level `configure.ac', after the `AM_GNU_GETTEXT' 6331 invocation, add an invocation of `BISON_I18N'. This macro is 6332 defined in the file `bison-i18n.m4' that you copied earlier. It 6333 causes `configure' to find the value of the `BISON_LOCALEDIR' 6334 variable, and it defines the source-language symbol `YYENABLE_NLS' 6335 to enable translations in the Bison-generated parser. 6336 6337 3. In the `main' function of your program, designate the directory 6338 containing Bison's runtime message catalog, through a call to 6339 `bindtextdomain' with domain name `bison-runtime'. For example: 6340 6341 bindtextdomain ("bison-runtime", BISON_LOCALEDIR); 6342 6343 Typically this appears after any other call `bindtextdomain 6344 (PACKAGE, LOCALEDIR)' that your package already has. Here we rely 6345 on `BISON_LOCALEDIR' to be defined as a string through the 6346 `Makefile'. 6347 6348 4. In the `Makefile.am' that controls the compilation of the `main' 6349 function, make `BISON_LOCALEDIR' available as a C preprocessor 6350 macro, either in `DEFS' or in `AM_CPPFLAGS'. For example: 6351 6352 DEFS = @DEFS@ -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"' 6353 6354 or: 6355 6356 AM_CPPFLAGS = -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"' 6357 6358 5. Finally, invoke the command `autoreconf' to generate the build 6359 infrastructure. 6360 6361 6362File: bison.info, Node: Algorithm, Next: Error Recovery, Prev: Interface, Up: Top 6363 63645 The Bison Parser Algorithm 6365**************************** 6366 6367As Bison reads tokens, it pushes them onto a stack along with their 6368semantic values. The stack is called the "parser stack". Pushing a 6369token is traditionally called "shifting". 6370 6371 For example, suppose the infix calculator has read `1 + 5 *', with a 6372`3' to come. The stack will have four elements, one for each token 6373that was shifted. 6374 6375 But the stack does not always have an element for each token read. 6376When the last N tokens and groupings shifted match the components of a 6377grammar rule, they can be combined according to that rule. This is 6378called "reduction". Those tokens and groupings are replaced on the 6379stack by a single grouping whose symbol is the result (left hand side) 6380of that rule. Running the rule's action is part of the process of 6381reduction, because this is what computes the semantic value of the 6382resulting grouping. 6383 6384 For example, if the infix calculator's parser stack contains this: 6385 6386 1 + 5 * 3 6387 6388and the next input token is a newline character, then the last three 6389elements can be reduced to 15 via the rule: 6390 6391 expr: expr '*' expr; 6392 6393Then the stack contains just these three elements: 6394 6395 1 + 15 6396 6397At this point, another reduction can be made, resulting in the single 6398value 16. Then the newline token can be shifted. 6399 6400 The parser tries, by shifts and reductions, to reduce the entire 6401input down to a single grouping whose symbol is the grammar's 6402start-symbol (*note Languages and Context-Free Grammars: Language and 6403Grammar.). 6404 6405 This kind of parser is known in the literature as a bottom-up parser. 6406 6407* Menu: 6408 6409* Lookahead:: Parser looks one token ahead when deciding what to do. 6410* Shift/Reduce:: Conflicts: when either shifting or reduction is valid. 6411* Precedence:: Operator precedence works by resolving conflicts. 6412* Contextual Precedence:: When an operator's precedence depends on context. 6413* Parser States:: The parser is a finite-state-machine with stack. 6414* Reduce/Reduce:: When two rules are applicable in the same situation. 6415* Mysterious Conflicts:: Conflicts that look unjustified. 6416* Tuning LR:: How to tune fundamental aspects of LR-based parsing. 6417* Generalized LR Parsing:: Parsing arbitrary context-free grammars. 6418* Memory Management:: What happens when memory is exhausted. How to avoid it. 6419 6420 6421File: bison.info, Node: Lookahead, Next: Shift/Reduce, Up: Algorithm 6422 64235.1 Lookahead Tokens 6424==================== 6425 6426The Bison parser does _not_ always reduce immediately as soon as the 6427last N tokens and groupings match a rule. This is because such a 6428simple strategy is inadequate to handle most languages. Instead, when a 6429reduction is possible, the parser sometimes "looks ahead" at the next 6430token in order to decide what to do. 6431 6432 When a token is read, it is not immediately shifted; first it 6433becomes the "lookahead token", which is not on the stack. Now the 6434parser can perform one or more reductions of tokens and groupings on 6435the stack, while the lookahead token remains off to the side. When no 6436more reductions should take place, the lookahead token is shifted onto 6437the stack. This does not mean that all possible reductions have been 6438done; depending on the token type of the lookahead token, some rules 6439may choose to delay their application. 6440 6441 Here is a simple case where lookahead is needed. These three rules 6442define expressions which contain binary addition operators and postfix 6443unary factorial operators (`!'), and allow parentheses for grouping. 6444 6445 expr: 6446 term '+' expr 6447 | term 6448 ; 6449 6450 term: 6451 '(' expr ')' 6452 | term '!' 6453 | "number" 6454 ; 6455 6456 Suppose that the tokens `1 + 2' have been read and shifted; what 6457should be done? If the following token is `)', then the first three 6458tokens must be reduced to form an `expr'. This is the only valid 6459course, because shifting the `)' would produce a sequence of symbols 6460`term ')'', and no rule allows this. 6461 6462 If the following token is `!', then it must be shifted immediately so 6463that `2 !' can be reduced to make a `term'. If instead the parser were 6464to reduce before shifting, `1 + 2' would become an `expr'. It would 6465then be impossible to shift the `!' because doing so would produce on 6466the stack the sequence of symbols `expr '!''. No rule allows that 6467sequence. 6468 6469 The lookahead token is stored in the variable `yychar'. Its 6470semantic value and location, if any, are stored in the variables 6471`yylval' and `yylloc'. *Note Special Features for Use in Actions: 6472Action Features. 6473 6474 6475File: bison.info, Node: Shift/Reduce, Next: Precedence, Prev: Lookahead, Up: Algorithm 6476 64775.2 Shift/Reduce Conflicts 6478========================== 6479 6480Suppose we are parsing a language which has if-then and if-then-else 6481statements, with a pair of rules like this: 6482 6483 if_stmt: 6484 "if" expr "then" stmt 6485 | "if" expr "then" stmt "else" stmt 6486 ; 6487 6488Here `"if"', `"then"' and `"else"' are terminal symbols for specific 6489keyword tokens. 6490 6491 When the `"else"' token is read and becomes the lookahead token, the 6492contents of the stack (assuming the input is valid) are just right for 6493reduction by the first rule. But it is also legitimate to shift the 6494`"else"', because that would lead to eventual reduction by the second 6495rule. 6496 6497 This situation, where either a shift or a reduction would be valid, 6498is called a "shift/reduce conflict". Bison is designed to resolve 6499these conflicts by choosing to shift, unless otherwise directed by 6500operator precedence declarations. To see the reason for this, let's 6501contrast it with the other alternative. 6502 6503 Since the parser prefers to shift the `"else"', the result is to 6504attach the else-clause to the innermost if-statement, making these two 6505inputs equivalent: 6506 6507 if x then if y then win; else lose; 6508 6509 if x then do; if y then win; else lose; end; 6510 6511 But if the parser chose to reduce when possible rather than shift, 6512the result would be to attach the else-clause to the outermost 6513if-statement, making these two inputs equivalent: 6514 6515 if x then if y then win; else lose; 6516 6517 if x then do; if y then win; end; else lose; 6518 6519 The conflict exists because the grammar as written is ambiguous: 6520either parsing of the simple nested if-statement is legitimate. The 6521established convention is that these ambiguities are resolved by 6522attaching the else-clause to the innermost if-statement; this is what 6523Bison accomplishes by choosing to shift rather than reduce. (It would 6524ideally be cleaner to write an unambiguous grammar, but that is very 6525hard to do in this case.) This particular ambiguity was first 6526encountered in the specifications of Algol 60 and is called the 6527"dangling `else'" ambiguity. 6528 6529 To avoid warnings from Bison about predictable, legitimate 6530shift/reduce conflicts, you can use the `%expect N' declaration. There 6531will be no warning as long as the number of shift/reduce conflicts is 6532exactly N, and Bison will report an error if there is a different 6533number. *Note Suppressing Conflict Warnings: Expect Decl. However, we 6534don't recommend the use of `%expect' (except `%expect 0'!), as an equal 6535number of conflicts does not mean that they are the _same_. When 6536possible, you should rather use precedence directives to _fix_ the 6537conflicts explicitly (*note Using Precedence For Non Operators: Non 6538Operators.). 6539 6540 The definition of `if_stmt' above is solely to blame for the 6541conflict, but the conflict does not actually appear without additional 6542rules. Here is a complete Bison grammar file that actually manifests 6543the conflict: 6544 6545 %% 6546 stmt: 6547 expr 6548 | if_stmt 6549 ; 6550 6551 if_stmt: 6552 "if" expr "then" stmt 6553 | "if" expr "then" stmt "else" stmt 6554 ; 6555 6556 expr: 6557 "identifier" 6558 ; 6559 6560 6561File: bison.info, Node: Precedence, Next: Contextual Precedence, Prev: Shift/Reduce, Up: Algorithm 6562 65635.3 Operator Precedence 6564======================= 6565 6566Another situation where shift/reduce conflicts appear is in arithmetic 6567expressions. Here shifting is not always the preferred resolution; the 6568Bison declarations for operator precedence allow you to specify when to 6569shift and when to reduce. 6570 6571* Menu: 6572 6573* Why Precedence:: An example showing why precedence is needed. 6574* Using Precedence:: How to specify precedence in Bison grammars. 6575* Precedence Examples:: How these features are used in the previous example. 6576* How Precedence:: How they work. 6577* Non Operators:: Using precedence for general conflicts. 6578 6579 6580File: bison.info, Node: Why Precedence, Next: Using Precedence, Up: Precedence 6581 65825.3.1 When Precedence is Needed 6583------------------------------- 6584 6585Consider the following ambiguous grammar fragment (ambiguous because the 6586input `1 - 2 * 3' can be parsed in two different ways): 6587 6588 expr: 6589 expr '-' expr 6590 | expr '*' expr 6591 | expr '<' expr 6592 | '(' expr ')' 6593 ... 6594 ; 6595 6596Suppose the parser has seen the tokens `1', `-' and `2'; should it 6597reduce them via the rule for the subtraction operator? It depends on 6598the next token. Of course, if the next token is `)', we must reduce; 6599shifting is invalid because no single rule can reduce the token 6600sequence `- 2 )' or anything starting with that. But if the next token 6601is `*' or `<', we have a choice: either shifting or reduction would 6602allow the parse to complete, but with different results. 6603 6604 To decide which one Bison should do, we must consider the results. 6605If the next operator token OP is shifted, then it must be reduced first 6606in order to permit another opportunity to reduce the difference. The 6607result is (in effect) `1 - (2 OP 3)'. On the other hand, if the 6608subtraction is reduced before shifting OP, the result is 6609`(1 - 2) OP 3'. Clearly, then, the choice of shift or reduce should 6610depend on the relative precedence of the operators `-' and OP: `*' 6611should be shifted first, but not `<'. 6612 6613 What about input such as `1 - 2 - 5'; should this be `(1 - 2) - 5' 6614or should it be `1 - (2 - 5)'? For most operators we prefer the 6615former, which is called "left association". The latter alternative, 6616"right association", is desirable for assignment operators. The choice 6617of left or right association is a matter of whether the parser chooses 6618to shift or reduce when the stack contains `1 - 2' and the lookahead 6619token is `-': shifting makes right-associativity. 6620 6621 6622File: bison.info, Node: Using Precedence, Next: Precedence Examples, Prev: Why Precedence, Up: Precedence 6623 66245.3.2 Specifying Operator Precedence 6625------------------------------------ 6626 6627Bison allows you to specify these choices with the operator precedence 6628declarations `%left' and `%right'. Each such declaration contains a 6629list of tokens, which are operators whose precedence and associativity 6630is being declared. The `%left' declaration makes all those operators 6631left-associative and the `%right' declaration makes them 6632right-associative. A third alternative is `%nonassoc', which declares 6633that it is a syntax error to find the same operator twice "in a row". 6634 6635 The relative precedence of different operators is controlled by the 6636order in which they are declared. The first `%left' or `%right' 6637declaration in the file declares the operators whose precedence is 6638lowest, the next such declaration declares the operators whose 6639precedence is a little higher, and so on. 6640 6641 6642File: bison.info, Node: Precedence Examples, Next: How Precedence, Prev: Using Precedence, Up: Precedence 6643 66445.3.3 Precedence Examples 6645------------------------- 6646 6647In our example, we would want the following declarations: 6648 6649 %left '<' 6650 %left '-' 6651 %left '*' 6652 6653 In a more complete example, which supports other operators as well, 6654we would declare them in groups of equal precedence. For example, 6655`'+'' is declared with `'-'': 6656 6657 %left '<' '>' '=' "!=" "<=" ">=" 6658 %left '+' '-' 6659 %left '*' '/' 6660 6661 6662File: bison.info, Node: How Precedence, Next: Non Operators, Prev: Precedence Examples, Up: Precedence 6663 66645.3.4 How Precedence Works 6665-------------------------- 6666 6667The first effect of the precedence declarations is to assign precedence 6668levels to the terminal symbols declared. The second effect is to assign 6669precedence levels to certain rules: each rule gets its precedence from 6670the last terminal symbol mentioned in the components. (You can also 6671specify explicitly the precedence of a rule. *Note Context-Dependent 6672Precedence: Contextual Precedence.) 6673 6674 Finally, the resolution of conflicts works by comparing the 6675precedence of the rule being considered with that of the lookahead 6676token. If the token's precedence is higher, the choice is to shift. 6677If the rule's precedence is higher, the choice is to reduce. If they 6678have equal precedence, the choice is made based on the associativity of 6679that precedence level. The verbose output file made by `-v' (*note 6680Invoking Bison: Invocation.) says how each conflict was resolved. 6681 6682 Not all rules and not all tokens have precedence. If either the 6683rule or the lookahead token has no precedence, then the default is to 6684shift. 6685 6686 6687File: bison.info, Node: Non Operators, Prev: How Precedence, Up: Precedence 6688 66895.3.5 Using Precedence For Non Operators 6690---------------------------------------- 6691 6692Using properly precedence and associativity directives can help fixing 6693shift/reduce conflicts that do not involve arithmetics-like operators. 6694For instance, the "dangling `else'" problem (*note Shift/Reduce 6695Conflicts: Shift/Reduce.) can be solved elegantly in two different ways. 6696 6697 In the present case, the conflict is between the token `"else"' 6698willing to be shifted, and the rule `if_stmt: "if" expr "then" stmt', 6699asking for reduction. By default, the precedence of a rule is that of 6700its last token, here `"then"', so the conflict will be solved 6701appropriately by giving `"else"' a precedence higher than that of 6702`"then"', for instance as follows: 6703 6704 %nonassoc "then" 6705 %nonassoc "else" 6706 6707 Alternatively, you may give both tokens the same precedence, in 6708which case associativity is used to solve the conflict. To preserve 6709the shift action, use right associativity: 6710 6711 %right "then" "else" 6712 6713 Neither solution is perfect however. Since Bison does not provide, 6714so far, support for "scoped" precedence, both force you to declare the 6715precedence of these keywords with respect to the other operators your 6716grammar. Therefore, instead of being warned about new conflicts you 6717would be unaware of (e.g., a shift/reduce conflict due to `if test then 67181 else 2 + 3' being ambiguous: `if test then 1 else (2 + 3)' or `(if 6719test then 1 else 2) + 3'?), the conflict will be already "fixed". 6720 6721 6722File: bison.info, Node: Contextual Precedence, Next: Parser States, Prev: Precedence, Up: Algorithm 6723 67245.4 Context-Dependent Precedence 6725================================ 6726 6727Often the precedence of an operator depends on the context. This sounds 6728outlandish at first, but it is really very common. For example, a minus 6729sign typically has a very high precedence as a unary operator, and a 6730somewhat lower precedence (lower than multiplication) as a binary 6731operator. 6732 6733 The Bison precedence declarations, `%left', `%right' and 6734`%nonassoc', can only be used once for a given token; so a token has 6735only one precedence declared in this way. For context-dependent 6736precedence, you need to use an additional mechanism: the `%prec' 6737modifier for rules. 6738 6739 The `%prec' modifier declares the precedence of a particular rule by 6740specifying a terminal symbol whose precedence should be used for that 6741rule. It's not necessary for that symbol to appear otherwise in the 6742rule. The modifier's syntax is: 6743 6744 %prec TERMINAL-SYMBOL 6745 6746and it is written after the components of the rule. Its effect is to 6747assign the rule the precedence of TERMINAL-SYMBOL, overriding the 6748precedence that would be deduced for it in the ordinary way. The 6749altered rule precedence then affects how conflicts involving that rule 6750are resolved (*note Operator Precedence: Precedence.). 6751 6752 Here is how `%prec' solves the problem of unary minus. First, 6753declare a precedence for a fictitious terminal symbol named `UMINUS'. 6754There are no tokens of this type, but the symbol serves to stand for its 6755precedence: 6756 6757 ... 6758 %left '+' '-' 6759 %left '*' 6760 %left UMINUS 6761 6762 Now the precedence of `UMINUS' can be used in specific rules: 6763 6764 exp: 6765 ... 6766 | exp '-' exp 6767 ... 6768 | '-' exp %prec UMINUS 6769 6770 6771File: bison.info, Node: Parser States, Next: Reduce/Reduce, Prev: Contextual Precedence, Up: Algorithm 6772 67735.5 Parser States 6774================= 6775 6776The function `yyparse' is implemented using a finite-state machine. 6777The values pushed on the parser stack are not simply token type codes; 6778they represent the entire sequence of terminal and nonterminal symbols 6779at or near the top of the stack. The current state collects all the 6780information about previous input which is relevant to deciding what to 6781do next. 6782 6783 Each time a lookahead token is read, the current parser state 6784together with the type of lookahead token are looked up in a table. 6785This table entry can say, "Shift the lookahead token." In this case, 6786it also specifies the new parser state, which is pushed onto the top of 6787the parser stack. Or it can say, "Reduce using rule number N." This 6788means that a certain number of tokens or groupings are taken off the 6789top of the stack, and replaced by one grouping. In other words, that 6790number of states are popped from the stack, and one new state is pushed. 6791 6792 There is one other alternative: the table can say that the lookahead 6793token is erroneous in the current state. This causes error processing 6794to begin (*note Error Recovery::). 6795 6796 6797File: bison.info, Node: Reduce/Reduce, Next: Mysterious Conflicts, Prev: Parser States, Up: Algorithm 6798 67995.6 Reduce/Reduce Conflicts 6800=========================== 6801 6802A reduce/reduce conflict occurs if there are two or more rules that 6803apply to the same sequence of input. This usually indicates a serious 6804error in the grammar. 6805 6806 For example, here is an erroneous attempt to define a sequence of 6807zero or more `word' groupings. 6808 6809 sequence: 6810 /* empty */ { printf ("empty sequence\n"); } 6811 | maybeword 6812 | sequence word { printf ("added word %s\n", $2); } 6813 ; 6814 6815 maybeword: 6816 /* empty */ { printf ("empty maybeword\n"); } 6817 | word { printf ("single word %s\n", $1); } 6818 ; 6819 6820The error is an ambiguity: there is more than one way to parse a single 6821`word' into a `sequence'. It could be reduced to a `maybeword' and 6822then into a `sequence' via the second rule. Alternatively, 6823nothing-at-all could be reduced into a `sequence' via the first rule, 6824and this could be combined with the `word' using the third rule for 6825`sequence'. 6826 6827 There is also more than one way to reduce nothing-at-all into a 6828`sequence'. This can be done directly via the first rule, or 6829indirectly via `maybeword' and then the second rule. 6830 6831 You might think that this is a distinction without a difference, 6832because it does not change whether any particular input is valid or 6833not. But it does affect which actions are run. One parsing order runs 6834the second rule's action; the other runs the first rule's action and 6835the third rule's action. In this example, the output of the program 6836changes. 6837 6838 Bison resolves a reduce/reduce conflict by choosing to use the rule 6839that appears first in the grammar, but it is very risky to rely on 6840this. Every reduce/reduce conflict must be studied and usually 6841eliminated. Here is the proper way to define `sequence': 6842 6843 sequence: 6844 /* empty */ { printf ("empty sequence\n"); } 6845 | sequence word { printf ("added word %s\n", $2); } 6846 ; 6847 6848 Here is another common error that yields a reduce/reduce conflict: 6849 6850 sequence: 6851 /* empty */ 6852 | sequence words 6853 | sequence redirects 6854 ; 6855 6856 words: 6857 /* empty */ 6858 | words word 6859 ; 6860 6861 redirects: 6862 /* empty */ 6863 | redirects redirect 6864 ; 6865 6866The intention here is to define a sequence which can contain either 6867`word' or `redirect' groupings. The individual definitions of 6868`sequence', `words' and `redirects' are error-free, but the three 6869together make a subtle ambiguity: even an empty input can be parsed in 6870infinitely many ways! 6871 6872 Consider: nothing-at-all could be a `words'. Or it could be two 6873`words' in a row, or three, or any number. It could equally well be a 6874`redirects', or two, or any number. Or it could be a `words' followed 6875by three `redirects' and another `words'. And so on. 6876 6877 Here are two ways to correct these rules. First, to make it a 6878single level of sequence: 6879 6880 sequence: 6881 /* empty */ 6882 | sequence word 6883 | sequence redirect 6884 ; 6885 6886 Second, to prevent either a `words' or a `redirects' from being 6887empty: 6888 6889 sequence: 6890 /* empty */ 6891 | sequence words 6892 | sequence redirects 6893 ; 6894 6895 words: 6896 word 6897 | words word 6898 ; 6899 6900 redirects: 6901 redirect 6902 | redirects redirect 6903 ; 6904 6905 Yet this proposal introduces another kind of ambiguity! The input 6906`word word' can be parsed as a single `words' composed of two `word's, 6907or as two one-`word' `words' (and likewise for `redirect'/`redirects'). 6908However this ambiguity is now a shift/reduce conflict, and therefore it 6909can now be addressed with precedence directives. 6910 6911 To simplify the matter, we will proceed with `word' and `redirect' 6912being tokens: `"word"' and `"redirect"'. 6913 6914 To prefer the longest `words', the conflict between the token 6915`"word"' and the rule `sequence: sequence words' must be resolved as a 6916shift. To this end, we use the same techniques as exposed above, see 6917*note Using Precedence For Non Operators: Non Operators. One solution 6918relies on precedences: use `%prec' to give a lower precedence to the 6919rule: 6920 6921 %nonassoc "word" 6922 %nonassoc "sequence" 6923 %% 6924 sequence: 6925 /* empty */ 6926 | sequence word %prec "sequence" 6927 | sequence redirect %prec "sequence" 6928 ; 6929 6930 words: 6931 word 6932 | words "word" 6933 ; 6934 6935 Another solution relies on associativity: provide both the token and 6936the rule with the same precedence, but make them right-associative: 6937 6938 %right "word" "redirect" 6939 %% 6940 sequence: 6941 /* empty */ 6942 | sequence word %prec "word" 6943 | sequence redirect %prec "redirect" 6944 ; 6945 6946 6947File: bison.info, Node: Mysterious Conflicts, Next: Tuning LR, Prev: Reduce/Reduce, Up: Algorithm 6948 69495.7 Mysterious Conflicts 6950======================== 6951 6952Sometimes reduce/reduce conflicts can occur that don't look warranted. 6953Here is an example: 6954 6955 %% 6956 def: param_spec return_spec ','; 6957 param_spec: 6958 type 6959 | name_list ':' type 6960 ; 6961 return_spec: 6962 type 6963 | name ':' type 6964 ; 6965 type: "id"; 6966 name: "id"; 6967 name_list: 6968 name 6969 | name ',' name_list 6970 ; 6971 6972 It would seem that this grammar can be parsed with only a single 6973token of lookahead: when a `param_spec' is being read, an `"id"' is a 6974`name' if a comma or colon follows, or a `type' if another `"id"' 6975follows. In other words, this grammar is LR(1). 6976 6977 However, for historical reasons, Bison cannot by default handle all 6978LR(1) grammars. In this grammar, two contexts, that after an `"id"' at 6979the beginning of a `param_spec' and likewise at the beginning of a 6980`return_spec', are similar enough that Bison assumes they are the same. 6981They appear similar because the same set of rules would be active--the 6982rule for reducing to a `name' and that for reducing to a `type'. Bison 6983is unable to determine at that stage of processing that the rules would 6984require different lookahead tokens in the two contexts, so it makes a 6985single parser state for them both. Combining the two contexts causes a 6986conflict later. In parser terminology, this occurrence means that the 6987grammar is not LALR(1). 6988 6989 For many practical grammars (specifically those that fall into the 6990non-LR(1) class), the limitations of LALR(1) result in difficulties 6991beyond just mysterious reduce/reduce conflicts. The best way to fix 6992all these problems is to select a different parser table construction 6993algorithm. Either IELR(1) or canonical LR(1) would suffice, but the 6994former is more efficient and easier to debug during development. *Note 6995LR Table Construction::, for details. (Bison's IELR(1) and canonical 6996LR(1) implementations are experimental. More user feedback will help 6997to stabilize them.) 6998 6999 If you instead wish to work around LALR(1)'s limitations, you can 7000often fix a mysterious conflict by identifying the two parser states 7001that are being confused, and adding something to make them look 7002distinct. In the above example, adding one rule to `return_spec' as 7003follows makes the problem go away: 7004 7005 ... 7006 return_spec: 7007 type 7008 | name ':' type 7009 | "id" "bogus" /* This rule is never used. */ 7010 ; 7011 7012 This corrects the problem because it introduces the possibility of an 7013additional active rule in the context after the `"id"' at the beginning 7014of `return_spec'. This rule is not active in the corresponding context 7015in a `param_spec', so the two contexts receive distinct parser states. 7016As long as the token `"bogus"' is never generated by `yylex', the added 7017rule cannot alter the way actual input is parsed. 7018 7019 In this particular example, there is another way to solve the 7020problem: rewrite the rule for `return_spec' to use `"id"' directly 7021instead of via `name'. This also causes the two confusing contexts to 7022have different sets of active rules, because the one for `return_spec' 7023activates the altered rule for `return_spec' rather than the one for 7024`name'. 7025 7026 param_spec: 7027 type 7028 | name_list ':' type 7029 ; 7030 return_spec: 7031 type 7032 | "id" ':' type 7033 ; 7034 7035 For a more detailed exposition of LALR(1) parsers and parser 7036generators, *note DeRemer 1982: Bibliography. 7037 7038 7039File: bison.info, Node: Tuning LR, Next: Generalized LR Parsing, Prev: Mysterious Conflicts, Up: Algorithm 7040 70415.8 Tuning LR 7042============= 7043 7044The default behavior of Bison's LR-based parsers is chosen mostly for 7045historical reasons, but that behavior is often not robust. For 7046example, in the previous section, we discussed the mysterious conflicts 7047that can be produced by LALR(1), Bison's default parser table 7048construction algorithm. Another example is Bison's `%error-verbose' 7049directive, which instructs the generated parser to produce verbose 7050syntax error messages, which can sometimes contain incorrect 7051information. 7052 7053 In this section, we explore several modern features of Bison that 7054allow you to tune fundamental aspects of the generated LR-based 7055parsers. Some of these features easily eliminate shortcomings like 7056those mentioned above. Others can be helpful purely for understanding 7057your parser. 7058 7059 Most of the features discussed in this section are still 7060experimental. More user feedback will help to stabilize them. 7061 7062* Menu: 7063 7064* LR Table Construction:: Choose a different construction algorithm. 7065* Default Reductions:: Disable default reductions. 7066* LAC:: Correct lookahead sets in the parser states. 7067* Unreachable States:: Keep unreachable parser states for debugging. 7068 7069 7070File: bison.info, Node: LR Table Construction, Next: Default Reductions, Up: Tuning LR 7071 70725.8.1 LR Table Construction 7073--------------------------- 7074 7075For historical reasons, Bison constructs LALR(1) parser tables by 7076default. However, LALR does not possess the full language-recognition 7077power of LR. As a result, the behavior of parsers employing LALR 7078parser tables is often mysterious. We presented a simple example of 7079this effect in *note Mysterious Conflicts::. 7080 7081 As we also demonstrated in that example, the traditional approach to 7082eliminating such mysterious behavior is to restructure the grammar. 7083Unfortunately, doing so correctly is often difficult. Moreover, merely 7084discovering that LALR causes mysterious behavior in your parser can be 7085difficult as well. 7086 7087 Fortunately, Bison provides an easy way to eliminate the possibility 7088of such mysterious behavior altogether. You simply need to activate a 7089more powerful parser table construction algorithm by using the `%define 7090lr.type' directive. 7091 7092 -- Directive: %define lr.type TYPE 7093 Specify the type of parser tables within the LR(1) family. The 7094 accepted values for TYPE are: 7095 7096 * `lalr' (default) 7097 7098 * `ielr' 7099 7100 * `canonical-lr' 7101 7102 (This feature is experimental. More user feedback will help to 7103 stabilize it.) 7104 7105 For example, to activate IELR, you might add the following directive 7106to you grammar file: 7107 7108 %define lr.type ielr 7109 7110For the example in *note Mysterious Conflicts::, the mysterious 7111conflict is then eliminated, so there is no need to invest time in 7112comprehending the conflict or restructuring the grammar to fix it. If, 7113during future development, the grammar evolves such that all mysterious 7114behavior would have disappeared using just LALR, you need not fear that 7115continuing to use IELR will result in unnecessarily large parser tables. 7116That is, IELR generates LALR tables when LALR (using a deterministic 7117parsing algorithm) is sufficient to support the full 7118language-recognition power of LR. Thus, by enabling IELR at the start 7119of grammar development, you can safely and completely eliminate the 7120need to consider LALR's shortcomings. 7121 7122 While IELR is almost always preferable, there are circumstances 7123where LALR or the canonical LR parser tables described by Knuth (*note 7124Knuth 1965: Bibliography.) can be useful. Here we summarize the 7125relative advantages of each parser table construction algorithm within 7126Bison: 7127 7128 * LALR 7129 7130 There are at least two scenarios where LALR can be worthwhile: 7131 7132 * GLR without static conflict resolution. 7133 7134 When employing GLR parsers (*note GLR Parsers::), if you do 7135 not resolve any conflicts statically (for example, with 7136 `%left' or `%prec'), then the parser explores all potential 7137 parses of any given input. In this case, the choice of 7138 parser table construction algorithm is guaranteed not to alter 7139 the language accepted by the parser. LALR parser tables are 7140 the smallest parser tables Bison can currently construct, so 7141 they may then be preferable. Nevertheless, once you begin to 7142 resolve conflicts statically, GLR behaves more like a 7143 deterministic parser in the syntactic contexts where those 7144 conflicts appear, and so either IELR or canonical LR can then 7145 be helpful to avoid LALR's mysterious behavior. 7146 7147 * Malformed grammars. 7148 7149 Occasionally during development, an especially malformed 7150 grammar with a major recurring flaw may severely impede the 7151 IELR or canonical LR parser table construction algorithm. 7152 LALR can be a quick way to construct parser tables in order 7153 to investigate such problems while ignoring the more subtle 7154 differences from IELR and canonical LR. 7155 7156 * IELR 7157 7158 IELR (Inadequacy Elimination LR) is a minimal LR algorithm. That 7159 is, given any grammar (LR or non-LR), parsers using IELR or 7160 canonical LR parser tables always accept exactly the same set of 7161 sentences. However, like LALR, IELR merges parser states during 7162 parser table construction so that the number of parser states is 7163 often an order of magnitude less than for canonical LR. More 7164 importantly, because canonical LR's extra parser states may contain 7165 duplicate conflicts in the case of non-LR grammars, the number of 7166 conflicts for IELR is often an order of magnitude less as well. 7167 This effect can significantly reduce the complexity of developing 7168 a grammar. 7169 7170 * Canonical LR 7171 7172 While inefficient, canonical LR parser tables can be an 7173 interesting means to explore a grammar because they possess a 7174 property that IELR and LALR tables do not. That is, if 7175 `%nonassoc' is not used and default reductions are left disabled 7176 (*note Default Reductions::), then, for every left context of 7177 every canonical LR state, the set of tokens accepted by that state 7178 is guaranteed to be the exact set of tokens that is syntactically 7179 acceptable in that left context. It might then seem that an 7180 advantage of canonical LR parsers in production is that, under the 7181 above constraints, they are guaranteed to detect a syntax error as 7182 soon as possible without performing any unnecessary reductions. 7183 However, IELR parsers that use LAC are also able to achieve this 7184 behavior without sacrificing `%nonassoc' or default reductions. 7185 For details and a few caveats of LAC, *note LAC::. 7186 7187 For a more detailed exposition of the mysterious behavior in LALR 7188parsers and the benefits of IELR, *note Denny 2008 March: Bibliography, 7189and *note Denny 2010 November: Bibliography. 7190 7191 7192File: bison.info, Node: Default Reductions, Next: LAC, Prev: LR Table Construction, Up: Tuning LR 7193 71945.8.2 Default Reductions 7195------------------------ 7196 7197After parser table construction, Bison identifies the reduction with the 7198largest lookahead set in each parser state. To reduce the size of the 7199parser state, traditional Bison behavior is to remove that lookahead 7200set and to assign that reduction to be the default parser action. Such 7201a reduction is known as a "default reduction". 7202 7203 Default reductions affect more than the size of the parser tables. 7204They also affect the behavior of the parser: 7205 7206 * Delayed `yylex' invocations. 7207 7208 A "consistent state" is a state that has only one possible parser 7209 action. If that action is a reduction and is encoded as a default 7210 reduction, then that consistent state is called a "defaulted 7211 state". Upon reaching a defaulted state, a Bison-generated parser 7212 does not bother to invoke `yylex' to fetch the next token before 7213 performing the reduction. In other words, whether default 7214 reductions are enabled in consistent states determines how soon a 7215 Bison-generated parser invokes `yylex' for a token: immediately 7216 when it _reaches_ that token in the input or when it eventually 7217 _needs_ that token as a lookahead to determine the next parser 7218 action. Traditionally, default reductions are enabled, and so the 7219 parser exhibits the latter behavior. 7220 7221 The presence of defaulted states is an important consideration when 7222 designing `yylex' and the grammar file. That is, if the behavior 7223 of `yylex' can influence or be influenced by the semantic actions 7224 associated with the reductions in defaulted states, then the delay 7225 of the next `yylex' invocation until after those reductions is 7226 significant. For example, the semantic actions might pop a scope 7227 stack that `yylex' uses to determine what token to return. Thus, 7228 the delay might be necessary to ensure that `yylex' does not look 7229 up the next token in a scope that should already be considered 7230 closed. 7231 7232 * Delayed syntax error detection. 7233 7234 When the parser fetches a new token by invoking `yylex', it checks 7235 whether there is an action for that token in the current parser 7236 state. The parser detects a syntax error if and only if either 7237 (1) there is no action for that token or (2) the action for that 7238 token is the error action (due to the use of `%nonassoc'). 7239 However, if there is a default reduction in that state (which 7240 might or might not be a defaulted state), then it is impossible 7241 for condition 1 to exist. That is, all tokens have an action. 7242 Thus, the parser sometimes fails to detect the syntax error until 7243 it reaches a later state. 7244 7245 While default reductions never cause the parser to accept 7246 syntactically incorrect sentences, the delay of syntax error 7247 detection can have unexpected effects on the behavior of the 7248 parser. However, the delay can be caused anyway by parser state 7249 merging and the use of `%nonassoc', and it can be fixed by another 7250 Bison feature, LAC. We discuss the effects of delayed syntax 7251 error detection and LAC more in the next section (*note LAC::). 7252 7253 For canonical LR, the only default reduction that Bison enables by 7254default is the accept action, which appears only in the accepting 7255state, which has no other action and is thus a defaulted state. 7256However, the default accept action does not delay any `yylex' 7257invocation or syntax error detection because the accept action ends the 7258parse. 7259 7260 For LALR and IELR, Bison enables default reductions in nearly all 7261states by default. There are only two exceptions. First, states that 7262have a shift action on the `error' token do not have default reductions 7263because delayed syntax error detection could then prevent the `error' 7264token from ever being shifted in that state. However, parser state 7265merging can cause the same effect anyway, and LAC fixes it in both 7266cases, so future versions of Bison might drop this exception when LAC 7267is activated. Second, GLR parsers do not record the default reduction 7268as the action on a lookahead token for which there is a conflict. The 7269correct action in this case is to split the parse instead. 7270 7271 To adjust which states have default reductions enabled, use the 7272`%define lr.default-reductions' directive. 7273 7274 -- Directive: %define lr.default-reductions WHERE 7275 Specify the kind of states that are permitted to contain default 7276 reductions. The accepted values of WHERE are: 7277 * `most' (default for LALR and IELR) 7278 7279 * `consistent' 7280 7281 * `accepting' (default for canonical LR) 7282 7283 (The ability to specify where default reductions are permitted is 7284 experimental. More user feedback will help to stabilize it.) 7285 7286 7287File: bison.info, Node: LAC, Next: Unreachable States, Prev: Default Reductions, Up: Tuning LR 7288 72895.8.3 LAC 7290--------- 7291 7292Canonical LR, IELR, and LALR can suffer from a couple of problems upon 7293encountering a syntax error. First, the parser might perform additional 7294parser stack reductions before discovering the syntax error. Such 7295reductions can perform user semantic actions that are unexpected because 7296they are based on an invalid token, and they cause error recovery to 7297begin in a different syntactic context than the one in which the 7298invalid token was encountered. Second, when verbose error messages are 7299enabled (*note Error Reporting::), the expected token list in the 7300syntax error message can both contain invalid tokens and omit valid 7301tokens. 7302 7303 The culprits for the above problems are `%nonassoc', default 7304reductions in inconsistent states (*note Default Reductions::), and 7305parser state merging. Because IELR and LALR merge parser states, they 7306suffer the most. Canonical LR can suffer only if `%nonassoc' is used 7307or if default reductions are enabled for inconsistent states. 7308 7309 LAC (Lookahead Correction) is a new mechanism within the parsing 7310algorithm that solves these problems for canonical LR, IELR, and LALR 7311without sacrificing `%nonassoc', default reductions, or state merging. 7312You can enable LAC with the `%define parse.lac' directive. 7313 7314 -- Directive: %define parse.lac VALUE 7315 Enable LAC to improve syntax error handling. 7316 * `none' (default) 7317 7318 * `full' 7319 (This feature is experimental. More user feedback will help to 7320 stabilize it. Moreover, it is currently only available for 7321 deterministic parsers in C.) 7322 7323 Conceptually, the LAC mechanism is straight-forward. Whenever the 7324parser fetches a new token from the scanner so that it can determine 7325the next parser action, it immediately suspends normal parsing and 7326performs an exploratory parse using a temporary copy of the normal 7327parser state stack. During this exploratory parse, the parser does not 7328perform user semantic actions. If the exploratory parse reaches a 7329shift action, normal parsing then resumes on the normal parser stacks. 7330If the exploratory parse reaches an error instead, the parser reports a 7331syntax error. If verbose syntax error messages are enabled, the parser 7332must then discover the list of expected tokens, so it performs a 7333separate exploratory parse for each token in the grammar. 7334 7335 There is one subtlety about the use of LAC. That is, when in a 7336consistent parser state with a default reduction, the parser will not 7337attempt to fetch a token from the scanner because no lookahead is 7338needed to determine the next parser action. Thus, whether default 7339reductions are enabled in consistent states (*note Default 7340Reductions::) affects how soon the parser detects a syntax error: 7341immediately when it _reaches_ an erroneous token or when it eventually 7342_needs_ that token as a lookahead to determine the next parser action. 7343The latter behavior is probably more intuitive, so Bison currently 7344provides no way to achieve the former behavior while default reductions 7345are enabled in consistent states. 7346 7347 Thus, when LAC is in use, for some fixed decision of whether to 7348enable default reductions in consistent states, canonical LR and IELR 7349behave almost exactly the same for both syntactically acceptable and 7350syntactically unacceptable input. While LALR still does not support 7351the full language-recognition power of canonical LR and IELR, LAC at 7352least enables LALR's syntax error handling to correctly reflect LALR's 7353language-recognition power. 7354 7355 There are a few caveats to consider when using LAC: 7356 7357 * Infinite parsing loops. 7358 7359 IELR plus LAC does have one shortcoming relative to canonical LR. 7360 Some parsers generated by Bison can loop infinitely. LAC does not 7361 fix infinite parsing loops that occur between encountering a 7362 syntax error and detecting it, but enabling canonical LR or 7363 disabling default reductions sometimes does. 7364 7365 * Verbose error message limitations. 7366 7367 Because of internationalization considerations, Bison-generated 7368 parsers limit the size of the expected token list they are willing 7369 to report in a verbose syntax error message. If the number of 7370 expected tokens exceeds that limit, the list is simply dropped 7371 from the message. Enabling LAC can increase the size of the list 7372 and thus cause the parser to drop it. Of course, dropping the 7373 list is better than reporting an incorrect list. 7374 7375 * Performance. 7376 7377 Because LAC requires many parse actions to be performed twice, it 7378 can have a performance penalty. However, not all parse actions 7379 must be performed twice. Specifically, during a series of default 7380 reductions in consistent states and shift actions, the parser 7381 never has to initiate an exploratory parse. Moreover, the most 7382 time-consuming tasks in a parse are often the file I/O, the 7383 lexical analysis performed by the scanner, and the user's semantic 7384 actions, but none of these are performed during the exploratory 7385 parse. Finally, the base of the temporary stack used during an 7386 exploratory parse is a pointer into the normal parser state stack 7387 so that the stack is never physically copied. In our experience, 7388 the performance penalty of LAC has proved insignificant for 7389 practical grammars. 7390 7391 While the LAC algorithm shares techniques that have been recognized 7392in the parser community for years, for the publication that introduces 7393LAC, *note Denny 2010 May: Bibliography. 7394 7395 7396File: bison.info, Node: Unreachable States, Prev: LAC, Up: Tuning LR 7397 73985.8.4 Unreachable States 7399------------------------ 7400 7401If there exists no sequence of transitions from the parser's start 7402state to some state S, then Bison considers S to be an "unreachable 7403state". A state can become unreachable during conflict resolution if 7404Bison disables a shift action leading to it from a predecessor state. 7405 7406 By default, Bison removes unreachable states from the parser after 7407conflict resolution because they are useless in the generated parser. 7408However, keeping unreachable states is sometimes useful when trying to 7409understand the relationship between the parser and the grammar. 7410 7411 -- Directive: %define lr.keep-unreachable-states VALUE 7412 Request that Bison allow unreachable states to remain in the 7413 parser tables. VALUE must be a Boolean. The default is `false'. 7414 7415 There are a few caveats to consider: 7416 7417 * Missing or extraneous warnings. 7418 7419 Unreachable states may contain conflicts and may use rules not 7420 used in any other state. Thus, keeping unreachable states may 7421 induce warnings that are irrelevant to your parser's behavior, and 7422 it may eliminate warnings that are relevant. Of course, the 7423 change in warnings may actually be relevant to a parser table 7424 analysis that wants to keep unreachable states, so this behavior 7425 will likely remain in future Bison releases. 7426 7427 * Other useless states. 7428 7429 While Bison is able to remove unreachable states, it is not 7430 guaranteed to remove other kinds of useless states. Specifically, 7431 when Bison disables reduce actions during conflict resolution, 7432 some goto actions may become useless, and thus some additional 7433 states may become useless. If Bison were to compute which goto 7434 actions were useless and then disable those actions, it could 7435 identify such states as unreachable and then remove those states. 7436 However, Bison does not compute which goto actions are useless. 7437 7438 7439File: bison.info, Node: Generalized LR Parsing, Next: Memory Management, Prev: Tuning LR, Up: Algorithm 7440 74415.9 Generalized LR (GLR) Parsing 7442================================ 7443 7444Bison produces _deterministic_ parsers that choose uniquely when to 7445reduce and which reduction to apply based on a summary of the preceding 7446input and on one extra token of lookahead. As a result, normal Bison 7447handles a proper subset of the family of context-free languages. 7448Ambiguous grammars, since they have strings with more than one possible 7449sequence of reductions cannot have deterministic parsers in this sense. 7450The same is true of languages that require more than one symbol of 7451lookahead, since the parser lacks the information necessary to make a 7452decision at the point it must be made in a shift-reduce parser. 7453Finally, as previously mentioned (*note Mysterious Conflicts::), there 7454are languages where Bison's default choice of how to summarize the 7455input seen so far loses necessary information. 7456 7457 When you use the `%glr-parser' declaration in your grammar file, 7458Bison generates a parser that uses a different algorithm, called 7459Generalized LR (or GLR). A Bison GLR parser uses the same basic 7460algorithm for parsing as an ordinary Bison parser, but behaves 7461differently in cases where there is a shift-reduce conflict that has not 7462been resolved by precedence rules (*note Precedence::) or a 7463reduce-reduce conflict. When a GLR parser encounters such a situation, 7464it effectively _splits_ into a several parsers, one for each possible 7465shift or reduction. These parsers then proceed as usual, consuming 7466tokens in lock-step. Some of the stacks may encounter other conflicts 7467and split further, with the result that instead of a sequence of states, 7468a Bison GLR parsing stack is what is in effect a tree of states. 7469 7470 In effect, each stack represents a guess as to what the proper parse 7471is. Additional input may indicate that a guess was wrong, in which case 7472the appropriate stack silently disappears. Otherwise, the semantics 7473actions generated in each stack are saved, rather than being executed 7474immediately. When a stack disappears, its saved semantic actions never 7475get executed. When a reduction causes two stacks to become equivalent, 7476their sets of semantic actions are both saved with the state that 7477results from the reduction. We say that two stacks are equivalent when 7478they both represent the same sequence of states, and each pair of 7479corresponding states represents a grammar symbol that produces the same 7480segment of the input token stream. 7481 7482 Whenever the parser makes a transition from having multiple states 7483to having one, it reverts to the normal deterministic parsing 7484algorithm, after resolving and executing the saved-up actions. At this 7485transition, some of the states on the stack will have semantic values 7486that are sets (actually multisets) of possible actions. The parser 7487tries to pick one of the actions by first finding one whose rule has 7488the highest dynamic precedence, as set by the `%dprec' declaration. 7489Otherwise, if the alternative actions are not ordered by precedence, 7490but there the same merging function is declared for both rules by the 7491`%merge' declaration, Bison resolves and evaluates both and then calls 7492the merge function on the result. Otherwise, it reports an ambiguity. 7493 7494 It is possible to use a data structure for the GLR parsing tree that 7495permits the processing of any LR(1) grammar in linear time (in the size 7496of the input), any unambiguous (not necessarily LR(1)) grammar in 7497quadratic worst-case time, and any general (possibly ambiguous) 7498context-free grammar in cubic worst-case time. However, Bison currently 7499uses a simpler data structure that requires time proportional to the 7500length of the input times the maximum number of stacks required for any 7501prefix of the input. Thus, really ambiguous or nondeterministic 7502grammars can require exponential time and space to process. Such badly 7503behaving examples, however, are not generally of practical interest. 7504Usually, nondeterminism in a grammar is local--the parser is "in doubt" 7505only for a few tokens at a time. Therefore, the current data structure 7506should generally be adequate. On LR(1) portions of a grammar, in 7507particular, it is only slightly slower than with the deterministic 7508LR(1) Bison parser. 7509 7510 For a more detailed exposition of GLR parsers, *note Scott 2000: 7511Bibliography. 7512 7513 7514File: bison.info, Node: Memory Management, Prev: Generalized LR Parsing, Up: Algorithm 7515 75165.10 Memory Management, and How to Avoid Memory Exhaustion 7517========================================================== 7518 7519The Bison parser stack can run out of memory if too many tokens are 7520shifted and not reduced. When this happens, the parser function 7521`yyparse' calls `yyerror' and then returns 2. 7522 7523 Because Bison parsers have growing stacks, hitting the upper limit 7524usually results from using a right recursion instead of a left 7525recursion, see *note Recursive Rules: Recursion. 7526 7527 By defining the macro `YYMAXDEPTH', you can control how deep the 7528parser stack can become before memory is exhausted. Define the macro 7529with a value that is an integer. This value is the maximum number of 7530tokens that can be shifted (and not reduced) before overflow. 7531 7532 The stack space allowed is not necessarily allocated. If you 7533specify a large value for `YYMAXDEPTH', the parser normally allocates a 7534small stack at first, and then makes it bigger by stages as needed. 7535This increasing allocation happens automatically and silently. 7536Therefore, you do not need to make `YYMAXDEPTH' painfully small merely 7537to save space for ordinary inputs that do not need much stack. 7538 7539 However, do not allow `YYMAXDEPTH' to be a value so large that 7540arithmetic overflow could occur when calculating the size of the stack 7541space. Also, do not allow `YYMAXDEPTH' to be less than `YYINITDEPTH'. 7542 7543 The default value of `YYMAXDEPTH', if you do not define it, is 10000. 7544 7545 You can control how much stack is allocated initially by defining the 7546macro `YYINITDEPTH' to a positive integer. For the deterministic 7547parser in C, this value must be a compile-time constant unless you are 7548assuming C99 or some other target language or compiler that allows 7549variable-length arrays. The default is 200. 7550 7551 Do not allow `YYINITDEPTH' to be greater than `YYMAXDEPTH'. 7552 7553 Because of semantic differences between C and C++, the deterministic 7554parsers in C produced by Bison cannot grow when compiled by C++ 7555compilers. In this precise case (compiling a C parser as C++) you are 7556suggested to grow `YYINITDEPTH'. The Bison maintainers hope to fix 7557this deficiency in a future release. 7558 7559 7560File: bison.info, Node: Error Recovery, Next: Context Dependency, Prev: Algorithm, Up: Top 7561 75626 Error Recovery 7563**************** 7564 7565It is not usually acceptable to have a program terminate on a syntax 7566error. For example, a compiler should recover sufficiently to parse the 7567rest of the input file and check it for errors; a calculator should 7568accept another expression. 7569 7570 In a simple interactive command parser where each input is one line, 7571it may be sufficient to allow `yyparse' to return 1 on error and have 7572the caller ignore the rest of the input line when that happens (and 7573then call `yyparse' again). But this is inadequate for a compiler, 7574because it forgets all the syntactic context leading up to the error. 7575A syntax error deep within a function in the compiler input should not 7576cause the compiler to treat the following line like the beginning of a 7577source file. 7578 7579 You can define how to recover from a syntax error by writing rules to 7580recognize the special token `error'. This is a terminal symbol that is 7581always defined (you need not declare it) and reserved for error 7582handling. The Bison parser generates an `error' token whenever a 7583syntax error happens; if you have provided a rule to recognize this 7584token in the current context, the parse can continue. 7585 7586 For example: 7587 7588 stmts: 7589 /* empty string */ 7590 | stmts '\n' 7591 | stmts exp '\n' 7592 | stmts error '\n' 7593 7594 The fourth rule in this example says that an error followed by a 7595newline makes a valid addition to any `stmts'. 7596 7597 What happens if a syntax error occurs in the middle of an `exp'? The 7598error recovery rule, interpreted strictly, applies to the precise 7599sequence of a `stmts', an `error' and a newline. If an error occurs in 7600the middle of an `exp', there will probably be some additional tokens 7601and subexpressions on the stack after the last `stmts', and there will 7602be tokens to read before the next newline. So the rule is not 7603applicable in the ordinary way. 7604 7605 But Bison can force the situation to fit the rule, by discarding 7606part of the semantic context and part of the input. First it discards 7607states and objects from the stack until it gets back to a state in 7608which the `error' token is acceptable. (This means that the 7609subexpressions already parsed are discarded, back to the last complete 7610`stmts'.) At this point the `error' token can be shifted. Then, if 7611the old lookahead token is not acceptable to be shifted next, the 7612parser reads tokens and discards them until it finds a token which is 7613acceptable. In this example, Bison reads and discards input until the 7614next newline so that the fourth rule can apply. Note that discarded 7615symbols are possible sources of memory leaks, see *note Freeing 7616Discarded Symbols: Destructor Decl, for a means to reclaim this memory. 7617 7618 The choice of error rules in the grammar is a choice of strategies 7619for error recovery. A simple and useful strategy is simply to skip the 7620rest of the current input line or current statement if an error is 7621detected: 7622 7623 stmt: error ';' /* On error, skip until ';' is read. */ 7624 7625 It is also useful to recover to the matching close-delimiter of an 7626opening-delimiter that has already been parsed. Otherwise the 7627close-delimiter will probably appear to be unmatched, and generate 7628another, spurious error message: 7629 7630 primary: 7631 '(' expr ')' 7632 | '(' error ')' 7633 ... 7634 ; 7635 7636 Error recovery strategies are necessarily guesses. When they guess 7637wrong, one syntax error often leads to another. In the above example, 7638the error recovery rule guesses that an error is due to bad input 7639within one `stmt'. Suppose that instead a spurious semicolon is 7640inserted in the middle of a valid `stmt'. After the error recovery 7641rule recovers from the first error, another syntax error will be found 7642straightaway, since the text following the spurious semicolon is also 7643an invalid `stmt'. 7644 7645 To prevent an outpouring of error messages, the parser will output 7646no error message for another syntax error that happens shortly after 7647the first; only after three consecutive input tokens have been 7648successfully shifted will error messages resume. 7649 7650 Note that rules which accept the `error' token may have actions, just 7651as any other rules can. 7652 7653 You can make error messages resume immediately by using the macro 7654`yyerrok' in an action. If you do this in the error rule's action, no 7655error messages will be suppressed. This macro requires no arguments; 7656`yyerrok;' is a valid C statement. 7657 7658 The previous lookahead token is reanalyzed immediately after an 7659error. If this is unacceptable, then the macro `yyclearin' may be used 7660to clear this token. Write the statement `yyclearin;' in the error 7661rule's action. *Note Special Features for Use in Actions: Action 7662Features. 7663 7664 For example, suppose that on a syntax error, an error handling 7665routine is called that advances the input stream to some point where 7666parsing should once again commence. The next symbol returned by the 7667lexical scanner is probably correct. The previous lookahead token 7668ought to be discarded with `yyclearin;'. 7669 7670 The expression `YYRECOVERING ()' yields 1 when the parser is 7671recovering from a syntax error, and 0 otherwise. Syntax error 7672diagnostics are suppressed while recovering from a syntax error. 7673 7674 7675File: bison.info, Node: Context Dependency, Next: Debugging, Prev: Error Recovery, Up: Top 7676 76777 Handling Context Dependencies 7678******************************* 7679 7680The Bison paradigm is to parse tokens first, then group them into larger 7681syntactic units. In many languages, the meaning of a token is affected 7682by its context. Although this violates the Bison paradigm, certain 7683techniques (known as "kludges") may enable you to write Bison parsers 7684for such languages. 7685 7686* Menu: 7687 7688* Semantic Tokens:: Token parsing can depend on the semantic context. 7689* Lexical Tie-ins:: Token parsing can depend on the syntactic context. 7690* Tie-in Recovery:: Lexical tie-ins have implications for how 7691 error recovery rules must be written. 7692 7693 (Actually, "kludge" means any technique that gets its job done but is 7694neither clean nor robust.) 7695 7696 7697File: bison.info, Node: Semantic Tokens, Next: Lexical Tie-ins, Up: Context Dependency 7698 76997.1 Semantic Info in Token Types 7700================================ 7701 7702The C language has a context dependency: the way an identifier is used 7703depends on what its current meaning is. For example, consider this: 7704 7705 foo (x); 7706 7707 This looks like a function call statement, but if `foo' is a typedef 7708name, then this is actually a declaration of `x'. How can a Bison 7709parser for C decide how to parse this input? 7710 7711 The method used in GNU C is to have two different token types, 7712`IDENTIFIER' and `TYPENAME'. When `yylex' finds an identifier, it 7713looks up the current declaration of the identifier in order to decide 7714which token type to return: `TYPENAME' if the identifier is declared as 7715a typedef, `IDENTIFIER' otherwise. 7716 7717 The grammar rules can then express the context dependency by the 7718choice of token type to recognize. `IDENTIFIER' is accepted as an 7719expression, but `TYPENAME' is not. `TYPENAME' can start a declaration, 7720but `IDENTIFIER' cannot. In contexts where the meaning of the 7721identifier is _not_ significant, such as in declarations that can 7722shadow a typedef name, either `TYPENAME' or `IDENTIFIER' is 7723accepted--there is one rule for each of the two token types. 7724 7725 This technique is simple to use if the decision of which kinds of 7726identifiers to allow is made at a place close to where the identifier is 7727parsed. But in C this is not always so: C allows a declaration to 7728redeclare a typedef name provided an explicit type has been specified 7729earlier: 7730 7731 typedef int foo, bar; 7732 int baz (void) 7733 { 7734 static bar (bar); /* redeclare `bar' as static variable */ 7735 extern foo foo (foo); /* redeclare `foo' as function */ 7736 return foo (bar); 7737 } 7738 7739 Unfortunately, the name being declared is separated from the 7740declaration construct itself by a complicated syntactic structure--the 7741"declarator". 7742 7743 As a result, part of the Bison parser for C needs to be duplicated, 7744with all the nonterminal names changed: once for parsing a declaration 7745in which a typedef name can be redefined, and once for parsing a 7746declaration in which that can't be done. Here is a part of the 7747duplication, with actions omitted for brevity: 7748 7749 initdcl: 7750 declarator maybeasm '=' init 7751 | declarator maybeasm 7752 ; 7753 7754 notype_initdcl: 7755 notype_declarator maybeasm '=' init 7756 | notype_declarator maybeasm 7757 ; 7758 7759Here `initdcl' can redeclare a typedef name, but `notype_initdcl' 7760cannot. The distinction between `declarator' and `notype_declarator' 7761is the same sort of thing. 7762 7763 There is some similarity between this technique and a lexical tie-in 7764(described next), in that information which alters the lexical analysis 7765is changed during parsing by other parts of the program. The 7766difference is here the information is global, and is used for other 7767purposes in the program. A true lexical tie-in has a special-purpose 7768flag controlled by the syntactic context. 7769 7770 7771File: bison.info, Node: Lexical Tie-ins, Next: Tie-in Recovery, Prev: Semantic Tokens, Up: Context Dependency 7772 77737.2 Lexical Tie-ins 7774=================== 7775 7776One way to handle context-dependency is the "lexical tie-in": a flag 7777which is set by Bison actions, whose purpose is to alter the way tokens 7778are parsed. 7779 7780 For example, suppose we have a language vaguely like C, but with a 7781special construct `hex (HEX-EXPR)'. After the keyword `hex' comes an 7782expression in parentheses in which all integers are hexadecimal. In 7783particular, the token `a1b' must be treated as an integer rather than 7784as an identifier if it appears in that context. Here is how you can do 7785it: 7786 7787 %{ 7788 int hexflag; 7789 int yylex (void); 7790 void yyerror (char const *); 7791 %} 7792 %% 7793 ... 7794 expr: 7795 IDENTIFIER 7796 | constant 7797 | HEX '(' { hexflag = 1; } 7798 expr ')' { hexflag = 0; $$ = $4; } 7799 | expr '+' expr { $$ = make_sum ($1, $3); } 7800 ... 7801 ; 7802 7803 constant: 7804 INTEGER 7805 | STRING 7806 ; 7807 7808Here we assume that `yylex' looks at the value of `hexflag'; when it is 7809nonzero, all integers are parsed in hexadecimal, and tokens starting 7810with letters are parsed as integers if possible. 7811 7812 The declaration of `hexflag' shown in the prologue of the grammar 7813file is needed to make it accessible to the actions (*note The 7814Prologue: Prologue.). You must also write the code in `yylex' to obey 7815the flag. 7816 7817 7818File: bison.info, Node: Tie-in Recovery, Prev: Lexical Tie-ins, Up: Context Dependency 7819 78207.3 Lexical Tie-ins and Error Recovery 7821====================================== 7822 7823Lexical tie-ins make strict demands on any error recovery rules you 7824have. *Note Error Recovery::. 7825 7826 The reason for this is that the purpose of an error recovery rule is 7827to abort the parsing of one construct and resume in some larger 7828construct. For example, in C-like languages, a typical error recovery 7829rule is to skip tokens until the next semicolon, and then start a new 7830statement, like this: 7831 7832 stmt: 7833 expr ';' 7834 | IF '(' expr ')' stmt { ... } 7835 ... 7836 | error ';' { hexflag = 0; } 7837 ; 7838 7839 If there is a syntax error in the middle of a `hex (EXPR)' 7840construct, this error rule will apply, and then the action for the 7841completed `hex (EXPR)' will never run. So `hexflag' would remain set 7842for the entire rest of the input, or until the next `hex' keyword, 7843causing identifiers to be misinterpreted as integers. 7844 7845 To avoid this problem the error recovery rule itself clears 7846`hexflag'. 7847 7848 There may also be an error recovery rule that works within 7849expressions. For example, there could be a rule which applies within 7850parentheses and skips to the close-parenthesis: 7851 7852 expr: 7853 ... 7854 | '(' expr ')' { $$ = $2; } 7855 | '(' error ')' 7856 ... 7857 7858 If this rule acts within the `hex' construct, it is not going to 7859abort that construct (since it applies to an inner level of parentheses 7860within the construct). Therefore, it should not clear the flag: the 7861rest of the `hex' construct should be parsed with the flag still in 7862effect. 7863 7864 What if there is an error recovery rule which might abort out of the 7865`hex' construct or might not, depending on circumstances? There is no 7866way you can write the action to determine whether a `hex' construct is 7867being aborted or not. So if you are using a lexical tie-in, you had 7868better make sure your error recovery rules are not of this kind. Each 7869rule must be such that you can be sure that it always will, or always 7870won't, have to clear the flag. 7871 7872 7873File: bison.info, Node: Debugging, Next: Invocation, Prev: Context Dependency, Up: Top 7874 78758 Debugging Your Parser 7876*********************** 7877 7878Developing a parser can be a challenge, especially if you don't 7879understand the algorithm (*note The Bison Parser Algorithm: 7880Algorithm.). This chapter explains how understand and debug a parser. 7881 7882 The first sections focus on the static part of the parser: its 7883structure. They explain how to generate and read the detailed 7884description of the automaton. There are several formats available: 7885 - as text, see *note Understanding Your Parser: Understanding.; 7886 7887 - as a graph, see *note Visualizing Your Parser: Graphviz.; 7888 7889 - or as a markup report that can be turned, for instance, into HTML, 7890 see *note Visualizing your parser in multiple formats: Xml. 7891 7892 The last section focuses on the dynamic part of the parser: how to 7893enable and understand the parser run-time traces (*note Tracing Your 7894Parser: Tracing.). 7895 7896* Menu: 7897 7898* Understanding:: Understanding the structure of your parser. 7899* Graphviz:: Getting a visual representation of the parser. 7900* Xml:: Getting a markup representation of the parser. 7901* Tracing:: Tracing the execution of your parser. 7902 7903 7904File: bison.info, Node: Understanding, Next: Graphviz, Up: Debugging 7905 79068.1 Understanding Your Parser 7907============================= 7908 7909As documented elsewhere (*note The Bison Parser Algorithm: Algorithm.) 7910Bison parsers are "shift/reduce automata". In some cases (much more 7911frequent than one would hope), looking at this automaton is required to 7912tune or simply fix a parser. 7913 7914 The textual file is generated when the options `--report' or 7915`--verbose' are specified, see *note Invoking Bison: Invocation. Its 7916name is made by removing `.tab.c' or `.c' from the parser 7917implementation file name, and adding `.output' instead. Therefore, if 7918the grammar file is `foo.y', then the parser implementation file is 7919called `foo.tab.c' by default. As a consequence, the verbose output 7920file is called `foo.output'. 7921 7922 The following grammar file, `calc.y', will be used in the sequel: 7923 7924 %token NUM STR 7925 %left '+' '-' 7926 %left '*' 7927 %% 7928 exp: 7929 exp '+' exp 7930 | exp '-' exp 7931 | exp '*' exp 7932 | exp '/' exp 7933 | NUM 7934 ; 7935 useless: STR; 7936 %% 7937 7938 `bison' reports: 7939 7940 calc.y: warning: 1 nonterminal useless in grammar 7941 calc.y: warning: 1 rule useless in grammar 7942 calc.y:12.1-7: warning: nonterminal useless in grammar: useless 7943 calc.y:12.10-12: warning: rule useless in grammar: useless: STR 7944 calc.y: conflicts: 7 shift/reduce 7945 7946 When given `--report=state', in addition to `calc.tab.c', it creates 7947a file `calc.output' with contents detailed below. The order of the 7948output and the exact presentation might vary, but the interpretation is 7949the same. 7950 7951The first section reports useless tokens, nonterminals and rules. 7952Useless nonterminals and rules are removed in order to produce a 7953smaller parser, but useless tokens are preserved, since they might be 7954used by the scanner (note the difference between "useless" and "unused" 7955below): 7956 7957 Nonterminals useless in grammar 7958 useless 7959 7960 Terminals unused in grammar 7961 STR 7962 7963 Rules useless in grammar 7964 6 useless: STR 7965 7966The next section lists states that still have conflicts. 7967 7968 State 8 conflicts: 1 shift/reduce 7969 State 9 conflicts: 1 shift/reduce 7970 State 10 conflicts: 1 shift/reduce 7971 State 11 conflicts: 4 shift/reduce 7972 7973Then Bison reproduces the exact grammar it used: 7974 7975 Grammar 7976 7977 0 $accept: exp $end 7978 7979 1 exp: exp '+' exp 7980 2 | exp '-' exp 7981 3 | exp '*' exp 7982 4 | exp '/' exp 7983 5 | NUM 7984 7985and reports the uses of the symbols: 7986 7987 Terminals, with rules where they appear 7988 7989 $end (0) 0 7990 '*' (42) 3 7991 '+' (43) 1 7992 '-' (45) 2 7993 '/' (47) 4 7994 error (256) 7995 NUM (258) 5 7996 STR (259) 7997 7998 Nonterminals, with rules where they appear 7999 8000 $accept (9) 8001 on left: 0 8002 exp (10) 8003 on left: 1 2 3 4 5, on right: 0 1 2 3 4 8004 8005Bison then proceeds onto the automaton itself, describing each state 8006with its set of "items", also known as "pointed rules". Each item is a 8007production rule together with a point (`.') marking the location of the 8008input cursor. 8009 8010 State 0 8011 8012 0 $accept: . exp $end 8013 8014 NUM shift, and go to state 1 8015 8016 exp go to state 2 8017 8018 This reads as follows: "state 0 corresponds to being at the very 8019beginning of the parsing, in the initial rule, right before the start 8020symbol (here, `exp'). When the parser returns to this state right 8021after having reduced a rule that produced an `exp', the control flow 8022jumps to state 2. If there is no such transition on a nonterminal 8023symbol, and the lookahead is a `NUM', then this token is shifted onto 8024the parse stack, and the control flow jumps to state 1. Any other 8025lookahead triggers a syntax error." 8026 8027 Even though the only active rule in state 0 seems to be rule 0, the 8028report lists `NUM' as a lookahead token because `NUM' can be at the 8029beginning of any rule deriving an `exp'. By default Bison reports the 8030so-called "core" or "kernel" of the item set, but if you want to see 8031more detail you can invoke `bison' with `--report=itemset' to list the 8032derived items as well: 8033 8034 State 0 8035 8036 0 $accept: . exp $end 8037 1 exp: . exp '+' exp 8038 2 | . exp '-' exp 8039 3 | . exp '*' exp 8040 4 | . exp '/' exp 8041 5 | . NUM 8042 8043 NUM shift, and go to state 1 8044 8045 exp go to state 2 8046 8047In the state 1... 8048 8049 State 1 8050 8051 5 exp: NUM . 8052 8053 $default reduce using rule 5 (exp) 8054 8055the rule 5, `exp: NUM;', is completed. Whatever the lookahead token 8056(`$default'), the parser will reduce it. If it was coming from State 80570, then, after this reduction it will return to state 0, and will jump 8058to state 2 (`exp: go to state 2'). 8059 8060 State 2 8061 8062 0 $accept: exp . $end 8063 1 exp: exp . '+' exp 8064 2 | exp . '-' exp 8065 3 | exp . '*' exp 8066 4 | exp . '/' exp 8067 8068 $end shift, and go to state 3 8069 '+' shift, and go to state 4 8070 '-' shift, and go to state 5 8071 '*' shift, and go to state 6 8072 '/' shift, and go to state 7 8073 8074In state 2, the automaton can only shift a symbol. For instance, 8075because of the item `exp: exp . '+' exp', if the lookahead is `+' it is 8076shifted onto the parse stack, and the automaton jumps to state 4, 8077corresponding to the item `exp: exp '+' . exp'. Since there is no 8078default action, any lookahead not listed triggers a syntax error. 8079 8080 The state 3 is named the "final state", or the "accepting state": 8081 8082 State 3 8083 8084 0 $accept: exp $end . 8085 8086 $default accept 8087 8088the initial rule is completed (the start symbol and the end-of-input 8089were read), the parsing exits successfully. 8090 8091 The interpretation of states 4 to 7 is straightforward, and is left 8092to the reader. 8093 8094 State 4 8095 8096 1 exp: exp '+' . exp 8097 8098 NUM shift, and go to state 1 8099 8100 exp go to state 8 8101 8102 8103 State 5 8104 8105 2 exp: exp '-' . exp 8106 8107 NUM shift, and go to state 1 8108 8109 exp go to state 9 8110 8111 8112 State 6 8113 8114 3 exp: exp '*' . exp 8115 8116 NUM shift, and go to state 1 8117 8118 exp go to state 10 8119 8120 8121 State 7 8122 8123 4 exp: exp '/' . exp 8124 8125 NUM shift, and go to state 1 8126 8127 exp go to state 11 8128 8129 As was announced in beginning of the report, `State 8 conflicts: 1 8130shift/reduce': 8131 8132 State 8 8133 8134 1 exp: exp . '+' exp 8135 1 | exp '+' exp . 8136 2 | exp . '-' exp 8137 3 | exp . '*' exp 8138 4 | exp . '/' exp 8139 8140 '*' shift, and go to state 6 8141 '/' shift, and go to state 7 8142 8143 '/' [reduce using rule 1 (exp)] 8144 $default reduce using rule 1 (exp) 8145 8146 Indeed, there are two actions associated to the lookahead `/': 8147either shifting (and going to state 7), or reducing rule 1. The 8148conflict means that either the grammar is ambiguous, or the parser lacks 8149information to make the right decision. Indeed the grammar is 8150ambiguous, as, since we did not specify the precedence of `/', the 8151sentence `NUM + NUM / NUM' can be parsed as `NUM + (NUM / NUM)', which 8152corresponds to shifting `/', or as `(NUM + NUM) / NUM', which 8153corresponds to reducing rule 1. 8154 8155 Because in deterministic parsing a single decision can be made, Bison 8156arbitrarily chose to disable the reduction, see *note Shift/Reduce 8157Conflicts: Shift/Reduce. Discarded actions are reported between square 8158brackets. 8159 8160 Note that all the previous states had a single possible action: 8161either shifting the next token and going to the corresponding state, or 8162reducing a single rule. In the other cases, i.e., when shifting _and_ 8163reducing is possible or when _several_ reductions are possible, the 8164lookahead is required to select the action. State 8 is one such state: 8165if the lookahead is `*' or `/' then the action is shifting, otherwise 8166the action is reducing rule 1. In other words, the first two items, 8167corresponding to rule 1, are not eligible when the lookahead token is 8168`*', since we specified that `*' has higher precedence than `+'. More 8169generally, some items are eligible only with some set of possible 8170lookahead tokens. When run with `--report=lookahead', Bison specifies 8171these lookahead tokens: 8172 8173 State 8 8174 8175 1 exp: exp . '+' exp 8176 1 | exp '+' exp . [$end, '+', '-', '/'] 8177 2 | exp . '-' exp 8178 3 | exp . '*' exp 8179 4 | exp . '/' exp 8180 8181 '*' shift, and go to state 6 8182 '/' shift, and go to state 7 8183 8184 '/' [reduce using rule 1 (exp)] 8185 $default reduce using rule 1 (exp) 8186 8187 Note however that while `NUM + NUM / NUM' is ambiguous (which 8188results in the conflicts on `/'), `NUM + NUM * NUM' is not: the 8189conflict was solved thanks to associativity and precedence directives. 8190If invoked with `--report=solved', Bison includes information about the 8191solved conflicts in the report: 8192 8193 Conflict between rule 1 and token '+' resolved as reduce (%left '+'). 8194 Conflict between rule 1 and token '-' resolved as reduce (%left '-'). 8195 Conflict between rule 1 and token '*' resolved as shift ('+' < '*'). 8196 8197 The remaining states are similar: 8198 8199 State 9 8200 8201 1 exp: exp . '+' exp 8202 2 | exp . '-' exp 8203 2 | exp '-' exp . 8204 3 | exp . '*' exp 8205 4 | exp . '/' exp 8206 8207 '*' shift, and go to state 6 8208 '/' shift, and go to state 7 8209 8210 '/' [reduce using rule 2 (exp)] 8211 $default reduce using rule 2 (exp) 8212 8213 State 10 8214 8215 1 exp: exp . '+' exp 8216 2 | exp . '-' exp 8217 3 | exp . '*' exp 8218 3 | exp '*' exp . 8219 4 | exp . '/' exp 8220 8221 '/' shift, and go to state 7 8222 8223 '/' [reduce using rule 3 (exp)] 8224 $default reduce using rule 3 (exp) 8225 8226 State 11 8227 8228 1 exp: exp . '+' exp 8229 2 | exp . '-' exp 8230 3 | exp . '*' exp 8231 4 | exp . '/' exp 8232 4 | exp '/' exp . 8233 8234 '+' shift, and go to state 4 8235 '-' shift, and go to state 5 8236 '*' shift, and go to state 6 8237 '/' shift, and go to state 7 8238 8239 '+' [reduce using rule 4 (exp)] 8240 '-' [reduce using rule 4 (exp)] 8241 '*' [reduce using rule 4 (exp)] 8242 '/' [reduce using rule 4 (exp)] 8243 $default reduce using rule 4 (exp) 8244 8245Observe that state 11 contains conflicts not only due to the lack of 8246precedence of `/' with respect to `+', `-', and `*', but also because 8247the associativity of `/' is not specified. 8248 8249 Bison may also produce an HTML version of this output, via an XML 8250file and XSLT processing (*note Visualizing your parser in multiple 8251formats: Xml.). 8252 8253 8254File: bison.info, Node: Graphviz, Next: Xml, Prev: Understanding, Up: Debugging 8255 82568.2 Visualizing Your Parser 8257=========================== 8258 8259As another means to gain better understanding of the shift/reduce 8260automaton corresponding to the Bison parser, a DOT file can be 8261generated. Note that debugging a real grammar with this is tedious at 8262best, and impractical most of the times, because the generated files 8263are huge (the generation of a PDF or PNG file from it will take very 8264long, and more often than not it will fail due to memory exhaustion). 8265This option was rather designed for beginners, to help them understand 8266LR parsers. 8267 8268 This file is generated when the `--graph' option is specified (*note 8269Invoking Bison: Invocation.). Its name is made by removing `.tab.c' or 8270`.c' from the parser implementation file name, and adding `.dot' 8271instead. If the grammar file is `foo.y', the Graphviz output file is 8272called `foo.dot'. A DOT file may also be produced via an XML file and 8273XSLT processing (*note Visualizing your parser in multiple formats: 8274Xml.). 8275 8276 The following grammar file, `rr.y', will be used in the sequel: 8277 8278 %% 8279 exp: a ";" | b "."; 8280 a: "0"; 8281 b: "0"; 8282 8283 The graphical output is very similar to the textual one, and as such 8284it is easier understood by making direct comparisons between them. 8285*Note Debugging Your Parser: Debugging, for a detailled analysis of the 8286textual report. 8287 8288Graphical Representation of States 8289---------------------------------- 8290 8291The items (pointed rules) for each state are grouped together in graph 8292nodes. Their numbering is the same as in the verbose file. See the 8293following points, about transitions, for examples 8294 8295 When invoked with `--report=lookaheads', the lookahead tokens, when 8296needed, are shown next to the relevant rule between square brackets as a 8297comma separated list. This is the case in the figure for the 8298representation of reductions, below. 8299 8300 8301 The transitions are represented as directed edges between the 8302current and the target states. 8303 8304Graphical Representation of Shifts 8305---------------------------------- 8306 8307Shifts are shown as solid arrows, labelled with the lookahead token for 8308that shift. The following describes a reduction in the `rr.output' file: 8309 8310 State 3 8311 8312 1 exp: a . ";" 8313 8314 ";" shift, and go to state 6 8315 8316 A Graphviz rendering of this portion of the graph could be: 8317 8318[image src="figs/example-shift.png" text=".----------------. 8319| State 3 | 8320| 1 exp: a . \";\" | 8321`----------------' 8322 | 8323 | \";\" 8324 | 8325 v 8326.----------------. 8327| State 6 | 8328| 1 exp: a \";\" . | 8329`----------------' 8330"] 8331 8332Graphical Representation of Reductions 8333-------------------------------------- 8334 8335Reductions are shown as solid arrows, leading to a diamond-shaped node 8336bearing the number of the reduction rule. The arrow is labelled with the 8337appropriate comma separated lookahead tokens. If the reduction is the 8338default action for the given state, there is no such label. 8339 8340 This is how reductions are represented in the verbose file 8341`rr.output': 8342 State 1 8343 8344 3 a: "0" . [";"] 8345 4 b: "0" . ["."] 8346 8347 "." reduce using rule 4 (b) 8348 $default reduce using rule 3 (a) 8349 8350 A Graphviz rendering of this portion of the graph could be: 8351 8352[image src="figs/example-reduce.png" text=" .------------------. 8353 | State 1 | 8354 | 3 a: \"0\" . [\";\"] | 8355 | 4 b: \"0\" . [\".\"] | 8356 `------------------' 8357 / \\ 8358 / \\ [\".\"] 8359 / \\ 8360 v v 8361 / \\ / \\ 8362 / R \\ / R \\ 8363(green) \\ 3 / \\ 4 / (green) 8364 \\ / \\ / 8365"] 8366 8367When unresolved conflicts are present, because in deterministic parsing 8368a single decision can be made, Bison can arbitrarily choose to disable a 8369reduction, see *note Shift/Reduce Conflicts: Shift/Reduce. Discarded 8370actions are distinguished by a red filling color on these nodes, just 8371like how they are reported between square brackets in the verbose file. 8372 8373 The reduction corresponding to the rule number 0 is the acceptation 8374state. It is shown as a blue diamond, labelled "Acc". 8375 8376Graphical representation of go tos 8377---------------------------------- 8378 8379The `go to' jump transitions are represented as dotted lines bearing 8380the name of the rule being jumped to. 8381 8382 8383File: bison.info, Node: Xml, Next: Tracing, Prev: Graphviz, Up: Debugging 8384 83858.3 Visualizing your parser in multiple formats 8386=============================================== 8387 8388Bison supports two major report formats: textual output (*note 8389Understanding Your Parser: Understanding.) when invoked with option 8390`--verbose', and DOT (*note Visualizing Your Parser: Graphviz.) when 8391invoked with option `--graph'. However, another alternative is to 8392output an XML file that may then be, with `xsltproc', rendered as 8393either a raw text format equivalent to the verbose file, or as an HTML 8394version of the same file, with clickable transitions, or even as a DOT. 8395The `.output' and DOT files obtained via XSLT have no difference 8396whatsoever with those obtained by invoking `bison' with options 8397`--verbose' or `--graph'. 8398 8399 The XML file is generated when the options `-x' or `--xml[=FILE]' 8400are specified, see *note Invoking Bison: Invocation. If not specified, 8401its name is made by removing `.tab.c' or `.c' from the parser 8402implementation file name, and adding `.xml' instead. For instance, if 8403the grammar file is `foo.y', the default XML output file is `foo.xml'. 8404 8405 Bison ships with a `data/xslt' directory, containing XSL 8406Transformation files to apply to the XML file. Their names are 8407non-ambiguous: 8408 8409`xml2dot.xsl' 8410 Used to output a copy of the DOT visualization of the automaton. 8411 8412`xml2text.xsl' 8413 Used to output a copy of the `.output' file. 8414 8415`xml2xhtml.xsl' 8416 Used to output an xhtml enhancement of the `.output' file. 8417 8418 Sample usage (requires `xsltproc'): 8419 $ bison -x gr.y 8420 $ bison --print-datadir 8421 /usr/local/share/bison 8422 $ xsltproc /usr/local/share/bison/xslt/xml2xhtml.xsl gr.xml >gr.html 8423 8424 8425File: bison.info, Node: Tracing, Prev: Xml, Up: Debugging 8426 84278.4 Tracing Your Parser 8428======================= 8429 8430When a Bison grammar compiles properly but parses "incorrectly", the 8431`yydebug' parser-trace feature helps figuring out why. 8432 8433* Menu: 8434 8435* Enabling Traces:: Activating run-time trace support 8436* Mfcalc Traces:: Extending `mfcalc' to support traces 8437* The YYPRINT Macro:: Obsolete interface for semantic value reports 8438 8439 8440File: bison.info, Node: Enabling Traces, Next: Mfcalc Traces, Up: Tracing 8441 84428.4.1 Enabling Traces 8443--------------------- 8444 8445There are several means to enable compilation of trace facilities: 8446 8447the macro `YYDEBUG' 8448 Define the macro `YYDEBUG' to a nonzero value when you compile the 8449 parser. This is compliant with POSIX Yacc. You could use 8450 `-DYYDEBUG=1' as a compiler option or you could put `#define 8451 YYDEBUG 1' in the prologue of the grammar file (*note The 8452 Prologue: Prologue.). 8453 8454 If the `%define' variable `api.prefix' is used (*note Multiple 8455 Parsers in the Same Program: Multiple Parsers.), for instance 8456 `%define api.prefix x', then if `CDEBUG' is defined, its value 8457 controls the tracing feature (enabled if and only if nonzero); 8458 otherwise tracing is enabled if and only if `YYDEBUG' is nonzero. 8459 8460the option `-t' (POSIX Yacc compliant) 8461the option `--debug' (Bison extension) 8462 Use the `-t' option when you run Bison (*note Invoking Bison: 8463 Invocation.). With `%define api.prefix c', it defines `CDEBUG' to 8464 1, otherwise it defines `YYDEBUG' to 1. 8465 8466the directive `%debug' 8467 Add the `%debug' directive (*note Bison Declaration Summary: Decl 8468 Summary.). This is a Bison extension, especially useful for 8469 languages that don't use a preprocessor. Unless POSIX and Yacc 8470 portability matter to you, this is the preferred solution. 8471 8472 We suggest that you always enable the debug option so that debugging 8473is always possible. 8474 8475 The trace facility outputs messages with macro calls of the form 8476`YYFPRINTF (stderr, FORMAT, ARGS)' where FORMAT and ARGS are the usual 8477`printf' format and variadic arguments. If you define `YYDEBUG' to a 8478nonzero value but do not define `YYFPRINTF', `<stdio.h>' is 8479automatically included and `YYFPRINTF' is defined to `fprintf'. 8480 8481 Once you have compiled the program with trace facilities, the way to 8482request a trace is to store a nonzero value in the variable `yydebug'. 8483You can do this by making the C code do it (in `main', perhaps), or you 8484can alter the value with a C debugger. 8485 8486 Each step taken by the parser when `yydebug' is nonzero produces a 8487line or two of trace information, written on `stderr'. The trace 8488messages tell you these things: 8489 8490 * Each time the parser calls `yylex', what kind of token was read. 8491 8492 * Each time a token is shifted, the depth and complete contents of 8493 the state stack (*note Parser States::). 8494 8495 * Each time a rule is reduced, which rule it is, and the complete 8496 contents of the state stack afterward. 8497 8498 To make sense of this information, it helps to refer to the automaton 8499description file (*note Understanding Your Parser: Understanding.). 8500This file shows the meaning of each state in terms of positions in 8501various rules, and also what each state will do with each possible 8502input token. As you read the successive trace messages, you can see 8503that the parser is functioning according to its specification in the 8504listing file. Eventually you will arrive at the place where something 8505undesirable happens, and you will see which parts of the grammar are to 8506blame. 8507 8508 The parser implementation file is a C/C++/Java program and you can 8509use debuggers on it, but it's not easy to interpret what it is doing. 8510The parser function is a finite-state machine interpreter, and aside 8511from the actions it executes the same code over and over. Only the 8512values of variables show where in the grammar it is working. 8513 8514 8515File: bison.info, Node: Mfcalc Traces, Next: The YYPRINT Macro, Prev: Enabling Traces, Up: Tracing 8516 85178.4.2 Enabling Debug Traces for `mfcalc' 8518---------------------------------------- 8519 8520The debugging information normally gives the token type of each token 8521read, but not its semantic value. The `%printer' directive allows 8522specify how semantic values are reported, see *note Printing Semantic 8523Values: Printer Decl. For backward compatibility, Yacc like C parsers 8524may also use the `YYPRINT' (*note The `YYPRINT' Macro: The YYPRINT 8525Macro.), but its use is discouraged. 8526 8527 As a demonstration of `%printer', consider the multi-function 8528calculator, `mfcalc' (*note Multi-function Calc::). To enable run-time 8529traces, and semantic value reports, insert the following directives in 8530its prologue: 8531 8532 /* Generate the parser description file. */ 8533 %verbose 8534 /* Enable run-time traces (yydebug). */ 8535 %define parse.trace 8536 8537 /* Formatting semantic values. */ 8538 %printer { fprintf (yyoutput, "%s", $$->name); } VAR; 8539 %printer { fprintf (yyoutput, "%s()", $$->name); } FNCT; 8540 %printer { fprintf (yyoutput, "%g", $$); } <val>; 8541 8542 The `%define' directive instructs Bison to generate run-time trace 8543support. Then, activation of these traces is controlled at run-time by 8544the `yydebug' variable, which is disabled by default. Because these 8545traces will refer to the "states" of the parser, it is helpful to ask 8546for the creation of a description of that parser; this is the purpose 8547of (admittedly ill-named) `%verbose' directive. 8548 8549 The set of `%printer' directives demonstrates how to format the 8550semantic value in the traces. Note that the specification can be done 8551either on the symbol type (e.g., `VAR' or `FNCT'), or on the type tag: 8552since `<val>' is the type for both `NUM' and `exp', this printer will 8553be used for them. 8554 8555 Here is a sample of the information provided by run-time traces. 8556The traces are sent onto standard error. 8557 8558 $ echo 'sin(1-1)' | ./mfcalc -p 8559 Starting parse 8560 Entering state 0 8561 Reducing stack by rule 1 (line 34): 8562 -> $$ = nterm input () 8563 Stack now 0 8564 Entering state 1 8565 8566This first batch shows a specific feature of this grammar: the first 8567rule (which is in line 34 of `mfcalc.y' can be reduced without even 8568having to look for the first token. The resulting left-hand symbol 8569(`$$') is a valueless (`()') `input' non terminal (`nterm'). 8570 8571 Then the parser calls the scanner. 8572 Reading a token: Next token is token FNCT (sin()) 8573 Shifting token FNCT (sin()) 8574 Entering state 6 8575 8576That token (`token') is a function (`FNCT') whose value is `sin' as 8577formatted per our `%printer' specification: `sin()'. The parser stores 8578(`Shifting') that token, and others, until it can do something about it. 8579 8580 Reading a token: Next token is token '(' () 8581 Shifting token '(' () 8582 Entering state 14 8583 Reading a token: Next token is token NUM (1.000000) 8584 Shifting token NUM (1.000000) 8585 Entering state 4 8586 Reducing stack by rule 6 (line 44): 8587 $1 = token NUM (1.000000) 8588 -> $$ = nterm exp (1.000000) 8589 Stack now 0 1 6 14 8590 Entering state 24 8591 8592The previous reduction demonstrates the `%printer' directive for 8593`<val>': both the token `NUM' and the resulting nonterminal `exp' have 8594`1' as value. 8595 8596 Reading a token: Next token is token '-' () 8597 Shifting token '-' () 8598 Entering state 17 8599 Reading a token: Next token is token NUM (1.000000) 8600 Shifting token NUM (1.000000) 8601 Entering state 4 8602 Reducing stack by rule 6 (line 44): 8603 $1 = token NUM (1.000000) 8604 -> $$ = nterm exp (1.000000) 8605 Stack now 0 1 6 14 24 17 8606 Entering state 26 8607 Reading a token: Next token is token ')' () 8608 Reducing stack by rule 11 (line 49): 8609 $1 = nterm exp (1.000000) 8610 $2 = token '-' () 8611 $3 = nterm exp (1.000000) 8612 -> $$ = nterm exp (0.000000) 8613 Stack now 0 1 6 14 8614 Entering state 24 8615 8616The rule for the subtraction was just reduced. The parser is about to 8617discover the end of the call to `sin'. 8618 8619 Next token is token ')' () 8620 Shifting token ')' () 8621 Entering state 31 8622 Reducing stack by rule 9 (line 47): 8623 $1 = token FNCT (sin()) 8624 $2 = token '(' () 8625 $3 = nterm exp (0.000000) 8626 $4 = token ')' () 8627 -> $$ = nterm exp (0.000000) 8628 Stack now 0 1 8629 Entering state 11 8630 8631Finally, the end-of-line allow the parser to complete the computation, 8632and display its result. 8633 8634 Reading a token: Next token is token '\n' () 8635 Shifting token '\n' () 8636 Entering state 22 8637 Reducing stack by rule 4 (line 40): 8638 $1 = nterm exp (0.000000) 8639 $2 = token '\n' () 8640 => 0 8641 -> $$ = nterm line () 8642 Stack now 0 1 8643 Entering state 10 8644 Reducing stack by rule 2 (line 35): 8645 $1 = nterm input () 8646 $2 = nterm line () 8647 -> $$ = nterm input () 8648 Stack now 0 8649 Entering state 1 8650 8651 The parser has returned into state 1, in which it is waiting for the 8652next expression to evaluate, or for the end-of-file token, which causes 8653the completion of the parsing. 8654 8655 Reading a token: Now at end of input. 8656 Shifting token $end () 8657 Entering state 2 8658 Stack now 0 1 2 8659 Cleanup: popping token $end () 8660 Cleanup: popping nterm input () 8661 8662 8663File: bison.info, Node: The YYPRINT Macro, Prev: Mfcalc Traces, Up: Tracing 8664 86658.4.3 The `YYPRINT' Macro 8666------------------------- 8667 8668Before `%printer' support, semantic values could be displayed using the 8669`YYPRINT' macro, which works only for terminal symbols and only with 8670the `yacc.c' skeleton. 8671 8672 -- Macro: YYPRINT (STREAM, TOKEN, VALUE); 8673 If you define `YYPRINT', it should take three arguments. The 8674 parser will pass a standard I/O stream, the numeric code for the 8675 token type, and the token value (from `yylval'). 8676 8677 For `yacc.c' only. Obsoleted by `%printer'. 8678 8679 Here is an example of `YYPRINT' suitable for the multi-function 8680calculator (*note Declarations for `mfcalc': Mfcalc Declarations.): 8681 8682 %{ 8683 static void print_token_value (FILE *, int, YYSTYPE); 8684 #define YYPRINT(File, Type, Value) \ 8685 print_token_value (File, Type, Value) 8686 %} 8687 8688 ... %% ... %% ... 8689 8690 static void 8691 print_token_value (FILE *file, int type, YYSTYPE value) 8692 { 8693 if (type == VAR) 8694 fprintf (file, "%s", value.tptr->name); 8695 else if (type == NUM) 8696 fprintf (file, "%d", value.val); 8697 } 8698 8699 8700File: bison.info, Node: Invocation, Next: Other Languages, Prev: Debugging, Up: Top 8701 87029 Invoking Bison 8703**************** 8704 8705The usual way to invoke Bison is as follows: 8706 8707 bison INFILE 8708 8709 Here INFILE is the grammar file name, which usually ends in `.y'. 8710The parser implementation file's name is made by replacing the `.y' 8711with `.tab.c' and removing any leading directory. Thus, the `bison 8712foo.y' file name yields `foo.tab.c', and the `bison hack/foo.y' file 8713name yields `foo.tab.c'. It's also possible, in case you are writing 8714C++ code instead of C in your grammar file, to name it `foo.ypp' or 8715`foo.y++'. Then, the output files will take an extension like the 8716given one as input (respectively `foo.tab.cpp' and `foo.tab.c++'). This 8717feature takes effect with all options that manipulate file names like 8718`-o' or `-d'. 8719 8720 For example : 8721 8722 bison -d INFILE.YXX 8723 will produce `infile.tab.cxx' and `infile.tab.hxx', and 8724 8725 bison -d -o OUTPUT.C++ INFILE.Y 8726 will produce `output.c++' and `outfile.h++'. 8727 8728 For compatibility with POSIX, the standard Bison distribution also 8729contains a shell script called `yacc' that invokes Bison with the `-y' 8730option. 8731 8732* Menu: 8733 8734* Bison Options:: All the options described in detail, 8735 in alphabetical order by short options. 8736* Option Cross Key:: Alphabetical list of long options. 8737* Yacc Library:: Yacc-compatible `yylex' and `main'. 8738 8739 8740File: bison.info, Node: Bison Options, Next: Option Cross Key, Up: Invocation 8741 87429.1 Bison Options 8743================= 8744 8745Bison supports both traditional single-letter options and mnemonic long 8746option names. Long option names are indicated with `--' instead of 8747`-'. Abbreviations for option names are allowed as long as they are 8748unique. When a long option takes an argument, like `--file-prefix', 8749connect the option name and the argument with `='. 8750 8751 Here is a list of options that can be used with Bison, alphabetized 8752by short option. It is followed by a cross key alphabetized by long 8753option. 8754 8755Operations modes: 8756`-h' 8757`--help' 8758 Print a summary of the command-line options to Bison and exit. 8759 8760`-V' 8761`--version' 8762 Print the version number of Bison and exit. 8763 8764`--print-localedir' 8765 Print the name of the directory containing locale-dependent data. 8766 8767`--print-datadir' 8768 Print the name of the directory containing skeletons and XSLT. 8769 8770`-y' 8771`--yacc' 8772 Act more like the traditional Yacc command. This can cause 8773 different diagnostics to be generated, and may change behavior in 8774 other minor ways. Most importantly, imitate Yacc's output file 8775 name conventions, so that the parser implementation file is called 8776 `y.tab.c', and the other outputs are called `y.output' and 8777 `y.tab.h'. Also, if generating a deterministic parser in C, 8778 generate `#define' statements in addition to an `enum' to associate 8779 token numbers with token names. Thus, the following shell script 8780 can substitute for Yacc, and the Bison distribution contains such 8781 a script for compatibility with POSIX: 8782 8783 #! /bin/sh 8784 bison -y "$@" 8785 8786 The `-y'/`--yacc' option is intended for use with traditional Yacc 8787 grammars. If your grammar uses a Bison extension like 8788 `%glr-parser', Bison might not be Yacc-compatible even if this 8789 option is specified. 8790 8791`-W [CATEGORY]' 8792`--warnings[=CATEGORY]' 8793 Output warnings falling in CATEGORY. CATEGORY can be one of: 8794 `midrule-values' 8795 Warn about mid-rule values that are set but not used within 8796 any of the actions of the parent rule. For example, warn 8797 about unused `$2' in: 8798 8799 exp: '1' { $$ = 1; } '+' exp { $$ = $1 + $4; }; 8800 8801 Also warn about mid-rule values that are used but not set. 8802 For example, warn about unset `$$' in the mid-rule action in: 8803 8804 exp: '1' { $1 = 1; } '+' exp { $$ = $2 + $4; }; 8805 8806 These warnings are not enabled by default since they 8807 sometimes prove to be false alarms in existing grammars 8808 employing the Yacc constructs `$0' or `$-N' (where N is some 8809 positive integer). 8810 8811 `yacc' 8812 Incompatibilities with POSIX Yacc. 8813 8814 `conflicts-sr' 8815 `conflicts-rr' 8816 S/R and R/R conflicts. These warnings are enabled by 8817 default. However, if the `%expect' or `%expect-rr' directive 8818 is specified, an unexpected number of conflicts is an error, 8819 and an expected number of conflicts is not reported, so `-W' 8820 and `--warning' then have no effect on the conflict report. 8821 8822 `other' 8823 All warnings not categorized above. These warnings are 8824 enabled by default. 8825 8826 This category is provided merely for the sake of 8827 completeness. Future releases of Bison may move warnings 8828 from this category to new, more specific categories. 8829 8830 `all' 8831 All the warnings. 8832 8833 `none' 8834 Turn off all the warnings. 8835 8836 `error' 8837 Treat warnings as errors. 8838 8839 A category can be turned off by prefixing its name with `no-'. For 8840 instance, `-Wno-yacc' will hide the warnings about POSIX Yacc 8841 incompatibilities. 8842 8843`-f [FEATURE]' 8844`--feature[=FEATURE]' 8845 Activate miscellaneous FEATURE. FEATURE can be one of: 8846 `caret' 8847 `diagnostics-show-caret' 8848 Show caret errors, in a manner similar to GCC's 8849 `-fdiagnostics-show-caret', or Clang's `-fcaret-diagnotics'. 8850 The location provided with the message is used to quote the 8851 corresponding line of the source file, underlining the 8852 important part of it with carets (^). Here is an example, 8853 using the following file `in.y': 8854 8855 %type <ival> exp 8856 %% 8857 exp: exp '+' exp { $exp = $1 + $2; }; 8858 8859 When invoked with `-fcaret', Bison will report: 8860 8861 in.y:3.20-23: error: ambiguous reference: '$exp' 8862 exp: exp '+' exp { $exp = $1 + $2; }; 8863 ^^^^ 8864 in.y:3.1-3: refers to: $exp at $$ 8865 exp: exp '+' exp { $exp = $1 + $2; }; 8866 ^^^ 8867 in.y:3.6-8: refers to: $exp at $1 8868 exp: exp '+' exp { $exp = $1 + $2; }; 8869 ^^^ 8870 in.y:3.14-16: refers to: $exp at $3 8871 exp: exp '+' exp { $exp = $1 + $2; }; 8872 ^^^ 8873 in.y:3.32-33: error: $2 of 'exp' has no declared type 8874 exp: exp '+' exp { $exp = $1 + $2; }; 8875 ^^ 8876 8877 8878Tuning the parser: 8879 8880`-t' 8881`--debug' 8882 In the parser implementation file, define the macro `YYDEBUG' to 1 8883 if it is not already defined, so that the debugging facilities are 8884 compiled. *Note Tracing Your Parser: Tracing. 8885 8886`-D NAME[=VALUE]' 8887`--define=NAME[=VALUE]' 8888`-F NAME[=VALUE]' 8889`--force-define=NAME[=VALUE]' 8890 Each of these is equivalent to `%define NAME "VALUE"' (*note 8891 %define Summary::) except that Bison processes multiple 8892 definitions for the same NAME as follows: 8893 8894 * Bison quietly ignores all command-line definitions for NAME 8895 except the last. 8896 8897 * If that command-line definition is specified by a `-D' or 8898 `--define', Bison reports an error for any `%define' 8899 definition for NAME. 8900 8901 * If that command-line definition is specified by a `-F' or 8902 `--force-define' instead, Bison quietly ignores all `%define' 8903 definitions for NAME. 8904 8905 * Otherwise, Bison reports an error if there are multiple 8906 `%define' definitions for NAME. 8907 8908 You should avoid using `-F' and `--force-define' in your make 8909 files unless you are confident that it is safe to quietly ignore 8910 any conflicting `%define' that may be added to the grammar file. 8911 8912`-L LANGUAGE' 8913`--language=LANGUAGE' 8914 Specify the programming language for the generated parser, as if 8915 `%language' was specified (*note Bison Declaration Summary: Decl 8916 Summary.). Currently supported languages include C, C++, and Java. 8917 LANGUAGE is case-insensitive. 8918 8919`--locations' 8920 Pretend that `%locations' was specified. *Note Decl Summary::. 8921 8922`-p PREFIX' 8923`--name-prefix=PREFIX' 8924 Pretend that `%name-prefix "PREFIX"' was specified (*note Decl 8925 Summary::). Obsoleted by `-Dapi.prefix=PREFIX'. *Note Multiple 8926 Parsers in the Same Program: Multiple Parsers. 8927 8928`-l' 8929`--no-lines' 8930 Don't put any `#line' preprocessor commands in the parser 8931 implementation file. Ordinarily Bison puts them in the parser 8932 implementation file so that the C compiler and debuggers will 8933 associate errors with your source file, the grammar file. This 8934 option causes them to associate errors with the parser 8935 implementation file, treating it as an independent source file in 8936 its own right. 8937 8938`-S FILE' 8939`--skeleton=FILE' 8940 Specify the skeleton to use, similar to `%skeleton' (*note Bison 8941 Declaration Summary: Decl Summary.). 8942 8943 If FILE does not contain a `/', FILE is the name of a skeleton 8944 file in the Bison installation directory. If it does, FILE is an 8945 absolute file name or a file name relative to the current working 8946 directory. This is similar to how most shells resolve commands. 8947 8948`-k' 8949`--token-table' 8950 Pretend that `%token-table' was specified. *Note Decl Summary::. 8951 8952Adjust the output: 8953 8954`--defines[=FILE]' 8955 Pretend that `%defines' was specified, i.e., write an extra output 8956 file containing macro definitions for the token type names defined 8957 in the grammar, as well as a few other declarations. *Note Decl 8958 Summary::. 8959 8960`-d' 8961 This is the same as `--defines' except `-d' does not accept a FILE 8962 argument since POSIX Yacc requires that `-d' can be bundled with 8963 other short options. 8964 8965`-b FILE-PREFIX' 8966`--file-prefix=PREFIX' 8967 Pretend that `%file-prefix' was specified, i.e., specify prefix to 8968 use for all Bison output file names. *Note Decl Summary::. 8969 8970`-r THINGS' 8971`--report=THINGS' 8972 Write an extra output file containing verbose description of the 8973 comma separated list of THINGS among: 8974 8975 `state' 8976 Description of the grammar, conflicts (resolved and 8977 unresolved), and parser's automaton. 8978 8979 `itemset' 8980 Implies `state' and augments the description of the automaton 8981 with the full set of items for each state, instead of its 8982 core only. 8983 8984 `lookahead' 8985 Implies `state' and augments the description of the automaton 8986 with each rule's lookahead set. 8987 8988 `solved' 8989 Implies `state'. Explain how conflicts were solved thanks to 8990 precedence and associativity directives. 8991 8992 `all' 8993 Enable all the items. 8994 8995 `none' 8996 Do not generate the report. 8997 8998`--report-file=FILE' 8999 Specify the FILE for the verbose description. 9000 9001`-v' 9002`--verbose' 9003 Pretend that `%verbose' was specified, i.e., write an extra output 9004 file containing verbose descriptions of the grammar and parser. 9005 *Note Decl Summary::. 9006 9007`-o FILE' 9008`--output=FILE' 9009 Specify the FILE for the parser implementation file. 9010 9011 The other output files' names are constructed from FILE as 9012 described under the `-v' and `-d' options. 9013 9014`-g [FILE]' 9015`--graph[=FILE]' 9016 Output a graphical representation of the parser's automaton 9017 computed by Bison, in Graphviz (http://www.graphviz.org/) DOT 9018 (http://www.graphviz.org/doc/info/lang.html) format. `FILE' is 9019 optional. If omitted and the grammar file is `foo.y', the output 9020 file will be `foo.dot'. 9021 9022`-x [FILE]' 9023`--xml[=FILE]' 9024 Output an XML report of the parser's automaton computed by Bison. 9025 `FILE' is optional. If omitted and the grammar file is `foo.y', 9026 the output file will be `foo.xml'. (The current XML schema is 9027 experimental and may evolve. More user feedback will help to 9028 stabilize it.) 9029 9030 9031File: bison.info, Node: Option Cross Key, Next: Yacc Library, Prev: Bison Options, Up: Invocation 9032 90339.2 Option Cross Key 9034==================== 9035 9036Here is a list of options, alphabetized by long option, to help you find 9037the corresponding short option and directive. 9038 9039Long Option Short Option Bison Directive 9040--------------------------------------------------------------------------------- 9041`--debug' `-t' `%debug' 9042`--define=NAME[=VALUE]' `-D NAME[=VALUE]' `%define NAME ["VALUE"]' 9043`--defines[=FILE]' `-d' `%defines ["FILE"]' 9044`--feature[=FEATURE]' `-f [FEATURE]' 9045`--file-prefix=PREFIX' `-b PREFIX' `%file-prefix "PREFIX"' 9046`--force-define=NAME[=VALUE]' `-F NAME[=VALUE]' `%define NAME ["VALUE"]' 9047`--graph[=FILE]' `-g [FILE]' 9048`--help' `-h' 9049`--language=LANGUAGE' `-L LANGUAGE' `%language "LANGUAGE"' 9050`--locations' `%locations' 9051`--name-prefix=PREFIX' `-p PREFIX' `%name-prefix "PREFIX"' 9052`--no-lines' `-l' `%no-lines' 9053`--output=FILE' `-o FILE' `%output "FILE"' 9054`--print-datadir' 9055`--print-localedir' 9056`--report-file=FILE' 9057`--report=THINGS' `-r THINGS' 9058`--skeleton=FILE' `-S FILE' `%skeleton "FILE"' 9059`--token-table' `-k' `%token-table' 9060`--verbose' `-v' `%verbose' 9061`--version' `-V' 9062`--warnings[=CATEGORY]' `-W [CATEGORY]' 9063`--xml[=FILE]' `-x [FILE]' 9064`--yacc' `-y' `%yacc' 9065 9066 9067File: bison.info, Node: Yacc Library, Prev: Option Cross Key, Up: Invocation 9068 90699.3 Yacc Library 9070================ 9071 9072The Yacc library contains default implementations of the `yyerror' and 9073`main' functions. These default implementations are normally not 9074useful, but POSIX requires them. To use the Yacc library, link your 9075program with the `-ly' option. Note that Bison's implementation of the 9076Yacc library is distributed under the terms of the GNU General Public 9077License (*note Copying::). 9078 9079 If you use the Yacc library's `yyerror' function, you should declare 9080`yyerror' as follows: 9081 9082 int yyerror (char const *); 9083 9084 Bison ignores the `int' value returned by this `yyerror'. If you 9085use the Yacc library's `main' function, your `yyparse' function should 9086have the following type signature: 9087 9088 int yyparse (void); 9089 9090 9091File: bison.info, Node: Other Languages, Next: FAQ, Prev: Invocation, Up: Top 9092 909310 Parsers Written In Other Languages 9094************************************* 9095 9096* Menu: 9097 9098* C++ Parsers:: The interface to generate C++ parser classes 9099* Java Parsers:: The interface to generate Java parser classes 9100 9101 9102File: bison.info, Node: C++ Parsers, Next: Java Parsers, Up: Other Languages 9103 910410.1 C++ Parsers 9105================ 9106 9107* Menu: 9108 9109* C++ Bison Interface:: Asking for C++ parser generation 9110* C++ Semantic Values:: %union vs. C++ 9111* C++ Location Values:: The position and location classes 9112* C++ Parser Interface:: Instantiating and running the parser 9113* C++ Scanner Interface:: Exchanges between yylex and parse 9114* A Complete C++ Example:: Demonstrating their use 9115 9116 9117File: bison.info, Node: C++ Bison Interface, Next: C++ Semantic Values, Up: C++ Parsers 9118 911910.1.1 C++ Bison Interface 9120-------------------------- 9121 9122The C++ deterministic parser is selected using the skeleton directive, 9123`%skeleton "lalr1.cc"', or the synonymous command-line option 9124`--skeleton=lalr1.cc'. *Note Decl Summary::. 9125 9126 When run, `bison' will create several entities in the `yy' namespace. Use 9127the `%define namespace' directive to change the namespace name, see 9128*note namespace: %define Summary. The various classes are generated in 9129the following files: 9130 9131`position.hh' 9132`location.hh' 9133 The definition of the classes `position' and `location', used for 9134 location tracking. These files are not generated if the `%define' 9135 variable `api.location.type' is defined. *Note C++ Location 9136 Values::. 9137 9138`stack.hh' 9139 An auxiliary class `stack' used by the parser. 9140 9141`FILE.hh' 9142`FILE.cc' 9143 (Assuming the extension of the grammar file was `.yy'.) The 9144 declaration and implementation of the C++ parser class. The 9145 basename and extension of these two files follow the same rules as 9146 with regular C parsers (*note Invocation::). 9147 9148 The header is _mandatory_; you must either pass `-d'/`--defines' 9149 to `bison', or use the `%defines' directive. 9150 9151 All these files are documented using Doxygen; run `doxygen' for a 9152complete and accurate documentation. 9153 9154 9155File: bison.info, Node: C++ Semantic Values, Next: C++ Location Values, Prev: C++ Bison Interface, Up: C++ Parsers 9156 915710.1.2 C++ Semantic Values 9158-------------------------- 9159 9160The `%union' directive works as for C, see *note The Collection of 9161Value Types: Union Decl. In particular it produces a genuine 9162`union'(1), which have a few specific features in C++. 9163 - The type `YYSTYPE' is defined but its use is discouraged: rather 9164 you should refer to the parser's encapsulated type 9165 `yy::parser::semantic_type'. 9166 9167 - Non POD (Plain Old Data) types cannot be used. C++ forbids any 9168 instance of classes with constructors in unions: only _pointers_ 9169 to such objects are allowed. 9170 9171 Because objects have to be stored via pointers, memory is not 9172reclaimed automatically: using the `%destructor' directive is the only 9173means to avoid leaks. *Note Freeing Discarded Symbols: Destructor Decl. 9174 9175 ---------- Footnotes ---------- 9176 9177 (1) In the future techniques to allow complex types within 9178pseudo-unions (similar to Boost variants) might be implemented to 9179alleviate these issues. 9180 9181 9182File: bison.info, Node: C++ Location Values, Next: C++ Parser Interface, Prev: C++ Semantic Values, Up: C++ Parsers 9183 918410.1.3 C++ Location Values 9185-------------------------- 9186 9187When the directive `%locations' is used, the C++ parser supports 9188location tracking, see *note Tracking Locations::. 9189 9190 By default, two auxiliary classes define a `position', a single point 9191in a file, and a `location', a range composed of a pair of `position's 9192(possibly spanning several files). But if the `%define' variable 9193`api.location.type' is defined, then these classes will not be 9194generated, and the user defined type will be used. 9195 9196 In this section `uint' is an abbreviation for `unsigned int': in 9197genuine code only the latter is used. 9198 9199* Menu: 9200 9201* C++ position:: One point in the source file 9202* C++ location:: Two points in the source file 9203* User Defined Location Type:: Required interface for locations 9204 9205 9206File: bison.info, Node: C++ position, Next: C++ location, Up: C++ Location Values 9207 920810.1.3.1 C++ `position' 9209....................... 9210 9211 -- Constructor on position: position (std::string* FILE = 0, uint 9212 LINE = 1, uint COL = 1) 9213 Create a `position' denoting a given point. Note that `file' is 9214 not reclaimed when the `position' is destroyed: memory managed 9215 must be handled elsewhere. 9216 9217 -- Method on position: void initialize (std::string* FILE = 0, uint 9218 LINE = 1, uint COL = 1) 9219 Reset the position to the given values. 9220 9221 -- Instance Variable of position: std::string* file 9222 The name of the file. It will always be handled as a pointer, the 9223 parser will never duplicate nor deallocate it. As an experimental 9224 feature you may change it to `TYPE*' using `%define filename_type 9225 "TYPE"'. 9226 9227 -- Instance Variable of position: uint line 9228 The line, starting at 1. 9229 9230 -- Method on position: uint lines (int HEIGHT = 1) 9231 Advance by HEIGHT lines, resetting the column number. 9232 9233 -- Instance Variable of position: uint column 9234 The column, starting at 1. 9235 9236 -- Method on position: uint columns (int WIDTH = 1) 9237 Advance by WIDTH columns, without changing the line number. 9238 9239 -- Method on position: position& operator+= (int WIDTH) 9240 -- Method on position: position operator+ (int WIDTH) 9241 -- Method on position: position& operator-= (int WIDTH) 9242 -- Method on position: position operator- (int WIDTH) 9243 Various forms of syntactic sugar for `columns'. 9244 9245 -- Method on position: bool operator== (const position& THAT) 9246 -- Method on position: bool operator!= (const position& THAT) 9247 Whether `*this' and `that' denote equal/different positions. 9248 9249 -- Function: std::ostream& operator<< (std::ostream& O, const 9250 position& P) 9251 Report P on O like this: `FILE:LINE.COLUMN', or `LINE.COLUMN' if 9252 FILE is null. 9253 9254 9255File: bison.info, Node: C++ location, Next: User Defined Location Type, Prev: C++ position, Up: C++ Location Values 9256 925710.1.3.2 C++ `location' 9258....................... 9259 9260 -- Constructor on location: location (const position& BEGIN, const 9261 position& END) 9262 Create a `Location' from the endpoints of the range. 9263 9264 -- Constructor on location: location (const position& POS = 9265 position()) 9266 -- Constructor on location: location (std::string* FILE, uint LINE, 9267 uint COL) 9268 Create a `Location' denoting an empty range located at a given 9269 point. 9270 9271 -- Method on location: void initialize (std::string* FILE = 0, uint 9272 LINE = 1, uint COL = 1) 9273 Reset the location to an empty range at the given values. 9274 9275 -- Instance Variable of location: position begin 9276 -- Instance Variable of location: position end 9277 The first, inclusive, position of the range, and the first beyond. 9278 9279 -- Method on location: uint columns (int WIDTH = 1) 9280 -- Method on location: uint lines (int HEIGHT = 1) 9281 Advance the `end' position. 9282 9283 -- Method on location: location operator+ (const location& END) 9284 -- Method on location: location operator+ (int WIDTH) 9285 -- Method on location: location operator+= (int WIDTH) 9286 Various forms of syntactic sugar. 9287 9288 -- Method on location: void step () 9289 Move `begin' onto `end'. 9290 9291 -- Method on location: bool operator== (const location& THAT) 9292 -- Method on location: bool operator!= (const location& THAT) 9293 Whether `*this' and `that' denote equal/different ranges of 9294 positions. 9295 9296 -- Function: std::ostream& operator<< (std::ostream& O, const 9297 location& P) 9298 Report P on O, taking care of special cases such as: no `filename' 9299 defined, or equal filename/line or column. 9300 9301 9302File: bison.info, Node: User Defined Location Type, Prev: C++ location, Up: C++ Location Values 9303 930410.1.3.3 User Defined Location Type 9305................................... 9306 9307Instead of using the built-in types you may use the `%define' variable 9308`api.location.type' to specify your own type: 9309 9310 %define api.location.type LOCATIONTYPE 9311 9312 The requirements over your LOCATIONTYPE are: 9313 * it must be copyable; 9314 9315 * in order to compute the (default) value of `@$' in a reduction, the 9316 parser basically runs 9317 @$.begin = @$1.begin; 9318 @$.end = @$N.end; // The location of last right-hand side symbol. 9319 so there must be copyable `begin' and `end' members; 9320 9321 * alternatively you may redefine the computation of the default 9322 location, in which case these members are not required (*note 9323 Location Default Action::); 9324 9325 * if traces are enabled, then there must exist an `std::ostream& 9326 operator<< (std::ostream& o, const LOCATIONTYPE& s)' function. 9327 9328 9329 In programs with several C++ parsers, you may also use the `%define' 9330variable `api.location.type' to share a common set of built-in 9331definitions for `position' and `location'. For instance, one parser 9332`master/parser.yy' might use: 9333 9334 %defines 9335 %locations 9336 %define namespace "master::" 9337 9338to generate the `master/position.hh' and `master/location.hh' files, 9339reused by other parsers as follows: 9340 9341 %define api.location.type "master::location" 9342 %code requires { #include <master/location.hh> } 9343 9344 9345File: bison.info, Node: C++ Parser Interface, Next: C++ Scanner Interface, Prev: C++ Location Values, Up: C++ Parsers 9346 934710.1.4 C++ Parser Interface 9348--------------------------- 9349 9350The output files `OUTPUT.hh' and `OUTPUT.cc' declare and define the 9351parser class in the namespace `yy'. The class name defaults to 9352`parser', but may be changed using `%define parser_class_name "NAME"'. 9353The interface of this class is detailed below. It can be extended 9354using the `%parse-param' feature: its semantics is slightly changed 9355since it describes an additional member of the parser class, and an 9356additional argument for its constructor. 9357 9358 -- Type of parser: semantic_type 9359 -- Type of parser: location_type 9360 The types for semantics value and locations. 9361 9362 -- Type of parser: token 9363 A structure that contains (only) the `yytokentype' enumeration, 9364 which defines the tokens. To refer to the token `FOO', use 9365 `yy::parser::token::FOO'. The scanner can use `typedef 9366 yy::parser::token token;' to "import" the token enumeration (*note 9367 Calc++ Scanner::). 9368 9369 -- Method on parser: parser (TYPE1 ARG1, ...) 9370 Build a new parser object. There are no arguments by default, 9371 unless `%parse-param {TYPE1 ARG1}' was used. 9372 9373 -- Method on parser: int parse () 9374 Run the syntactic analysis, and return 0 on success, 1 otherwise. 9375 9376 The whole function is wrapped in a `try'/`catch' block, so that 9377 when an exception is thrown, the `%destructor's are called to 9378 release the lookahead symbol, and the symbols pushed on the stack. 9379 9380 -- Method on parser: std::ostream& debug_stream () 9381 -- Method on parser: void set_debug_stream (std::ostream& O) 9382 Get or set the stream used for tracing the parsing. It defaults to 9383 `std::cerr'. 9384 9385 -- Method on parser: debug_level_type debug_level () 9386 -- Method on parser: void set_debug_level (debug_level L) 9387 Get or set the tracing level. Currently its value is either 0, no 9388 trace, or nonzero, full tracing. 9389 9390 -- Method on parser: void error (const location_type& L, const 9391 std::string& M) 9392 The definition for this member function must be supplied by the 9393 user: the parser uses it to report a parser error occurring at L, 9394 described by M. 9395 9396 9397File: bison.info, Node: C++ Scanner Interface, Next: A Complete C++ Example, Prev: C++ Parser Interface, Up: C++ Parsers 9398 939910.1.5 C++ Scanner Interface 9400---------------------------- 9401 9402The parser invokes the scanner by calling `yylex'. Contrary to C 9403parsers, C++ parsers are always pure: there is no point in using the 9404`%define api.pure full' directive. Therefore the interface is as 9405follows. 9406 9407 -- Method on parser: int yylex (semantic_type* YYLVAL, location_type* 9408 YYLLOC, TYPE1 ARG1, ...) 9409 Return the next token. Its type is the return value, its semantic 9410 value and location being YYLVAL and YYLLOC. Invocations of 9411 `%lex-param {TYPE1 ARG1}' yield additional arguments. 9412 9413 9414File: bison.info, Node: A Complete C++ Example, Prev: C++ Scanner Interface, Up: C++ Parsers 9415 941610.1.6 A Complete C++ Example 9417----------------------------- 9418 9419This section demonstrates the use of a C++ parser with a simple but 9420complete example. This example should be available on your system, 9421ready to compile, in the directory "../bison/examples/calc++". It 9422focuses on the use of Bison, therefore the design of the various C++ 9423classes is very naive: no accessors, no encapsulation of members etc. 9424We will use a Lex scanner, and more precisely, a Flex scanner, to 9425demonstrate the various interaction. A hand written scanner is 9426actually easier to interface with. 9427 9428* Menu: 9429 9430* Calc++ --- C++ Calculator:: The specifications 9431* Calc++ Parsing Driver:: An active parsing context 9432* Calc++ Parser:: A parser class 9433* Calc++ Scanner:: A pure C++ Flex scanner 9434* Calc++ Top Level:: Conducting the band 9435 9436 9437File: bison.info, Node: Calc++ --- C++ Calculator, Next: Calc++ Parsing Driver, Up: A Complete C++ Example 9438 943910.1.6.1 Calc++ -- C++ Calculator 9440................................. 9441 9442Of course the grammar is dedicated to arithmetics, a single expression, 9443possibly preceded by variable assignments. An environment containing 9444possibly predefined variables such as `one' and `two', is exchanged 9445with the parser. An example of valid input follows. 9446 9447 three := 3 9448 seven := one + two * three 9449 seven * seven 9450 9451 9452File: bison.info, Node: Calc++ Parsing Driver, Next: Calc++ Parser, Prev: Calc++ --- C++ Calculator, Up: A Complete C++ Example 9453 945410.1.6.2 Calc++ Parsing Driver 9455.............................. 9456 9457To support a pure interface with the parser (and the scanner) the 9458technique of the "parsing context" is convenient: a structure 9459containing all the data to exchange. Since, in addition to simply 9460launch the parsing, there are several auxiliary tasks to execute (open 9461the file for parsing, instantiate the parser etc.), we recommend 9462transforming the simple parsing context structure into a fully blown 9463"parsing driver" class. 9464 9465 The declaration of this driver class, `calc++-driver.hh', is as 9466follows. The first part includes the CPP guard and imports the 9467required standard library components, and the declaration of the parser 9468class. 9469 9470 #ifndef CALCXX_DRIVER_HH 9471 # define CALCXX_DRIVER_HH 9472 # include <string> 9473 # include <map> 9474 # include "calc++-parser.hh" 9475 9476Then comes the declaration of the scanning function. Flex expects the 9477signature of `yylex' to be defined in the macro `YY_DECL', and the C++ 9478parser expects it to be declared. We can factor both as follows. 9479 9480 // Tell Flex the lexer's prototype ... 9481 # define YY_DECL \ 9482 yy::calcxx_parser::token_type \ 9483 yylex (yy::calcxx_parser::semantic_type* yylval, \ 9484 yy::calcxx_parser::location_type* yylloc, \ 9485 calcxx_driver& driver) 9486 // ... and declare it for the parser's sake. 9487 YY_DECL; 9488 9489The `calcxx_driver' class is then declared with its most obvious 9490members. 9491 9492 // Conducting the whole scanning and parsing of Calc++. 9493 class calcxx_driver 9494 { 9495 public: 9496 calcxx_driver (); 9497 virtual ~calcxx_driver (); 9498 9499 std::map<std::string, int> variables; 9500 9501 int result; 9502 9503To encapsulate the coordination with the Flex scanner, it is useful to 9504have two members function to open and close the scanning phase. 9505 9506 // Handling the scanner. 9507 void scan_begin (); 9508 void scan_end (); 9509 bool trace_scanning; 9510 9511Similarly for the parser itself. 9512 9513 // Run the parser. Return 0 on success. 9514 int parse (const std::string& f); 9515 std::string file; 9516 bool trace_parsing; 9517 9518To demonstrate pure handling of parse errors, instead of simply dumping 9519them on the standard error output, we will pass them to the compiler 9520driver using the following two member functions. Finally, we close the 9521class declaration and CPP guard. 9522 9523 // Error handling. 9524 void error (const yy::location& l, const std::string& m); 9525 void error (const std::string& m); 9526 }; 9527 #endif // ! CALCXX_DRIVER_HH 9528 9529 The implementation of the driver is straightforward. The `parse' 9530member function deserves some attention. The `error' functions are 9531simple stubs, they should actually register the located error messages 9532and set error state. 9533 9534 #include "calc++-driver.hh" 9535 #include "calc++-parser.hh" 9536 9537 calcxx_driver::calcxx_driver () 9538 : trace_scanning (false), trace_parsing (false) 9539 { 9540 variables["one"] = 1; 9541 variables["two"] = 2; 9542 } 9543 9544 calcxx_driver::~calcxx_driver () 9545 { 9546 } 9547 9548 int 9549 calcxx_driver::parse (const std::string &f) 9550 { 9551 file = f; 9552 scan_begin (); 9553 yy::calcxx_parser parser (*this); 9554 parser.set_debug_level (trace_parsing); 9555 int res = parser.parse (); 9556 scan_end (); 9557 return res; 9558 } 9559 9560 void 9561 calcxx_driver::error (const yy::location& l, const std::string& m) 9562 { 9563 std::cerr << l << ": " << m << std::endl; 9564 } 9565 9566 void 9567 calcxx_driver::error (const std::string& m) 9568 { 9569 std::cerr << m << std::endl; 9570 } 9571 9572 9573File: bison.info, Node: Calc++ Parser, Next: Calc++ Scanner, Prev: Calc++ Parsing Driver, Up: A Complete C++ Example 9574 957510.1.6.3 Calc++ Parser 9576...................... 9577 9578The grammar file `calc++-parser.yy' starts by asking for the C++ 9579deterministic parser skeleton, the creation of the parser header file, 9580and specifies the name of the parser class. Because the C++ skeleton 9581changed several times, it is safer to require the version you designed 9582the grammar for. 9583 9584 %skeleton "lalr1.cc" /* -*- C++ -*- */ 9585 %require "2.7" 9586 %defines 9587 %define parser_class_name "calcxx_parser" 9588 9589Then come the declarations/inclusions needed to define the `%union'. 9590Because the parser uses the parsing driver and reciprocally, both 9591cannot include the header of the other. Because the driver's header 9592needs detailed knowledge about the parser class (in particular its 9593inner types), it is the parser's header which will simply use a forward 9594declaration of the driver. *Note %code Summary::. 9595 9596 %code requires { 9597 # include <string> 9598 class calcxx_driver; 9599 } 9600 9601The driver is passed by reference to the parser and to the scanner. 9602This provides a simple but effective pure interface, not relying on 9603global variables. 9604 9605 // The parsing context. 9606 %parse-param { calcxx_driver& driver } 9607 %lex-param { calcxx_driver& driver } 9608 9609Then we request the location tracking feature, and initialize the first 9610location's file name. Afterward new locations are computed relatively 9611to the previous locations: the file name will be automatically 9612propagated. 9613 9614 %locations 9615 %initial-action 9616 { 9617 // Initialize the initial location. 9618 @$.begin.filename = @$.end.filename = &driver.file; 9619 }; 9620 9621Use the two following directives to enable parser tracing and verbose 9622error messages. However, verbose error messages can contain incorrect 9623information (*note LAC::). 9624 9625 %debug 9626 %error-verbose 9627 9628Semantic values cannot use "real" objects, but only pointers to them. 9629 9630 // Symbols. 9631 %union 9632 { 9633 int ival; 9634 std::string *sval; 9635 }; 9636 9637The code between `%code {' and `}' is output in the `*.cc' file; it 9638needs detailed knowledge about the driver. 9639 9640 %code { 9641 # include "calc++-driver.hh" 9642 } 9643 9644The token numbered as 0 corresponds to end of file; the following line 9645allows for nicer error messages referring to "end of file" instead of 9646"$end". Similarly user friendly named are provided for each symbol. 9647Note that the tokens names are prefixed by `TOKEN_' to avoid name 9648clashes. 9649 9650 %token END 0 "end of file" 9651 %token ASSIGN ":=" 9652 %token <sval> IDENTIFIER "identifier" 9653 %token <ival> NUMBER "number" 9654 %type <ival> exp 9655 9656To enable memory deallocation during error recovery, use `%destructor'. 9657 9658 %printer { yyoutput << *$$; } "identifier" 9659 %destructor { delete $$; } "identifier" 9660 9661 %printer { yyoutput << $$; } <ival> 9662 9663The grammar itself is straightforward. 9664 9665 %% 9666 %start unit; 9667 unit: assignments exp { driver.result = $2; }; 9668 9669 assignments: 9670 /* Nothing. */ {} 9671 | assignments assignment {}; 9672 9673 assignment: 9674 "identifier" ":=" exp 9675 { driver.variables[*$1] = $3; delete $1; }; 9676 9677 %left '+' '-'; 9678 %left '*' '/'; 9679 exp: exp '+' exp { $$ = $1 + $3; } 9680 | exp '-' exp { $$ = $1 - $3; } 9681 | exp '*' exp { $$ = $1 * $3; } 9682 | exp '/' exp { $$ = $1 / $3; } 9683 | "identifier" { $$ = driver.variables[*$1]; delete $1; } 9684 | "number" { $$ = $1; }; 9685 %% 9686 9687Finally the `error' member function registers the errors to the driver. 9688 9689 void 9690 yy::calcxx_parser::error (const yy::calcxx_parser::location_type& l, 9691 const std::string& m) 9692 { 9693 driver.error (l, m); 9694 } 9695 9696 9697File: bison.info, Node: Calc++ Scanner, Next: Calc++ Top Level, Prev: Calc++ Parser, Up: A Complete C++ Example 9698 969910.1.6.4 Calc++ Scanner 9700....................... 9701 9702The Flex scanner first includes the driver declaration, then the 9703parser's to get the set of defined tokens. 9704 9705 %{ /* -*- C++ -*- */ 9706 # include <cstdlib> 9707 # include <cerrno> 9708 # include <climits> 9709 # include <string> 9710 # include "calc++-driver.hh" 9711 # include "calc++-parser.hh" 9712 9713 /* Work around an incompatibility in flex (at least versions 9714 2.5.31 through 2.5.33): it generates code that does 9715 not conform to C89. See Debian bug 333231 9716 <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=333231>. */ 9717 # undef yywrap 9718 # define yywrap() 1 9719 9720 /* By default yylex returns int, we use token_type. 9721 Unfortunately yyterminate by default returns 0, which is 9722 not of token_type. */ 9723 #define yyterminate() return token::END 9724 %} 9725 9726Because there is no `#include'-like feature we don't need `yywrap', we 9727don't need `unput' either, and we parse an actual file, this is not an 9728interactive session with the user. Finally we enable the scanner 9729tracing features. 9730 9731 %option noyywrap nounput batch debug 9732 9733Abbreviations allow for more readable rules. 9734 9735 id [a-zA-Z][a-zA-Z_0-9]* 9736 int [0-9]+ 9737 blank [ \t] 9738 9739The following paragraph suffices to track locations accurately. Each 9740time `yylex' is invoked, the begin position is moved onto the end 9741position. Then when a pattern is matched, the end position is advanced 9742of its width. In case it matched ends of lines, the end cursor is 9743adjusted, and each time blanks are matched, the begin cursor is moved 9744onto the end cursor to effectively ignore the blanks preceding tokens. 9745Comments would be treated equally. 9746 9747 %{ 9748 # define YY_USER_ACTION yylloc->columns (yyleng); 9749 %} 9750 %% 9751 %{ 9752 yylloc->step (); 9753 %} 9754 {blank}+ yylloc->step (); 9755 [\n]+ yylloc->lines (yyleng); yylloc->step (); 9756 9757The rules are simple, just note the use of the driver to report errors. 9758It is convenient to use a typedef to shorten 9759`yy::calcxx_parser::token::identifier' into `token::identifier' for 9760instance. 9761 9762 %{ 9763 typedef yy::calcxx_parser::token token; 9764 %} 9765 /* Convert ints to the actual type of tokens. */ 9766 [-+*/] return yy::calcxx_parser::token_type (yytext[0]); 9767 9768 ":=" return token::ASSIGN; 9769 9770 {int} { 9771 errno = 0; 9772 long n = strtol (yytext, NULL, 10); 9773 if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE)) 9774 driver.error (*yylloc, "integer is out of range"); 9775 yylval->ival = n; 9776 return token::NUMBER; 9777 } 9778 9779 {id} { 9780 yylval->sval = new std::string (yytext); 9781 return token::IDENTIFIER; 9782 } 9783 9784 . driver.error (*yylloc, "invalid character"); 9785 %% 9786 9787Finally, because the scanner related driver's member function depend on 9788the scanner's data, it is simpler to implement them in this file. 9789 9790 void 9791 calcxx_driver::scan_begin () 9792 { 9793 yy_flex_debug = trace_scanning; 9794 if (file.empty () || file == "-") 9795 yyin = stdin; 9796 else if (!(yyin = fopen (file.c_str (), "r"))) 9797 { 9798 error ("cannot open " + file + ": " + strerror(errno)); 9799 exit (EXIT_FAILURE); 9800 } 9801 } 9802 9803 void 9804 calcxx_driver::scan_end () 9805 { 9806 fclose (yyin); 9807 } 9808 9809 9810File: bison.info, Node: Calc++ Top Level, Prev: Calc++ Scanner, Up: A Complete C++ Example 9811 981210.1.6.5 Calc++ Top Level 9813......................... 9814 9815The top level file, `calc++.cc', poses no problem. 9816 9817 #include <iostream> 9818 #include "calc++-driver.hh" 9819 9820 int 9821 main (int argc, char *argv[]) 9822 { 9823 calcxx_driver driver; 9824 for (int i = 1; i < argc; ++i) 9825 if (argv[i] == std::string ("-p")) 9826 driver.trace_parsing = true; 9827 else if (argv[i] == std::string ("-s")) 9828 driver.trace_scanning = true; 9829 else if (!driver.parse (argv[i])) 9830 std::cout << driver.result << std::endl; 9831 } 9832 9833 9834File: bison.info, Node: Java Parsers, Prev: C++ Parsers, Up: Other Languages 9835 983610.2 Java Parsers 9837================= 9838 9839* Menu: 9840 9841* Java Bison Interface:: Asking for Java parser generation 9842* Java Semantic Values:: %type and %token vs. Java 9843* Java Location Values:: The position and location classes 9844* Java Parser Interface:: Instantiating and running the parser 9845* Java Scanner Interface:: Specifying the scanner for the parser 9846* Java Action Features:: Special features for use in actions 9847* Java Differences:: Differences between C/C++ and Java Grammars 9848* Java Declarations Summary:: List of Bison declarations used with Java 9849 9850 9851File: bison.info, Node: Java Bison Interface, Next: Java Semantic Values, Up: Java Parsers 9852 985310.2.1 Java Bison Interface 9854--------------------------- 9855 9856(The current Java interface is experimental and may evolve. More user 9857feedback will help to stabilize it.) 9858 9859 The Java parser skeletons are selected using the `%language "Java"' 9860directive or the `-L java'/`--language=java' option. 9861 9862 When generating a Java parser, `bison BASENAME.y' will create a 9863single Java source file named `BASENAME.java' containing the parser 9864implementation. Using a grammar file without a `.y' suffix is 9865currently broken. The basename of the parser implementation file can 9866be changed by the `%file-prefix' directive or the `-p'/`--name-prefix' 9867option. The entire parser implementation file name can be changed by 9868the `%output' directive or the `-o'/`--output' option. The parser 9869implementation file contains a single class for the parser. 9870 9871 You can create documentation for generated parsers using Javadoc. 9872 9873 Contrary to C parsers, Java parsers do not use global variables; the 9874state of the parser is always local to an instance of the parser class. 9875Therefore, all Java parsers are "pure", and the `%pure-parser' and 9876`%define api.pure full' directives does not do anything when used in 9877Java. 9878 9879 Push parsers are currently unsupported in Java and `%define 9880api.push-pull' have no effect. 9881 9882 GLR parsers are currently unsupported in Java. Do not use the 9883`glr-parser' directive. 9884 9885 No header file can be generated for Java parsers. Do not use the 9886`%defines' directive or the `-d'/`--defines' options. 9887 9888 Currently, support for debugging and verbose errors are always 9889compiled in. Thus the `%debug' and `%token-table' directives and the 9890`-t'/`--debug' and `-k'/`--token-table' options have no effect. This 9891may change in the future to eliminate unused code in the generated 9892parser, so use `%debug' and `%verbose-error' explicitly if needed. 9893Also, in the future the `%token-table' directive might enable a public 9894interface to access the token names and codes. 9895 9896 9897File: bison.info, Node: Java Semantic Values, Next: Java Location Values, Prev: Java Bison Interface, Up: Java Parsers 9898 989910.2.2 Java Semantic Values 9900--------------------------- 9901 9902There is no `%union' directive in Java parsers. Instead, the semantic 9903values' types (class names) should be specified in the `%type' or 9904`%token' directive: 9905 9906 %type <Expression> expr assignment_expr term factor 9907 %type <Integer> number 9908 9909 By default, the semantic stack is declared to have `Object' members, 9910which means that the class types you specify can be of any class. To 9911improve the type safety of the parser, you can declare the common 9912superclass of all the semantic values using the `%define stype' 9913directive. For example, after the following declaration: 9914 9915 %define stype "ASTNode" 9916 9917any `%type' or `%token' specifying a semantic type which is not a 9918subclass of ASTNode, will cause a compile-time error. 9919 9920 Types used in the directives may be qualified with a package name. 9921Primitive data types are accepted for Java version 1.5 or later. Note 9922that in this case the autoboxing feature of Java 1.5 will be used. 9923Generic types may not be used; this is due to a limitation in the 9924implementation of Bison, and may change in future releases. 9925 9926 Java parsers do not support `%destructor', since the language adopts 9927garbage collection. The parser will try to hold references to semantic 9928values for as little time as needed. 9929 9930 Java parsers do not support `%printer', as `toString()' can be used 9931to print the semantic values. This however may change (in a 9932backwards-compatible way) in future versions of Bison. 9933 9934 9935File: bison.info, Node: Java Location Values, Next: Java Parser Interface, Prev: Java Semantic Values, Up: Java Parsers 9936 993710.2.3 Java Location Values 9938--------------------------- 9939 9940When the directive `%locations' is used, the Java parser supports 9941location tracking, see *note Tracking Locations::. An auxiliary 9942user-defined class defines a "position", a single point in a file; 9943Bison itself defines a class representing a "location", a range 9944composed of a pair of positions (possibly spanning several files). The 9945location class is an inner class of the parser; the name is `Location' 9946by default, and may also be renamed using `%define api.location.type 9947"CLASS-NAME"'. 9948 9949 The location class treats the position as a completely opaque value. 9950By default, the class name is `Position', but this can be changed with 9951`%define api.position.type "CLASS-NAME"'. This class must be supplied 9952by the user. 9953 9954 -- Instance Variable of Location: Position begin 9955 -- Instance Variable of Location: Position end 9956 The first, inclusive, position of the range, and the first beyond. 9957 9958 -- Constructor on Location: Location (Position LOC) 9959 Create a `Location' denoting an empty range located at a given 9960 point. 9961 9962 -- Constructor on Location: Location (Position BEGIN, Position END) 9963 Create a `Location' from the endpoints of the range. 9964 9965 -- Method on Location: String toString () 9966 Prints the range represented by the location. For this to work 9967 properly, the position class should override the `equals' and 9968 `toString' methods appropriately. 9969 9970 9971File: bison.info, Node: Java Parser Interface, Next: Java Scanner Interface, Prev: Java Location Values, Up: Java Parsers 9972 997310.2.4 Java Parser Interface 9974---------------------------- 9975 9976The name of the generated parser class defaults to `YYParser'. The 9977`YY' prefix may be changed using the `%name-prefix' directive or the 9978`-p'/`--name-prefix' option. Alternatively, use `%define 9979parser_class_name "NAME"' to give a custom name to the class. The 9980interface of this class is detailed below. 9981 9982 By default, the parser class has package visibility. A declaration 9983`%define public' will change to public visibility. Remember that, 9984according to the Java language specification, the name of the `.java' 9985file should match the name of the class in this case. Similarly, you 9986can use `abstract', `final' and `strictfp' with the `%define' 9987declaration to add other modifiers to the parser class. 9988 9989 The Java package name of the parser class can be specified using the 9990`%define package' directive. The superclass and the implemented 9991interfaces of the parser class can be specified with the `%define 9992extends' and `%define implements' directives. 9993 9994 The parser class defines an inner class, `Location', that is used 9995for location tracking (see *note Java Location Values::), and a inner 9996interface, `Lexer' (see *note Java Scanner Interface::). Other than 9997these inner class/interface, and the members described in the interface 9998below, all the other members and fields are preceded with a `yy' or 9999`YY' prefix to avoid clashes with user code. 10000 10001 The parser class can be extended using the `%parse-param' directive. 10002Each occurrence of the directive will add a `protected final' field to 10003the parser class, and an argument to its constructor, which initialize 10004them automatically. 10005 10006 Token names defined by `%token' and the predefined `EOF' token name 10007are added as constant fields to the parser class. 10008 10009 -- Constructor on YYParser: YYParser (LEX_PARAM, ..., PARSE_PARAM, 10010 ...) 10011 Build a new parser object with embedded `%code lexer'. There are 10012 no parameters, unless `%parse-param's and/or `%lex-param's are 10013 used. 10014 10015 -- Constructor on YYParser: YYParser (Lexer LEXER, PARSE_PARAM, ...) 10016 Build a new parser object using the specified scanner. There are 10017 no additional parameters unless `%parse-param's are used. 10018 10019 If the scanner is defined by `%code lexer', this constructor is 10020 declared `protected' and is called automatically with a scanner 10021 created with the correct `%lex-param's. 10022 10023 -- Method on YYParser: boolean parse () 10024 Run the syntactic analysis, and return `true' on success, `false' 10025 otherwise. 10026 10027 -- Method on YYParser: boolean recovering () 10028 During the syntactic analysis, return `true' if recovering from a 10029 syntax error. *Note Error Recovery::. 10030 10031 -- Method on YYParser: java.io.PrintStream getDebugStream () 10032 -- Method on YYParser: void setDebugStream (java.io.printStream O) 10033 Get or set the stream used for tracing the parsing. It defaults to 10034 `System.err'. 10035 10036 -- Method on YYParser: int getDebugLevel () 10037 -- Method on YYParser: void setDebugLevel (int L) 10038 Get or set the tracing level. Currently its value is either 0, no 10039 trace, or nonzero, full tracing. 10040 10041 10042File: bison.info, Node: Java Scanner Interface, Next: Java Action Features, Prev: Java Parser Interface, Up: Java Parsers 10043 1004410.2.5 Java Scanner Interface 10045----------------------------- 10046 10047There are two possible ways to interface a Bison-generated Java parser 10048with a scanner: the scanner may be defined by `%code lexer', or defined 10049elsewhere. In either case, the scanner has to implement the `Lexer' 10050inner interface of the parser class. 10051 10052 In the first case, the body of the scanner class is placed in `%code 10053lexer' blocks. If you want to pass parameters from the parser 10054constructor to the scanner constructor, specify them with `%lex-param'; 10055they are passed before `%parse-param's to the constructor. 10056 10057 In the second case, the scanner has to implement the `Lexer' 10058interface, which is defined within the parser class (e.g., 10059`YYParser.Lexer'). The constructor of the parser object will then 10060accept an object implementing the interface; `%lex-param' is not used 10061in this case. 10062 10063 In both cases, the scanner has to implement the following methods. 10064 10065 -- Method on Lexer: void yyerror (Location LOC, String MSG) 10066 This method is defined by the user to emit an error message. The 10067 first parameter is omitted if location tracking is not active. 10068 Its type can be changed using `%define api.location.type 10069 "CLASS-NAME".' 10070 10071 -- Method on Lexer: int yylex () 10072 Return the next token. Its type is the return value, its semantic 10073 value and location are saved and returned by the their methods in 10074 the interface. 10075 10076 Use `%define lex_throws' to specify any uncaught exceptions. 10077 Default is `java.io.IOException'. 10078 10079 -- Method on Lexer: Position getStartPos () 10080 -- Method on Lexer: Position getEndPos () 10081 Return respectively the first position of the last token that 10082 `yylex' returned, and the first position beyond it. These methods 10083 are not needed unless location tracking is active. 10084 10085 The return type can be changed using `%define api.position.type 10086 "CLASS-NAME".' 10087 10088 -- Method on Lexer: Object getLVal () 10089 Return the semantic value of the last token that yylex returned. 10090 10091 The return type can be changed using `%define stype "CLASS-NAME".' 10092 10093 10094File: bison.info, Node: Java Action Features, Next: Java Differences, Prev: Java Scanner Interface, Up: Java Parsers 10095 1009610.2.6 Special Features for Use in Java Actions 10097----------------------------------------------- 10098 10099The following special constructs can be uses in Java actions. Other 10100analogous C action features are currently unavailable for Java. 10101 10102 Use `%define throws' to specify any uncaught exceptions from parser 10103actions, and initial actions specified by `%initial-action'. 10104 10105 -- Variable: $N 10106 The semantic value for the Nth component of the current rule. 10107 This may not be assigned to. *Note Java Semantic Values::. 10108 10109 -- Variable: $<TYPEALT>N 10110 Like `$N' but specifies a alternative type TYPEALT. *Note Java 10111 Semantic Values::. 10112 10113 -- Variable: $$ 10114 The semantic value for the grouping made by the current rule. As a 10115 value, this is in the base type (`Object' or as specified by 10116 `%define stype') as in not cast to the declared subtype because 10117 casts are not allowed on the left-hand side of Java assignments. 10118 Use an explicit Java cast if the correct subtype is needed. *Note 10119 Java Semantic Values::. 10120 10121 -- Variable: $<TYPEALT>$ 10122 Same as `$$' since Java always allow assigning to the base type. 10123 Perhaps we should use this and `$<>$' for the value and `$$' for 10124 setting the value but there is currently no easy way to distinguish 10125 these constructs. *Note Java Semantic Values::. 10126 10127 -- Variable: @N 10128 The location information of the Nth component of the current rule. 10129 This may not be assigned to. *Note Java Location Values::. 10130 10131 -- Variable: @$ 10132 The location information of the grouping made by the current rule. 10133 *Note Java Location Values::. 10134 10135 -- Statement: return YYABORT `;' 10136 Return immediately from the parser, indicating failure. *Note 10137 Java Parser Interface::. 10138 10139 -- Statement: return YYACCEPT `;' 10140 Return immediately from the parser, indicating success. *Note 10141 Java Parser Interface::. 10142 10143 -- Statement: return YYERROR `;' 10144 Start error recovery (without printing an error message). *Note 10145 Error Recovery::. 10146 10147 -- Function: boolean recovering () 10148 Return whether error recovery is being done. In this state, the 10149 parser reads token until it reaches a known state, and then 10150 restarts normal operation. *Note Error Recovery::. 10151 10152 -- Function: protected void yyerror (String msg) 10153 -- Function: protected void yyerror (Position pos, String msg) 10154 -- Function: protected void yyerror (Location loc, String msg) 10155 Print an error message using the `yyerror' method of the scanner 10156 instance in use. 10157 10158 10159File: bison.info, Node: Java Differences, Next: Java Declarations Summary, Prev: Java Action Features, Up: Java Parsers 10160 1016110.2.7 Differences between C/C++ and Java Grammars 10162-------------------------------------------------- 10163 10164The different structure of the Java language forces several differences 10165between C/C++ grammars, and grammars designed for Java parsers. This 10166section summarizes these differences. 10167 10168 * Java lacks a preprocessor, so the `YYERROR', `YYACCEPT', `YYABORT' 10169 symbols (*note Table of Symbols::) cannot obviously be macros. 10170 Instead, they should be preceded by `return' when they appear in 10171 an action. The actual definition of these symbols is opaque to 10172 the Bison grammar, and it might change in the future. The only 10173 meaningful operation that you can do, is to return them. *Note 10174 Java Action Features::. 10175 10176 Note that of these three symbols, only `YYACCEPT' and `YYABORT' 10177 will cause a return from the `yyparse' method(1). 10178 10179 * Java lacks unions, so `%union' has no effect. Instead, semantic 10180 values have a common base type: `Object' or as specified by 10181 `%define stype'. Angle brackets on `%token', `type', `$N' and 10182 `$$' specify subtypes rather than fields of an union. The type of 10183 `$$', even with angle brackets, is the base type since Java casts 10184 are not allow on the left-hand side of assignments. Also, `$N' 10185 and `@N' are not allowed on the left-hand side of assignments. 10186 *Note Java Semantic Values::, and *note Java Action Features::. 10187 10188 * The prologue declarations have a different meaning than in C/C++ 10189 code. 10190 `%code imports' 10191 blocks are placed at the beginning of the Java source code. 10192 They may include copyright notices. For a `package' 10193 declarations, it is suggested to use `%define package' 10194 instead. 10195 10196 unqualified `%code' 10197 blocks are placed inside the parser class. 10198 10199 `%code lexer' 10200 blocks, if specified, should include the implementation of the 10201 scanner. If there is no such block, the scanner can be any 10202 class that implements the appropriate interface (*note Java 10203 Scanner Interface::). 10204 10205 Other `%code' blocks are not supported in Java parsers. In 10206 particular, `%{ ... %}' blocks should not be used and may give an 10207 error in future versions of Bison. 10208 10209 The epilogue has the same meaning as in C/C++ code and it can be 10210 used to define other classes used by the parser _outside_ the 10211 parser class. 10212 10213 ---------- Footnotes ---------- 10214 10215 (1) Java parsers include the actions in a separate method than 10216`yyparse' in order to have an intuitive syntax that corresponds to 10217these C macros. 10218 10219 10220File: bison.info, Node: Java Declarations Summary, Prev: Java Differences, Up: Java Parsers 10221 1022210.2.8 Java Declarations Summary 10223-------------------------------- 10224 10225This summary only include declarations specific to Java or have special 10226meaning when used in a Java parser. 10227 10228 -- Directive: %language "Java" 10229 Generate a Java class for the parser. 10230 10231 -- Directive: %lex-param {TYPE NAME} 10232 A parameter for the lexer class defined by `%code lexer' _only_, 10233 added as parameters to the lexer constructor and the parser 10234 constructor that _creates_ a lexer. Default is none. *Note Java 10235 Scanner Interface::. 10236 10237 -- Directive: %name-prefix "PREFIX" 10238 The prefix of the parser class name `PREFIXParser' if `%define 10239 parser_class_name' is not used. Default is `YY'. *Note Java 10240 Bison Interface::. 10241 10242 -- Directive: %parse-param {TYPE NAME} 10243 A parameter for the parser class added as parameters to 10244 constructor(s) and as fields initialized by the constructor(s). 10245 Default is none. *Note Java Parser Interface::. 10246 10247 -- Directive: %token <TYPE> TOKEN ... 10248 Declare tokens. Note that the angle brackets enclose a Java 10249 _type_. *Note Java Semantic Values::. 10250 10251 -- Directive: %type <TYPE> NONTERMINAL ... 10252 Declare the type of nonterminals. Note that the angle brackets 10253 enclose a Java _type_. *Note Java Semantic Values::. 10254 10255 -- Directive: %code { CODE ... } 10256 Code appended to the inside of the parser class. *Note Java 10257 Differences::. 10258 10259 -- Directive: %code imports { CODE ... } 10260 Code inserted just after the `package' declaration. *Note Java 10261 Differences::. 10262 10263 -- Directive: %code lexer { CODE ... } 10264 Code added to the body of a inner lexer class within the parser 10265 class. *Note Java Scanner Interface::. 10266 10267 -- Directive: %% CODE ... 10268 Code (after the second `%%') appended to the end of the file, 10269 _outside_ the parser class. *Note Java Differences::. 10270 10271 -- Directive: %{ CODE ... %} 10272 Not supported. Use `%code import' instead. *Note Java 10273 Differences::. 10274 10275 -- Directive: %define abstract 10276 Whether the parser class is declared `abstract'. Default is false. 10277 *Note Java Bison Interface::. 10278 10279 -- Directive: %define extends "SUPERCLASS" 10280 The superclass of the parser class. Default is none. *Note Java 10281 Bison Interface::. 10282 10283 -- Directive: %define final 10284 Whether the parser class is declared `final'. Default is false. 10285 *Note Java Bison Interface::. 10286 10287 -- Directive: %define implements "INTERFACES" 10288 The implemented interfaces of the parser class, a comma-separated 10289 list. Default is none. *Note Java Bison Interface::. 10290 10291 -- Directive: %define lex_throws "EXCEPTIONS" 10292 The exceptions thrown by the `yylex' method of the lexer, a 10293 comma-separated list. Default is `java.io.IOException'. *Note 10294 Java Scanner Interface::. 10295 10296 -- Directive: %define api.location.type "CLASS" 10297 The name of the class used for locations (a range between two 10298 positions). This class is generated as an inner class of the 10299 parser class by `bison'. Default is `Location'. Formerly named 10300 `location_type'. *Note Java Location Values::. 10301 10302 -- Directive: %define package "PACKAGE" 10303 The package to put the parser class in. Default is none. *Note 10304 Java Bison Interface::. 10305 10306 -- Directive: %define parser_class_name "NAME" 10307 The name of the parser class. Default is `YYParser' or 10308 `NAME-PREFIXParser'. *Note Java Bison Interface::. 10309 10310 -- Directive: %define api.position.type "CLASS" 10311 The name of the class used for positions. This class must be 10312 supplied by the user. Default is `Position'. Formerly named 10313 `position_type'. *Note Java Location Values::. 10314 10315 -- Directive: %define public 10316 Whether the parser class is declared `public'. Default is false. 10317 *Note Java Bison Interface::. 10318 10319 -- Directive: %define stype "CLASS" 10320 The base type of semantic values. Default is `Object'. *Note 10321 Java Semantic Values::. 10322 10323 -- Directive: %define strictfp 10324 Whether the parser class is declared `strictfp'. Default is false. 10325 *Note Java Bison Interface::. 10326 10327 -- Directive: %define throws "EXCEPTIONS" 10328 The exceptions thrown by user-supplied parser actions and 10329 `%initial-action', a comma-separated list. Default is none. 10330 *Note Java Parser Interface::. 10331 10332 10333File: bison.info, Node: FAQ, Next: Table of Symbols, Prev: Other Languages, Up: Top 10334 1033511 Frequently Asked Questions 10336***************************** 10337 10338Several questions about Bison come up occasionally. Here some of them 10339are addressed. 10340 10341* Menu: 10342 10343* Memory Exhausted:: Breaking the Stack Limits 10344* How Can I Reset the Parser:: `yyparse' Keeps some State 10345* Strings are Destroyed:: `yylval' Loses Track of Strings 10346* Implementing Gotos/Loops:: Control Flow in the Calculator 10347* Multiple start-symbols:: Factoring closely related grammars 10348* Secure? Conform?:: Is Bison POSIX safe? 10349* I can't build Bison:: Troubleshooting 10350* Where can I find help?:: Troubleshouting 10351* Bug Reports:: Troublereporting 10352* More Languages:: Parsers in C++, Java, and so on 10353* Beta Testing:: Experimenting development versions 10354* Mailing Lists:: Meeting other Bison users 10355 10356 10357File: bison.info, Node: Memory Exhausted, Next: How Can I Reset the Parser, Up: FAQ 10358 1035911.1 Memory Exhausted 10360===================== 10361 10362 My parser returns with error with a `memory exhausted' message. 10363 What can I do? 10364 10365 This question is already addressed elsewhere, see *note Recursive 10366Rules: Recursion. 10367 10368 10369File: bison.info, Node: How Can I Reset the Parser, Next: Strings are Destroyed, Prev: Memory Exhausted, Up: FAQ 10370 1037111.2 How Can I Reset the Parser 10372=============================== 10373 10374The following phenomenon has several symptoms, resulting in the 10375following typical questions: 10376 10377 I invoke `yyparse' several times, and on correct input it works 10378 properly; but when a parse error is found, all the other calls fail 10379 too. How can I reset the error flag of `yyparse'? 10380 10381or 10382 10383 My parser includes support for an `#include'-like feature, in 10384 which case I run `yyparse' from `yyparse'. This fails although I 10385 did specify `%define api.pure full'. 10386 10387 These problems typically come not from Bison itself, but from 10388Lex-generated scanners. Because these scanners use large buffers for 10389speed, they might not notice a change of input file. As a 10390demonstration, consider the following source file, `first-line.l': 10391 10392 %{ 10393 #include <stdio.h> 10394 #include <stdlib.h> 10395 %} 10396 %% 10397 .*\n ECHO; return 1; 10398 %% 10399 int 10400 yyparse (char const *file) 10401 { 10402 yyin = fopen (file, "r"); 10403 if (!yyin) 10404 { 10405 perror ("fopen"); 10406 exit (EXIT_FAILURE); 10407 } 10408 /* One token only. */ 10409 yylex (); 10410 if (fclose (yyin) != 0) 10411 { 10412 perror ("fclose"); 10413 exit (EXIT_FAILURE); 10414 } 10415 return 0; 10416 } 10417 10418 int 10419 main (void) 10420 { 10421 yyparse ("input"); 10422 yyparse ("input"); 10423 return 0; 10424 } 10425 10426If the file `input' contains 10427 10428 input:1: Hello, 10429 input:2: World! 10430 10431then instead of getting the first line twice, you get: 10432 10433 $ flex -ofirst-line.c first-line.l 10434 $ gcc -ofirst-line first-line.c -ll 10435 $ ./first-line 10436 input:1: Hello, 10437 input:2: World! 10438 10439 Therefore, whenever you change `yyin', you must tell the 10440Lex-generated scanner to discard its current buffer and switch to the 10441new one. This depends upon your implementation of Lex; see its 10442documentation for more. For Flex, it suffices to call 10443`YY_FLUSH_BUFFER' after each change to `yyin'. If your Flex-generated 10444scanner needs to read from several input streams to handle features 10445like include files, you might consider using Flex functions like 10446`yy_switch_to_buffer' that manipulate multiple input buffers. 10447 10448 If your Flex-generated scanner uses start conditions (*note Start 10449conditions: (flex)Start conditions.), you might also want to reset the 10450scanner's state, i.e., go back to the initial start condition, through 10451a call to `BEGIN (0)'. 10452 10453 10454File: bison.info, Node: Strings are Destroyed, Next: Implementing Gotos/Loops, Prev: How Can I Reset the Parser, Up: FAQ 10455 1045611.3 Strings are Destroyed 10457========================== 10458 10459 My parser seems to destroy old strings, or maybe it loses track of 10460 them. Instead of reporting `"foo", "bar"', it reports `"bar", 10461 "bar"', or even `"foo\nbar", "bar"'. 10462 10463 This error is probably the single most frequent "bug report" sent to 10464Bison lists, but is only concerned with a misunderstanding of the role 10465of the scanner. Consider the following Lex code: 10466 10467 %{ 10468 #include <stdio.h> 10469 char *yylval = NULL; 10470 %} 10471 %% 10472 .* yylval = yytext; return 1; 10473 \n /* IGNORE */ 10474 %% 10475 int 10476 main () 10477 { 10478 /* Similar to using $1, $2 in a Bison action. */ 10479 char *fst = (yylex (), yylval); 10480 char *snd = (yylex (), yylval); 10481 printf ("\"%s\", \"%s\"\n", fst, snd); 10482 return 0; 10483 } 10484 10485 If you compile and run this code, you get: 10486 10487 $ flex -osplit-lines.c split-lines.l 10488 $ gcc -osplit-lines split-lines.c -ll 10489 $ printf 'one\ntwo\n' | ./split-lines 10490 "one 10491 two", "two" 10492 10493this is because `yytext' is a buffer provided for _reading_ in the 10494action, but if you want to keep it, you have to duplicate it (e.g., 10495using `strdup'). Note that the output may depend on how your 10496implementation of Lex handles `yytext'. For instance, when given the 10497Lex compatibility option `-l' (which triggers the option `%array') Flex 10498generates a different behavior: 10499 10500 $ flex -l -osplit-lines.c split-lines.l 10501 $ gcc -osplit-lines split-lines.c -ll 10502 $ printf 'one\ntwo\n' | ./split-lines 10503 "two", "two" 10504 10505 10506File: bison.info, Node: Implementing Gotos/Loops, Next: Multiple start-symbols, Prev: Strings are Destroyed, Up: FAQ 10507 1050811.4 Implementing Gotos/Loops 10509============================= 10510 10511 My simple calculator supports variables, assignments, and 10512 functions, but how can I implement gotos, or loops? 10513 10514 Although very pedagogical, the examples included in the document blur 10515the distinction to make between the parser--whose job is to recover the 10516structure of a text and to transmit it to subsequent modules of the 10517program--and the processing (such as the execution) of this structure. 10518This works well with so called straight line programs, i.e., precisely 10519those that have a straightforward execution model: execute simple 10520instructions one after the others. 10521 10522 If you want a richer model, you will probably need to use the parser 10523to construct a tree that does represent the structure it has recovered; 10524this tree is usually called the "abstract syntax tree", or "AST" for 10525short. Then, walking through this tree, traversing it in various ways, 10526will enable treatments such as its execution or its translation, which 10527will result in an interpreter or a compiler. 10528 10529 This topic is way beyond the scope of this manual, and the reader is 10530invited to consult the dedicated literature. 10531 10532 10533File: bison.info, Node: Multiple start-symbols, Next: Secure? Conform?, Prev: Implementing Gotos/Loops, Up: FAQ 10534 1053511.5 Multiple start-symbols 10536=========================== 10537 10538 I have several closely related grammars, and I would like to share 10539 their implementations. In fact, I could use a single grammar but 10540 with multiple entry points. 10541 10542 Bison does not support multiple start-symbols, but there is a very 10543simple means to simulate them. If `foo' and `bar' are the two pseudo 10544start-symbols, then introduce two new tokens, say `START_FOO' and 10545`START_BAR', and use them as switches from the real start-symbol: 10546 10547 %token START_FOO START_BAR; 10548 %start start; 10549 start: 10550 START_FOO foo 10551 | START_BAR bar; 10552 10553 These tokens prevents the introduction of new conflicts. As far as 10554the parser goes, that is all that is needed. 10555 10556 Now the difficult part is ensuring that the scanner will send these 10557tokens first. If your scanner is hand-written, that should be 10558straightforward. If your scanner is generated by Lex, them there is 10559simple means to do it: recall that anything between `%{ ... %}' after 10560the first `%%' is copied verbatim in the top of the generated `yylex' 10561function. Make sure a variable `start_token' is available in the 10562scanner (e.g., a global variable or using `%lex-param' etc.), and use 10563the following: 10564 10565 /* Prologue. */ 10566 %% 10567 %{ 10568 if (start_token) 10569 { 10570 int t = start_token; 10571 start_token = 0; 10572 return t; 10573 } 10574 %} 10575 /* The rules. */ 10576 10577 10578File: bison.info, Node: Secure? Conform?, Next: I can't build Bison, Prev: Multiple start-symbols, Up: FAQ 10579 1058011.6 Secure? Conform? 10581====================== 10582 10583 Is Bison secure? Does it conform to POSIX? 10584 10585 If you're looking for a guarantee or certification, we don't provide 10586it. However, Bison is intended to be a reliable program that conforms 10587to the POSIX specification for Yacc. If you run into problems, please 10588send us a bug report. 10589 10590 10591File: bison.info, Node: I can't build Bison, Next: Where can I find help?, Prev: Secure? Conform?, Up: FAQ 10592 1059311.7 I can't build Bison 10594======================== 10595 10596 I can't build Bison because `make' complains that `msgfmt' is not 10597 found. What should I do? 10598 10599 Like most GNU packages with internationalization support, that 10600feature is turned on by default. If you have problems building in the 10601`po' subdirectory, it indicates that your system's internationalization 10602support is lacking. You can re-configure Bison with `--disable-nls' to 10603turn off this support, or you can install GNU gettext from 10604`ftp://ftp.gnu.org/gnu/gettext/' and re-configure Bison. See the file 10605`ABOUT-NLS' for more information. 10606 10607 10608File: bison.info, Node: Where can I find help?, Next: Bug Reports, Prev: I can't build Bison, Up: FAQ 10609 1061011.8 Where can I find help? 10611=========================== 10612 10613 I'm having trouble using Bison. Where can I find help? 10614 10615 First, read this fine manual. Beyond that, you can send mail to 10616<help-bison@gnu.org>. This mailing list is intended to be populated 10617with people who are willing to answer questions about using and 10618installing Bison. Please keep in mind that (most of) the people on the 10619list have aspects of their lives which are not related to Bison (!), so 10620you may not receive an answer to your question right away. This can be 10621frustrating, but please try not to honk them off; remember that any 10622help they provide is purely voluntary and out of the kindness of their 10623hearts. 10624 10625 10626File: bison.info, Node: Bug Reports, Next: More Languages, Prev: Where can I find help?, Up: FAQ 10627 1062811.9 Bug Reports 10629================ 10630 10631 I found a bug. What should I include in the bug report? 10632 10633 Before you send a bug report, make sure you are using the latest 10634version. Check `ftp://ftp.gnu.org/pub/gnu/bison/' or one of its 10635mirrors. Be sure to include the version number in your bug report. If 10636the bug is present in the latest version but not in a previous version, 10637try to determine the most recent version which did not contain the bug. 10638 10639 If the bug is parser-related, you should include the smallest grammar 10640you can which demonstrates the bug. The grammar file should also be 10641complete (i.e., I should be able to run it through Bison without having 10642to edit or add anything). The smaller and simpler the grammar, the 10643easier it will be to fix the bug. 10644 10645 Include information about your compilation environment, including 10646your operating system's name and version and your compiler's name and 10647version. If you have trouble compiling, you should also include a 10648transcript of the build session, starting with the invocation of 10649`configure'. Depending on the nature of the bug, you may be asked to 10650send additional files as well (such as `config.h' or `config.cache'). 10651 10652 Patches are most welcome, but not required. That is, do not 10653hesitate to send a bug report just because you cannot provide a fix. 10654 10655 Send bug reports to <bug-bison@gnu.org>. 10656 10657 10658File: bison.info, Node: More Languages, Next: Beta Testing, Prev: Bug Reports, Up: FAQ 10659 1066011.10 More Languages 10661==================== 10662 10663 Will Bison ever have C++ and Java support? How about INSERT YOUR 10664 FAVORITE LANGUAGE HERE? 10665 10666 C++ and Java support is there now, and is documented. We'd love to 10667add other languages; contributions are welcome. 10668 10669 10670File: bison.info, Node: Beta Testing, Next: Mailing Lists, Prev: More Languages, Up: FAQ 10671 1067211.11 Beta Testing 10673================== 10674 10675 What is involved in being a beta tester? 10676 10677 It's not terribly involved. Basically, you would download a test 10678release, compile it, and use it to build and run a parser or two. After 10679that, you would submit either a bug report or a message saying that 10680everything is okay. It is important to report successes as well as 10681failures because test releases eventually become mainstream releases, 10682but only if they are adequately tested. If no one tests, development is 10683essentially halted. 10684 10685 Beta testers are particularly needed for operating systems to which 10686the developers do not have easy access. They currently have easy 10687access to recent GNU/Linux and Solaris versions. Reports about other 10688operating systems are especially welcome. 10689 10690 10691File: bison.info, Node: Mailing Lists, Prev: Beta Testing, Up: FAQ 10692 1069311.12 Mailing Lists 10694=================== 10695 10696 How do I join the help-bison and bug-bison mailing lists? 10697 10698 See `http://lists.gnu.org/'. 10699 10700 10701File: bison.info, Node: Table of Symbols, Next: Glossary, Prev: FAQ, Up: Top 10702 10703Appendix A Bison Symbols 10704************************ 10705 10706 -- Variable: @$ 10707 In an action, the location of the left-hand side of the rule. 10708 *Note Tracking Locations::. 10709 10710 -- Variable: @N 10711 -- Symbol: @N 10712 In an action, the location of the N-th symbol of the right-hand 10713 side of the rule. *Note Tracking Locations::. 10714 10715 In a grammar, the Bison-generated nonterminal symbol for a 10716 mid-rule action with a semantical value. *Note Mid-Rule Action 10717 Translation::. 10718 10719 -- Variable: @NAME 10720 -- Variable: @[NAME] 10721 In an action, the location of a symbol addressed by NAME. *Note 10722 Tracking Locations::. 10723 10724 -- Symbol: $@N 10725 In a grammar, the Bison-generated nonterminal symbol for a 10726 mid-rule action with no semantical value. *Note Mid-Rule Action 10727 Translation::. 10728 10729 -- Variable: $$ 10730 In an action, the semantic value of the left-hand side of the rule. 10731 *Note Actions::. 10732 10733 -- Variable: $N 10734 In an action, the semantic value of the N-th symbol of the 10735 right-hand side of the rule. *Note Actions::. 10736 10737 -- Variable: $NAME 10738 -- Variable: $[NAME] 10739 In an action, the semantic value of a symbol addressed by NAME. 10740 *Note Actions::. 10741 10742 -- Delimiter: %% 10743 Delimiter used to separate the grammar rule section from the Bison 10744 declarations section or the epilogue. *Note The Overall Layout of 10745 a Bison Grammar: Grammar Layout. 10746 10747 -- Delimiter: %{CODE%} 10748 All code listed between `%{' and `%}' is copied verbatim to the 10749 parser implementation file. Such code forms the prologue of the 10750 grammar file. *Note Outline of a Bison Grammar: Grammar Outline. 10751 10752 -- Construct: /* ... */ 10753 -- Construct: // ... 10754 Comments, as in C/C++. 10755 10756 -- Delimiter: : 10757 Separates a rule's result from its components. *Note Syntax of 10758 Grammar Rules: Rules. 10759 10760 -- Delimiter: ; 10761 Terminates a rule. *Note Syntax of Grammar Rules: Rules. 10762 10763 -- Delimiter: | 10764 Separates alternate rules for the same result nonterminal. *Note 10765 Syntax of Grammar Rules: Rules. 10766 10767 -- Directive: <*> 10768 Used to define a default tagged `%destructor' or default tagged 10769 `%printer'. 10770 10771 This feature is experimental. More user feedback will help to 10772 determine whether it should become a permanent feature. 10773 10774 *Note Freeing Discarded Symbols: Destructor Decl. 10775 10776 -- Directive: <> 10777 Used to define a default tagless `%destructor' or default tagless 10778 `%printer'. 10779 10780 This feature is experimental. More user feedback will help to 10781 determine whether it should become a permanent feature. 10782 10783 *Note Freeing Discarded Symbols: Destructor Decl. 10784 10785 -- Symbol: $accept 10786 The predefined nonterminal whose only rule is `$accept: START 10787 $end', where START is the start symbol. *Note The Start-Symbol: 10788 Start Decl. It cannot be used in the grammar. 10789 10790 -- Directive: %code {CODE} 10791 -- Directive: %code QUALIFIER {CODE} 10792 Insert CODE verbatim into the output parser source at the default 10793 location or at the location specified by QUALIFIER. *Note %code 10794 Summary::. 10795 10796 -- Directive: %debug 10797 Equip the parser for debugging. *Note Decl Summary::. 10798 10799 -- Directive: %define VARIABLE 10800 -- Directive: %define VARIABLE VALUE 10801 -- Directive: %define VARIABLE "VALUE" 10802 Define a variable to adjust Bison's behavior. *Note %define 10803 Summary::. 10804 10805 -- Directive: %defines 10806 Bison declaration to create a parser header file, which is usually 10807 meant for the scanner. *Note Decl Summary::. 10808 10809 -- Directive: %defines DEFINES-FILE 10810 Same as above, but save in the file DEFINES-FILE. *Note Decl 10811 Summary::. 10812 10813 -- Directive: %destructor 10814 Specify how the parser should reclaim the memory associated to 10815 discarded symbols. *Note Freeing Discarded Symbols: Destructor 10816 Decl. 10817 10818 -- Directive: %dprec 10819 Bison declaration to assign a precedence to a rule that is used at 10820 parse time to resolve reduce/reduce conflicts. *Note Writing GLR 10821 Parsers: GLR Parsers. 10822 10823 -- Symbol: $end 10824 The predefined token marking the end of the token stream. It 10825 cannot be used in the grammar. 10826 10827 -- Symbol: error 10828 A token name reserved for error recovery. This token may be used 10829 in grammar rules so as to allow the Bison parser to recognize an 10830 error in the grammar without halting the process. In effect, a 10831 sentence containing an error may be recognized as valid. On a 10832 syntax error, the token `error' becomes the current lookahead 10833 token. Actions corresponding to `error' are then executed, and 10834 the lookahead token is reset to the token that originally caused 10835 the violation. *Note Error Recovery::. 10836 10837 -- Directive: %error-verbose 10838 Bison declaration to request verbose, specific error message 10839 strings when `yyerror' is called. *Note Error Reporting::. 10840 10841 -- Directive: %file-prefix "PREFIX" 10842 Bison declaration to set the prefix of the output files. *Note 10843 Decl Summary::. 10844 10845 -- Directive: %glr-parser 10846 Bison declaration to produce a GLR parser. *Note Writing GLR 10847 Parsers: GLR Parsers. 10848 10849 -- Directive: %initial-action 10850 Run user code before parsing. *Note Performing Actions before 10851 Parsing: Initial Action Decl. 10852 10853 -- Directive: %language 10854 Specify the programming language for the generated parser. *Note 10855 Decl Summary::. 10856 10857 -- Directive: %left 10858 Bison declaration to assign left associativity to token(s). *Note 10859 Operator Precedence: Precedence Decl. 10860 10861 -- Directive: %lex-param {ARGUMENT-DECLARATION} 10862 Bison declaration to specifying an additional parameter that 10863 `yylex' should accept. *Note Calling Conventions for Pure 10864 Parsers: Pure Calling. 10865 10866 -- Directive: %merge 10867 Bison declaration to assign a merging function to a rule. If 10868 there is a reduce/reduce conflict with a rule having the same 10869 merging function, the function is applied to the two semantic 10870 values to get a single result. *Note Writing GLR Parsers: GLR 10871 Parsers. 10872 10873 -- Directive: %name-prefix "PREFIX" 10874 Obsoleted by the `%define' variable `api.prefix' (*note Multiple 10875 Parsers in the Same Program: Multiple Parsers.). 10876 10877 Rename the external symbols (variables and functions) used in the 10878 parser so that they start with PREFIX instead of `yy'. Contrary to 10879 `api.prefix', do no rename types and macros. 10880 10881 The precise list of symbols renamed in C parsers is `yyparse', 10882 `yylex', `yyerror', `yynerrs', `yylval', `yychar', `yydebug', and 10883 (if locations are used) `yylloc'. If you use a push parser, 10884 `yypush_parse', `yypull_parse', `yypstate', `yypstate_new' and 10885 `yypstate_delete' will also be renamed. For example, if you use 10886 `%name-prefix "c_"', the names become `c_parse', `c_lex', and so 10887 on. For C++ parsers, see the `%define namespace' documentation in 10888 this section. 10889 10890 -- Directive: %no-lines 10891 Bison declaration to avoid generating `#line' directives in the 10892 parser implementation file. *Note Decl Summary::. 10893 10894 -- Directive: %nonassoc 10895 Bison declaration to assign nonassociativity to token(s). *Note 10896 Operator Precedence: Precedence Decl. 10897 10898 -- Directive: %output "FILE" 10899 Bison declaration to set the name of the parser implementation 10900 file. *Note Decl Summary::. 10901 10902 -- Directive: %parse-param {ARGUMENT-DECLARATION} 10903 Bison declaration to specifying an additional parameter that 10904 `yyparse' should accept. *Note The Parser Function `yyparse': 10905 Parser Function. 10906 10907 -- Directive: %prec 10908 Bison declaration to assign a precedence to a specific rule. 10909 *Note Context-Dependent Precedence: Contextual Precedence. 10910 10911 -- Directive: %pure-parser 10912 Deprecated version of `%define api.pure' (*note api.pure: %define 10913 Summary.), for which Bison is more careful to warn about 10914 unreasonable usage. 10915 10916 -- Directive: %require "VERSION" 10917 Require version VERSION or higher of Bison. *Note Require a 10918 Version of Bison: Require Decl. 10919 10920 -- Directive: %right 10921 Bison declaration to assign right associativity to token(s). 10922 *Note Operator Precedence: Precedence Decl. 10923 10924 -- Directive: %skeleton 10925 Specify the skeleton to use; usually for development. *Note Decl 10926 Summary::. 10927 10928 -- Directive: %start 10929 Bison declaration to specify the start symbol. *Note The 10930 Start-Symbol: Start Decl. 10931 10932 -- Directive: %token 10933 Bison declaration to declare token(s) without specifying 10934 precedence. *Note Token Type Names: Token Decl. 10935 10936 -- Directive: %token-table 10937 Bison declaration to include a token name table in the parser 10938 implementation file. *Note Decl Summary::. 10939 10940 -- Directive: %type 10941 Bison declaration to declare nonterminals. *Note Nonterminal 10942 Symbols: Type Decl. 10943 10944 -- Symbol: $undefined 10945 The predefined token onto which all undefined values returned by 10946 `yylex' are mapped. It cannot be used in the grammar, rather, use 10947 `error'. 10948 10949 -- Directive: %union 10950 Bison declaration to specify several possible data types for 10951 semantic values. *Note The Collection of Value Types: Union Decl. 10952 10953 -- Macro: YYABORT 10954 Macro to pretend that an unrecoverable syntax error has occurred, 10955 by making `yyparse' return 1 immediately. The error reporting 10956 function `yyerror' is not called. *Note The Parser Function 10957 `yyparse': Parser Function. 10958 10959 For Java parsers, this functionality is invoked using `return 10960 YYABORT;' instead. 10961 10962 -- Macro: YYACCEPT 10963 Macro to pretend that a complete utterance of the language has been 10964 read, by making `yyparse' return 0 immediately. *Note The Parser 10965 Function `yyparse': Parser Function. 10966 10967 For Java parsers, this functionality is invoked using `return 10968 YYACCEPT;' instead. 10969 10970 -- Macro: YYBACKUP 10971 Macro to discard a value from the parser stack and fake a lookahead 10972 token. *Note Special Features for Use in Actions: Action Features. 10973 10974 -- Variable: yychar 10975 External integer variable that contains the integer value of the 10976 lookahead token. (In a pure parser, it is a local variable within 10977 `yyparse'.) Error-recovery rule actions may examine this variable. 10978 *Note Special Features for Use in Actions: Action Features. 10979 10980 -- Variable: yyclearin 10981 Macro used in error-recovery rule actions. It clears the previous 10982 lookahead token. *Note Error Recovery::. 10983 10984 -- Macro: YYDEBUG 10985 Macro to define to equip the parser with tracing code. *Note 10986 Tracing Your Parser: Tracing. 10987 10988 -- Variable: yydebug 10989 External integer variable set to zero by default. If `yydebug' is 10990 given a nonzero value, the parser will output information on input 10991 symbols and parser action. *Note Tracing Your Parser: Tracing. 10992 10993 -- Macro: yyerrok 10994 Macro to cause parser to recover immediately to its normal mode 10995 after a syntax error. *Note Error Recovery::. 10996 10997 -- Macro: YYERROR 10998 Cause an immediate syntax error. This statement initiates error 10999 recovery just as if the parser itself had detected an error; 11000 however, it does not call `yyerror', and does not print any 11001 message. If you want to print an error message, call `yyerror' 11002 explicitly before the `YYERROR;' statement. *Note Error 11003 Recovery::. 11004 11005 For Java parsers, this functionality is invoked using `return 11006 YYERROR;' instead. 11007 11008 -- Function: yyerror 11009 User-supplied function to be called by `yyparse' on error. *Note 11010 The Error Reporting Function `yyerror': Error Reporting. 11011 11012 -- Macro: YYERROR_VERBOSE 11013 An obsolete macro that you define with `#define' in the prologue 11014 to request verbose, specific error message strings when `yyerror' 11015 is called. It doesn't matter what definition you use for 11016 `YYERROR_VERBOSE', just whether you define it. Supported by the C 11017 skeletons only; using `%error-verbose' is preferred. *Note Error 11018 Reporting::. 11019 11020 -- Macro: YYFPRINTF 11021 Macro used to output run-time traces. *Note Enabling Traces::. 11022 11023 -- Macro: YYINITDEPTH 11024 Macro for specifying the initial size of the parser stack. *Note 11025 Memory Management::. 11026 11027 -- Function: yylex 11028 User-supplied lexical analyzer function, called with no arguments 11029 to get the next token. *Note The Lexical Analyzer Function 11030 `yylex': Lexical. 11031 11032 -- Macro: YYLEX_PARAM 11033 An obsolete macro for specifying an extra argument (or list of 11034 extra arguments) for `yyparse' to pass to `yylex'. The use of this 11035 macro is deprecated, and is supported only for Yacc like parsers. 11036 *Note Calling Conventions for Pure Parsers: Pure Calling. 11037 11038 -- Variable: yylloc 11039 External variable in which `yylex' should place the line and column 11040 numbers associated with a token. (In a pure parser, it is a local 11041 variable within `yyparse', and its address is passed to `yylex'.) 11042 You can ignore this variable if you don't use the `@' feature in 11043 the grammar actions. *Note Textual Locations of Tokens: Token 11044 Locations. In semantic actions, it stores the location of the 11045 lookahead token. *Note Actions and Locations: Actions and 11046 Locations. 11047 11048 -- Type: YYLTYPE 11049 Data type of `yylloc'; by default, a structure with four members. 11050 *Note Data Types of Locations: Location Type. 11051 11052 -- Variable: yylval 11053 External variable in which `yylex' should place the semantic value 11054 associated with a token. (In a pure parser, it is a local 11055 variable within `yyparse', and its address is passed to `yylex'.) 11056 *Note Semantic Values of Tokens: Token Values. In semantic 11057 actions, it stores the semantic value of the lookahead token. 11058 *Note Actions: Actions. 11059 11060 -- Macro: YYMAXDEPTH 11061 Macro for specifying the maximum size of the parser stack. *Note 11062 Memory Management::. 11063 11064 -- Variable: yynerrs 11065 Global variable which Bison increments each time it reports a 11066 syntax error. (In a pure parser, it is a local variable within 11067 `yyparse'. In a pure push parser, it is a member of yypstate.) 11068 *Note The Error Reporting Function `yyerror': Error Reporting. 11069 11070 -- Function: yyparse 11071 The parser function produced by Bison; call this function to start 11072 parsing. *Note The Parser Function `yyparse': Parser Function. 11073 11074 -- Macro: YYPRINT 11075 Macro used to output token semantic values. For `yacc.c' only. 11076 Obsoleted by `%printer'. *Note The `YYPRINT' Macro: The YYPRINT 11077 Macro. 11078 11079 -- Function: yypstate_delete 11080 The function to delete a parser instance, produced by Bison in 11081 push mode; call this function to delete the memory associated with 11082 a parser. *Note The Parser Delete Function `yypstate_delete': 11083 Parser Delete Function. (The current push parsing interface is 11084 experimental and may evolve. More user feedback will help to 11085 stabilize it.) 11086 11087 -- Function: yypstate_new 11088 The function to create a parser instance, produced by Bison in 11089 push mode; call this function to create a new parser. *Note The 11090 Parser Create Function `yypstate_new': Parser Create Function. 11091 (The current push parsing interface is experimental and may evolve. 11092 More user feedback will help to stabilize it.) 11093 11094 -- Function: yypull_parse 11095 The parser function produced by Bison in push mode; call this 11096 function to parse the rest of the input stream. *Note The Pull 11097 Parser Function `yypull_parse': Pull Parser Function. (The 11098 current push parsing interface is experimental and may evolve. 11099 More user feedback will help to stabilize it.) 11100 11101 -- Function: yypush_parse 11102 The parser function produced by Bison in push mode; call this 11103 function to parse a single token. *Note The Push Parser Function 11104 `yypush_parse': Push Parser Function. (The current push parsing 11105 interface is experimental and may evolve. More user feedback will 11106 help to stabilize it.) 11107 11108 -- Macro: YYPARSE_PARAM 11109 An obsolete macro for specifying the name of a parameter that 11110 `yyparse' should accept. The use of this macro is deprecated, and 11111 is supported only for Yacc like parsers. *Note Calling 11112 Conventions for Pure Parsers: Pure Calling. 11113 11114 -- Macro: YYRECOVERING 11115 The expression `YYRECOVERING ()' yields 1 when the parser is 11116 recovering from a syntax error, and 0 otherwise. *Note Special 11117 Features for Use in Actions: Action Features. 11118 11119 -- Macro: YYSTACK_USE_ALLOCA 11120 Macro used to control the use of `alloca' when the deterministic 11121 parser in C needs to extend its stacks. If defined to 0, the 11122 parser will use `malloc' to extend its stacks. If defined to 1, 11123 the parser will use `alloca'. Values other than 0 and 1 are 11124 reserved for future Bison extensions. If not defined, 11125 `YYSTACK_USE_ALLOCA' defaults to 0. 11126 11127 In the all-too-common case where your code may run on a host with a 11128 limited stack and with unreliable stack-overflow checking, you 11129 should set `YYMAXDEPTH' to a value that cannot possibly result in 11130 unchecked stack overflow on any of your target hosts when `alloca' 11131 is called. You can inspect the code that Bison generates in order 11132 to determine the proper numeric values. This will require some 11133 expertise in low-level implementation details. 11134 11135 -- Type: YYSTYPE 11136 Data type of semantic values; `int' by default. *Note Data Types 11137 of Semantic Values: Value Type. 11138 11139 11140File: bison.info, Node: Glossary, Next: Copying This Manual, Prev: Table of Symbols, Up: Top 11141 11142Appendix B Glossary 11143******************* 11144 11145Accepting state 11146 A state whose only action is the accept action. The accepting 11147 state is thus a consistent state. *Note Understanding Your 11148 Parser: Understanding. 11149 11150Backus-Naur Form (BNF; also called "Backus Normal Form") 11151 Formal method of specifying context-free grammars originally 11152 proposed by John Backus, and slightly improved by Peter Naur in 11153 his 1960-01-02 committee document contributing to what became the 11154 Algol 60 report. *Note Languages and Context-Free Grammars: 11155 Language and Grammar. 11156 11157Consistent state 11158 A state containing only one possible action. *Note Default 11159 Reductions::. 11160 11161Context-free grammars 11162 Grammars specified as rules that can be applied regardless of 11163 context. Thus, if there is a rule which says that an integer can 11164 be used as an expression, integers are allowed _anywhere_ an 11165 expression is permitted. *Note Languages and Context-Free 11166 Grammars: Language and Grammar. 11167 11168Default reduction 11169 The reduction that a parser should perform if the current parser 11170 state contains no other action for the lookahead token. In 11171 permitted parser states, Bison declares the reduction with the 11172 largest lookahead set to be the default reduction and removes that 11173 lookahead set. *Note Default Reductions::. 11174 11175Defaulted state 11176 A consistent state with a default reduction. *Note Default 11177 Reductions::. 11178 11179Dynamic allocation 11180 Allocation of memory that occurs during execution, rather than at 11181 compile time or on entry to a function. 11182 11183Empty string 11184 Analogous to the empty set in set theory, the empty string is a 11185 character string of length zero. 11186 11187Finite-state stack machine 11188 A "machine" that has discrete states in which it is said to exist 11189 at each instant in time. As input to the machine is processed, the 11190 machine moves from state to state as specified by the logic of the 11191 machine. In the case of the parser, the input is the language 11192 being parsed, and the states correspond to various stages in the 11193 grammar rules. *Note The Bison Parser Algorithm: Algorithm. 11194 11195Generalized LR (GLR) 11196 A parsing algorithm that can handle all context-free grammars, 11197 including those that are not LR(1). It resolves situations that 11198 Bison's deterministic parsing algorithm cannot by effectively 11199 splitting off multiple parsers, trying all possible parsers, and 11200 discarding those that fail in the light of additional right 11201 context. *Note Generalized LR Parsing: Generalized LR Parsing. 11202 11203Grouping 11204 A language construct that is (in general) grammatically divisible; 11205 for example, `expression' or `declaration' in C. *Note Languages 11206 and Context-Free Grammars: Language and Grammar. 11207 11208IELR(1) (Inadequacy Elimination LR(1)) 11209 A minimal LR(1) parser table construction algorithm. That is, 11210 given any context-free grammar, IELR(1) generates parser tables 11211 with the full language-recognition power of canonical LR(1) but 11212 with nearly the same number of parser states as LALR(1). This 11213 reduction in parser states is often an order of magnitude. More 11214 importantly, because canonical LR(1)'s extra parser states may 11215 contain duplicate conflicts in the case of non-LR(1) grammars, the 11216 number of conflicts for IELR(1) is often an order of magnitude 11217 less as well. This can significantly reduce the complexity of 11218 developing a grammar. *Note LR Table Construction::. 11219 11220Infix operator 11221 An arithmetic operator that is placed between the operands on 11222 which it performs some operation. 11223 11224Input stream 11225 A continuous flow of data between devices or programs. 11226 11227LAC (Lookahead Correction) 11228 A parsing mechanism that fixes the problem of delayed syntax error 11229 detection, which is caused by LR state merging, default 11230 reductions, and the use of `%nonassoc'. Delayed syntax error 11231 detection results in unexpected semantic actions, initiation of 11232 error recovery in the wrong syntactic context, and an incorrect 11233 list of expected tokens in a verbose syntax error message. *Note 11234 LAC::. 11235 11236Language construct 11237 One of the typical usage schemas of the language. For example, 11238 one of the constructs of the C language is the `if' statement. 11239 *Note Languages and Context-Free Grammars: Language and Grammar. 11240 11241Left associativity 11242 Operators having left associativity are analyzed from left to 11243 right: `a+b+c' first computes `a+b' and then combines with `c'. 11244 *Note Operator Precedence: Precedence. 11245 11246Left recursion 11247 A rule whose result symbol is also its first component symbol; for 11248 example, `expseq1 : expseq1 ',' exp;'. *Note Recursive Rules: 11249 Recursion. 11250 11251Left-to-right parsing 11252 Parsing a sentence of a language by analyzing it token by token 11253 from left to right. *Note The Bison Parser Algorithm: Algorithm. 11254 11255Lexical analyzer (scanner) 11256 A function that reads an input stream and returns tokens one by 11257 one. *Note The Lexical Analyzer Function `yylex': Lexical. 11258 11259Lexical tie-in 11260 A flag, set by actions in the grammar rules, which alters the way 11261 tokens are parsed. *Note Lexical Tie-ins::. 11262 11263Literal string token 11264 A token which consists of two or more fixed characters. *Note 11265 Symbols::. 11266 11267Lookahead token 11268 A token already read but not yet shifted. *Note Lookahead Tokens: 11269 Lookahead. 11270 11271LALR(1) 11272 The class of context-free grammars that Bison (like most other 11273 parser generators) can handle by default; a subset of LR(1). 11274 *Note Mysterious Conflicts::. 11275 11276LR(1) 11277 The class of context-free grammars in which at most one token of 11278 lookahead is needed to disambiguate the parsing of any piece of 11279 input. 11280 11281Nonterminal symbol 11282 A grammar symbol standing for a grammatical construct that can be 11283 expressed through rules in terms of smaller constructs; in other 11284 words, a construct that is not a token. *Note Symbols::. 11285 11286Parser 11287 A function that recognizes valid sentences of a language by 11288 analyzing the syntax structure of a set of tokens passed to it 11289 from a lexical analyzer. 11290 11291Postfix operator 11292 An arithmetic operator that is placed after the operands upon 11293 which it performs some operation. 11294 11295Reduction 11296 Replacing a string of nonterminals and/or terminals with a single 11297 nonterminal, according to a grammar rule. *Note The Bison Parser 11298 Algorithm: Algorithm. 11299 11300Reentrant 11301 A reentrant subprogram is a subprogram which can be in invoked any 11302 number of times in parallel, without interference between the 11303 various invocations. *Note A Pure (Reentrant) Parser: Pure Decl. 11304 11305Reverse polish notation 11306 A language in which all operators are postfix operators. 11307 11308Right recursion 11309 A rule whose result symbol is also its last component symbol; for 11310 example, `expseq1: exp ',' expseq1;'. *Note Recursive Rules: 11311 Recursion. 11312 11313Semantics 11314 In computer languages, the semantics are specified by the actions 11315 taken for each instance of the language, i.e., the meaning of each 11316 statement. *Note Defining Language Semantics: Semantics. 11317 11318Shift 11319 A parser is said to shift when it makes the choice of analyzing 11320 further input from the stream rather than reducing immediately some 11321 already-recognized rule. *Note The Bison Parser Algorithm: 11322 Algorithm. 11323 11324Single-character literal 11325 A single character that is recognized and interpreted as is. 11326 *Note From Formal Rules to Bison Input: Grammar in Bison. 11327 11328Start symbol 11329 The nonterminal symbol that stands for a complete valid utterance 11330 in the language being parsed. The start symbol is usually listed 11331 as the first nonterminal symbol in a language specification. 11332 *Note The Start-Symbol: Start Decl. 11333 11334Symbol table 11335 A data structure where symbol names and associated data are stored 11336 during parsing to allow for recognition and use of existing 11337 information in repeated uses of a symbol. *Note Multi-function 11338 Calc::. 11339 11340Syntax error 11341 An error encountered during parsing of an input stream due to 11342 invalid syntax. *Note Error Recovery::. 11343 11344Token 11345 A basic, grammatically indivisible unit of a language. The symbol 11346 that describes a token in the grammar is a terminal symbol. The 11347 input of the Bison parser is a stream of tokens which comes from 11348 the lexical analyzer. *Note Symbols::. 11349 11350Terminal symbol 11351 A grammar symbol that has no rules in the grammar and therefore is 11352 grammatically indivisible. The piece of text it represents is a 11353 token. *Note Languages and Context-Free Grammars: Language and 11354 Grammar. 11355 11356Unreachable state 11357 A parser state to which there does not exist a sequence of 11358 transitions from the parser's start state. A state can become 11359 unreachable during conflict resolution. *Note Unreachable 11360 States::. 11361 11362 11363File: bison.info, Node: Copying This Manual, Next: Bibliography, Prev: Glossary, Up: Top 11364 11365Appendix C Copying This Manual 11366****************************** 11367 11368 Version 1.3, 3 November 2008 11369 11370 Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc. 11371 `http://fsf.org/' 11372 11373 Everyone is permitted to copy and distribute verbatim copies 11374 of this license document, but changing it is not allowed. 11375 11376 0. PREAMBLE 11377 11378 The purpose of this License is to make a manual, textbook, or other 11379 functional and useful document "free" in the sense of freedom: to 11380 assure everyone the effective freedom to copy and redistribute it, 11381 with or without modifying it, either commercially or 11382 noncommercially. Secondarily, this License preserves for the 11383 author and publisher a way to get credit for their work, while not 11384 being considered responsible for modifications made by others. 11385 11386 This License is a kind of "copyleft", which means that derivative 11387 works of the document must themselves be free in the same sense. 11388 It complements the GNU General Public License, which is a copyleft 11389 license designed for free software. 11390 11391 We have designed this License in order to use it for manuals for 11392 free software, because free software needs free documentation: a 11393 free program should come with manuals providing the same freedoms 11394 that the software does. But this License is not limited to 11395 software manuals; it can be used for any textual work, regardless 11396 of subject matter or whether it is published as a printed book. 11397 We recommend this License principally for works whose purpose is 11398 instruction or reference. 11399 11400 1. APPLICABILITY AND DEFINITIONS 11401 11402 This License applies to any manual or other work, in any medium, 11403 that contains a notice placed by the copyright holder saying it 11404 can be distributed under the terms of this License. Such a notice 11405 grants a world-wide, royalty-free license, unlimited in duration, 11406 to use that work under the conditions stated herein. The 11407 "Document", below, refers to any such manual or work. Any member 11408 of the public is a licensee, and is addressed as "you". You 11409 accept the license if you copy, modify or distribute the work in a 11410 way requiring permission under copyright law. 11411 11412 A "Modified Version" of the Document means any work containing the 11413 Document or a portion of it, either copied verbatim, or with 11414 modifications and/or translated into another language. 11415 11416 A "Secondary Section" is a named appendix or a front-matter section 11417 of the Document that deals exclusively with the relationship of the 11418 publishers or authors of the Document to the Document's overall 11419 subject (or to related matters) and contains nothing that could 11420 fall directly within that overall subject. (Thus, if the Document 11421 is in part a textbook of mathematics, a Secondary Section may not 11422 explain any mathematics.) The relationship could be a matter of 11423 historical connection with the subject or with related matters, or 11424 of legal, commercial, philosophical, ethical or political position 11425 regarding them. 11426 11427 The "Invariant Sections" are certain Secondary Sections whose 11428 titles are designated, as being those of Invariant Sections, in 11429 the notice that says that the Document is released under this 11430 License. If a section does not fit the above definition of 11431 Secondary then it is not allowed to be designated as Invariant. 11432 The Document may contain zero Invariant Sections. If the Document 11433 does not identify any Invariant Sections then there are none. 11434 11435 The "Cover Texts" are certain short passages of text that are 11436 listed, as Front-Cover Texts or Back-Cover Texts, in the notice 11437 that says that the Document is released under this License. A 11438 Front-Cover Text may be at most 5 words, and a Back-Cover Text may 11439 be at most 25 words. 11440 11441 A "Transparent" copy of the Document means a machine-readable copy, 11442 represented in a format whose specification is available to the 11443 general public, that is suitable for revising the document 11444 straightforwardly with generic text editors or (for images 11445 composed of pixels) generic paint programs or (for drawings) some 11446 widely available drawing editor, and that is suitable for input to 11447 text formatters or for automatic translation to a variety of 11448 formats suitable for input to text formatters. A copy made in an 11449 otherwise Transparent file format whose markup, or absence of 11450 markup, has been arranged to thwart or discourage subsequent 11451 modification by readers is not Transparent. An image format is 11452 not Transparent if used for any substantial amount of text. A 11453 copy that is not "Transparent" is called "Opaque". 11454 11455 Examples of suitable formats for Transparent copies include plain 11456 ASCII without markup, Texinfo input format, LaTeX input format, 11457 SGML or XML using a publicly available DTD, and 11458 standard-conforming simple HTML, PostScript or PDF designed for 11459 human modification. Examples of transparent image formats include 11460 PNG, XCF and JPG. Opaque formats include proprietary formats that 11461 can be read and edited only by proprietary word processors, SGML or 11462 XML for which the DTD and/or processing tools are not generally 11463 available, and the machine-generated HTML, PostScript or PDF 11464 produced by some word processors for output purposes only. 11465 11466 The "Title Page" means, for a printed book, the title page itself, 11467 plus such following pages as are needed to hold, legibly, the 11468 material this License requires to appear in the title page. For 11469 works in formats which do not have any title page as such, "Title 11470 Page" means the text near the most prominent appearance of the 11471 work's title, preceding the beginning of the body of the text. 11472 11473 The "publisher" means any person or entity that distributes copies 11474 of the Document to the public. 11475 11476 A section "Entitled XYZ" means a named subunit of the Document 11477 whose title either is precisely XYZ or contains XYZ in parentheses 11478 following text that translates XYZ in another language. (Here XYZ 11479 stands for a specific section name mentioned below, such as 11480 "Acknowledgements", "Dedications", "Endorsements", or "History".) 11481 To "Preserve the Title" of such a section when you modify the 11482 Document means that it remains a section "Entitled XYZ" according 11483 to this definition. 11484 11485 The Document may include Warranty Disclaimers next to the notice 11486 which states that this License applies to the Document. These 11487 Warranty Disclaimers are considered to be included by reference in 11488 this License, but only as regards disclaiming warranties: any other 11489 implication that these Warranty Disclaimers may have is void and 11490 has no effect on the meaning of this License. 11491 11492 2. VERBATIM COPYING 11493 11494 You may copy and distribute the Document in any medium, either 11495 commercially or noncommercially, provided that this License, the 11496 copyright notices, and the license notice saying this License 11497 applies to the Document are reproduced in all copies, and that you 11498 add no other conditions whatsoever to those of this License. You 11499 may not use technical measures to obstruct or control the reading 11500 or further copying of the copies you make or distribute. However, 11501 you may accept compensation in exchange for copies. If you 11502 distribute a large enough number of copies you must also follow 11503 the conditions in section 3. 11504 11505 You may also lend copies, under the same conditions stated above, 11506 and you may publicly display copies. 11507 11508 3. COPYING IN QUANTITY 11509 11510 If you publish printed copies (or copies in media that commonly 11511 have printed covers) of the Document, numbering more than 100, and 11512 the Document's license notice requires Cover Texts, you must 11513 enclose the copies in covers that carry, clearly and legibly, all 11514 these Cover Texts: Front-Cover Texts on the front cover, and 11515 Back-Cover Texts on the back cover. Both covers must also clearly 11516 and legibly identify you as the publisher of these copies. The 11517 front cover must present the full title with all words of the 11518 title equally prominent and visible. You may add other material 11519 on the covers in addition. Copying with changes limited to the 11520 covers, as long as they preserve the title of the Document and 11521 satisfy these conditions, can be treated as verbatim copying in 11522 other respects. 11523 11524 If the required texts for either cover are too voluminous to fit 11525 legibly, you should put the first ones listed (as many as fit 11526 reasonably) on the actual cover, and continue the rest onto 11527 adjacent pages. 11528 11529 If you publish or distribute Opaque copies of the Document 11530 numbering more than 100, you must either include a 11531 machine-readable Transparent copy along with each Opaque copy, or 11532 state in or with each Opaque copy a computer-network location from 11533 which the general network-using public has access to download 11534 using public-standard network protocols a complete Transparent 11535 copy of the Document, free of added material. If you use the 11536 latter option, you must take reasonably prudent steps, when you 11537 begin distribution of Opaque copies in quantity, to ensure that 11538 this Transparent copy will remain thus accessible at the stated 11539 location until at least one year after the last time you 11540 distribute an Opaque copy (directly or through your agents or 11541 retailers) of that edition to the public. 11542 11543 It is requested, but not required, that you contact the authors of 11544 the Document well before redistributing any large number of 11545 copies, to give them a chance to provide you with an updated 11546 version of the Document. 11547 11548 4. MODIFICATIONS 11549 11550 You may copy and distribute a Modified Version of the Document 11551 under the conditions of sections 2 and 3 above, provided that you 11552 release the Modified Version under precisely this License, with 11553 the Modified Version filling the role of the Document, thus 11554 licensing distribution and modification of the Modified Version to 11555 whoever possesses a copy of it. In addition, you must do these 11556 things in the Modified Version: 11557 11558 A. Use in the Title Page (and on the covers, if any) a title 11559 distinct from that of the Document, and from those of 11560 previous versions (which should, if there were any, be listed 11561 in the History section of the Document). You may use the 11562 same title as a previous version if the original publisher of 11563 that version gives permission. 11564 11565 B. List on the Title Page, as authors, one or more persons or 11566 entities responsible for authorship of the modifications in 11567 the Modified Version, together with at least five of the 11568 principal authors of the Document (all of its principal 11569 authors, if it has fewer than five), unless they release you 11570 from this requirement. 11571 11572 C. State on the Title page the name of the publisher of the 11573 Modified Version, as the publisher. 11574 11575 D. Preserve all the copyright notices of the Document. 11576 11577 E. Add an appropriate copyright notice for your modifications 11578 adjacent to the other copyright notices. 11579 11580 F. Include, immediately after the copyright notices, a license 11581 notice giving the public permission to use the Modified 11582 Version under the terms of this License, in the form shown in 11583 the Addendum below. 11584 11585 G. Preserve in that license notice the full lists of Invariant 11586 Sections and required Cover Texts given in the Document's 11587 license notice. 11588 11589 H. Include an unaltered copy of this License. 11590 11591 I. Preserve the section Entitled "History", Preserve its Title, 11592 and add to it an item stating at least the title, year, new 11593 authors, and publisher of the Modified Version as given on 11594 the Title Page. If there is no section Entitled "History" in 11595 the Document, create one stating the title, year, authors, 11596 and publisher of the Document as given on its Title Page, 11597 then add an item describing the Modified Version as stated in 11598 the previous sentence. 11599 11600 J. Preserve the network location, if any, given in the Document 11601 for public access to a Transparent copy of the Document, and 11602 likewise the network locations given in the Document for 11603 previous versions it was based on. These may be placed in 11604 the "History" section. You may omit a network location for a 11605 work that was published at least four years before the 11606 Document itself, or if the original publisher of the version 11607 it refers to gives permission. 11608 11609 K. For any section Entitled "Acknowledgements" or "Dedications", 11610 Preserve the Title of the section, and preserve in the 11611 section all the substance and tone of each of the contributor 11612 acknowledgements and/or dedications given therein. 11613 11614 L. Preserve all the Invariant Sections of the Document, 11615 unaltered in their text and in their titles. Section numbers 11616 or the equivalent are not considered part of the section 11617 titles. 11618 11619 M. Delete any section Entitled "Endorsements". Such a section 11620 may not be included in the Modified Version. 11621 11622 N. Do not retitle any existing section to be Entitled 11623 "Endorsements" or to conflict in title with any Invariant 11624 Section. 11625 11626 O. Preserve any Warranty Disclaimers. 11627 11628 If the Modified Version includes new front-matter sections or 11629 appendices that qualify as Secondary Sections and contain no 11630 material copied from the Document, you may at your option 11631 designate some or all of these sections as invariant. To do this, 11632 add their titles to the list of Invariant Sections in the Modified 11633 Version's license notice. These titles must be distinct from any 11634 other section titles. 11635 11636 You may add a section Entitled "Endorsements", provided it contains 11637 nothing but endorsements of your Modified Version by various 11638 parties--for example, statements of peer review or that the text 11639 has been approved by an organization as the authoritative 11640 definition of a standard. 11641 11642 You may add a passage of up to five words as a Front-Cover Text, 11643 and a passage of up to 25 words as a Back-Cover Text, to the end 11644 of the list of Cover Texts in the Modified Version. Only one 11645 passage of Front-Cover Text and one of Back-Cover Text may be 11646 added by (or through arrangements made by) any one entity. If the 11647 Document already includes a cover text for the same cover, 11648 previously added by you or by arrangement made by the same entity 11649 you are acting on behalf of, you may not add another; but you may 11650 replace the old one, on explicit permission from the previous 11651 publisher that added the old one. 11652 11653 The author(s) and publisher(s) of the Document do not by this 11654 License give permission to use their names for publicity for or to 11655 assert or imply endorsement of any Modified Version. 11656 11657 5. COMBINING DOCUMENTS 11658 11659 You may combine the Document with other documents released under 11660 this License, under the terms defined in section 4 above for 11661 modified versions, provided that you include in the combination 11662 all of the Invariant Sections of all of the original documents, 11663 unmodified, and list them all as Invariant Sections of your 11664 combined work in its license notice, and that you preserve all 11665 their Warranty Disclaimers. 11666 11667 The combined work need only contain one copy of this License, and 11668 multiple identical Invariant Sections may be replaced with a single 11669 copy. If there are multiple Invariant Sections with the same name 11670 but different contents, make the title of each such section unique 11671 by adding at the end of it, in parentheses, the name of the 11672 original author or publisher of that section if known, or else a 11673 unique number. Make the same adjustment to the section titles in 11674 the list of Invariant Sections in the license notice of the 11675 combined work. 11676 11677 In the combination, you must combine any sections Entitled 11678 "History" in the various original documents, forming one section 11679 Entitled "History"; likewise combine any sections Entitled 11680 "Acknowledgements", and any sections Entitled "Dedications". You 11681 must delete all sections Entitled "Endorsements." 11682 11683 6. COLLECTIONS OF DOCUMENTS 11684 11685 You may make a collection consisting of the Document and other 11686 documents released under this License, and replace the individual 11687 copies of this License in the various documents with a single copy 11688 that is included in the collection, provided that you follow the 11689 rules of this License for verbatim copying of each of the 11690 documents in all other respects. 11691 11692 You may extract a single document from such a collection, and 11693 distribute it individually under this License, provided you insert 11694 a copy of this License into the extracted document, and follow 11695 this License in all other respects regarding verbatim copying of 11696 that document. 11697 11698 7. AGGREGATION WITH INDEPENDENT WORKS 11699 11700 A compilation of the Document or its derivatives with other 11701 separate and independent documents or works, in or on a volume of 11702 a storage or distribution medium, is called an "aggregate" if the 11703 copyright resulting from the compilation is not used to limit the 11704 legal rights of the compilation's users beyond what the individual 11705 works permit. When the Document is included in an aggregate, this 11706 License does not apply to the other works in the aggregate which 11707 are not themselves derivative works of the Document. 11708 11709 If the Cover Text requirement of section 3 is applicable to these 11710 copies of the Document, then if the Document is less than one half 11711 of the entire aggregate, the Document's Cover Texts may be placed 11712 on covers that bracket the Document within the aggregate, or the 11713 electronic equivalent of covers if the Document is in electronic 11714 form. Otherwise they must appear on printed covers that bracket 11715 the whole aggregate. 11716 11717 8. TRANSLATION 11718 11719 Translation is considered a kind of modification, so you may 11720 distribute translations of the Document under the terms of section 11721 4. Replacing Invariant Sections with translations requires special 11722 permission from their copyright holders, but you may include 11723 translations of some or all Invariant Sections in addition to the 11724 original versions of these Invariant Sections. You may include a 11725 translation of this License, and all the license notices in the 11726 Document, and any Warranty Disclaimers, provided that you also 11727 include the original English version of this License and the 11728 original versions of those notices and disclaimers. In case of a 11729 disagreement between the translation and the original version of 11730 this License or a notice or disclaimer, the original version will 11731 prevail. 11732 11733 If a section in the Document is Entitled "Acknowledgements", 11734 "Dedications", or "History", the requirement (section 4) to 11735 Preserve its Title (section 1) will typically require changing the 11736 actual title. 11737 11738 9. TERMINATION 11739 11740 You may not copy, modify, sublicense, or distribute the Document 11741 except as expressly provided under this License. Any attempt 11742 otherwise to copy, modify, sublicense, or distribute it is void, 11743 and will automatically terminate your rights under this License. 11744 11745 However, if you cease all violation of this License, then your 11746 license from a particular copyright holder is reinstated (a) 11747 provisionally, unless and until the copyright holder explicitly 11748 and finally terminates your license, and (b) permanently, if the 11749 copyright holder fails to notify you of the violation by some 11750 reasonable means prior to 60 days after the cessation. 11751 11752 Moreover, your license from a particular copyright holder is 11753 reinstated permanently if the copyright holder notifies you of the 11754 violation by some reasonable means, this is the first time you have 11755 received notice of violation of this License (for any work) from 11756 that copyright holder, and you cure the violation prior to 30 days 11757 after your receipt of the notice. 11758 11759 Termination of your rights under this section does not terminate 11760 the licenses of parties who have received copies or rights from 11761 you under this License. If your rights have been terminated and 11762 not permanently reinstated, receipt of a copy of some or all of 11763 the same material does not give you any rights to use it. 11764 11765 10. FUTURE REVISIONS OF THIS LICENSE 11766 11767 The Free Software Foundation may publish new, revised versions of 11768 the GNU Free Documentation License from time to time. Such new 11769 versions will be similar in spirit to the present version, but may 11770 differ in detail to address new problems or concerns. See 11771 `http://www.gnu.org/copyleft/'. 11772 11773 Each version of the License is given a distinguishing version 11774 number. If the Document specifies that a particular numbered 11775 version of this License "or any later version" applies to it, you 11776 have the option of following the terms and conditions either of 11777 that specified version or of any later version that has been 11778 published (not as a draft) by the Free Software Foundation. If 11779 the Document does not specify a version number of this License, 11780 you may choose any version ever published (not as a draft) by the 11781 Free Software Foundation. If the Document specifies that a proxy 11782 can decide which future versions of this License can be used, that 11783 proxy's public statement of acceptance of a version permanently 11784 authorizes you to choose that version for the Document. 11785 11786 11. RELICENSING 11787 11788 "Massive Multiauthor Collaboration Site" (or "MMC Site") means any 11789 World Wide Web server that publishes copyrightable works and also 11790 provides prominent facilities for anybody to edit those works. A 11791 public wiki that anybody can edit is an example of such a server. 11792 A "Massive Multiauthor Collaboration" (or "MMC") contained in the 11793 site means any set of copyrightable works thus published on the MMC 11794 site. 11795 11796 "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0 11797 license published by Creative Commons Corporation, a not-for-profit 11798 corporation with a principal place of business in San Francisco, 11799 California, as well as future copyleft versions of that license 11800 published by that same organization. 11801 11802 "Incorporate" means to publish or republish a Document, in whole or 11803 in part, as part of another Document. 11804 11805 An MMC is "eligible for relicensing" if it is licensed under this 11806 License, and if all works that were first published under this 11807 License somewhere other than this MMC, and subsequently 11808 incorporated in whole or in part into the MMC, (1) had no cover 11809 texts or invariant sections, and (2) were thus incorporated prior 11810 to November 1, 2008. 11811 11812 The operator of an MMC Site may republish an MMC contained in the 11813 site under CC-BY-SA on the same site at any time before August 1, 11814 2009, provided the MMC is eligible for relicensing. 11815 11816 11817ADDENDUM: How to use this License for your documents 11818==================================================== 11819 11820To use this License in a document you have written, include a copy of 11821the License in the document and put the following copyright and license 11822notices just after the title page: 11823 11824 Copyright (C) YEAR YOUR NAME. 11825 Permission is granted to copy, distribute and/or modify this document 11826 under the terms of the GNU Free Documentation License, Version 1.3 11827 or any later version published by the Free Software Foundation; 11828 with no Invariant Sections, no Front-Cover Texts, and no Back-Cover 11829 Texts. A copy of the license is included in the section entitled ``GNU 11830 Free Documentation License''. 11831 11832 If you have Invariant Sections, Front-Cover Texts and Back-Cover 11833Texts, replace the "with...Texts." line with this: 11834 11835 with the Invariant Sections being LIST THEIR TITLES, with 11836 the Front-Cover Texts being LIST, and with the Back-Cover Texts 11837 being LIST. 11838 11839 If you have Invariant Sections without Cover Texts, or some other 11840combination of the three, merge those two alternatives to suit the 11841situation. 11842 11843 If your document contains nontrivial examples of program code, we 11844recommend releasing these examples in parallel under your choice of 11845free software license, such as the GNU General Public License, to 11846permit their use in free software. 11847 11848 11849File: bison.info, Node: Bibliography, Next: Index of Terms, Prev: Copying This Manual, Up: Top 11850 11851Bibliography 11852************ 11853 11854[Denny 2008] 11855 Joel E. Denny and Brian A. Malloy, IELR(1): Practical LR(1) Parser 11856 Tables for Non-LR(1) Grammars with Conflict Resolution, in 11857 `Proceedings of the 2008 ACM Symposium on Applied Computing' 11858 (SAC'08), ACM, New York, NY, USA, pp. 240-245. 11859 `http://dx.doi.org/10.1145/1363686.1363747' 11860 11861[Denny 2010 May] 11862 Joel E. Denny, PSLR(1): Pseudo-Scannerless Minimal LR(1) for the 11863 Deterministic Parsing of Composite Languages, Ph.D. Dissertation, 11864 Clemson University, Clemson, SC, USA (May 2010). 11865 `http://proquest.umi.com/pqdlink?did=2041473591&Fmt=7&clientId=79356&RQT=309&VName=PQD' 11866 11867[Denny 2010 November] 11868 Joel E. Denny and Brian A. Malloy, The IELR(1) Algorithm for 11869 Generating Minimal LR(1) Parser Tables for Non-LR(1) Grammars with 11870 Conflict Resolution, in `Science of Computer Programming', Vol. 11871 75, Issue 11 (November 2010), pp. 943-979. 11872 `http://dx.doi.org/10.1016/j.scico.2009.08.001' 11873 11874[DeRemer 1982] 11875 Frank DeRemer and Thomas Pennello, Efficient Computation of LALR(1) 11876 Look-Ahead Sets, in `ACM Transactions on Programming Languages and 11877 Systems', Vol. 4, No. 4 (October 1982), pp. 615-649. 11878 `http://dx.doi.org/10.1145/69622.357187' 11879 11880[Knuth 1965] 11881 Donald E. Knuth, On the Translation of Languages from Left to 11882 Right, in `Information and Control', Vol. 8, Issue 6 (December 11883 1965), pp. 607-639. 11884 `http://dx.doi.org/10.1016/S0019-9958(65)90426-2' 11885 11886[Scott 2000] 11887 Elizabeth Scott, Adrian Johnstone, and Shamsa Sadaf Hussain, 11888 `Tomita-Style Generalised LR Parsers', Royal Holloway, University 11889 of London, Department of Computer Science, TR-00-12 (December 11890 2000). 11891 `http://www.cs.rhul.ac.uk/research/languages/publications/tomita_style_1.ps' 11892 11893 11894File: bison.info, Node: Index of Terms, Prev: Bibliography, Up: Top 11895 11896Index of Terms 11897************** 11898 11899[index] 11900* Menu: 11901 11902* $ <1>: Table of Symbols. (line 25) 11903* $ <2>: Java Action Features. 11904 (line 13) 11905* $ <3>: Table of Symbols. (line 39) 11906* $ <4>: Action Features. (line 14) 11907* $: Table of Symbols. (line 34) 11908* $$ <1>: Java Action Features. 11909 (line 21) 11910* $$ <2>: Action Features. (line 10) 11911* $$ <3>: Table of Symbols. (line 30) 11912* $$: Actions. (line 6) 11913* $< <1>: Java Action Features. 11914 (line 17) 11915* $< <2>: Action Features. (line 18) 11916* $<: Java Action Features. 11917 (line 29) 11918* $@N: Mid-Rule Action Translation. 11919 (line 6) 11920* $[NAME]: Actions. (line 6) 11921* $accept: Table of Symbols. (line 86) 11922* $end: Table of Symbols. (line 124) 11923* $N: Actions. (line 6) 11924* $NAME: Actions. (line 6) 11925* $undefined: Table of Symbols. (line 245) 11926* % <1>: Java Declarations Summary. 11927 (line 53) 11928* %: Table of Symbols. (line 48) 11929* %% <1>: Table of Symbols. (line 43) 11930* %%: Java Declarations Summary. 11931 (line 49) 11932* %code <1>: Table of Symbols. (line 92) 11933* %code <2>: %code Summary. (line 6) 11934* %code <3>: Calc++ Parser. (line 65) 11935* %code <4>: Prologue Alternatives. 11936 (line 6) 11937* %code <5>: Decl Summary. (line 46) 11938* %code <6>: Java Declarations Summary. 11939 (line 37) 11940* %code <7>: Table of Symbols. (line 91) 11941* %code: Decl Summary. (line 47) 11942* %code imports <1>: %code Summary. (line 83) 11943* %code imports: Java Declarations Summary. 11944 (line 41) 11945* %code lexer: Java Declarations Summary. 11946 (line 45) 11947* %code provides <1>: %code Summary. (line 55) 11948* %code provides <2>: Prologue Alternatives. 11949 (line 6) 11950* %code provides: Decl Summary. (line 95) 11951* %code requires <1>: Decl Summary. (line 95) 11952* %code requires <2>: %code Summary. (line 41) 11953* %code requires <3>: Calc++ Parser. (line 17) 11954* %code requires: Prologue Alternatives. 11955 (line 6) 11956* %code top <1>: Prologue Alternatives. 11957 (line 6) 11958* %code top: %code Summary. (line 67) 11959* %debug <1>: Decl Summary. (line 52) 11960* %debug <2>: Table of Symbols. (line 97) 11961* %debug: Enabling Traces. (line 28) 11962* %define <1>: %define Summary. (line 16) 11963* %define <2>: Decl Summary. (line 59) 11964* %define <3>: Table of Symbols. (line 100) 11965* %define <4>: Decl Summary. (line 60) 11966* %define: %define Summary. (line 14) 11967* %define abstract: Java Declarations Summary. 11968 (line 57) 11969* %define api.location.type <1>: %define Summary. (line 49) 11970* %define api.location.type <2>: Java Declarations Summary. 11971 (line 78) 11972* %define api.location.type: User Defined Location Type. 11973 (line 6) 11974* %define api.position.type: Java Declarations Summary. 11975 (line 92) 11976* %define api.prefix: %define Summary. (line 62) 11977* %define api.pure <1>: Pure Decl. (line 6) 11978* %define api.pure: %define Summary. (line 75) 11979* %define api.push-pull <1>: Push Decl. (line 6) 11980* %define api.push-pull: %define Summary. (line 115) 11981* %define extends: Java Declarations Summary. 11982 (line 61) 11983* %define final: Java Declarations Summary. 11984 (line 65) 11985* %define implements: Java Declarations Summary. 11986 (line 69) 11987* %define lex_throws: Java Declarations Summary. 11988 (line 73) 11989* %define lr.default-reductions <1>: %define Summary. (line 128) 11990* %define lr.default-reductions: Default Reductions. (line 6) 11991* %define lr.keep-unreachable-states <1>: Unreachable States. (line 6) 11992* %define lr.keep-unreachable-states <2>: %define Summary. (line 145) 11993* %define lr.keep-unreachable-states: Unreachable States. (line 17) 11994* %define lr.type <1>: LR Table Construction. 11995 (line 6) 11996* %define lr.type <2>: %define Summary. (line 156) 11997* %define lr.type: LR Table Construction. 11998 (line 24) 11999* %define namespace <1>: %define Summary. (line 168) 12000* %define namespace: C++ Bison Interface. (line 10) 12001* %define package: Java Declarations Summary. 12002 (line 84) 12003* %define parse.lac <1>: %define Summary. (line 208) 12004* %define parse.lac: LAC. (line 29) 12005* %define parser_class_name: Java Declarations Summary. 12006 (line 88) 12007* %define public: Java Declarations Summary. 12008 (line 97) 12009* %define strictfp: Java Declarations Summary. 12010 (line 105) 12011* %define stype: Java Declarations Summary. 12012 (line 101) 12013* %define throws: Java Declarations Summary. 12014 (line 109) 12015* %defines <1>: Decl Summary. (line 113) 12016* %defines: Table of Symbols. (line 110) 12017* %destructor <1>: Table of Symbols. (line 114) 12018* %destructor <2>: Destructor Decl. (line 22) 12019* %destructor <3>: Using Mid-Rule Actions. 12020 (line 76) 12021* %destructor <4>: Destructor Decl. (line 6) 12022* %destructor <5>: Decl Summary. (line 116) 12023* %destructor: Destructor Decl. (line 22) 12024* %dprec <1>: Table of Symbols. (line 119) 12025* %dprec: Merging GLR Parses. (line 6) 12026* %error-verbose <1>: Table of Symbols. (line 138) 12027* %error-verbose: Error Reporting. (line 17) 12028* %expect <1>: Expect Decl. (line 6) 12029* %expect: Decl Summary. (line 38) 12030* %expect-rr <1>: Simple GLR Parsers. (line 6) 12031* %expect-rr: Expect Decl. (line 6) 12032* %file-prefix <1>: Decl Summary. (line 121) 12033* %file-prefix: Table of Symbols. (line 142) 12034* %glr-parser <1>: Simple GLR Parsers. (line 6) 12035* %glr-parser <2>: GLR Parsers. (line 6) 12036* %glr-parser: Table of Symbols. (line 146) 12037* %initial-action <1>: Initial Action Decl. (line 11) 12038* %initial-action <2>: Table of Symbols. (line 150) 12039* %initial-action: Initial Action Decl. (line 6) 12040* %language <1>: Decl Summary. (line 125) 12041* %language: Table of Symbols. (line 154) 12042* %language "Java": Java Declarations Summary. 12043 (line 10) 12044* %left <1>: Table of Symbols. (line 158) 12045* %left <2>: Decl Summary. (line 21) 12046* %left: Using Precedence. (line 6) 12047* %lex-param <1>: Pure Calling. (line 31) 12048* %lex-param <2>: Table of Symbols. (line 162) 12049* %lex-param: Java Declarations Summary. 12050 (line 13) 12051* %locations: Decl Summary. (line 131) 12052* %merge <1>: Merging GLR Parses. (line 6) 12053* %merge: Table of Symbols. (line 167) 12054* %name-prefix <1>: Table of Symbols. (line 174) 12055* %name-prefix: Java Declarations Summary. 12056 (line 19) 12057* %no-lines <1>: Decl Summary. (line 138) 12058* %no-lines: Table of Symbols. (line 191) 12059* %nonassoc <1>: Decl Summary. (line 25) 12060* %nonassoc <2>: Table of Symbols. (line 195) 12061* %nonassoc <3>: LR Table Construction. 12062 (line 103) 12063* %nonassoc <4>: Using Precedence. (line 6) 12064* %nonassoc: Default Reductions. (line 6) 12065* %output <1>: Decl Summary. (line 147) 12066* %output: Table of Symbols. (line 199) 12067* %parse-param <1>: Table of Symbols. (line 203) 12068* %parse-param <2>: Parser Function. (line 36) 12069* %parse-param <3>: Java Declarations Summary. 12070 (line 24) 12071* %parse-param: Parser Function. (line 36) 12072* %prec <1>: Contextual Precedence. 12073 (line 6) 12074* %prec: Table of Symbols. (line 208) 12075* %printer: Printer Decl. (line 6) 12076* %pure-parser <1>: Table of Symbols. (line 212) 12077* %pure-parser: Decl Summary. (line 150) 12078* %require <1>: Decl Summary. (line 155) 12079* %require <2>: Table of Symbols. (line 217) 12080* %require: Require Decl. (line 6) 12081* %right <1>: Decl Summary. (line 17) 12082* %right <2>: Using Precedence. (line 6) 12083* %right: Table of Symbols. (line 221) 12084* %skeleton <1>: Decl Summary. (line 159) 12085* %skeleton: Table of Symbols. (line 225) 12086* %start <1>: Start Decl. (line 6) 12087* %start <2>: Table of Symbols. (line 229) 12088* %start: Decl Summary. (line 34) 12089* %token <1>: Table of Symbols. (line 233) 12090* %token <2>: Java Declarations Summary. 12091 (line 29) 12092* %token <3>: Token Decl. (line 6) 12093* %token: Decl Summary. (line 13) 12094* %token-table <1>: Decl Summary. (line 167) 12095* %token-table: Table of Symbols. (line 237) 12096* %type <1>: Type Decl. (line 6) 12097* %type <2>: Table of Symbols. (line 241) 12098* %type <3>: Java Declarations Summary. 12099 (line 33) 12100* %type: Decl Summary. (line 30) 12101* %union <1>: Table of Symbols. (line 250) 12102* %union <2>: Decl Summary. (line 9) 12103* %union: Union Decl. (line 6) 12104* %verbose: Decl Summary. (line 200) 12105* %yacc: Decl Summary. (line 206) 12106* /*: Table of Symbols. (line 53) 12107* /* ... */: Grammar Outline. (line 6) 12108* //: Table of Symbols. (line 54) 12109* // ...: Grammar Outline. (line 6) 12110* :: Table of Symbols. (line 57) 12111* ;: Table of Symbols. (line 61) 12112* <*> <1>: Printer Decl. (line 6) 12113* <*> <2>: Destructor Decl. (line 6) 12114* <*>: Table of Symbols. (line 68) 12115* <> <1>: Table of Symbols. (line 77) 12116* <> <2>: Printer Decl. (line 6) 12117* <>: Destructor Decl. (line 6) 12118* @$ <1>: Action Features. (line 98) 12119* @$ <2>: Table of Symbols. (line 7) 12120* @$ <3>: Actions and Locations. 12121 (line 6) 12122* @$: Java Action Features. 12123 (line 39) 12124* @[: Table of Symbols. (line 21) 12125* @[NAME]: Actions and Locations. 12126 (line 6) 12127* @N <1>: Action Features. (line 104) 12128* @N <2>: Table of Symbols. (line 12) 12129* @N <3>: Action Features. (line 104) 12130* @N <4>: Actions and Locations. 12131 (line 6) 12132* @N <5>: Mid-Rule Action Translation. 12133 (line 6) 12134* @N: Java Action Features. 12135 (line 35) 12136* @NAME <1>: Actions and Locations. 12137 (line 6) 12138* @NAME: Table of Symbols. (line 20) 12139* abstract syntax tree: Implementing Gotos/Loops. 12140 (line 17) 12141* accepting state: Understanding. (line 177) 12142* action: Actions. (line 6) 12143* action data types: Action Types. (line 6) 12144* action features summary: Action Features. (line 6) 12145* actions in mid-rule <1>: Mid-Rule Actions. (line 6) 12146* actions in mid-rule: Destructor Decl. (line 88) 12147* actions, location: Actions and Locations. 12148 (line 6) 12149* actions, semantic: Semantic Actions. (line 6) 12150* additional C code section: Epilogue. (line 6) 12151* algorithm of parser: Algorithm. (line 6) 12152* ambiguous grammars <1>: Generalized LR Parsing. 12153 (line 6) 12154* ambiguous grammars: Language and Grammar. 12155 (line 34) 12156* associativity: Why Precedence. (line 34) 12157* AST: Implementing Gotos/Loops. 12158 (line 17) 12159* Backus-Naur form: Language and Grammar. 12160 (line 16) 12161* begin of Location: Java Location Values. 12162 (line 21) 12163* begin of location: C++ location. (line 22) 12164* Bison declaration summary: Decl Summary. (line 6) 12165* Bison declarations: Declarations. (line 6) 12166* Bison declarations (introduction): Bison Declarations. (line 6) 12167* Bison grammar: Grammar in Bison. (line 6) 12168* Bison invocation: Invocation. (line 6) 12169* Bison parser: Bison Parser. (line 6) 12170* Bison parser algorithm: Algorithm. (line 6) 12171* Bison symbols, table of: Table of Symbols. (line 6) 12172* Bison utility: Bison Parser. (line 6) 12173* bison-i18n.m4: Internationalization. 12174 (line 20) 12175* bison-po: Internationalization. 12176 (line 6) 12177* BISON_I18N: Internationalization. 12178 (line 27) 12179* BISON_LOCALEDIR: Internationalization. 12180 (line 27) 12181* BNF: Language and Grammar. 12182 (line 16) 12183* braced code: Rules. (line 29) 12184* C code, section for additional: Epilogue. (line 6) 12185* C-language interface: Interface. (line 6) 12186* calc: Infix Calc. (line 6) 12187* calculator, infix notation: Infix Calc. (line 6) 12188* calculator, location tracking: Location Tracking Calc. 12189 (line 6) 12190* calculator, multi-function: Multi-function Calc. (line 6) 12191* calculator, simple: RPN Calc. (line 6) 12192* canonical LR <1>: Mysterious Conflicts. 12193 (line 43) 12194* canonical LR: LR Table Construction. 12195 (line 6) 12196* character token: Symbols. (line 37) 12197* column of position: C++ position. (line 29) 12198* columns on location: C++ location. (line 26) 12199* columns on position: C++ position. (line 32) 12200* comment: Grammar Outline. (line 6) 12201* compiling the parser: Rpcalc Compile. (line 6) 12202* conflicts <1>: Merging GLR Parses. (line 6) 12203* conflicts <2>: Shift/Reduce. (line 6) 12204* conflicts <3>: GLR Parsers. (line 6) 12205* conflicts: Simple GLR Parsers. (line 6) 12206* conflicts, reduce/reduce: Reduce/Reduce. (line 6) 12207* conflicts, suppressing warnings of: Expect Decl. (line 6) 12208* consistent states: Default Reductions. (line 17) 12209* context-dependent precedence: Contextual Precedence. 12210 (line 6) 12211* context-free grammar: Language and Grammar. 12212 (line 6) 12213* controlling function: Rpcalc Main. (line 6) 12214* core, item set: Understanding. (line 124) 12215* dangling else: Shift/Reduce. (line 6) 12216* data type of locations: Location Type. (line 6) 12217* data types in actions: Action Types. (line 6) 12218* data types of semantic values: Value Type. (line 6) 12219* debug_level on parser: C++ Parser Interface. 12220 (line 42) 12221* debug_stream on parser: C++ Parser Interface. 12222 (line 37) 12223* debugging: Tracing. (line 6) 12224* declaration summary: Decl Summary. (line 6) 12225* declarations: Prologue. (line 6) 12226* declarations section: Prologue. (line 6) 12227* declarations, Bison: Declarations. (line 6) 12228* declarations, Bison (introduction): Bison Declarations. (line 6) 12229* declaring literal string tokens: Token Decl. (line 6) 12230* declaring operator precedence: Precedence Decl. (line 6) 12231* declaring the start symbol: Start Decl. (line 6) 12232* declaring token type names: Token Decl. (line 6) 12233* declaring value types: Union Decl. (line 6) 12234* declaring value types, nonterminals: Type Decl. (line 6) 12235* default action: Actions. (line 62) 12236* default data type: Value Type. (line 6) 12237* default location type: Location Type. (line 6) 12238* default reductions: Default Reductions. (line 6) 12239* default stack limit: Memory Management. (line 30) 12240* default start symbol: Start Decl. (line 6) 12241* defaulted states: Default Reductions. (line 17) 12242* deferred semantic actions: GLR Semantic Actions. 12243 (line 6) 12244* defining language semantics: Semantics. (line 6) 12245* delayed syntax error detection <1>: Default Reductions. (line 43) 12246* delayed syntax error detection: LR Table Construction. 12247 (line 103) 12248* delayed yylex invocations: Default Reductions. (line 17) 12249* discarded symbols: Destructor Decl. (line 98) 12250* discarded symbols, mid-rule actions: Using Mid-Rule Actions. 12251 (line 76) 12252* dot: Graphviz. (line 6) 12253* else, dangling: Shift/Reduce. (line 6) 12254* end of Location: Java Location Values. 12255 (line 22) 12256* end of location: C++ location. (line 23) 12257* epilogue: Epilogue. (line 6) 12258* error <1>: Table of Symbols. (line 128) 12259* error: Error Recovery. (line 20) 12260* error on parser: C++ Parser Interface. 12261 (line 48) 12262* error recovery: Error Recovery. (line 6) 12263* error recovery, mid-rule actions: Using Mid-Rule Actions. 12264 (line 76) 12265* error recovery, simple: Simple Error Recovery. 12266 (line 6) 12267* error reporting function: Error Reporting. (line 6) 12268* error reporting routine: Rpcalc Error. (line 6) 12269* examples, simple: Examples. (line 6) 12270* exceptions: C++ Parser Interface. 12271 (line 32) 12272* exercises: Exercises. (line 6) 12273* file format: Grammar Layout. (line 6) 12274* file of position: C++ position. (line 17) 12275* finite-state machine: Parser States. (line 6) 12276* formal grammar: Grammar in Bison. (line 6) 12277* format of grammar file: Grammar Layout. (line 6) 12278* freeing discarded symbols: Destructor Decl. (line 6) 12279* frequently asked questions: FAQ. (line 6) 12280* generalized LR (GLR) parsing <1>: Language and Grammar. 12281 (line 34) 12282* generalized LR (GLR) parsing <2>: Generalized LR Parsing. 12283 (line 6) 12284* generalized LR (GLR) parsing: GLR Parsers. (line 6) 12285* generalized LR (GLR) parsing, ambiguous grammars: Merging GLR Parses. 12286 (line 6) 12287* generalized LR (GLR) parsing, unambiguous grammars: Simple GLR Parsers. 12288 (line 6) 12289* getDebugLevel on YYParser: Java Parser Interface. 12290 (line 67) 12291* getDebugStream on YYParser: Java Parser Interface. 12292 (line 62) 12293* getEndPos on Lexer: Java Scanner Interface. 12294 (line 40) 12295* getLVal on Lexer: Java Scanner Interface. 12296 (line 48) 12297* getStartPos on Lexer: Java Scanner Interface. 12298 (line 39) 12299* gettext: Internationalization. 12300 (line 6) 12301* glossary: Glossary. (line 6) 12302* GLR parsers and inline: Compiler Requirements. 12303 (line 6) 12304* GLR parsers and yychar: GLR Semantic Actions. 12305 (line 10) 12306* GLR parsers and yyclearin: GLR Semantic Actions. 12307 (line 18) 12308* GLR parsers and YYERROR: GLR Semantic Actions. 12309 (line 28) 12310* GLR parsers and yylloc: GLR Semantic Actions. 12311 (line 10) 12312* GLR parsers and YYLLOC_DEFAULT: Location Default Action. 12313 (line 6) 12314* GLR parsers and yylval: GLR Semantic Actions. 12315 (line 10) 12316* GLR parsing <1>: Generalized LR Parsing. 12317 (line 6) 12318* GLR parsing <2>: GLR Parsers. (line 6) 12319* GLR parsing: Language and Grammar. 12320 (line 34) 12321* GLR parsing, ambiguous grammars: Merging GLR Parses. (line 6) 12322* GLR parsing, unambiguous grammars: Simple GLR Parsers. (line 6) 12323* GLR with LALR: LR Table Construction. 12324 (line 65) 12325* grammar file: Grammar Layout. (line 6) 12326* grammar rule syntax: Rules. (line 6) 12327* grammar rules section: Grammar Rules. (line 6) 12328* grammar, Bison: Grammar in Bison. (line 6) 12329* grammar, context-free: Language and Grammar. 12330 (line 6) 12331* grouping, syntactic: Language and Grammar. 12332 (line 48) 12333* Header guard: Decl Summary. (line 98) 12334* i18n: Internationalization. 12335 (line 6) 12336* IELR <1>: LR Table Construction. 12337 (line 6) 12338* IELR: Mysterious Conflicts. 12339 (line 43) 12340* IELR grammars: Language and Grammar. 12341 (line 22) 12342* infix notation calculator: Infix Calc. (line 6) 12343* initialize on location: C++ location. (line 19) 12344* initialize on position: C++ position. (line 14) 12345* inline: Compiler Requirements. 12346 (line 6) 12347* interface: Interface. (line 6) 12348* internationalization: Internationalization. 12349 (line 6) 12350* introduction: Introduction. (line 6) 12351* invoking Bison: Invocation. (line 6) 12352* item: Understanding. (line 102) 12353* item set core: Understanding. (line 124) 12354* kernel, item set: Understanding. (line 124) 12355* LAC <1>: LR Table Construction. 12356 (line 103) 12357* LAC <2>: LAC. (line 6) 12358* LAC: Default Reductions. (line 54) 12359* LALR <1>: LR Table Construction. 12360 (line 6) 12361* LALR: Mysterious Conflicts. 12362 (line 31) 12363* LALR grammars: Language and Grammar. 12364 (line 22) 12365* language semantics, defining: Semantics. (line 6) 12366* layout of Bison grammar: Grammar Layout. (line 6) 12367* left recursion: Recursion. (line 17) 12368* lex-param: Pure Calling. (line 31) 12369* lexical analyzer: Lexical. (line 6) 12370* lexical analyzer, purpose: Bison Parser. (line 6) 12371* lexical analyzer, writing: Rpcalc Lexer. (line 6) 12372* lexical tie-in: Lexical Tie-ins. (line 6) 12373* line of position: C++ position. (line 23) 12374* lines on location: C++ location. (line 27) 12375* lines on position: C++ position. (line 26) 12376* literal string token: Symbols. (line 59) 12377* literal token: Symbols. (line 37) 12378* location <1>: Locations. (line 6) 12379* location: Tracking Locations. (line 6) 12380* location actions: Actions and Locations. 12381 (line 6) 12382* location on location: C++ location. (line 8) 12383* Location on Location: Java Location Values. 12384 (line 25) 12385* location on location: C++ location. (line 12) 12386* location tracking calculator: Location Tracking Calc. 12387 (line 6) 12388* location, textual <1>: Tracking Locations. (line 6) 12389* location, textual: Locations. (line 6) 12390* location_type: C++ Parser Interface. 12391 (line 16) 12392* lookahead correction: LAC. (line 6) 12393* lookahead token: Lookahead. (line 6) 12394* LR: Mysterious Conflicts. 12395 (line 31) 12396* LR grammars: Language and Grammar. 12397 (line 22) 12398* ltcalc: Location Tracking Calc. 12399 (line 6) 12400* main function in simple example: Rpcalc Main. (line 6) 12401* memory exhaustion: Memory Management. (line 6) 12402* memory management: Memory Management. (line 6) 12403* mfcalc: Multi-function Calc. (line 6) 12404* mid-rule actions <1>: Destructor Decl. (line 88) 12405* mid-rule actions: Mid-Rule Actions. (line 6) 12406* multi-function calculator: Multi-function Calc. (line 6) 12407* multicharacter literal: Symbols. (line 59) 12408* mutual recursion: Recursion. (line 34) 12409* Mysterious Conflict: LR Table Construction. 12410 (line 6) 12411* Mysterious Conflicts: Mysterious Conflicts. 12412 (line 6) 12413* named references: Named References. (line 6) 12414* NLS: Internationalization. 12415 (line 6) 12416* nondeterministic parsing <1>: Language and Grammar. 12417 (line 34) 12418* nondeterministic parsing: Generalized LR Parsing. 12419 (line 6) 12420* nonterminal symbol: Symbols. (line 6) 12421* nonterminal, useless: Understanding. (line 48) 12422* operator precedence: Precedence. (line 6) 12423* operator precedence, declaring: Precedence Decl. (line 6) 12424* operator!= on location: C++ location. (line 39) 12425* operator!= on position: C++ position. (line 42) 12426* operator+ on location: C++ location. (line 31) 12427* operator+ on position: C++ position. (line 36) 12428* operator+= on location: C++ location. (line 32) 12429* operator+= on position: C++ position. (line 35) 12430* operator- on position: C++ position. (line 38) 12431* operator-= on position: C++ position. (line 37) 12432* operator<< <1>: C++ position. (line 46) 12433* operator<<: C++ location. (line 44) 12434* operator== on location: C++ location. (line 38) 12435* operator== on position: C++ position. (line 41) 12436* options for invoking Bison: Invocation. (line 6) 12437* overflow of parser stack: Memory Management. (line 6) 12438* parse error: Error Reporting. (line 6) 12439* parse on parser: C++ Parser Interface. 12440 (line 30) 12441* parse on YYParser: Java Parser Interface. 12442 (line 54) 12443* parser: Bison Parser. (line 6) 12444* parser on parser: C++ Parser Interface. 12445 (line 26) 12446* parser stack: Algorithm. (line 6) 12447* parser stack overflow: Memory Management. (line 6) 12448* parser state: Parser States. (line 6) 12449* pointed rule: Understanding. (line 102) 12450* polish notation calculator: RPN Calc. (line 6) 12451* position on position: C++ position. (line 8) 12452* precedence declarations: Precedence Decl. (line 6) 12453* precedence of operators: Precedence. (line 6) 12454* precedence, context-dependent: Contextual Precedence. 12455 (line 6) 12456* precedence, unary operator: Contextual Precedence. 12457 (line 6) 12458* preventing warnings about conflicts: Expect Decl. (line 6) 12459* printing semantic values: Printer Decl. (line 6) 12460* Prologue <1>: %code Summary. (line 6) 12461* Prologue: Prologue. (line 6) 12462* Prologue Alternatives: Prologue Alternatives. 12463 (line 6) 12464* pure parser: Pure Decl. (line 6) 12465* push parser: Push Decl. (line 6) 12466* questions: FAQ. (line 6) 12467* recovering: Java Action Features. 12468 (line 55) 12469* recovering on YYParser: Java Parser Interface. 12470 (line 58) 12471* recovery from errors: Error Recovery. (line 6) 12472* recursive rule: Recursion. (line 6) 12473* reduce/reduce conflict: Reduce/Reduce. (line 6) 12474* reduce/reduce conflicts <1>: Simple GLR Parsers. (line 6) 12475* reduce/reduce conflicts <2>: GLR Parsers. (line 6) 12476* reduce/reduce conflicts: Merging GLR Parses. (line 6) 12477* reduction: Algorithm. (line 6) 12478* reentrant parser: Pure Decl. (line 6) 12479* requiring a version of Bison: Require Decl. (line 6) 12480* reverse polish notation: RPN Calc. (line 6) 12481* right recursion: Recursion. (line 17) 12482* rpcalc: RPN Calc. (line 6) 12483* rule syntax: Rules. (line 6) 12484* rule, pointed: Understanding. (line 102) 12485* rule, useless: Understanding. (line 48) 12486* rules section for grammar: Grammar Rules. (line 6) 12487* running Bison (introduction): Rpcalc Generate. (line 6) 12488* semantic actions: Semantic Actions. (line 6) 12489* semantic value: Semantic Values. (line 6) 12490* semantic value type: Value Type. (line 6) 12491* semantic_type: C++ Parser Interface. 12492 (line 15) 12493* set_debug_level on parser: C++ Parser Interface. 12494 (line 43) 12495* set_debug_stream on parser: C++ Parser Interface. 12496 (line 38) 12497* setDebugLevel on YYParser: Java Parser Interface. 12498 (line 68) 12499* setDebugStream on YYParser: Java Parser Interface. 12500 (line 63) 12501* shift/reduce conflicts <1>: Simple GLR Parsers. (line 6) 12502* shift/reduce conflicts <2>: GLR Parsers. (line 6) 12503* shift/reduce conflicts: Shift/Reduce. (line 6) 12504* shifting: Algorithm. (line 6) 12505* simple examples: Examples. (line 6) 12506* single-character literal: Symbols. (line 37) 12507* stack overflow: Memory Management. (line 6) 12508* stack, parser: Algorithm. (line 6) 12509* stages in using Bison: Stages. (line 6) 12510* start symbol: Language and Grammar. 12511 (line 97) 12512* start symbol, declaring: Start Decl. (line 6) 12513* state (of parser): Parser States. (line 6) 12514* step on location: C++ location. (line 35) 12515* string token: Symbols. (line 59) 12516* summary, action features: Action Features. (line 6) 12517* summary, Bison declaration: Decl Summary. (line 6) 12518* suppressing conflict warnings: Expect Decl. (line 6) 12519* symbol: Symbols. (line 6) 12520* symbol table example: Mfcalc Symbol Table. (line 6) 12521* symbols (abstract): Language and Grammar. 12522 (line 48) 12523* symbols in Bison, table of: Table of Symbols. (line 6) 12524* syntactic grouping: Language and Grammar. 12525 (line 48) 12526* syntax error: Error Reporting. (line 6) 12527* syntax of grammar rules: Rules. (line 6) 12528* terminal symbol: Symbols. (line 6) 12529* textual location <1>: Tracking Locations. (line 6) 12530* textual location: Locations. (line 6) 12531* token <1>: Language and Grammar. 12532 (line 48) 12533* token: C++ Parser Interface. 12534 (line 19) 12535* token type: Symbols. (line 6) 12536* token type names, declaring: Token Decl. (line 6) 12537* token, useless: Understanding. (line 48) 12538* toString on Location: Java Location Values. 12539 (line 32) 12540* tracing the parser: Tracing. (line 6) 12541* uint: C++ Location Values. (line 15) 12542* unary operator precedence: Contextual Precedence. 12543 (line 6) 12544* unreachable states: Unreachable States. (line 6) 12545* useless nonterminal: Understanding. (line 48) 12546* useless rule: Understanding. (line 48) 12547* useless token: Understanding. (line 48) 12548* using Bison: Stages. (line 6) 12549* value type, semantic: Value Type. (line 6) 12550* value types, declaring: Union Decl. (line 6) 12551* value types, nonterminals, declaring: Type Decl. (line 6) 12552* value, semantic: Semantic Values. (line 6) 12553* version requirement: Require Decl. (line 6) 12554* warnings, preventing: Expect Decl. (line 6) 12555* writing a lexical analyzer: Rpcalc Lexer. (line 6) 12556* xml: Xml. (line 6) 12557* YYABORT <1>: Parser Function. (line 29) 12558* YYABORT <2>: Table of Symbols. (line 254) 12559* YYABORT <3>: Action Features. (line 28) 12560* YYABORT: Java Action Features. 12561 (line 43) 12562* YYACCEPT <1>: Table of Symbols. (line 263) 12563* YYACCEPT <2>: Parser Function. (line 26) 12564* YYACCEPT <3>: Java Action Features. 12565 (line 47) 12566* YYACCEPT <4>: Parser Function. (line 26) 12567* YYACCEPT: Action Features. (line 32) 12568* YYBACKUP <1>: Table of Symbols. (line 271) 12569* YYBACKUP: Action Features. (line 36) 12570* yychar <1>: Action Features. (line 69) 12571* yychar <2>: GLR Semantic Actions. 12572 (line 10) 12573* yychar <3>: Lookahead. (line 49) 12574* yychar: Table of Symbols. (line 275) 12575* yyclearin <1>: GLR Semantic Actions. 12576 (line 18) 12577* yyclearin <2>: Error Recovery. (line 99) 12578* yyclearin <3>: Action Features. (line 76) 12579* yyclearin: Table of Symbols. (line 281) 12580* YYDEBUG: Enabling Traces. (line 9) 12581* yydebug: Tracing. (line 6) 12582* YYDEBUG: Table of Symbols. (line 285) 12583* yydebug: Table of Symbols. (line 289) 12584* YYEMPTY: Action Features. (line 49) 12585* YYENABLE_NLS: Internationalization. 12586 (line 27) 12587* YYEOF: Action Features. (line 52) 12588* yyerrok <1>: Action Features. (line 81) 12589* yyerrok <2>: Table of Symbols. (line 294) 12590* yyerrok: Error Recovery. (line 94) 12591* yyerror: Java Action Features. 12592 (line 60) 12593* YYERROR <1>: Java Action Features. 12594 (line 51) 12595* YYERROR: Action Features. (line 56) 12596* yyerror <1>: Error Reporting. (line 6) 12597* yyerror: Table of Symbols. (line 309) 12598* YYERROR <1>: Table of Symbols. (line 298) 12599* YYERROR: GLR Semantic Actions. 12600 (line 28) 12601* yyerror: Java Action Features. 12602 (line 61) 12603* yyerror on Lexer: Java Scanner Interface. 12604 (line 25) 12605* YYERROR_VERBOSE: Table of Symbols. (line 313) 12606* YYFPRINTF <1>: Enabling Traces. (line 36) 12607* YYFPRINTF: Table of Symbols. (line 321) 12608* YYINITDEPTH <1>: Memory Management. (line 32) 12609* YYINITDEPTH: Table of Symbols. (line 324) 12610* yylex <1>: Lexical. (line 6) 12611* yylex: Table of Symbols. (line 328) 12612* yylex on Lexer: Java Scanner Interface. 12613 (line 31) 12614* yylex on parser: C++ Scanner Interface. 12615 (line 13) 12616* YYLEX_PARAM: Table of Symbols. (line 333) 12617* yylloc <1>: GLR Semantic Actions. 12618 (line 10) 12619* yylloc <2>: Actions and Locations. 12620 (line 67) 12621* yylloc <3>: Lookahead. (line 49) 12622* yylloc <4>: Action Features. (line 86) 12623* yylloc <5>: Table of Symbols. (line 339) 12624* yylloc: Token Locations. (line 6) 12625* YYLLOC_DEFAULT: Location Default Action. 12626 (line 6) 12627* YYLTYPE <1>: Table of Symbols. (line 349) 12628* YYLTYPE: Token Locations. (line 19) 12629* yylval <1>: Token Values. (line 6) 12630* yylval <2>: Lookahead. (line 49) 12631* yylval <3>: Table of Symbols. (line 353) 12632* yylval <4>: Actions. (line 87) 12633* yylval <5>: GLR Semantic Actions. 12634 (line 10) 12635* yylval: Action Features. (line 92) 12636* YYMAXDEPTH <1>: Memory Management. (line 14) 12637* YYMAXDEPTH: Table of Symbols. (line 361) 12638* yynerrs <1>: Table of Symbols. (line 365) 12639* yynerrs: Error Reporting. (line 69) 12640* yyoutput: Printer Decl. (line 16) 12641* yyparse <1>: Parser Function. (line 13) 12642* yyparse: Table of Symbols. (line 371) 12643* YYPARSE_PARAM: Table of Symbols. (line 409) 12644* YYParser on YYParser: Java Parser Interface. 12645 (line 41) 12646* YYPRINT <1>: The YYPRINT Macro. (line 6) 12647* YYPRINT <2>: Table of Symbols. (line 375) 12648* YYPRINT: The YYPRINT Macro. (line 11) 12649* yypstate_delete <1>: Table of Symbols. (line 380) 12650* yypstate_delete: Parser Delete Function. 12651 (line 15) 12652* yypstate_new <1>: Parser Create Function. 12653 (line 15) 12654* yypstate_new <2>: Table of Symbols. (line 388) 12655* yypstate_new: Parser Create Function. 12656 (line 6) 12657* yypull_parse <1>: Table of Symbols. (line 395) 12658* yypull_parse: Pull Parser Function. 12659 (line 14) 12660* yypush_parse <1>: Table of Symbols. (line 402) 12661* yypush_parse: Push Parser Function. 12662 (line 6) 12663* YYRECOVERING <1>: Error Recovery. (line 111) 12664* YYRECOVERING <2>: Action Features. (line 64) 12665* YYRECOVERING: Table of Symbols. (line 415) 12666* YYSTACK_USE_ALLOCA: Table of Symbols. (line 420) 12667* YYSTYPE: Table of Symbols. (line 436) 12668* | <1>: Rules. (line 48) 12669* |: Table of Symbols. (line 64) 12670 12671 12672 12673Tag Table: 12674Node: Top1060 12675Node: Introduction15110 12676Node: Conditions16781 12677Node: Copying18692 12678Node: Concepts56230 12679Node: Language and Grammar57422 12680Node: Grammar in Bison63357 12681Node: Semantic Values65271 12682Node: Semantic Actions67377 12683Node: GLR Parsers68551 12684Node: Simple GLR Parsers71296 12685Node: Merging GLR Parses77709 12686Node: GLR Semantic Actions82251 12687Node: Compiler Requirements84145 12688Node: Locations84897 12689Node: Bison Parser86350 12690Node: Stages89471 12691Node: Grammar Layout90759 12692Node: Examples92091 12693Node: RPN Calc93287 12694Node: Rpcalc Declarations94289 12695Node: Rpcalc Rules96217 12696Node: Rpcalc Input97950 12697Node: Rpcalc Line99421 12698Node: Rpcalc Expr100542 12699Node: Rpcalc Lexer102431 12700Node: Rpcalc Main105033 12701Node: Rpcalc Error105440 12702Node: Rpcalc Generate106473 12703Node: Rpcalc Compile107709 12704Node: Infix Calc108633 12705Node: Simple Error Recovery111347 12706Node: Location Tracking Calc113229 12707Node: Ltcalc Declarations113925 12708Node: Ltcalc Rules115014 12709Node: Ltcalc Lexer116879 12710Node: Multi-function Calc119202 12711Node: Mfcalc Declarations120778 12712Node: Mfcalc Rules122790 12713Node: Mfcalc Symbol Table124112 12714Node: Exercises130956 12715Node: Grammar File131470 12716Node: Grammar Outline132378 12717Node: Prologue133228 12718Node: Prologue Alternatives135032 12719Node: Bison Declarations144604 12720Node: Grammar Rules145032 12721Node: Epilogue145503 12722Node: Symbols146548 12723Node: Rules153623 12724Node: Recursion156043 12725Node: Semantics157718 12726Node: Value Type158826 12727Node: Multiple Types159661 12728Node: Actions160828 12729Node: Action Types164641 12730Node: Mid-Rule Actions165935 12731Node: Using Mid-Rule Actions166519 12732Node: Mid-Rule Action Translation170599 12733Node: Mid-Rule Conflicts172475 12734Node: Tracking Locations175101 12735Node: Location Type175765 12736Node: Actions and Locations176785 12737Node: Location Default Action179235 12738Node: Named References182733 12739Node: Declarations185369 12740Node: Require Decl187106 12741Node: Token Decl187425 12742Node: Precedence Decl189851 12743Node: Union Decl191861 12744Node: Type Decl193635 12745Node: Initial Action Decl194561 12746Node: Destructor Decl195347 12747Node: Printer Decl201012 12748Node: Expect Decl203308 12749Node: Start Decl205304 12750Node: Pure Decl205694 12751Node: Push Decl207449 12752Node: Decl Summary211940 12753Node: %define Summary220627 12754Node: %code Summary227657 12755Node: Multiple Parsers231375 12756Node: Interface234444 12757Node: Parser Function235762 12758Node: Push Parser Function238161 12759Node: Pull Parser Function238947 12760Node: Parser Create Function239596 12761Node: Parser Delete Function240416 12762Node: Lexical241183 12763Node: Calling Convention242626 12764Node: Token Values245601 12765Node: Token Locations246765 12766Node: Pure Calling247649 12767Node: Error Reporting249196 12768Node: Action Features252491 12769Node: Internationalization256792 12770Node: Algorithm259335 12771Node: Lookahead261765 12772Node: Shift/Reduce263942 12773Node: Precedence267119 12774Node: Why Precedence267837 12775Node: Using Precedence269673 12776Node: Precedence Examples270650 12777Node: How Precedence271167 12778Node: Non Operators272346 12779Node: Contextual Precedence273904 12780Node: Parser States275682 12781Node: Reduce/Reduce276926 12782Node: Mysterious Conflicts281576 12783Node: Tuning LR285082 12784Node: LR Table Construction286390 12785Node: Default Reductions292078 12786Node: LAC296912 12787Node: Unreachable States302464 12788Node: Generalized LR Parsing304455 12789Node: Memory Management308829 12790Node: Error Recovery311061 12791Node: Context Dependency316316 12792Node: Semantic Tokens317165 12793Node: Lexical Tie-ins320157 12794Node: Tie-in Recovery321595 12795Node: Debugging323689 12796Node: Understanding324928 12797Node: Graphviz335514 12798Node: Xml339875 12799Node: Tracing341593 12800Node: Enabling Traces342027 12801Node: Mfcalc Traces345489 12802Node: The YYPRINT Macro350761 12803Node: Invocation351924 12804Node: Bison Options353339 12805Node: Option Cross Key363832 12806Node: Yacc Library365755 12807Node: Other Languages366580 12808Node: C++ Parsers366907 12809Node: C++ Bison Interface367404 12810Node: C++ Semantic Values368793 12811Ref: C++ Semantic Values-Footnote-1369735 12812Node: C++ Location Values369888 12813Node: C++ position370813 12814Node: C++ location372689 12815Node: User Defined Location Type374445 12816Node: C++ Parser Interface375944 12817Node: C++ Scanner Interface378176 12818Node: A Complete C++ Example378877 12819Node: Calc++ --- C++ Calculator379819 12820Node: Calc++ Parsing Driver380333 12821Node: Calc++ Parser384114 12822Node: Calc++ Scanner387938 12823Node: Calc++ Top Level391463 12824Node: Java Parsers392119 12825Node: Java Bison Interface392796 12826Node: Java Semantic Values394847 12827Node: Java Location Values396461 12828Node: Java Parser Interface398017 12829Node: Java Scanner Interface401255 12830Node: Java Action Features403452 12831Node: Java Differences406075 12832Ref: Java Differences-Footnote-1408642 12833Node: Java Declarations Summary408792 12834Node: FAQ413114 12835Node: Memory Exhausted414061 12836Node: How Can I Reset the Parser414374 12837Node: Strings are Destroyed416915 12838Node: Implementing Gotos/Loops418588 12839Node: Multiple start-symbols419871 12840Node: Secure? Conform?421418 12841Node: I can't build Bison421866 12842Node: Where can I find help?422580 12843Node: Bug Reports423373 12844Node: More Languages424833 12845Node: Beta Testing425191 12846Node: Mailing Lists426065 12847Node: Table of Symbols426276 12848Node: Glossary443560 12849Node: Copying This Manual452564 12850Node: Bibliography477699 12851Node: Index of Terms479590 12852 12853End Tag Table 12854