1ANTLR v3.0.1 C Runtime
2ANTLR 3.0.1
3January 1, 2008
4
5At the moment, the use of the C runtime engine for the parser is not generally
6for the inexperienced C programmer. However this is mainly because of the lack
7of documentation on use, which will be corrected shortly. The C runtime
8code itself is however well documented with doxygen style comments and a
9reasonably experienced C programmer should be able to piece it together. You
10can visit the documentation at: http://www.antlr.org/api/C/index.html
11
12The general make up is that everything is implemented as a pseudo class/object
13initialized with pointers to its 'member' functions and data. All objects are
14(usually) created by factories, which auto manage the memory allocation and
15release and generally make life easier. If you remember this rule, everything
16should fall in to place.
17
18Jim Idle - Portland Oregon, Jan 2008
19jimi     idle ws
20
21===============================================================================
22
23Terence Parr, parrt at cs usfca edu
24ANTLR project lead and supreme dictator for life
25University of San Francisco
26
27INTRODUCTION
28
29Welcome to ANTLR v3!  I've been working on this for nearly 4 years and it's
30almost ready!  I plan no feature additions between this beta and first
313.0 release.  I have lots of features to add later, but this will be
32the first set.  Ultimately, I need to rewrite ANTLR v3 in itself (it's
33written in 2.7.7 at the moment and also needs StringTemplate 3.0 or
34later).
35
36You should use v3 in conjunction with ANTLRWorks:
37
38    http://www.antlr.org/works/index.html
39
40WARNING: We have bits of documentation started, but nothing super-complete
41yet.  The book will be printed May 2007:
42
43http://www.pragmaticprogrammer.com/titles/tpantlr/index.html
44
45but we should have a beta PDF available on that page in Feb 2007.
46
47You also have the examples plus the source to guide you.
48
49See the new wiki FAQ:
50
51    http://www.antlr.org/wiki/display/ANTLR3/ANTLR+v3+FAQ
52
53and general doc root:
54
55    http://www.antlr.org/wiki/display/ANTLR3/ANTLR+3+Wiki+Home
56
57Please help add/update FAQ entries.
58
59I have made very little effort at this point to deal well with
60erroneous input (e.g., bad syntax might make ANTLR crash).  I will clean
61this up after I've rewritten v3 in v3.
62
63Per the license in LICENSE.txt, this software is not guaranteed to
64work and might even destroy all life on this planet:
65
66THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
67IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
68WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
69DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
70INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
71(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
72SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
73HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
74STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
75IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
76POSSIBILITY OF SUCH DAMAGE.
77
78EXAMPLES
79
80ANTLR v3 sample grammars:
81
82    http://www.antlr.org/download/examples-v3.tar.gz
83
84contains the following examples: LL-star, cminus, dynamic-scope,
85fuzzy, hoistedPredicates, island-grammar, java, python, scopes,
86simplecTreeParser, treeparser, tweak, xmlLexer.
87
88Also check out Mantra Programming Language for a prototype (work in
89progress) using v3:
90
91    http://www.linguamantra.org/
92
93----------------------------------------------------------------------
94
95What is ANTLR?
96
97ANTLR stands for (AN)other (T)ool for (L)anguage (R)ecognition and was
98originally known as PCCTS.  ANTLR is a language tool that provides a
99framework for constructing recognizers, compilers, and translators
100from grammatical descriptions containing actions.  Target language list:
101
102http://www.antlr.org/wiki/display/ANTLR3/Code+Generation+Targets
103
104----------------------------------------------------------------------
105
106How is ANTLR v3 different than ANTLR v2?
107
108See migration guide:
109    http://www.antlr.org/wiki/display/ANTLR3/Migrating+from+ANTLR+2+to+ANTLR+3
110
111ANTLR v3 has a far superior parsing algorithm called LL(*) that
112handles many more grammars than v2 does.  In practice, it means you
113can throw almost any grammar at ANTLR that is non-left-recursive and
114unambiguous (same input can be matched by multiple rules); the cost is
115perhaps a tiny bit of backtracking, but with a DFA not a full parser.
116You can manually set the max lookahead k as an option for any decision
117though.  The LL(*) algorithm ramps up to use more lookahead when it
118needs to and is much more efficient than normal LL backtracking. There
119is support for syntactic predicate (full LL backtracking) when LL(*)
120fails.
121
122Lexers are much easier due to the LL(*) algorithm as well.  Previously
123these two lexer rules would cause trouble because ANTLR couldn't
124distinguish between them with finite lookahead to see the decimal
125point:
126
127INT : ('0'..'9')+ ;
128FLOAT : INT '.' INT ;
129
130The syntax is almost identical for features in common, but you should
131note that labels are always '=' not ':'.  So do id=ID not id:ID.
132
133You can do combined lexer/parser grammars again (ala PCCTS) both lexer
134and parser rules are defined in the same file.  See the examples.
135Really nice.  You can reference strings and characters in the grammar
136and ANTLR will generate the lexer for you.
137
138The attribute structure has been enhanced.  Rules may have multiple
139return values, for example.  Further, there are dynamically scoped
140attributes whereby a rule may define a value usable by any rule it
141invokes directly or indirectly w/o having to pass a parameter all the
142way down.
143
144ANTLR v3 tree construction is far superior--it provides tree rewrite
145rules where the right hand side is simply the tree grammar fragment
146describing the tree you want to build:
147
148formalArgs
149	:	typename declarator (',' typename declarator )*
150		-> ^(ARG typename declarator)+
151	;
152
153That builds tree sequences like:
154
155^(ARG int v1) ^(ARG int v2)
156
157ANTLR v3 also incorporates StringTemplate:
158
159      http://www.stringtemplate.org
160
161just like AST support.  It is useful for generating output.  For
162example this rule creates a template called 'import' for each import
163definition found in the input stream:
164
165grammar Java;
166options {
167  output=template;
168}
169...
170importDefinition
171    :   'import' identifierStar SEMI
172        -> import(name={$identifierStar.st},
173                begin={$identifierStar.start},
174                end={$identifierStar.stop})
175    ;
176
177The attributes are set via assignments in the argument list.  The
178arguments are actions with arbitrary expressions in the target
179language.  The .st label property is the result template from a rule
180reference.  There is a nice shorthand in actions too:
181
182    %foo(a={},b={},...) ctor
183    %({name-expr})(a={},...) indirect template ctor reference
184    %{string-expr} anonymous template from string expr
185    %{expr}.y = z; template attribute y of StringTemplate-typed expr to z
186    %x.y = z; set template attribute y of x (always set never get attr)
187              to z [languages like python without ';' must still use the
188              ';' which the code generator is free to remove during code gen]
189              Same as '(x).setAttribute("y", z);'
190
191For ANTLR v3 I decided to make the most common tasks easy by default
192rather.  This means that some of the basic objects are heavier weight
193than some speed demons would like, but they are free to pare it down
194leaving most programmers the luxury of having it "just work."  For
195example, to read in some input, tweak it, and write it back out
196preserving whitespace, is easy in v3.
197
198The ANTLR source code is much prettier.  You'll also note that the
199run-time classes are conveniently encapsulated in the
200org.antlr.runtime package.
201
202----------------------------------------------------------------------
203
204How do I install this damn thing?
205
206Just untar and you'll get:
207
208antlr-3.0b6/README.txt (this file)
209antlr-3.0b6/LICENSE.txt
210antlr-3.0b6/src/org/antlr/...
211antlr-3.0b6/lib/stringtemplate-3.0.jar (3.0b6 needs 3.0)
212antlr-3.0b6/lib/antlr-2.7.7.jar
213antlr-3.0b6/lib/antlr-3.0b6.jar
214
215Then you need to add all the jars in lib to your CLASSPATH.
216
217----------------------------------------------------------------------
218
219How do I use ANTLR v3?
220
221[I am assuming you are only using the command-line (and not the
222ANTLRWorks GUI)].
223
224Running ANTLR with no parameters shows you:
225
226ANTLR Parser Generator   Early Access Version 3.0b6 (Jan 31, 2007) 1989-2007
227usage: java org.antlr.Tool [args] file.g [file2.g file3.g ...]
228  -o outputDir          specify output directory where all output is generated
229  -lib dir              specify location of token files
230  -report               print out a report about the grammar(s) processed
231  -print                print out the grammar without actions
232  -debug                generate a parser that emits debugging events
233  -profile              generate a parser that computes profiling information
234  -nfa                  generate an NFA for each rule
235  -dfa                  generate a DFA for each decision point
236  -message-format name  specify output style for messages
237  -X                    display extended argument list
238
239For example, consider how to make the LL-star example from the examples
240tarball you can get at http://www.antlr.org/download/examples-v3.tar.gz
241
242$ cd examples/java/LL-star
243$ java org.antlr.Tool simplec.g
244$ jikes *.java
245
246For input:
247
248char c;
249int x;
250void bar(int x);
251int foo(int y, char d) {
252  int i;
253  for (i=0; i<3; i=i+1) {
254    x=3;
255    y=5;
256  }
257}
258
259you will see output as follows:
260
261$ java Main input
262bar is a declaration
263foo is a definition
264
265What if I want to test my parser without generating code?  Easy.  Just
266run ANTLR in interpreter mode.  It can't execute your actions, but it
267can create a parse tree from your input to show you how it would be
268matched.  Use the org.antlr.tool.Interp main class.  In the following,
269I interpret simplec.g on t.c, which contains "int x;"
270
271$ java org.antlr.tool.Interp simplec.g WS program t.c
272( <grammar SimpleC>
273  ( program
274    ( declaration
275      ( variable
276        ( type [@0,0:2='int',<14>,1:0] )
277        ( declarator [@2,4:4='x',<2>,1:4] )
278        [@3,5:5=';',<5>,1:5]
279      )
280    )
281  )
282)
283
284where I have formatted the output to make it more readable.  I have
285told it to ignore all WS tokens.
286
287----------------------------------------------------------------------
288
289How do I rebuild ANTLR v3?
290
291Make sure the following two jars are in your CLASSPATH
292
293antlr-3.0b6/lib/stringtemplate-3.0.jar
294antlr-3.0b6/lib/antlr-2.7.7.jar
295junit.jar [if you want to build the test directories]
296
297then jump into antlr-3.0b6/src directory and then type:
298
299$ javac -d . org/antlr/Tool.java org/antlr/*/*.java org/antlr/*/*/*.java
300
301Takes 9 seconds on my 1Ghz laptop or 4 seconds with jikes.  Later I'll
302have a real build mechanism, though I must admit the one-liner appeals
303to me.  I use Intellij so I never type anything actually to build.
304
305There is also an ANT build.xml file, but I know nothing of ANT; contributed
306by others (I'm opposed to any tool with an XML interface for Humans).
307
308-----------------------------------------------------------------------
309C# Target Notes
310
3111. Auto-generated lexers do not inherit parent parser's @namespace
312   {...} value.  Use @lexer::namespace{...}.
313
314-----------------------------------------------------------------------
315
316CHANGES
317
318March 17, 2007
319
320* Jonathan DeKlotz updated C# templates to be 3.0b6 current
321
322March 14, 2007
323
324* Manually-specified (...)=> force backtracking eval of that predicate.
325  backtracking=true mode does not however.  Added unit test.
326
327March 14, 2007
328
329* Fixed bug in lexer where ~T didn't compute the set from rule T.
330
331* Added -Xnoinlinedfa make all DFA with tables; no inline prediction with IFs
332
333* Fixed http://www.antlr.org:8888/browse/ANTLR-80.
334  Sem pred states didn't define lookahead vars.
335
336* Fixed http://www.antlr.org:8888/browse/ANTLR-91.
337  When forcing some acyclic DFA to be state tables, they broke.
338  Forcing all DFA to be state tables should give same results.
339
340March 12, 2007
341
342* setTokenSource in CommonTokenStream didn't clear tokens list.
343  setCharStream calls reset in Lexer.
344
345* Altered -depend.  No longer printing grammar files for multiple input
346  files with -depend.  Doesn't show T__.g temp file anymore. Added
347  TLexer.tokens.  Added .h files if defined.
348
349February 11, 2007
350
351* Added -depend command-line option that, instead of processing files,
352  it shows you what files the input grammar(s) depend on and what files
353  they generate. For combined grammar T.g:
354
355  $ java org.antlr.Tool -depend T.g
356
357  You get:
358
359  TParser.java : T.g
360  T.tokens : T.g
361  T__.g : T.g
362
363  Now, assuming U.g is a tree grammar ref'd T's tokens:
364
365  $ java org.antlr.Tool -depend T.g U.g
366
367  TParser.java : T.g
368  T.tokens : T.g
369  T__.g : T.g
370  U.g: T.tokens
371  U.java : U.g
372  U.tokens : U.g
373
374  Handles spaces by escaping them.  Pays attention to -o, -fo and -lib.
375  Dir 'x y' is a valid dir in current dir.
376
377  $ java org.antlr.Tool -depend -lib /usr/local/lib -o 'x y' T.g U.g
378  x\ y/TParser.java : T.g
379  x\ y/T.tokens : T.g
380  x\ y/T__.g : T.g
381  U.g: /usr/local/lib/T.tokens
382  x\ y/U.java : U.g
383  x\ y/U.tokens : U.g
384
385  You have API access via org.antlr.tool.BuildDependencyGenerator class:
386  getGeneratedFileList(), getDependenciesFileList().  You can also access
387  the output template: getDependencies().  The file
388  org/antlr/tool/templates/depend.stg contains the template.  You can
389  modify as you want.  File objects go in so you can play with path etc...
390
391February 10, 2007
392
393* no more .gl files generated.  All .g all the time.
394
395* changed @finally to be @after and added a finally clause to the
396  exception stuff.  I also removed the superfluous "exception"
397  keyword.  Here's what the new syntax looks like:
398
399  a
400  @after { System.out.println("ick"); }
401    : 'a'
402    ;
403    catch[RecognitionException e] { System.out.println("foo"); }
404    catch[IOException e] { System.out.println("io"); }
405    finally { System.out.println("foobar"); }
406
407  @after executes after bookkeeping to set $rule.stop, $rule.tree but
408  before scopes pop and any memoization happens.  Dynamic scopes and
409  memoization are still in generated finally block because they must
410  exec even if error in rule.  The @after action and tree setting
411  stuff can technically be skipped upon syntax error in rule.  [Later
412  we might add something to finally to stick an ERROR token in the
413  tree and set the return value.]  Sequence goes: set $stop, $tree (if
414  any), @after (if any), pop scopes (if any), memoize (if needed),
415  grammar finally clause.  Last 3 are in generated code's finally
416  clause.
417
4183.0b6 - January 31, 2007
419
420January 30, 2007
421
422* Fixed bug in IntervalSet.and: it returned the same empty set all the time
423  rather than new empty set.  Code altered the same empty set.
424
425* Made analysis terminate faster upon a decision that takes too long;
426  it seemed to keep doing work for a while.  Refactored some names
427  and updated comments.  Also made it terminate when it realizes it's
428  non-LL(*) due to recursion.  just added terminate conditions to loop
429  in convert().
430
431* Sometimes fatal non-LL(*) messages didn't appear; instead you got
432  "antlr couldn't analyze", which is actually untrue.  I had the
433  order of some prints wrong in the DecisionProbe.
434
435* The code generator incorrectly detected when it could use a fixed,
436  acyclic inline DFA (i.e., using an IF).  Upon non-LL(*) decisions
437  with predicates, analysis made cyclic DFA.  But this stops
438  the computation detecting whether they are cyclic.  I just added
439  a protection in front of the acyclic DFA generator to avoid if
440  non-LL(*).  Updated comments.
441
442January 23, 2007
443
444* Made tree node streams use adaptor to create navigation nodes.
445  Thanks to Emond Papegaaij.
446
447January 22, 2007
448
449* Added lexer rule properties: start, stop
450
451January 1, 2007
452
453* analysis failsafe is back on; if a decision takes too long, it bails out
454  and uses k=1
455
456January 1, 2007
457
458* += labels for rules only work for output option; previously elements
459  of list were the return value structs, but are now either the tree or
460  StringTemplate return value.  You can label different rules now
461  x+=a x+=b.
462
463December 30, 2006
464
465* Allow \" to work correctly in "..." template.
466
467December 28, 2006
468
469* errors that are now warnings: missing AST label type in trees.
470  Also "no start rule detected" is warning.
471
472* tree grammars also can do rewrite=true for output=template.
473  Only works for alts with single node or tree as alt elements.
474  If you are going to use $text in a tree grammar or do rewrite=true
475  for templates, you must use in your main:
476
477  nodes.setTokenStream(tokens);
478
479* You get a warning for tree grammars that do rewrite=true and
480  output=template and have -> for alts that are not simple nodes
481  or simple trees.  new unit tests in TestRewriteTemplates at end.
482
483December 27, 2006
484
485* Error message appears when you use -> in tree grammar with
486  output=template and rewrite=true for alt that is not simple
487  node or tree ref.
488
489* no more $stop attribute for tree parsers; meaningless/useless.
490  Removed from TreeRuleReturnScope also.
491
492* rule text attribute in tree parser must pull from token buffer.
493  Makes no sense otherwise.  added getTokenStream to TreeNodeStream
494  so rule $text attr works.  CommonTreeNodeStream etc... now let
495  you set the token stream so you can access later from tree parser.
496  $text is not well-defined for rules like
497
498     slist : stat+ ;
499
500  because stat is not a single node nor rooted with a single node.
501  $slist.text will get only first stat.  I need to add a warning about
502  this...
503
504* Fixed http://www.antlr.org:8888/browse/ANTLR-76 for Java.
505  Enhanced TokenRewriteStream so it accepts any object; converts
506  to string at last second.  Allows you to rewrite with StringTemplate
507  templates now :)
508
509* added rewrite option that makes -> template rewrites do replace ops for
510  TokenRewriteStream input stream.  In output=template and rewrite=true mode
511  same as before 'cept that the parser does
512
513    ((TokenRewriteStream)input).replace(
514	      ((Token)retval.start).getTokenIndex(),
515	      input.LT(-1).getTokenIndex(),
516	      retval.st);
517
518  after each rewrite so that the input stream is altered.  Later refs to
519  $text will have rewrites.  Here's a sample test program for grammar Rew.
520
521        FileReader groupFileR = new FileReader("Rew.stg");
522        StringTemplateGroup templates = new StringTemplateGroup(groupFileR);
523        ANTLRInputStream input = new ANTLRInputStream(System.in);
524        RewLexer lexer = new RewLexer(input);
525        TokenRewriteStream tokens = new TokenRewriteStream(lexer);
526        RewParser parser = new RewParser(tokens);
527        parser.setTemplateLib(templates);
528        parser.program();
529        System.out.println(tokens.toString());
530        groupFileR.close();
531
532December 26, 2006
533
534* BaseTree.dupTree didn't dup recursively.
535
536December 24, 2006
537
538* Cleaned up some comments and removed field treeNode
539  from MismatchedTreeNodeException class.  It is "node" in
540  RecognitionException.
541
542* Changed type from Object to BitSet for expecting fields in
543  MismatchedSetException and MismatchedNotSetException
544
545* Cleaned up error printing in lexers and the messages that it creates.
546
547* Added this to TreeAdaptor:
548	/** Return the token object from which this node was created.
549	 *  Currently used only for printing an error message.
550	 *  The error display routine in BaseRecognizer needs to
551	 *  display where the input the error occurred. If your
552	 *  tree of limitation does not store information that can
553	 *  lead you to the token, you can create a token filled with
554	 *  the appropriate information and pass that back.  See
555	 *  BaseRecognizer.getErrorMessage().
556	 */
557	public Token getToken(Object t);
558
559December 23, 2006
560
561* made BaseRecognizer.displayRecognitionError nonstatic so people can
562  override it. Not sure why it was static before.
563
564* Removed state/decision message that comes out of no
565  viable alternative exceptions, as that was too much.
566  removed the decision number from the early exit exception
567  also.  During development, you can simply override
568  displayRecognitionError from BaseRecognizer to add the stuff
569  back in if you want.
570
571* made output go to an output method you can override: emitErrorMessage()
572
573* general cleanup of the error emitting code in BaseRecognizer.  Lots
574  more stuff you can override: getErrorHeader, getTokenErrorDisplay,
575  emitErrorMessage, getErrorMessage.
576
577December 22, 2006
578
579* Altered Tree.Parser.matchAny() so that it skips entire trees if
580  node has children otherwise skips one node.  Now this works to
581  skip entire body of function if single-rooted subtree:
582  ^(FUNC name=ID arg=ID .)
583
584* Added "reverse index" from node to stream index.  Override
585  fillReverseIndex() in CommonTreeNodeStream if you want to change.
586  Use getNodeIndex(node) to find stream index for a specific tree node.
587  See getNodeIndex(), reverseIndex(Set tokenTypes),
588  reverseIndex(int tokenType), fillReverseIndex().  The indexing
589  costs time and memory to fill, but pulling stuff out will be lots
590  faster as it can jump from a node ptr straight to a stream index.
591
592* Added TreeNodeStream.get(index) to make it easier for interpreters to
593  jump around in tree node stream.
594
595* New CommonTreeNodeStream buffers all nodes in stream for fast jumping
596  around.  It now has push/pop methods to invoke other locations in
597  the stream for building interpreters.
598
599* Moved CommonTreeNodeStream to UnBufferedTreeNodeStream and removed
600  Iterator implementation.  moved toNodesOnlyString() to TestTreeNodeStream
601
602* [BREAKS ANY TREE IMPLEMENTATION]
603  made CommonTreeNodeStream work with any tree node type.  TreeAdaptor
604  now implements isNil so must add; trivial, but does break back
605  compatibility.
606
607December 17, 2006
608
609* Added traceIn/Out methods to recognizers so that you can override them;
610  previously they were in-line print statements. The message has also
611  been slightly improved.
612
613* Factored BuildParseTree into debug package; cleaned stuff up. Fixed
614  unit tests.
615
616December 15, 2006
617
618* [BREAKS ANY TREE IMPLEMENTATION]
619  org.antlr.runtime.tree.Tree; needed to add get/set for token start/stop
620  index so CommonTreeAdaptor can assume Tree interface not CommonTree
621  implementation.  Otherwise, no way to create your own nodes that satisfy
622  Tree because CommonTreeAdaptor was doing
623
624	public int getTokenStartIndex(Object t) {
625		return ((CommonTree)t).startIndex;
626	}
627
628  Added to Tree:
629
630	/**  What is the smallest token index (indexing from 0) for this node
631	 *   and its children?
632	 */
633	int getTokenStartIndex();
634
635	void setTokenStartIndex(int index);
636
637	/**  What is the largest token index (indexing from 0) for this node
638	 *   and its children?
639	 */
640	int getTokenStopIndex();
641
642	void setTokenStopIndex(int index);
643
644December 13, 2006
645
646* Added org.antlr.runtime.tree.DOTTreeGenerator so you can generate DOT
647  diagrams easily from trees.
648
649	CharStream input = new ANTLRInputStream(System.in);
650	TLexer lex = new TLexer(input);
651	CommonTokenStream tokens = new CommonTokenStream(lex);
652	TParser parser = new TParser(tokens);
653	TParser.e_return r = parser.e();
654	Tree t = (Tree)r.tree;
655	System.out.println(t.toStringTree());
656	DOTTreeGenerator gen = new DOTTreeGenerator();
657	StringTemplate st = gen.toDOT(t);
658	System.out.println(st);
659
660* Changed the way mark()/rewind() work in CommonTreeNode stream to mirror
661  more flexible solution in ANTLRStringStream.  Forgot to set lastMarker
662  anyway.  Now you can rewind to non-most-recent marker.
663
664December 12, 2006
665
666* Temp lexer now end in .gl (T__.gl, for example)
667
668* TreeParser suffix no longer generated for tree grammars
669
670* Defined reset for lexer, parser, tree parser; rewinds the input stream also
671
672December 10, 2006
673
674* Made Grammar.abortNFAToDFAConversion() abort in middle of a DFA.
675
676December 9, 2006
677
678* fixed bug in OrderedHashSet.add().  It didn't track elements correctly.
679
680December 6, 2006
681
682* updated build.xml for future Ant compatibility, thanks to Matt Benson.
683
684* various tests in TestRewriteTemplate and TestSyntacticPredicateEvaluation
685  were using the old 'channel' vs. new '$channel' notation.
686  TestInterpretedParsing didn't pick up an earlier change to CommonToken.
687  Reported by Matt Benson.
688
689* fixed platform dependent test failures in TestTemplates, supplied by Matt
690  Benson.
691
692November 29, 2006
693
694*  optimized semantic predicate evaluation so that p||!p yields true.
695
696November 22, 2006
697
698* fixed bug that prevented var = $rule.some_retval from working in anything
699  but the first alternative of a rule or subrule.
700
701* attribute names containing digits were not allowed, this is now fixed,
702  allowing attributes like 'name1' but not '1name1'.
703
704November 19, 2006
705
706* Removed LeftRecursionMessage and apparatus because it seems that I check
707  for left recursion upfront before analysis and everything gets specified as
708  recursion cycles at this point.
709
710November 16, 2006
711
712* TokenRewriteStream.replace was not passing programName to next method.
713
714November 15, 2006
715
716* updated DOT files for DFA generation to make smaller circles.
717
718* made epsilon edges italics in the NFA diagrams.
719
7203.0b5 - November 15, 2006
721
722The biggest thing is that your grammar file names must match the grammar name
723inside (your generated class names will also be different) and we use
724$channel=HIDDEN now instead of channel=99 inside lexer actions.
725Should be compatible other than that.   Please look at complete list of
726changes.
727
728November 14, 2006
729
730* Force token index to be -1 for CommonIndex in case not set.
731
732November 11, 2006
733
734* getUniqueID for TreeAdaptor now uses identityHashCode instead of hashCode.
735
736November 10, 2006
737
738* No grammar nondeterminism warning now when wildcard '.' is final alt.
739  Examples:
740
741	a : A | B | . ;
742
743	A : 'a'
744	  | .
745	  ;
746
747	SL_COMMENT
748	    : '//' (options {greedy=false;} : .)* '\r'? '\n'
749	    ;
750
751	SL_COMMENT2
752	    : '//' (options {greedy=false;} : 'x'|.)* '\r'? '\n'
753	    ;
754
755
756November 8, 2006
757
758* Syntactic predicates did not get hoisting properly upon non-LL(*) decision.  Other hoisting issues fixed.  Cleaned up code.
759
760* Removed failsafe that check to see if I'm spending too much time on a single DFA; I don't think we need it anymore.
761
762November 3, 2006
763
764* $text, $line, etc... were not working in assignments. Fixed and added
765  test case.
766
767* $label.text translated to label.getText in lexer even if label was on a char
768
769November 2, 2006
770
771* Added error if you don't specify what the AST type is; actions in tree
772  grammar won't work without it.
773
774  $ cat x.g
775  tree grammar x;
776  a : ID {String s = $ID.text;} ;
777
778  ANTLR Parser Generator   Early Access Version 3.0b5 (??, 2006)  1989-2006
779  error: x.g:0:0: (152) tree grammar x has no ASTLabelType option
780
781November 1, 2006
782
783* $text, $line, etc... were not working properly within lexer rule.
784
785October 32, 2006
786
787* Finally actions now execute before dynamic scopes are popped it in the
788  rule. Previously was not possible to access the rules scoped variables
789  in a finally action.
790
791October 29, 2006
792
793* Altered ActionTranslator to emit errors on setting read-only attributes
794  such as $start, $stop, $text in a rule. Also forbid setting any attributes
795  in rules/tokens referenced by a label or name.
796  Setting dynamic scopes's attributes and your own parameter attributes
797  is legal.
798
799October 27, 2006
800
801* Altered how ANTLR figures out what decision is associated with which
802  block of grammar.  Makes ANTLRWorks correctly find DFA for a block.
803
804October 26, 2006
805
806* Fixed bug where EOT transitions led to no NFA configs in a DFA state,
807  yielding an error in DFA table generation.
808
809* renamed action.g to ActionTranslator.g
810  the ActionTranslator class is now called ActionTranslatorLexer, as ANTLR
811  generates this classname now. Fixed rest of codebase accordingly.
812
813* added rules recognizing setting of scopes' attributes to ActionTranslator.g
814  the Objective C target needed access to the right-hand side of the assignment
815  in order to generate correct code
816
817* changed ANTLRCore.sti to reflect the new mandatory templates to support the above
818  namely: scopeSetAttributeRef, returnSetAttributeRef and the ruleSetPropertyRef_*
819  templates, with the exception of ruleSetPropertyRef_text. we cannot set this attribute
820
821October 19, 2006
822
823* Fixed 2 bugs in DFA conversion that caused exceptions.
824  altered functionality of getMinElement so it ignores elements<0.
825
826October 18, 2006
827
828* moved resetStateNumbersToBeContiguous() to after issuing of warnings;
829  an internal error in that routine should make more sense as issues
830  with decision will appear first.
831
832* fixed cut/paste bug I introduced when fixed EOF in min/max
833  bug. Prevented C grammar from working briefly.
834
835October 17, 2006
836
837* Removed a failsafe that seems to be unnecessary that ensure DFA didn't
838  get too big.  It was resulting in some failures in code generation that
839  led me on quite a strange debugging trip.
840
841October 16, 2006
842
843* Use channel=HIDDEN not channel=99 to put tokens on hidden channel.
844
845October 12, 2006
846
847* ANTLR now has a customizable message format for errors and warnings,
848  to make it easier to fulfill requirements by IDEs and such.
849  The format to be used can be specified via the '-message-format name'
850  command line switch. The default for name is 'antlr', also available
851  at the moment is 'gnu'. This is done via StringTemplate, for details
852  on the requirements look in org/antlr/tool/templates/messages/formats/
853
854* line numbers for lexers in combined grammars are now reported correctly.
855
856September 29, 2006
857
858* ANTLRReaderStream improperly checked for end of input.
859
860September 28, 2006
861
862* For ANTLRStringStream, LA(-1) was off by one...gave you LA(-2).
863
8643.0b4 - August 24, 2006
865
866* error when no rules in grammar.  doesn't crash now.
867
868* Token is now an interface.
869
870* remove dependence on non runtime classes in runtime package.
871
872* filename and grammar name must be same Foo in Foo.g.  Generates FooParser,
873  FooLexer, ...  Combined grammar Foo generates Foo$Lexer.g which generates
874  FooLexer.java.  tree grammars generate FooTreeParser.java
875
876August 24, 2006
877
878* added C# target to lib, codegen, templates
879
880August 11, 2006
881
882* added tree arg to navigation methods in treeadaptor
883
884August 07, 2006
885
886* fixed bug related to (a|)+ on end of lexer rules.  crashed instead
887  of warning.
888
889* added warning that interpreter doesn't do synpreds yet
890
891* allow different source of classloader:
892ClassLoader cl = Thread.currentThread().getContextClassLoader();
893if ( cl==null ) {
894    cl = this.getClass().getClassLoader();
895}
896
897
898July 26, 2006
899
900* compressed DFA edge tables significantly.  All edge tables are
901  unique. The transition table can reuse arrays.  Look like this now:
902
903     public static readonly DFA30_transition0 =
904     	new short[] { 46, 46, -1, 46, 46, -1, -1, -1, -1, -1, -1, -1,...};
905         public static readonly DFA30_transition1 =
906     	new short[] { 21 };
907      public static readonly short[][] DFA30_transition = {
908     	  DFA30_transition0,
909     	  DFA30_transition0,
910     	  DFA30_transition1,
911     	  ...
912      };
913
914* If you defined both a label like EQ and '=', sometimes the '=' was
915  used instead of the EQ label.
916
917* made headerFile template have same arg list as outputFile for consistency
918
919* outputFile, lexer, genericParser, parser, treeParser templates
920  reference cyclicDFAs attribute which was no longer used after I
921  started the new table-based DFA.  I made cyclicDFADescriptors
922  argument to outputFile and headerFile (only).  I think this is
923  correct as only OO languages will want the DFA in the recognizer.
924  At the top level, C and friends can use it.  Changed name to use
925  cyclicDFAs again as it's a better name probably.  Removed parameter
926  from the lexer, ...  For example, my parser template says this now:
927
928    <cyclicDFAs:cyclicDFA()> <! dump tables for all DFA !>
929
930* made all token ref token types go thru code gen's
931  getTokenTypeAsTargetLabel()
932
933* no more computing DFA transition tables for acyclic DFA.
934
935July 25, 2006
936
937* fixed a place where I was adding syn predicates into rewrite stuff.
938
939* turned off invalid token index warning in AW support; had a problem.
940
941* bad location event generated with -debug for synpreds in autobacktrack mode.
942
943July 24, 2006
944
945* changed runtime.DFA so that it treats all chars and token types as
946  char (unsigned 16 bit int).  -1 becomes '\uFFFF' then or 65535.
947
948* changed MAX_STATE_TRANSITIONS_FOR_TABLE to be 65534 by default
949  now. This means that all states can use a table to do transitions.
950
951* was not making synpreds on (C)* type loops with backtrack=true
952
953* was copying tree stuff and actions into synpreds with backtrack=true
954
955* was making synpreds on even single alt rules / blocks with backtrack=true
956
9573.0b3 - July 21, 2006
958
959* ANTLR fails to analyze complex decisions much less frequently.  It
960  turns out that the set of decisions for which ANTLR fails (times
961  out) is the same set (so far) of non-LL(*) decisions.  Morever, I'm
962  able to detect this situation quickly and report rather than timing
963  out. Errors look like:
964
965  java.g:468:23: [fatal] rule concreteDimensions has non-LL(*)
966    decision due to recursive rule invocations in alts 1,2.  Resolve
967    by left-factoring or using syntactic predicates with fixed k
968    lookahead or use backtrack=true option.
969
970  This message only appears when k=*.
971
972* Shortened no viable alt messages to not include decision
973  description:
974
975[compilationUnit, declaration]: line 8:8 decision=<<67:1: declaration
976: ( ( fieldDeclaration )=> fieldDeclaration | ( methodDeclaration )=>
977methodDeclaration | ( constructorDeclaration )=>
978constructorDeclaration | ( classDeclaration )=> classDeclaration | (
979interfaceDeclaration )=> interfaceDeclaration | ( blockDeclaration )=>
980blockDeclaration | emptyDeclaration );>> state 3 (decision=14) no
981viable alt; token=[@1,184:187='java',<122>,8:8]
982
983  too long and hard to read.
984
985July 19, 2006
986
987* Code gen bug: states with no emanating edges were ignored by ST.
988  Now an empty list is used.
989
990* Added grammar parameter to recognizer templates so they can access
991  properties like getName(), ...
992
993July 10, 2006
994
995* Fixed the gated pred merged state bug.  Added unit test.
996
997* added new method to Target: getTokenTypeAsTargetLabel()
998
999July 7, 2006
1000
1001* I was doing an AND instead of OR in the gated predicate stuff.
1002  Thanks to Stephen Kou!
1003
1004* Reduce op for combining predicates was insanely slow sometimes and
1005  didn't actually work well.  Now it's fast and works.
1006
1007* There is a bug in merging of DFA stop states related to gated
1008  preds...turned it off for now.
1009
10103.0b2 - July 5, 2006
1011
1012July 5, 2006
1013
1014* token emission not properly protected in lexer filter mode.
1015
1016* EOT, EOT DFA state transition tables should be init'd to -1 (only
1017  was doing this for compressed tables).  Fixed.
1018
1019* in trace mode, exit method not shown for memoized rules
1020
1021* added -Xmaxdfaedges to allow you to increase number of edges allowed
1022  for a single DFA state before it becomes "special" and can't fit in
1023  a simple table.
1024
1025* Bug in tables.  Short are signed so min/max tables for DFA are now
1026  char[].  Bizarre.
1027
1028July 3, 2006
1029
1030* Added a method to reset the tool error state for current thread.
1031  See ErrorManager.java
1032
1033* [Got this working properly today] backtrack mode that let's you type
1034  in any old crap and ANTLR will backtrack if it can't figure out what
1035  you meant.  No errors are reported by antlr during analysis.  It
1036  implicitly adds a syn pred in front of every production, using them
1037  only if static grammar LL(*) analysis fails.  Syn pred code is not
1038  generated if the pred is not used in a decision.
1039
1040  This is essentially a rapid prototyping mode.
1041
1042* Added backtracking report to the -report option
1043
1044* Added NFA->DFA conversion early termination report to the -report option
1045
1046* Added grammar level k and backtrack options to -report
1047
1048* Added a dozen unit tests to test autobacktrack NFA construction.
1049
1050* If you are using filter mode, you must manually use option
1051  memoize=true now.
1052
1053July 2, 2006
1054
1055* Added k=* option so you can set k=2, for example, on whole grammar,
1056  but an individual decision can be LL(*).
1057
1058* memoize option for grammars, rules, blocks.  Remove -nomemo cmd-line option
1059
1060* but in DOT generator for DFA; fixed.
1061
1062* runtime.DFA reported errors even when backtracking
1063
1064July 1, 2006
1065
1066* Added -X option list to help
1067
1068* Syn preds were being hoisted into other rules, causing lots of extra
1069  backtracking.
1070
1071June 29, 2006
1072
1073* unnecessary files removed during build.
1074
1075* Matt Benson updated build.xml
1076
1077* Detecting use of synpreds in analysis now instead of codegen.  In
1078  this way, I can avoid analyzing decisions in synpreds for synpreds
1079  not used in a DFA for a real rule.  This is used to optimize things
1080  for backtrack option.
1081
1082* Code gen must add _fragment or whatever to end of pred name in
1083  template synpredRule to avoid having ANTLR know anything about
1084  method names.
1085
1086* Added -IdbgST option to emit ST delimiters at start/stop of all
1087  templates spit out.
1088
1089June 28, 2006
1090
1091* Tweaked message when ANTLR cannot handle analysis.
1092
10933.0b1 - June 27, 2006
1094
1095June 24, 2006
1096
1097* syn preds no longer generate little static classes; they also don't
1098  generate a whole bunch of extra crap in the rules built to test syn
1099  preds.  Removed GrammarFragmentPointer class from runtime.
1100
1101June 23-24, 2006
1102
1103* added output option to -report output.
1104
1105* added profiling info:
1106  Number of rule invocations in "guessing" mode
1107  number of rule memoization cache hits
1108  number of rule memoization cache misses
1109
1110* made DFA DOT diagrams go left to right not top to bottom
1111
1112* I try to recursive overflow states now by resolving these states
1113  with semantic/syntactic predicates if they exist.  The DFA is then
1114  deterministic rather than simply resolving by choosing first
1115  nondeterministic alt.  I used to generated errors:
1116
1117~/tmp $ java org.antlr.Tool -dfa t.g
1118ANTLR Parser Generator   Early Access Version 3.0b2 (July 5, 2006)  1989-2006
1119t.g:2:5: Alternative 1: after matching input such as A A A A A decision cannot predict what comes next due to recursion overflow to b from b
1120t.g:2:5: Alternative 2: after matching input such as A A A A A decision cannot predict what comes next due to recursion overflow to b from b
1121
1122  Now, I uses predicates if available and emits no warnings.
1123
1124* made sem preds share accept states.  Previously, multiple preds in a
1125decision forked new accepts each time for each nondet state.
1126
1127June 19, 2006
1128
1129* Need parens around the prediction expressions in templates.
1130
1131* Referencing $ID.text in an action forced bad code gen in lexer rule ID.
1132
1133* Fixed a bug in how predicates are collected.  The definition of
1134  "last predicated alternative" was incorrect in the analysis.  Further,
1135  gated predicates incorrectly missed a case where an edge should become
1136  true (a tautology).
1137
1138* Removed an unnecessary input.consume() reference in the runtime/DFA class.
1139
1140June 14, 2006
1141
1142* -> ($rulelabel)? didn't generate proper code for ASTs.
1143
1144* bug in code gen (did not compile)
1145a : ID -> ID
1146  | ID -> ID
1147  ;
1148Problem is repeated ref to ID from left side.  Juergen pointed this out.
1149
1150* use of tokenVocab with missing file yielded exception
1151
1152* (A|B)=> foo yielded an exception as (A|B) is a set not a block. Fixed.
1153
1154* Didn't set ID1= and INT1= for this alt:
1155  | ^(ID INT+ {System.out.print(\"^(\"+$ID+\" \"+$INT+\")\");})
1156
1157* Fixed so repeated dangling state errors only occur once like:
1158t.g:4:17: the decision cannot distinguish between alternative(s) 2,1 for at least one input sequence
1159
1160* tracking of rule elements was on (making list defs at start of
1161  method) with templates instead of just with ASTs.  Turned off.
1162
1163* Doesn't crash when you give it a missing file now.
1164
1165* -report: add output info: how many LL(1) decisions.
1166
1167June 13, 2006
1168
1169* ^(ROOT ID?) Didn't work; nor did any other nullable child list such as
1170  ^(ROOT ID* INT?).  Now, I check to see if child list is nullable using
1171  Grammar.LOOK() and, if so, I generate an "IF lookahead is DOWN" gate
1172  around the child list so the whole thing is optional.
1173
1174* Fixed a bug in LOOK that made it not look through nullable rules.
1175
1176* Using AST suffixes or -> rewrite syntax now gives an error w/o a grammar
1177  output option.  Used to crash ;)
1178
1179* References to EOF ended up with improper -1 refs instead of EOF in output.
1180
1181* didn't warn of ambig ref to $expr in rewrite; fixed.
1182list
1183     :	'[' expr 'for' type ID 'in' expr ']'
1184	-> comprehension(expr={$expr.st},type={},list={},i={})
1185	;
1186
1187June 12, 2006
1188
1189* EOF works in the parser as a token name.
1190
1191* Rule b:(A B?)*; didn't display properly in AW due to the way ANTLR
1192  generated NFA.
1193
1194* "scope x;" in a rule for unknown x gives no error.  Fixed.  Added unit test.
1195
1196* Label type for refs to start/stop in tree parser and other parsers were
1197  not used.  Lots of casting.  Ick. Fixed.
1198
1199* couldn't refer to $tokenlabel in isolation; but need so we can test if
1200  something was matched.  Fixed.
1201
1202* Lots of little bugs fixed in $x.y, %... translation due to new
1203  action translator.
1204
1205* Improperly tracking block nesting level; result was that you couldn't
1206  see $ID in action of rule "a : A+ | ID {Token t = $ID;} | C ;"
1207
1208* a : ID ID {$ID.text;} ; did not get a warning about ambiguous $ID ref.
1209
1210* No error was found on $COMMENT.text:
1211
1212COMMENT
1213    :   '/*' (options {greedy=false;} : . )* '*/'
1214        {System.out.println("found method "+$COMMENT.text);}
1215    ;
1216
1217  $enclosinglexerrule scope does not exist.  Use text or setText() here.
1218
1219June 11, 2006
1220
1221* Single return values are initialized now to default or to your spec.
1222
1223* cleaned up input stream stuff.  Added ANTLRReaderStream, ANTLRInputStream
1224  and refactored.  You can specify encodings now on ANTLRFileStream (and
1225  ANTLRInputStream) now.
1226
1227* You can set text local var now in a lexer rule and token gets that text.
1228  start/stop indexes are still set for the token.
1229
1230* Changed lexer slightly.  Calling a nonfragment rule from a
1231  nonfragment rule does not set the overall token.
1232
1233June 10, 2006
1234
1235* Fixed bug where unnecessary escapes yield char==0 like '\{'.
1236
1237* Fixed analysis bug.  This grammar didn't report a recursion warning:
1238x   : y X
1239    | y Y
1240    ;
1241y   : L y R
1242    | B
1243    ;
1244  The DFAState.equals() method was messed up.
1245
1246* Added @synpredgate {...} action so you can tell ANTLR how to gate actions
1247  in/out during syntactic predicate evaluation.
1248
1249* Fuzzy parsing should be more efficient.  It should backtrack over a rule
1250  and then rewind and do it again "with feeling" to exec actions.  It was
1251  actually doing it 3x not 2x.
1252
1253June 9, 2006
1254
1255* Gutted and rebuilt the action translator for $x.y, $x::y, ...
1256  Uses ANTLR v3 now for the first time inside v3 source. :)
1257  ActionTranslator.java
1258
1259* Fixed a bug where referencing a return value on a rule didn't work
1260  because later a ref to that rule's predefined properties didn't
1261  properly force a return value struct to be built.  Added unit test.
1262
1263June 6, 2006
1264
1265* New DFA mechanisms.  Cyclic DFA are implemented as state tables,
1266  encoded via strings as java cannot handle large static arrays :(
1267  States with edges emanating that have predicates are specially
1268  treated.  A method is generated to do these states.  The DFA
1269  simulation routine uses the "special" array to figure out if the
1270  state is special.  See March 25, 2006 entry for description:
1271  http://www.antlr.org/blog/antlr3/codegen.tml.  analysis.DFA now has
1272  all the state tables generated for code gen.  CyclicCodeGenerator.java
1273  disappeared as it's unneeded code. :)
1274
1275* Internal general clean up of the DFA.states vs uniqueStates thing.
1276  Fixed lookahead decisions no longer fill uniqueStates.  Waste of
1277  time.  Also noted that when adding sem pred edges, I didn't check
1278  for state reuse.  Fixed.
1279
1280June 4, 2006
1281
1282* When resolving ambig DFA states predicates, I did not add the new states
1283  to the list of unique DFA states.  No observable effect on output except
1284  that DFA state numbers were not always contiguous for predicated decisions.
1285  I needed this fix for new DFA tables.
1286
12873.0ea10 - June 2, 2006
1288
1289June 2, 2006
1290
1291* Improved grammar stats and added syntactic pred tracking.
1292
1293June 1, 2006
1294
1295* Due to a type mismatch, the DebugParser.recoverFromMismatchedToken()
1296  method was not called.  Debug events for mismatched token error
1297  notification were not sent to ANTLRWorks probably
1298
1299* Added getBacktrackingLevel() for any recognizer; needed for profiler.
1300
1301* Only writes profiling data for antlr grammar analysis with -profile set
1302
1303* Major update and bug fix to (runtime) Profiler.
1304
1305May 27, 2006
1306
1307* Added Lexer.skip() to force lexer to ignore current token and look for
1308  another; no token is created for current rule and is not passed on to
1309  parser (or other consumer of the lexer).
1310
1311* Parsers are much faster now.  I removed use of java.util.Stack for pushing
1312  follow sets and use a hardcoded array stack instead.  Dropped from
1313  5900ms to 3900ms for parse+lex time parsing entire java 1.4.2 source.  Lex
1314  time alone was about 1500ms.  Just looking at parse time, we get about 2x
1315  speed improvement. :)
1316
1317May 26, 2006
1318
1319* Fixed NFA construction so it generates NFA for (A*)* such that ANTLRWorks
1320  can display it properly.
1321
1322May 25, 2006
1323
1324* added abort method to Grammar so AW can terminate the conversion if it's
1325  taking too long.
1326
1327May 24, 2006
1328
1329* added method to get left recursive rules from grammar without doing full
1330  grammar analysis.
1331
1332* analysis, code gen not attempted if serious error (like
1333  left-recursion or missing rule definition) occurred while reading
1334  the grammar in and defining symbols.
1335
1336* added amazing optimization; reduces analysis time by 90% for java
1337  grammar; simple IF statement addition!
1338
13393.0ea9 - May 20, 2006
1340
1341* added global k value for grammar to limit lookahead for all decisions unless
1342overridden in a particular decision.
1343
1344* added failsafe so that any decision taking longer than 2 seconds to create
1345the DFA will fall back on k=1.  Use -ImaxtimeforDFA n (in ms) to set the time.
1346
1347* added an option (turned off for now) to use multiple threads to
1348perform grammar analysis.  Not much help on a 2-CPU computer as
1349garbage collection seems to peg the 2nd CPU already. :( Gotta wait for
1350a 4 CPU box ;)
1351
1352* switched from #src to // $ANTLR src directive.
1353
1354* CommonTokenStream.getTokens() looked past end of buffer sometimes. fixed.
1355
1356* unicode literals didn't really work in DOT output and generated code. fixed.
1357
1358* fixed the unit test rig so it compiles nicely with Java 1.5
1359
1360* Added ant build.xml file (reads build.properties file)
1361
1362* predicates sometimes failed to compile/eval properly due to missing (...)
1363  in IF expressions.  Forced (..)
1364
1365* (...)? with only one alt were not optimized.  Was:
1366
1367        // t.g:4:7: ( B )?
1368        int alt1=2;
1369        int LA1_0 = input.LA(1);
1370        if ( LA1_0==B ) {
1371            alt1=1;
1372        }
1373        else if ( LA1_0==-1 ) {
1374            alt1=2;
1375        }
1376        else {
1377            NoViableAltException nvae =
1378                new NoViableAltException("4:7: ( B )?", 1, 0, input);
1379            throw nvae;
1380        }
1381
1382is now:
1383
1384        // t.g:4:7: ( B )?
1385        int alt1=2;
1386        int LA1_0 = input.LA(1);
1387        if ( LA1_0==B ) {
1388            alt1=1;
1389        }
1390
1391  Smaller, faster and more readable.
1392
1393* Allow manual init of return values now:
1394  functionHeader returns [int x=3*4, char (*f)()=null] : ... ;
1395
1396* Added optimization for DFAs that fixed a codegen bug with rules in lexer:
1397   EQ			 : '=' ;
1398   ASSIGNOP		 : '=' | '+=' ;
1399  EQ is a subset of other rule.  It did not given an error which is
1400  correct, but generated bad code.
1401
1402* ANTLR was sending column not char position to ANTLRWorks.
1403
1404* Bug fix: location 0, 0 emitted for synpreds and empty alts.
1405
1406* debugging event handshake how sends grammar file name.  Added getGrammarFileName() to recognizers.  Java.stg generates it:
1407
1408    public String getGrammarFileName() { return "<fileName>"; }
1409
1410* tree parsers can do arbitrary lookahead now including backtracking.  I
1411  updated CommonTreeNodeStream.
1412
1413* added events for debugging tree parsers:
1414
1415	/** Input for a tree parser is an AST, but we know nothing for sure
1416	 *  about a node except its type and text (obtained from the adaptor).
1417	 *  This is the analog of the consumeToken method.  Again, the ID is
1418	 *  the hashCode usually of the node so it only works if hashCode is
1419	 *  not implemented.
1420	 */
1421	public void consumeNode(int ID, String text, int type);
1422
1423	/** The tree parser looked ahead */
1424	public void LT(int i, int ID, String text, int type);
1425
1426	/** The tree parser has popped back up from the child list to the
1427	 *  root node.
1428	 */
1429	public void goUp();
1430
1431	/** The tree parser has descended to the first child of a the current
1432	 *  root node.
1433	 */
1434	public void goDown();
1435
1436* Added DebugTreeNodeStream and DebugTreeParser classes
1437
1438* Added ctor because the debug tree node stream will need to ask quesitons about nodes and since  nodes are just Object, it needs an adaptor to decode the nodes and get text/type info for the debugger.
1439
1440public CommonTreeNodeStream(TreeAdaptor adaptor, Tree tree);
1441
1442* added getter to TreeNodeStream:
1443	public TreeAdaptor getTreeAdaptor();
1444
1445* Implemented getText/getType in CommonTreeAdaptor.
1446
1447* Added TraceDebugEventListener that can dump all events to stdout.
1448
1449* I broke down and make Tree implement getText
1450
1451* tree rewrites now gen location debug events.
1452
1453* added AST debug events to listener; added blank listener for convenience
1454
1455* updated debug events to send begin/end backtrack events for debugging
1456
1457* with a : (b->b) ('+' b -> ^(PLUS $a b))* ; you get b[0] each time as
1458  there is no loop in rewrite rule itself.  Need to know context that
1459  the -> is inside the rule and hence b means last value of b not all
1460  values.
1461
1462* Bug in TokenRewriteStream; ops at indexes < start index blocked proper op.
1463
1464* Actions in ST rewrites "-> ({$op})()" were not translated
1465
1466* Added new action name:
1467
1468@rulecatch {
1469catch (RecognitionException re) {
1470    reportError(re);
1471    recover(input,re);
1472}
1473catch (Throwable t) {
1474    System.err.println(t);
1475}
1476}
1477Overrides rule catch stuff.
1478
1479* Isolated $ refs caused exception
1480
14813.0ea8 - March 11, 2006
1482
1483* added @finally {...} action like @init for rules.  Executes in
1484  finally block (java target) after all other stuff like rule memoization.
1485  No code changes needs; ST just refs a new action:
1486      <ruleDescriptor.actions.finally>
1487
1488* hideous bug fixed: PLUS='+' didn't result in '+' rule in lexer
1489
1490* TokenRewriteStream didn't do toString() right when no rewrites had been done.
1491
1492* lexer errors in interpreter were not printed properly
1493
1494* bitsets are dumped in hex not decimal now for FOLLOW sets
1495
1496* /* epsilon */ is not printed now when printing out grammars with empty alts
1497
1498* Fixed another bug in tree rewrite stuff where it was checking that elements
1499  had at least one element.  Strange...commented out for now to see if I can remember what's up.
1500
1501* Tree rewrites had problems when you didn't have x+=FOO variables.  Rules
1502  like this work now:
1503
1504  a : (x=ID)? y=ID -> ($x $y)?;
1505
1506* filter=true for lexers turns on k=1 and backtracking for every token
1507  alternative.  Put the rules in priority order.
1508
1509* added getLine() etc... to Tree to support better error reporting for
1510  trees.  Added MismatchedTreeNodeException.
1511
1512* $templates::foo() is gone.  added % as special template symbol.
1513  %foo(a={},b={},...) ctor (even shorter than $templates::foo(...))
1514  %({name-expr})(a={},...) indirect template ctor reference
1515
1516  The above are parsed by antlr.g and translated by codegen.g
1517  The following are parsed manually here:
1518
1519  %{string-expr} anonymous template from string expr
1520  %{expr}.y = z; template attribute y of StringTemplate-typed expr to z
1521  %x.y = z; set template attribute y of x (always set never get attr)
1522            to z [languages like python without ';' must still use the
1523            ';' which the code generator is free to remove during code gen]
1524
1525* -> ({expr})(a={},...) notation for indirect template rewrite.
1526  expr is the name of the template.
1527
1528* $x[i]::y and $x[-i]::y notation for accesssing absolute scope stack
1529  indexes and relative negative scopes.  $x[-1]::y is the y attribute
1530  of the previous scope (stack top - 1).
1531
1532* filter=true mode for lexers; can do this now...upon mismatch, just
1533  consumes a char and tries again:
1534lexer grammar FuzzyJava;
1535options {filter=true;}
1536
1537FIELD
1538    :   TYPE WS? name=ID WS? (';'|'=')
1539        {System.out.println("found var "+$name.text);}
1540    ;
1541
1542* refactored char streams so ANTLRFileStream is now a subclass of
1543  ANTLRStringStream.
1544
1545* char streams for lexer now allowed nested backtracking in lexer.
1546
1547* added TokenLabelType for lexer/parser for all token labels
1548
1549* line numbers for error messages were not updated properly in antlr.g
1550  for strings, char literals and <<...>>
1551
1552* init action in lexer rules was before the type,start,line,... decls.
1553
1554* Tree grammars can now specify output; I've only tested output=templat
1555  though.
1556
1557* You can reference EOF now in the parser and lexer.  It's just token type
1558  or char value -1.
1559
1560* Bug fix: $ID refs in the *lexer* were all messed up.  Cleaned up the
1561  set of properties available...
1562
1563* Bug fix: .st not found in rule ref when rule has scope:
1564field
1565scope {
1566	StringTemplate funcDef;
1567}
1568    :   ...
1569	{$field::funcDef = $field.st;}
1570    ;
1571it gets field_stack.st instead
1572
1573* return in backtracking must return retval or null if return value.
1574
1575* $property within a rule now works like $text, $st, ...
1576
1577* AST/Template Rewrites were not gated by backtracking==0 so they
1578  executed even when guessing.  Auto AST construction is now gated also.
1579
1580* CommonTokenStream was somehow returning tokens not text in toString()
1581
1582* added useful methods to runtime.BitSet and also to CommonToken so you can
1583  update the text.  Added nice Token stream method:
1584
1585  /** Given a start and stop index, return a List of all tokens in
1586   *  the token type BitSet.  Return null if no tokens were found.  This
1587   *  method looks at both on and off channel tokens.
1588   */
1589  public List getTokens(int start, int stop, BitSet types);
1590
1591* literals are now passed in the .tokens files so you can ref them in
1592  tree parses, for example.
1593
1594* added basic exception handling; no labels, just general catches:
1595
1596a : {;}A | B ;
1597        exception
1598                catch[RecognitionException re] {
1599                        System.out.println("recog error");
1600                }
1601                catch[Exception e] {
1602                        System.out.println("error");
1603                }
1604
1605* Added method to TokenStream:
1606  public String toString(Token start, Token stop);
1607
1608* antlr generates #src lines in lexer grammars generated from combined grammars
1609  so error messages refer to original file.
1610
1611* lexers generated from combined grammars now use originally formatting.
1612
1613* predicates have $x.y stuff translated now.  Warning: predicates might be
1614  hoisted out of context.
1615
1616* return values in return val structs are now public.
1617
1618* output=template with return values on rules was broken.  I assume return values with ASTs was broken too.  Fixed.
1619
16203.0ea7 - December 14, 2005
1621
1622* Added -print option to print out grammar w/o actions
1623
1624* Renamed BaseParser to be BaseRecognizer and even made Lexer derive from
1625  this; nice as it now shares backtracking support code.
1626
1627* Added syntactic predicates (...)=>.  See December 4, 2005 entry:
1628
1629  http://www.antlr.org/blog/antlr3/lookahead.tml
1630
1631  Note that we have a new option for turning off rule memoization during
1632  backtracking:
1633
1634  -nomemo        when backtracking don't generate memoization code
1635
1636* Predicates are now tested in order that you specify the alts.  If you
1637  leave the last alt "naked" (w/o pred), it will assume a true pred rather
1638  than union of other preds.
1639
1640* Added gated predicates "{p}?=>" that literally turn off a production whereas
1641disambiguating predicates are only hoisted into the predictor when syntax alone
1642is not sufficient to uniquely predict alternatives.
1643
1644A : {p}?  => "a" ;
1645B : {!p}? => ("a"|"b")+ ;
1646
1647* bug fixed related to predicates in predictor
1648lexer grammar w;
1649A : {p}? "a" ;
1650B : {!p}? ("a"|"b")+ ;
1651DFA is correct.  A state splits for input "a" on the pred.
1652Generated code though was hosed.  No pred tests in prediction code!
1653I added testLexerPreds() and others in TestSemanticPredicateEvaluation.java
1654
1655* added execAction template in case we want to do something in front of
1656  each action execution or something.
1657
1658* left-recursive cycles from rules w/o decisions were not detected.
1659
1660* undefined lexer rules were not announced! fixed.
1661
1662* unreachable messages for Tokens rule now indicate rule name not alt. E.g.,
1663
1664  Ruby.lexer.g:24:1: The following token definitions are unreachable: IVAR
1665
1666* nondeterminism warnings improved for Tokens rule:
1667
1668Ruby.lexer.g:10:1: Multiple token rules can match input such as ""0".."9"": INT, FLOAT
1669As a result, tokens(s) FLOAT were disabled for that input
1670
1671
1672* DOT diagrams didn't show escaped char properly.
1673
1674* Char/string literals are now all 'abc' not "abc".
1675
1676* action syntax changed "@scope::actionname {action}" where scope defaults
1677  to "parser" if parser grammar or combined grammar, "lexer" if lexer grammar,
1678  and "treeparser" if tree grammar.  The code generation targets decide
1679  what scopes are available.  Each "scope" yields a hashtable for use in
1680  the output templates.  The scopes full of actions are sent to all output
1681  file templates (currently headerFile and outputFile) as attribute actions.
1682  Then you can reference <actions.scope> to get the map of actions associated
1683  with scope and <actions.parser.header> to get the parser's header action
1684  for example.  This should be very flexible.  The target should only have
1685  to define which scopes are valid, but the action names should be variable
1686  so we don't have to recompile ANTLR to add actions to code gen templates.
1687
1688  grammar T;
1689  options {language=Java;}
1690  @header { package foo; }
1691  @parser::stuff { int i; } // names within scope not checked; target dependent
1692  @members { int i; }
1693  @lexer::header {head}
1694  @lexer::members { int j; }
1695  @headerfile::blort {...} // error: this target doesn't have headerfile
1696  @treeparser::members {...} // error: this is not a tree parser
1697  a
1698  @init {int i;}
1699    : ID
1700    ;
1701  ID : 'a'..'z';
1702
1703  For now, the Java target uses members and header as a valid name.  Within a
1704  rule, the init action name is valid.
1705
1706* changed $dynamicscope.value to $dynamicscope::value even if value is defined
1707  in same rule such as $function::name where rule function defines name.
1708
1709* $dynamicscope gets you the stack
1710
1711* rule scopes go like this now:
1712
1713  rule
1714  scope {...}
1715  scope slist,Symbols;
1716  	: ...
1717	;
1718
1719* Created RuleReturnScope as a generic rule return value.  Makes it easier
1720  to do this:
1721    RuleReturnScope r = parser.program();
1722    System.out.println(r.getTemplate().toString());
1723
1724* $template, $tree, $start, etc...
1725
1726* $r.x in current rule.  $r is ignored as fully-qualified name. $r.start works too
1727
1728* added warning about $r referring to both return value of rule and dynamic scope of rule
1729
1730* integrated StringTemplate in a very simple manner
1731
1732Syntax:
1733-> template(arglist) "..."
1734-> template(arglist) <<...>>
1735-> namedTemplate(arglist)
1736-> {free expression}
1737-> // empty
1738
1739Predicate syntax:
1740a : A B -> {p1}? foo(a={$A.text})
1741        -> {p2}? foo(a={$B.text})
1742        -> // return nothing
1743
1744An arg list is just a list of template attribute assignments to actions in curlies.
1745
1746There is a setTemplateLib() method for you to use with named template rewrites.
1747
1748Use a new option:
1749
1750grammar t;
1751options {output=template;}
1752...
1753
1754This all should work for tree grammars too, but I'm still testing.
1755
1756* fixed bugs where strings were improperly escaped in exceptions, comments, etc..  For example, newlines came out as newlines not the escaped version
1757
17583.0ea6 - November 13, 2005
1759
1760* turned off -debug/-profile, which was on by default
1761
1762* completely refactored the output templates; added some missing templates.
1763
1764* dramatically improved infinite recursion error messages (actually
1765  left-recursion never even was printed out before).
1766
1767* wasn't printing dangling state messages when it reanalyzes with k=1.
1768
1769* fixed a nasty bug in the analysis engine dealing with infinite recursion.
1770  Spent all day thinking about it and cleaned up the code dramatically.
1771  Bug fixed and software is more powerful and I understand it better! :)
1772
1773* improved verbose DFA nodes; organized by alt
1774
1775* got much better random phrase generation.  For example:
1776
1777 $ java org.antlr.tool.RandomPhrase simple.g program
1778 int Ktcdn ';' method wh '(' ')' '{' return 5 ';' '}'
1779
1780* empty rules like "a : ;" generated code that didn't compile due to
1781  try/catch for RecognitionException.  Generated code couldn't possibly
1782  throw that exception.
1783
1784* when printing out a grammar, such as in comments in generated code,
1785  ANTLR didn't print ast suffix stuff back out for literals.
1786
1787* This never exited loop:
1788  DATA : (options {greedy=false;}: .* '\n' )* '\n' '.' ;
1789  and now it works due to new default nongreedy .*  Also this works:
1790  DATA : (options {greedy=false;}: .* '\n' )* '.' ;
1791
1792* Dot star ".*" syntax didn't work; in lexer it is nongreedy by
1793  default.  In parser it is on greedy but also k=1 by default.  Added
1794  unit tests.  Added blog entry to describe.
1795
1796* ~T where T is the only token yielded an empty set but no error
1797
1798* Used to generate unreachable message here:
1799
1800  parser grammar t;
1801  a : ID a
1802    | ID
1803    ;
1804
1805  z.g:3:11: The following alternatives are unreachable: 2
1806
1807  In fact it should really be an error; now it generates:
1808
1809  no start rule in grammar t (no rule can obviously be followed by EOF)
1810
1811  Per next change item, ANTLR cannot know that EOF follows rule 'a'.
1812
1813* added error message indicating that ANTLR can't figure out what your
1814  start rule is.  Required to properly generate code in some cases.
1815
1816* validating semantic predicates now work (if they are false, they
1817  throw a new FailedPredicateException
1818
1819* two hideous bug fixes in the IntervalSet, which made analysis go wrong
1820  in a few cases.  Thanks to Oliver Zeigermann for finding lots of bugs
1821  and making suggested fixes (including the next two items)!
1822
1823* cyclic DFAs are now nonstatic and hence can access instance variables
1824
1825* labels are now allowed on lexical elements (in the lexer)
1826
1827* added some internal debugging options
1828
1829* ~'a'* and ~('a')* were not working properly; refactored antlr.g grammar
1830
18313.0ea5 - July 5, 2005
1832
1833* Using '\n' in a parser grammar resulted in a nonescaped version of '\n' in the token names table making compilation fail.  I fixed this by reorganizing/cleaning up portion of ANTLR that deals with literals.  See comment org.antlr.codegen.Target.
1834
1835* Target.getMaxCharValue() did not use the appropriate max value constant.
1836
1837* ALLCHAR was a constant when it should use the Target max value def.  set complement for wildcard also didn't use the Target def.  Generally cleaned up the max char value stuff.
1838
1839* Code gen didn't deal with ASTLabelType properly...I think even the 3.0ea7 example tree parser was broken! :(
1840
1841* Added a few more unit tests dealing with escaped literals
1842
18433.0ea4 - June 29, 2005
1844
1845* tree parsers work; added CommonTreeNodeStream.  See simplecTreeParser
1846  example in examples-v3 tarball.
1847
1848* added superClass and ASTLabelType options
1849
1850* refactored Parser to have a BaseParser and added TreeParser
1851
1852* bug fix: actions being dumped in description strings; compile errors
1853  resulted
1854
18553.0ea3 - June 23, 2005
1856
1857Enhancements
1858
1859* Automatic tree construction operators are in: ! ^ ^^
1860
1861* Tree construction rewrite rules are in
1862	-> {pred1}? rewrite1
1863	-> {pred2}? rewrite2
1864	...
1865	-> rewriteN
1866
1867  The rewrite rules may be elements like ID, expr, $label, {node expr}
1868  and trees ^( <root> <children> ).  You have have (...)?, (...)*, (...)+
1869  subrules as well.
1870
1871  You may have rewrites in subrules not just at outer level of rule, but
1872  any -> rewrite forces auto AST construction off for that alternative
1873  of that rule.
1874
1875  To avoid cycles, copy semantics are used:
1876
1877  r : INT -> INT INT ;
1878
1879  means make two new nodes from the same INT token.
1880
1881  Repeated references to a rule element implies a copy for at least one
1882  tree:
1883
1884  a : atom -> ^(atom atom) ; // NOT CYCLE! (dup atom tree)
1885
1886* $ruleLabel.tree refers to tree created by matching the labeled element.
1887
1888* A description of the blocks/alts is generated as a comment in output code
1889
1890* A timestamp / signature is put at top of each generated code file
1891
18923.0ea2 - June 12, 2005
1893
1894Bug fixes
1895
1896* Some error messages were missing the stackTrace parameter
1897
1898* Removed the file locking mechanism as it's not cross platform
1899
1900* Some absolute vs relative path name problems with writing output
1901  files.  Rules are now more concrete.  -o option takes precedence
1902  // -o /tmp /var/lib/t.g => /tmp/T.java
1903  // -o subdir/output /usr/lib/t.g => subdir/output/T.java
1904  // -o . /usr/lib/t.g => ./T.java
1905  // -o /tmp subdir/t.g => /tmp/subdir/t.g
1906  // If they didn't specify a -o dir so just write to location
1907  // where grammar is, absolute or relative
1908
1909* does error checking on unknown option names now
1910
1911* Using just language code not locale name for error message file.  I.e.,
1912  the default (and for any English speaking locale) is en.stg not en_US.stg
1913  anymore.
1914
1915* The error manager now asks the Tool to panic rather than simply doing
1916  a System.exit().
1917
1918* Lots of refactoring concerning grammar, rule, subrule options.  Now
1919  detects invalid options.
1920
19213.0ea1 - June 1, 2005
1922
1923Initial early access release
1924
1925