1 .\" $Id: LLgen.n,v 1.16 1994/12/20 12:40:21 ceriel Exp $
2 .\" Run this paper off with
3 .\" refer [options] -p LLgen.refs LLgen.doc | [n]eqn | tbl | (nt)roff -ms
18 LLgen, an extended LL(1) parser generator
22 Dept. of Mathematics and Computer Science
24 Amsterdam, The Netherlands
26 \fILLgen\fR provides a
27 tool for generating an efficient recursive descent parser
28 with no backtrack from
29 an Extended Context Free syntax.
31 user specifies the syntax, together with code
32 describing actions associated with the parsing process.
34 turns this specification into a number of subroutines that handle the
37 The grammar may be ambiguous.
38 \fILLgen\fR contains both static and dynamic facilities
39 to resolve these ambiguities.
41 The specification can be split into several files, for each of
42 which \fILLgen\fR generates an output file containing the
43 corresponding part of the parser.
44 Furthermore, only output files that differ from their previous
46 Other output files are not affected in any
48 This allows the user to recompile only those output files that have
51 The subroutine produced by \fILLgen\fR calls a user supplied routine
52 that must return the next token. This way, the input to the
53 parser can be split into single characters or higher level
56 An error recovery mechanism is generated almost completely
58 It is based on so called \fBdefault choices\fR, which are
59 implicitly or explicitly specified by the user.
61 \fILLgen\fR has succesfully been used to create recognizers for
62 Pascal, C, and Modula-2.
68 provides a tool for generating an efficient recursive
69 descent parser with no backtrack from an Extended Context Free
75 for the rest of this document.
76 It is assumed that the reader has some knowledge of LL(1) grammars and
77 recursive descent parsers.
78 For a survey on the subject, see reference
83 Extended LL(1) parsers are an extension of LL(1) parsers. They are
84 derived from an Extended Context-Free (ECF) syntax instead of a Context-Free
86 ECF syntax is described in section 2.
87 Section 3 provides an outline of a
88 specification as accepted by
89 \fILLgen\fR and also discusses the lexical conventions of
90 grammar specification files.
91 Section 4 provides a description of the way the
94 actions with the syntax. These actions must be written in the programming
99 which also is the target language of \fILLgen\fR.
100 The error recovery technique is discussed in section 5.
101 This section also discusses what the user can do about it.
103 the facilities \fILLgen\fR offers
104 to resolve ambiguities and conflicts.
105 \fILLgen\fR offers facilities to resolve them both at parser
106 generation time and during the execution of \fILLparse\fR.
107 Section 7 discusses the
110 It also discusses the lexical analyzer that must be supplied by the
112 This lexical analyzer must read the input stream and break it
113 up into basic input items, called \fBtokens\fR for the rest of
115 Appendix A gives a summary of the
118 Appendix B gives an example.
119 It is very instructive to compare this example with the one
124 It demonstrates the struggle \fILLparse\fR and other LL(1)
125 parsers have with expressions.
126 Appendix C gives an example of the \fILLgen\fR features
127 allowing the user to recompile only those output files that
128 have changed, using the \fImake\fR program.
133 The Extended Context-Free Syntax
135 The extensions of an ECF syntax with respect to an ordinary CF syntax are:
137 An ECF syntax contains the repetition operator: "N" (N represents a positive
140 An ECF syntax contains the closure set operator without and with
141 upperbound: "*" and "*N".
143 An ECF syntax contains the positive closure set operator without and with
144 upperbound: "+" and "+N".
146 An ECF syntax contains the optional operator: "?", which is a
149 An ECF syntax contains parentheses "[" and "]" which can be
152 We can describe the syntax of an ECF syntax with an ECF syntax :
159 This grammar rule states that a grammar consists of one or more
163 rule : nonterminal ':' productionrule ';'
167 A rule consists of a left hand side, the nonterminal,
169 the \fBproduce symbol\fR, followed by a production rule, followed by a
170 ";", in\%di\%ca\%ting the end of the rule.
173 productionrule : production [ '|' production ]*
177 A production rule consists of one or
178 more alternative productions separated by "|". This symbol is called the
179 \fBalternation symbol\fR.
186 A production consists of a possibly empty list of terms.
187 So, empty productions are allowed.
190 term : element repeats
194 A term is an element, possibly with a repeat specification.
199 | '[' productionrule ']'
203 An element can be a LITERAL, which basically is a single character
204 between apostrophes, it can be an IDENTIFIER, which is either a
205 nonterminal or a token, and it can be a production rule
206 between square parentheses.
210 | [ '*' | '+' ] NUMBER ?
215 These are the repeat specifications discussed above. Notice that
216 this specification may be empty.
218 The class of ECF languages
219 is identical with the class of CF languages. However, in many
220 cases recursive definitions of language features can now be
221 replaced by iterative ones. This tends to reduce the number of
222 nonterminals and gives rise to very efficient recursive descent
225 Grammar Specifications
229 grammar specification consists of an
230 ECF syntax specification.
231 Names in this syntax specification refer to either tokens or nonterminal
234 requires token names to be declared as such. This way it
235 can be avoided that a typing error in a nonterminal name causes it to
236 be accepted as a token name. The token declarations will be
238 A name will be regarded as a nonterminal symbol, unless it is declared
240 If there is no production rule for a nonterminal symbol, \fILLgen\fR
243 A grammar specification may also include some C routines,
244 for instance the lexical analyzer and an error reporting
246 Thus, a grammar specification file can contain declarations,
247 grammar rules and C-code.
249 Blanks, tabs and newlines are ignored, but may not appear in names or
251 Comments may appear wherever a name is legal (which is almost
254 /* ... */, as in C. Comments do not nest.
256 Names may be of arbitrary length, and can be made up of letters, underscore
257 "\_" and non-initial digits. Upper and lower case letters are distinct.
258 Only the first 50 characters are significant.
259 Notice however, that the names for the tokens will be used by the
261 The number of significant characters therefore depends on the
262 underlying C-implementation.
263 A safe rule is to make the identifiers distinct in the first six
264 characters, case ignored.
266 There are two kinds of tokens:
267 those that are declared and are denoted by a name,
270 A literal consists of a character enclosed in apostrophes "'".
271 The "\e" is an escape character within literals. The following escapes
278 \&'\e'' apostrophe "'"
279 \&'\e\e' backslash "\e"
283 \&'\exxx' "xxx" in octal
286 Names representing tokens must be declared before they are used.
287 This can be done using the "\fB%token\fR" keyword,
292 %token name1, name2, . . . ;
296 \fILLparse\fR is designed to recognize special nonterminal
297 symbols called \fBstart symbols\fR.
298 \fILLgen\fR allows for more than one start symbol.
299 Thus, grammars with more than one entry point are accepted.
300 The start symbols must be declared explicitly using the
301 "\fB%start\fR" keyword. It can be used whenever a declaration is
306 %start LLparse, specification ;
310 declares "specification" as a start symbol and associates the
311 identifier "LLparse" with it.
312 "LLparse" will now be the name of the C-function that must be
313 called to recognize "specification".
318 allows arbitrary insertions of actions within the right hand side
319 of a production rule in the ECF syntax. An action consists of a number of C
320 statements, enclosed in the brackets "{" and "}".
323 generates a parsing routine for each rule in the grammar. The actions
324 supplied by the user are just inserted in the proper place.
325 There may also be declarations before the statements in the
327 the "{" and "}" are copied into the target code along with the
328 action. The scope of these declarations terminates with the
329 closing bracket "}" of the action.
331 In addition to actions, it is also possible to declare local variables
332 in the parsing routine, which can then be used in the actions.
333 Such a declaration consists of a number of C variable declarations,
334 enclosed in the brackets "{" and "}". It must be placed
335 right in front of the ":" in the grammar rule.
336 The scope of these local variables consists of the complete
339 In order to facilitate communication between the actions and
341 the parsing routines can be given C-like parameters.
342 Each parameter must be declared separately, and each of these declarations must
343 end with a semicolon.
344 For the last parameter, the semicolon is optional.
350 expr(int *pval;) { int fact; } :
352 * Rule with one parameter, a pointer to an int.
353 * Parameter specifications are ordinary C declarations.
354 * One local variable, of type int.
356 factor (&fact) { *pval = fact; }
358 * factor is another nonterminal symbol.
359 * One actual parameter is supplied.
360 * Notice that the parameter passing mechanism is that
363 [ '+' factor (&fact) { *pval += fact; } ]*
365 * remember the '*' means zero or more times
371 is a rule to recognize a number of factors, separated by "+", and
372 to compute their sum.
375 generates C code, so the parameter passing mechanism is that of
376 C, as is shown in the example above.
378 Actions often manipulate attributes of the token just read.
379 For instance, when an identifier is read, its name must be
380 looked up in a symbol table.
381 Therefore, \fILLgen\fR generates code
382 such that at a number of places in the grammar rule
383 it is defined which token has last been read.
384 After a token, the last token read is this token.
385 After a "[" or a "|", the last token read is the next token to
386 be accepted by \fILLparse\fR.
387 At all other places, it is undefined which token has last been
389 The last token read is available in the global integer variable
392 The user may also specify C-code wherever a \fILLgen\fR-declaration is
394 Again, this code must be enclosed in the brackets "{" and "}".
395 This way, the user can define global declarations and
397 To avoid name-conflicts with identifiers generated by
398 \fILLgen\fR, \fILLparse\fR only uses names beginning with
399 "LL"; the user should avoid such names.
403 The error recovery technique used by \fILLgen\fR is a
404 modification of the one presented in reference
406 automatic construction error correcting
408 It is based on \fBdefault choices\fR, which just are
409 what the word says, default choices at
410 every point in the grammar where there is a
412 Thus, in an alternation, one of the productions is marked as a
413 default choice, and in a term with a non-fixed repetition
414 specification there will also be a default choice (between
415 doing the term (once more) and continuing with the rest of the
416 production in which the term appears).
418 When \fILLparse\fR detects an error after having parsed the
419 string @s@, the default choices enable it to compute one
420 syntactically correct continuation,
421 consisting of the tokens @t sub 1~...~t sub n@,
422 such that @s~t sub 1~...~t sub n@ is a string of tokens that
423 is a member of the language defined by the grammar.
424 Notice, that the computation of this continuation must
425 terminate, which implies that the default choices may not
426 invoke recursive rules.
428 At each point in this continuation, a certain number of other
429 tokens could also be syntactically correct, f.i. the token
430 @t@ is syntactically correct at point @t sub i@ in this
431 continuation, if the string @s~t sub 1~...~t sub i~t~s sub 1@
432 is a string of the language defined by the grammar for some
433 string @s sub 1@ and i >= 0.
436 containing all these tokens (including @t sub 1 ,~...,~t sub n@) is computed.
437 Next, \fILLparse\fR discards zero
438 or more tokens from its input, until a token
439 @t@ \(mo @T@ is found.
440 The error is then corrected by inserting i (i >= 0) tokens
441 @t sub 1~...~t sub i@, such that the string
442 @s~t sub 1~...~t sub i~t~s sub 1@ is a string of the language
443 defined by the grammar, for some @s sub 1@.
444 Then, normal parsing is resumed.
446 The above is difficult to implement in a recursive decent
447 parser, and is not the way \fILLparse\fR does it, but the
448 effect is the same. In fact, \fILLparse\fR maintains a list
449 of tokens that may not be discarded, which is adjusted as
450 \fILLparse\fR proceeds. This list is just a representation
451 of the set @T@ mentioned
452 above. When an error occurs, \fILLparse\fR discards tokens until
453 a token @t@ that is a member of this list is found.
454 Then, it continues parsing, following the default choices,
455 inserting tokens along the way, until this token @t@ is legal.
457 the default choices must guarantee that this will always
460 The default choices are explicitly or implicitly
461 specified by the user.
462 By default, the default choice in an alternation is the
463 alternative with the shortest possible terminal production.
464 The user can select one of the other productions in the
465 alternation as the default choice by putting the keyword
466 "\fB%default\fR" in front of it.
468 By default, for terms with a repetition count containing "*" or
469 "?" the default choice is to continue with the rest of the rule
470 in which the term appears, and
486 It is also clear, that it can never be the default choice to do
487 the term (once more), because this could cause the parser to
488 loop, inserting tokens forever.
489 However, when the user does not want the parser to skip
490 tokens that would not have been skipped if the term
491 would have been the default choice,
492 the skipping of such a term can be prevented by
493 using the keyword "\fB%persistent\fR".
494 For instance, the rule
498 commandlist : command* ;
506 commandlist : [ %persistent command ]* ;
510 The effects of this in case of a syntax error are twofold:
511 The set @T@ mentioned above will be extended as if "command" were
512 in the default production, so that fewer tokens will be
514 Also, if the first token that is not skipped is a member of the
515 subset of @T@ arising from the grammar rule for "command",
516 \fILLparse\fR will enter that rule.
517 So, in fact the default choice
518 is determined dynamically (by \fILLparse\fR).
519 Again, \fILLgen\fR checks (statically)
520 that \fILLparse\fR will always terminate, and if not,
521 \fILLgen\fR will complain.
523 An important property of this error recovery method is that,
524 once a rule is started, it will be finished.
525 This means that all actions in the rule will be executed
526 normally, so that the user can be sure that there will be no
527 inconsistencies in his data structures because of syntax
529 Also, as the method is in fact error correcting, the
530 actions in a rule only have to deal with syntactically correct
533 Ambiguities and conflicts
535 As \fILLgen\fR generates a recursive descent parser with no backtrack,
536 it must at all times be able to determine what to do,
537 based on the current input symbol.
538 Unfortunately, this cannot be done for all grammars.
539 Two kinds of conflicts can arise :
541 the grammar rule is of the form "production1 | production2",
542 and \fILLparse\fR cannot decide which production to chose.
543 This we call an \fBalternation conflict\fR.
545 the grammar rule is of the form "[ productionrule ]...",
546 where ... specifies a non-fixed repetition count,
547 and \fILLparse\fR cannot decide whether to
548 choose "productionrule" once more, or to continue.
549 This we call a \fBrepetition conflict\fR.
551 There can be several causes for conflicts: the grammar may be
552 ambiguous, or the grammar may require a more complex parser
553 than \fILLgen\fR can construct.
554 The conflicts can be examined by inspecting the verbose
555 (-\fBv\fR) option output file.
556 The conflicts can be resolved by rewriting the grammar
557 or by using \fBconflict resolvers\fR.
558 The mechanism described here is based on the attributed parsing
564 An alternation conflict can be resolved by putting an \fBif condition\fR
565 in front of the first conflicting production.
566 It consists of a "\fB%if\fR" followed by a
567 C-expression between parentheses.
568 \fILLparse\fR will then evaluate this expression whenever a
569 token is met at this point on which there is a conflict, so
570 the conflict will be resolved dynamically.
571 If the expression evaluates to
572 non-zero, the first conflicting production is chosen,
573 otherwise one of the remaining ones is chosen.
575 An alternation conflict can also be resolved using the keywords
576 "\fB%prefer\fR" or "\fB%avoid\fR". "\fB%prefer\fR"
577 is equivalent in behaviour to
578 "\fB%if\fR (1)". "\fB%avoid\fR" is equivalent to "\fB%if\fR (0)".
579 In these cases however, "\fB%prefer\fR" and "\fB%avoid\fR" should be used,
580 as they resolve the conflict statically and thus
581 give rise to better C-code.
583 A repetition conflict can be resolved by putting a \fBwhile condition\fR
584 right after the opening parentheses. This while condition
585 consists of a "\fB%while\fR" followed by a C-expression between
586 parentheses. Again, \fILLparse\fR will then
587 evaluate this expression whenever a token is met
588 at this point on which there is a conflict.
589 If the expression evaluates to non-zero, the
590 repeating part is chosen, otherwise the parser continues with
591 the rest of the rule.
592 Appendix B will give an example of these features.
594 A useful aid in writing conflict resolvers is the "\fB%first\fR" keyword.
595 It is used to declare a C-macro that forms an expression
596 returning 1 if the parameter supplied can start a specified
601 %first fmac, nonterm ;
605 declares "fmac" as a macro with one parameter, whose value
606 is a token number. If the parameter
607 X can start the nonterminal "nonterm", "fmac(X)" is true,
608 otherwise it is false.
610 The LLgen working environment
612 \fILLgen\fR generates a number of files: one for each input
613 file, and two other files: \fILpars.c\fR and \fILpars.h\fR.
614 \fILpars.h\fR contains "#-define"s for the tokennames.
615 \fILpars.c\fR contains the error recovery routines and tables.
616 Only those output files that differ from their previous version
617 are updated. See appendix C for a possible application of this
620 The names of the output files are constructed as
622 in the input file name, the suffix after the last point is
623 replaced by a "c". If no point is present in the input file
624 name, ".c" is appended to it. \fILLgen\fR checks that the
625 filename constructed this way in fact represents a previous
626 version, or does not exist already.
628 The user must provide some environment to obtain a complete
630 Routines called \fImain\fR and \fILLmessage\fR must be defined.
631 Also, a lexical analyzer must be provided.
633 The routine \fImain\fR must be defined, as it must be in every
634 C-program. It should eventually call one of the startsymbol
637 The routine \fILLmessage\fR must accept one
638 parameter, whose value is a token number, zero or -1.
640 A zero parameter indicates that the current token (the one in
641 the external variable \fILLsymb\fR) is deleted.
643 A -1 parameter indicates that the parser expected end of file, but didn't get
645 The parser will then skip tokens until end of file is detected.
647 A parameter that is a token number (a positive parameter)
649 token is to be inserted in front of the token currently in
651 The user can give the token the proper attributes.
652 Also, the user must take care, that the token currently in
653 \fILLsymb\fR is again returned by the \fBnext\fR call to the
654 lexical analyzer, with the proper attributes.
655 So, the lexical analyzer must have a facility to push back one
658 The user may also supply his own error recovery routines, or handle
659 errors differently. For this purpose, the name of a routine to be called
660 when an error occurs may be declared using the keyword \fB%onerror\fR.
661 This routine takes two parameters.
662 The first one is either the token number of the
663 token expected, or 0. In the last case, the error occurred at a choice.
664 In both cases, the routine must ensure that the next call to the lexical
665 analyser returns the token that replaces the current one. Of course,
666 that could well be the current one, in which case
668 recovers from the error.
669 The second parameter contains a list of tokens that are not skipped at the
670 error point. The list is in the form of a null-terminated array of integers,
671 whose address is passed.
673 The user must supply a lexical analyzer to read the input stream and
674 break it up into tokens, which are passed to
676 It should be an integer valued function, returning the token number.
677 The name of this function can be declared using the
678 "\fB%lexical\fR" keyword.
679 This keyword can be used wherever a declaration is legal and may appear
680 only once in the grammar specification, f.i.:
688 declares "scanner" as the name of the lexical analyzer.
689 The default name for the lexical analyzer is "yylex".
690 The reason for this funny name is that a useful tool for constructing
691 lexical analyzers is the
697 which generates a routine of that name.
699 The token numbers are chosen by \fILLgen\fR.
700 The token number for a literal
701 is the numerical value of the character in the local character set.
702 If the tokens have a name,
703 the "#\ define" mechanism of C is used to give them a value and
704 to allow the lexical analyzer to return their token numbers symbolically.
705 These "#\ define"s are collected in the file \fILpars.h\fR which
706 can be "#\ include"d in any file that needs the token-names.
707 The maximum token number chosen is defined in the macro \fILL_MAXTOKNO\fP.
709 The lexical analyzer must signal the end
710 of input to \fILLparse\fR
711 by returning a number less than or equal to zero.
713 Programs with more than one parser
715 \fILLgen\fR offers a simple facility for having more than one parser in
716 a program: in this case, the user can change the names of global procedures,
717 variables, etc, by giving a different prefix, like this:
725 The effect of this is that all global names start with XX instead of LL, for
726 the parser that has this prefix. This holds for the variables \fILLsymb\fP,
727 which now is called \fIXXsymb\fP, for the routine \fILLmessage\fP,
728 which must now be called \fIXXmessage\fP, and for the macro \fILL_MAXTOKNO\fP,
729 which is now called \fIXX_MAXTOKNO\fP.
730 \fILL.output\fP is now \fIXX.output\fP, and \fILpars.c\fP and \fILpars.h\fP
731 are now called \fIXXpars.c\fP and \fIXXpars.h\fP.
740 Appendix A : LLgen Input Syntax
742 This appendix has a description of the \fILLgen\fR input syntax,
743 as a \fILLgen\fR specification. As a matter of fact, the current
744 version of \fILLgen\fR is written with \fILLgen\fR.
749 * First the declarations of the terminals
750 * The order is not important
753 %token IDENTIFIER; /* terminal or nonterminal name */
761 %token TOKEN; /* %token */
762 %token START; /* %start */
763 %token PERSISTENT; /* %persistent */
765 %token WHILE; /* %while */
766 %token AVOID; /* %avoid */
767 %token PREFER; /* %prefer */
768 %token DEFAULT; /* %default */
769 %token LEXICAL; /* %lexical */
770 %token PREFIX; /* %prefix */
771 %token ONERROR; /* %onerror */
772 %token FIRST; /* %first */
775 * Declare LLparse to be a C-routine that recognizes "specification"
778 %start LLparse, specification;
786 IDENTIFIER ',' IDENTIFIER
789 /* Read C-declaration here */
796 IDENTIFIER ',' IDENTIFIER
810 rule : IDENTIFIER parameters? ldecl?
816 /* Read C-declaration here */
822 [ '|' simpleproduction ]*
827 [ IF '(' /* Read C-expression here */ ')'
835 /* Read action here */
837 | '[' [ WHILE '(' /* Read C-expression here */ ')' ]?
842 | IDENTIFIER parameters?
846 : '(' /* Read C-parameters here */ ')'
849 repeats : /* empty */
850 | [ '*' | '+' ] NUMBER?
859 Appendix B : An example
861 This example gives the complete \fILLgen\fR specification of a simple
862 desk calculator. It has 26 registers, labeled "a" through "z",
863 and accepts arithmetic expressions made up of the C operators
864 +, -, *, /, %, &, and |, with their usual priorities.
865 The value of the expression is
866 printed. As in C, an integer that begins with 0 is assumed to
867 be octal; otherwise it is assumed to be decimal.
869 Although the example is short and not very complicated, it
870 demonstrates the use of if and while conditions. In
871 the example they are in fact used to reduce the number of
872 nonterminals, and to reduce the overhead due to the recursion
873 that would be involved in parsing an expression with an
874 ordinary recursive descent parser. In an ordinary LL(1)
875 grammar there would be one nonterminal for each operator
876 priority. The example shows how we can do it all with one
877 nonterminal, no matter how many priority levels there are.
885 #define prio(op) (ptab[op])
888 int t_tokno; /* token number */
889 int t_tval; /* Its attribute */
890 } stok = { 0,0 }, tok;
893 int regs[26]; /* Space for the registers */
894 int ptab[128]; /* Attribute table */
897 nexttok() { /* Read next token and return it */
901 while ((c = getchar()) == ' ' || c == '\et') { /* nothing */ }
902 if (isdigit(c)) new.t_tokno = DIGIT;
903 else if (islower(c)) new.t_tokno = IDENT;
904 else new.t_tokno = c;
905 if (c >= 0) new.t_tval = ptab[c];
914 stat { int ident, val; } :
915 %if (stok = nexttok(),
917 /* The conflict is resolved by looking one further
918 * token ahead. The grammar is LL(2)
921 { ident = tok.t_tval; }
922 '=' expr(1,&val) '\en'
923 { if (!nerrors) regs[ident] = val; }
925 { if (!nerrors) printf("%d\en",val); }
929 expr(int level; int *val;) { int expr; } :
931 [ %while (prio(tok.t_tokno) >= level)
932 /* Swallow operators as long as their priority is
933 * larger than or equal to the level of this invocation
935 '+' expr(prio('+')+1,&expr)
937 /* This states that '+' groups left to right. If it
938 * should group right to left, the rule should read:
939 * '+' expr(prio('+'),&expr)
941 | '-' expr(prio('-')+1,&expr)
943 | '*' expr(prio('*')+1,&expr)
945 | '/' expr(prio('/')+1,&expr)
947 | '%' expr(prio('%')+1,&expr)
949 | '&' expr(prio('&')+1,&expr)
951 | '|' expr(prio('|')+1,&expr)
954 /* Notice the "*" here. It is important.
960 | '-' expr(MAXPRIO+1,val)
964 { *val = regs[tok.t_tval]; }
967 number(int *val;) { int base; }
969 { base = (*val=tok.t_tval)==0?8:10; }
971 { *val = base * *val + tok.t_tval; }
977 if (stok.t_tokno) { /* a token has been inserted or read ahead */
982 if (nerrors && tok.t_tokno == '\en') {
990 LLmessage(insertedtok) {
992 if (insertedtok) { /* token inserted, save old token */
995 if (insertedtok < 128) tok.t_tval = ptab[insertedtok];
1002 for (p = ptab; p < &ptab[128]; p++) *p = 0;
1003 /* for letters, their attribute is their index in the regs array */
1004 for (p = &ptab['a']; p <= &ptab['z']; p++) *p = p - &ptab['a'];
1005 /* for digits, their attribute is their value */
1006 for (p = &ptab['0']; p <= &ptab['9']; p++) *p = p - &ptab['0'];
1007 /* for operators, their attribute is their priority */
1022 Appendix C. How to use \fILLgen\fR.
1024 This appendix demonstrates how \fILLgen\fR can be used in
1025 combination with the \fImake\fR program, to make effective use
1026 of the \fILLgen\fR-feature that it only changes output files
1027 when neccessary. \fIMake\fR uses a "makefile", which
1028 is a file containing dependencies and associated commands.
1029 A dependency usually indicates that some files depend on other
1030 files. When a file depends on another file and is older than
1031 that other file, the commands associated with the dependency
1034 So, \fImake\fR seems just the program that we always wanted.
1036 is not very good in handling programs that generate more than
1038 As usual, there is a way around this problem.
1039 A sample makefile follows:
1043 # The grammar exists of the files decl.g, stat.g and expr.g.
1044 # The ".o"-files are the result of a C-compilation.
1046 GFILES = decl.g stat.g expr.g
1047 OFILES = decl.o stat.o expr.o Lpars.o
1050 # As make does'nt handle programs that generate more than one
1051 # file well, we just don't tell make about it.
1052 # We just create a dummy file, and touch it whenever LLgen is
1053 # executed. This way, the dummy in fact depends on the grammar
1055 # Then, we execute make again, to do the C-compilations and
1062 LLgen $(LLOPT) $(GFILES)
1066 $(CC) -o parser $(LDFLAGS) $(OFILES)
1068 # Some dependencies without actions :
1069 # make already knows what to do about them