doc/top/top.n

   1 .ND
   2 .tr ~
   3 .ds <.
   4 .ds <,
   5 .ds >. .
   6 .ds >, ,
   7 .ds [. [
   8 .ds .] ]
   9 .TL
  10 The ACK Target Optimizer
  11 .AU
  12 H.E. Bal
  13 .AI
  14 Vrije Universiteit
  15 Wiskundig Seminarium, Amsterdam
  16 .AB
  17 The Target Optimizer is one of several optimizers that are part of
  18 the Amsterdam Compiler Kit.
  19 It operates directly on assembly code,
  20 rather than on a higher level intermediate code,
  21 as the Peephole Optimizer and Global Optimizer do.
  22 Consequently, the Target Optimizer can do optimizations
  23 that are highly machine-dependent.
  24 .PP
  25 Each target machine has its own Target Optimizer.
  26 New optimizers are generated by the Target Optimizer Generator,
  27 which uses a machine-dependent table as input.
  28 This document contains full information on how to
  29 write such a table for a new machine.
  30 It also discusses the implementation of the
  31 Target Optimizer and its generator.
  32 .AE
  33 .NH 1
  34 Introduction
  35 .PP
  36 .FS
  37 This work was supported by the
  38 Stichting Technische Wetenschappen (STW)
  39 under grant VWI03.0001.
  40 .FE
  41 This document describes the target optimizer component
  42 of the Amsterdam Compiler Kit (ACK) .
  43 .[
  44 tanenbaum staveren amsterdam toolkit
  45 .]
  46 .[
  47 tanenbaum staveren cacm
  48 .]
  49 .[
  50 tanenbaum staveren toronto
  51 .]
  52 Optimization takes place in several parts of ACK compilers,
  53 most notably in the Peephole Optimizer
  54 .[
  55 staveren peephole toplas
  56 .]
  57 and
  58 the Global Optimizer,
  59 .[
  60 bal tanenbaum global optimization
  61 .]
  62 .[
  63 bal implementation global optimizer
  64 .]
  65 which are both language- and machine-independent,
  66 and in the machine-specific code generators.
  67 .[
  68 documentation amsterdam compiler kit
  69 .]
  70 The target optimizer is the finishing touch in this sequence of
  71 optimizers.
  72 It can be used to capture those optimizations that are hard
  73 to express in the other parts of ACK.
  74 These optimizations will typically be very machine-specific.
  75 .PP
  76 The target optimizer operates on the assembly code of some target machine.
  77 Hence there is one target optimizer per machine.
  78 However, just as for the ACK code generators and assemblers,
  79 a framework has been build that allows easy generation of
  80 target optimizers out of machine-independent parts and a
  81 machine-dependent description table (see figure 1.).
  82 So the major part of the code of a target optimizer is
  83 shared among all target optimizers.
  84 .DS
  85 .ft CW
  86
  87
  88                                        |-------------------------|
  89                                        | machine-independent     |
  90                                        | code                    |
  91                                        |                         |
  92           |-----------------|          |-------------------------|
  93 descrip-  |target optimizer |          | machine-dependent code  |
  94  tion --> |generator        | ---->    | + tables                |
  95 table     |                 |          |                         |
  96           |-----------------|          |-------------------------|
  97
  98                                               target optimizer
  99 .ft R
 100
 101     Figure 1: Generation of a target optimizer.
 102
 103 .DE
 104 .PP
 105 This document focusses on the description of the machine-dependent table.
 106 In chapter 2 we give an informal introduction to the optimization
 107 algorithm and to the definition of the table format.
 108 Chapters 3 and 4 discuss the implementation of the target optimizer
 109 and the target optimizer generator.
 110 Appendix A gives full information for writing a description table.
 111 .NH 1
 112 Global structure of the target optimizer
 113 .PP
 114 The target optimizer is based on the well understood model
 115 of a \fIpeephole optimizer\fR.
 116 .[
 117 aho ullman compiler
 118 .]
 119 It contains a machine-dependent table
 120 of (pattern,replacement) pairs.
 121 Each pattern describes
 122 a sequence of one or more assembler instructions
 123 that can be replaced by zero or more equivalent, yet cheaper,
 124 instructions (the 'replacement').
 125 The optimizer maintains a \fIwindow\fR that moves over the input.
 126 At any moment, the window contains some contiguous part of the input.
 127 If the instructions in the current window match some pattern
 128 in the table,
 129 they are replaced by the corresponding replacement;
 130 else, the window moves one instruction to the right.
 131 .PP
 132 In the remainder of this section we will give an informal
 133 description of the machine-dependent table.
 134 A more precise definition is given in appendix A.
 135 We will first discuss the restrictions put on the
 136 format of the assembly code.
 137 .NH 2
 138 Assumptions about the assembly code format
 139 .PP
 140 We assume that a line of assembly code begins with an
 141 instruction \fImnemonic\fR (opcode),
 142 followed by zero or more \fIoperands\fR.
 143 The mnemonic and the first operand must be separated by a special
 144 character (e.g. a space or a tab).
 145 Likewise, the operands must be separated by a special
 146 character (e.g. a comma).
 147 These separators need not be the same for all machines.
 148 .NH 2
 149 Informal description of the machine-dependent tables
 150 .PP
 151 The major part of the table consists of (pattern,replacement) pairs
 152 called \fIentries\fR.
 153 .PP
 154 A pattern is a list of instruction descriptions.
 155 Each instruction description describes the instruction mnemonic and
 156 the operands.
 157 .PP
 158 A mnemonic is described either by a string constant or by the
 159 keyword ANY.
 160 As all entities dealt with by the target optimizer are strings,
 161 string constants do not contain quotes.
 162 A string constant matches only itself.
 163 ANY matches every instruction mnemonic.
 164 .nf
 165
 166 Examples of mnemonic descriptions:
 167 .ft CW
 168
 169         add
 170         sub.l
 171         mulw3
 172         ANY
 173 .ft R
 174 .fi
 175 .PP
 176 An operand can also be described by a string constant.
 177 .nf
 178
 179 Examples:
 180 .ft CW
 181
 182        (sp)+
 183        r5
 184        -4(r6)
 185
 186 .ft R
 187 .fi
 188 Alternatively, it can be described by means of a \fIvariable name\fR.
 189 Variables have values which are strings.
 190 They have to be declared in the table before the patterns.
 191 Each such declaration defines the name of a variable and
 192 a \fIrestriction\fR to which its value is subjected.
 193 .nf
 194 Example of variable declarations:
 195 .ft CW
 196
 197       CONST       { VAL[0] == '$' };
 198       REG         { VAL[0] == 'r' && VAL[1] >= '0' && VAL[1] <= '3' &&
 199                     VAL[2] == '\\0' };
 200       X           { TRUE };
 201
 202 .ft R
 203 .fi
 204 The keyword VAL denotes the value of the variable, which is
 205 a null-terminated string.
 206 An operand description given via a variable name matches an
 207 actual operand if the actual operand obeys the associated restriction.
 208 .nf
 209 .ft CW
 210
 211      CONST  matches   $1, $-5, $foo etc.
 212      REG    matches   r0, r1, r2 and r3
 213      X      matches   anything
 214 .ft R
 215
 216 .fi
 217 The restriction (between curly braces) may be any legal "C"
 218 .[
 219 kernighan ritchie c programming
 220 .]
 221 expression.
 222 It may also contain calls to user-defined procedures.
 223 These procedures must be added to the table after the patterns.
 224 .nf
 225
 226 Example:
 227 .ft CW
 228
 229      FERMAT_NUMBER    { VAL[0] == '$' && is_fermat_number(&VAL[1]) };
 230
 231 .ft R
 232 .fi
 233 An operand can also be described by a mixture of a string constant
 234 and a variable name.
 235 The most general form allowed is:
 236 .nf
 237
 238        string_constant1 variable_name string_constant2
 239
 240 Example:
 241 .ft CW
 242
 243        (REG)+  matches  (r0)+, (r1)+, (r2)+ and (r3)+
 244
 245 .ft R
 246 .fi
 247 Any of the three components may be omitted,
 248 so the first two forms are just special cases of the general form.
 249 The name of a variable can not be used as a string constant.
 250 In the above context, it is impossible to define an operand that
 251 matches the string "REG".
 252 This limitation is of little consequence,
 253 as the table writer is free to choose the names of variables.
 254 This approach, however, avoids the need for awkward escape sequences.
 255 .PP
 256 A pattern consists of one or more instruction descriptions
 257 (separated by a colon)
 258 followed by an optional constraint.
 259 A pattern "P1 : P2 : .. : Pn C" matches the sequence of
 260 instructions "I1 I2 .. In" if:
 261 .IP (i) 7
 262 for each i, 1 <= i <= n, Pi matches Ii, as described above;
 263 .IP (ii)
 264 multiple occurrences of the same variable name or of
 265 the keyword ANY stand for the same values throughout the pattern;
 266 .IP (iii)
 267 the optional constraint C is satisfied, i.e. it evaluates to TRUE.
 268 .LP
 269 .nf
 270 The pattern:
 271 .ft CW
 272
 273       dec REG : move.b CONST,(REG)
 274
 275 .ft R
 276 matches:
 277 .ft CW
 278
 279       dec r0 : move.b $4,(r0)
 280
 281 .ft R
 282 but not:
 283 .ft CW
 284
 285       dec r0 : move.b $4,(r1)
 286
 287 .ft R
 288 (as the variable REG matches two different strings).
 289 .fi
 290 If a pattern containing different registers must be described,
 291 extra names for a register should be declared, all sharing
 292 the same restriction.
 293 .nf
 294 Example:
 295 .ft CW
 296
 297      REG1,REG2  { VAL[0] == 'r' &&  .....  };
 298
 299      addl3 REG1,REG1,REG2 : subl2 REG2,REG1
 300 .ft R
 301 .fi
 302 .PP
 303 The optional constraint is an auxiliary "C" expression (just like
 304 the parameter restrictions).
 305 The expression may refer to the variables and to ANY.
 306 .nf
 307 Example:
 308 .ft CW
 309
 310     move REG1,REG2    { REG1[1] == REG2[1] + 1 }
 311
 312 .ft R
 313 matches
 314 .ft CW
 315
 316     move r1,r0
 317     move r2,r1
 318     move r3,r2
 319 .ft R
 320 .fi
 321 .PP
 322 The replacement part of a (pattern,replacement) table entry
 323 has the same structure as a pattern, except that:
 324 .IP (i)
 325 it may not contain an additional constraint;
 326 .IP (ii)
 327 it may be empty.
 328 .LP
 329 A replacement may also refer to the values of variables and ANY.
 330 .NH 2
 331 Examples
 332 .PP
 333 This section contains some realistic examples for
 334 optimization on PDP-11 and Vax assembly code.
 335 .NH 3
 336 Vax examples
 337 .PP
 338 Suppose the table contains the following declarations:
 339 .nf
 340
 341 .ft CW
 342          X, LOG        { TRUE };
 343          LAB           { VAL[0] == 'L' };   /* e.g. L0017 */
 344          A             { no_side_effects(VAL) };
 345          NUM           { is_number(VAL) };
 346 .ft R
 347
 348 .fi
 349 The procedure "no_side_effects" checks if its argument
 350 contains any side effects, i.e. auto increment or auto decrement.
 351 The procedure "is_number" checks if its argument contains only digits.
 352 These procedures must be supplied by the table-writer and must be
 353 included in the table.
 354 .PP
 355 .nf
 356 .ft CW
 357 \fIentry:\fP  addl3 X,A,A    -> addl2 X,A;
 358 .ft R
 359
 360 .fi
 361 This entry changes a 3-operand instruction into a cheaper  2-operand
 362 instruction.
 363 An optimization like:
 364 .nf
 365 .ft CW
 366
 367         addl3 r0,(r2)+,(r2)+   -> addl2 r0,(r2)+
 368
 369 .ft R
 370 .fi
 371 is illegal, as r2 should be incremented twice.
 372 Hence the second argument is required to
 373 be side-effect free.
 374 .PP
 375 .nf
 376 .ft CW
 377 \fIentry:\fP  addw2 $-NUM,X  -> subw2 $NUM,X;
 378 .ft R
 379
 380 .fi
 381 An instruction like "subw2 $5,r0" is cheaper
 382 than "addw2 $-5,r0",
 383 because constants in the range 0 to 63 are represented
 384 very efficiently on the Vax.
 385 .PP
 386 .nf
 387 .ft CW
 388 \fIentry:\fP  bitw $NUM,A : jneq LAB
 389                 { is_poweroftwo(NUM,LOG) }  -> jbs $LOG,A,LAB;
 390
 391 .ft R
 392 .fi
 393 A "bitw x,y" sets the condition codes to the bitwise "and" of
 394 x and y.
 395 A "jbs n,x,l" branches to l if bit n of x is set.
 396 So, for example, the following transformation is possible:
 397 .nf
 398 .ft CW
 399
 400       bitw $32,r0 : jneq L0017 ->  jbs $5,r0,L0017
 401
 402 .ft R
 403 .fi
 404 The user-defined  procedure "is_poweroftwo" checks if its first argument is
 405 a power of 2 and, if so, sets its second argument to the logarithm
 406 of the first argument. (Both arguments are strings).
 407 Note that the variable LOG is not used in the pattern itself.
 408 It is assigned a (string) value by "is_poweroftwo" and is used
 409 in the replacement.
 410 .NH 3
 411 PDP-11 examples
 412 .PP
 413 Suppose we have the following declarations:
 414 .nf
 415
 416 .ft CW
 417          X             { TRUE };
 418          A             { no_side_effects(VAL) };
 419          L1, L2        { VAL[0] == 'I' };
 420          REG           { VAL[0] == 'r' && VAL[1] >= '0' && VAL[1] <= '5' &&
 421                          VAL[2] == '\\0' };
 422
 423 .ft P
 424 .fi
 425 The implementation of "no_side_effects" may of course
 426 differ for the PDP-11 and the Vax.
 427 .PP
 428 .nf
 429 .ft CW
 430 \fIentry:\fP  mov REG,A : ANY A,X  ->  mov REG,A : ANY REG,X ;
 431 .ft R
 432
 433 .fi
 434 This entry implements register subsumption.
 435 If A and REG hold the same value (which is true after "mov REG,A")
 436 and A is used as source (first) operand, it is cheaper to use REG instead.
 437 .PP
 438 .nf
 439 .ft CW
 440 \fIentry:\fP  jeq L1 : jbr L2 : labdef L1  ->  jne L2 : labdef L1;
 441 .ft R
 442
 443 .fi
 444 The "jeq L1" is a "skip over an unconditional jump". "labdef L1"
 445 denotes the definition (i.e. defining occurrence) of label L1.
 446 As the target optimizer has to know how such a definition
 447 looks like, this must be expressed in the table (see Appendix A).
 448 .PP
 449 .nf
 450 .ft CW
 451 \fIentry:\fP  add $01,X { carry_dead(REST) }  -> inc X;
 452 .ft R
 453
 454 .fi
 455 On the PDP-11, an add-one is not equivalent to an increment.
 456 The latter does not set the carry-bit of the condition codes,
 457 while the former does.
 458 So a look-ahead is needed to see if the rest of the input uses
 459 the carry-bit before changing the condition codes.
 460 A look-ahead of one instruction is provided by
 461 the target optimizer.
 462 This will normally be sufficient for compiler-generated code.
 463 The keyword REST contains the mnemonic of the first instruction of
 464 the rest of the input.
 465 If this instruction uses the carry-bit (e.g. an adc, subc, bhis)
 466 the transformation is not allowed.
 467 .NH 1
 468 Implementation of the target optimizer
 469 .PP
 470 The target optimizer reads one input file of assembler instructions,
 471 processes it, and writes the optimized code
 472 to the output file.
 473 So it performs one pass over the input.
 474 .NH 2
 475 The window mechanism
 476 .PP
 477 The optimizer uses a \fIwindow\fR that moves over the input.
 478 It repeatedly tries to match the instructions in the window
 479 with the patterns in the table.
 480 If no match is possible, the window moves
 481 one instruction forwards (to the right).
 482 After a successful match the matched instructions are
 483 removed from the window and are replaced by the
 484 replacement part of the table entry.
 485 Furthermore, the window is moved a few instructions
 486 backwards,
 487 as it is possible that instructions that were rejected earlier now do match.
 488 For example, consider the following patterns:
 489 .DS
 490 .ft CW
 491 cmp $0, X           -> tst X ;
 492 mov REG,X : tst X   -> move REG.X ;   /* redundant test */
 493 .ft R
 494 .DE
 495 If the input is:
 496 .DS
 497 .ft CW
 498 mov r0,foo : cmp $0,foo
 499 .ft R
 500 .DE
 501 then the first instruction is initially rejected.
 502 However, after the transformation
 503 .DS
 504 .ft CW
 505 cmp $0,foo   ->  tst foo
 506 .ft R
 507 .DE
 508 the following optimization is possible:
 509 .DS
 510 .ft CW
 511 mov r0,foo : tst foo  ->  mov r0,foo
 512 .ft R
 513 .DE
 514 .PP
 515 The window is implemented as a \fIqueue\fR.
 516 Matching takes place at the head of the queue.
 517 New instructions are added at the tail.
 518 If the window is moved forwards, the instruction at the head
 519 is not yet written to the output,
 520 as it may be needed later on.
 521 Instead it is added to a second queue,
 522 the \fIbackup queue\fR.
 523 After a successful match, the entire backup queue is
 524 inserted at the front of the window queue,
 525 which effectively implements the shift backwards.
 526 .PP
 527 Both queues have the length of the longest pattern in the table.
 528 If, as a result of a forward window move,
 529 the backup queue gets full,
 530 the instruction at its head is outputted and removed.
 531 Instructions are read from the input whenever the
 532 window queue contains fewer elements than the length
 533 of the longest pattern.
 534 .NH 2
 535 Pattern matching
 536 .PP
 537 Pattern matching is done in three steps:
 538 .IP (i) 7
 539 find patterns in the table whose instruction mnemonics
 540 match the mnemonics of the instructions in the
 541 current window;
 542 .IP (ii)
 543 check if the operands of the pattern match the operands of the
 544 instructions in the current window;
 545 .IP (iii)
 546 check if the optional constraint is satisfied.
 547 .LP
 548 For step (i) hashing is used.
 549 The mnemonic of the first instruction of the window
 550 is used to determine a list of possible patterns.
 551 Patterns starting with ANY are always tried.
 552 .PP
 553 Matching of operand descriptions against actual operands
 554 takes place as follows.
 555 The general form of an operand description is:
 556 .DS
 557 string_constant1 variable_name string_constant2
 558 .DE
 559 The actual operand should begin with string_constant1 and end
 560 on string_constant2.
 561 If so, these strings are stripped from it and the remaining string is
 562 matched against the variable.
 563 Matching a string against a variable is
 564 defined as follows:
 565 .IP 1.
 566 initially (before the entire pattern match)
 567 all variables are uninstantiated;
 568 .IP 2.
 569 matching a string against an uninstantiated variable
 570 succeeds if the restriction associated with the variable is
 571 satisfied.
 572 As a side effect, it causes the variable to be instantiated to
 573 the string;
 574 .IP 3.
 575 matching a string against an instantiated variable succeeds
 576 only if the variable was instantiated to the same string.
 577 .LP
 578 Matching an actual mnemonic against the keyword ANY is defined likewise.
 579 .PP
 580 The matching scheme implements the requirement that multiple occurrences
 581 of the same variable name or of the keyword ANY should
 582 stand for the same values throughout the entire pattern
 583 (see section 2.).
 584 .PP
 585 Both the parameter restriction of 2. and the constraint of step (iii)
 586 are checked by executing the "C" expression.
 587 .NH 2
 588 Data structures
 589 .PP
 590 The most important data structure is the representation
 591 of the input instructions.
 592 For every instruction we use two representations:
 593 .IP (i)
 594 the textual representation,
 595 i.e. the exact code as it appeared in the input;
 596 .IP (ii)
 597 a structural representation,
 598 containing the opcode and the operands.
 599 .LP
 600 The opcode of an instruction is determined as soon as it is read.
 601 If the line contains a label definition, the opcode is set
 602 to "labdef", so a label definition is treated like a normal
 603 instruction.
 604 .PP
 605 The operands of an instruction are not determined until
 606 they are needed, i.e. until step (i) of the pattern matching
 607 process has succeeded.
 608 For every instruction we keep track of a \fIstate\fR.
 609 After the opcode has successfully been determined,
 610 the state is OPC_ONLY.
 611 Once the operands have been recognized, the state is set to DONE.
 612 If the opcode or operands can not be determined,
 613 or if the instruction cannot be optimized for any other
 614 reason (see Appendix A), the state is set to JUNK
 615 and any attempt to match it will fail.
 616 .PP
 617 For each table entry we record the following information:
 618 .IP (i) 7
 619 the length of the pattern (i.e. the number of instruction descriptions)
 620 .IP (ii)
 621 a description of the instructions of the pattern
 622 .IP (iii)
 623 the length of the replacement
 624 .IP (iv)
 625 a description of the instructions of the replacement.
 626 .LP
 627 The description of an instruction consists of:
 628 .IP (i)
 629 the opcode
 630 .IP (ii)
 631 for each operand, a description of the operand.
 632 .LP
 633 The description of an operand of the form:
 634 .DS
 635 string_constant1 variable_name string_constant2
 636 .DE
 637 contains:
 638 .IP (i)
 639 both string constants
 640 .IP (ii)
 641 the number of the variable.
 642 .LP
 643 Each declared variable is assigned a unique number.
 644 For every variable we maintain:
 645 .IP (i)
 646 its state (instantiated or not instantiated)
 647 .IP (ii)
 648 its current value (a string).
 649 .LP
 650 The restrictions on variables and the constraints are stored
 651 in a switch-statement,
 652 indexed by variable number and entry number respectively.
 653 .NH 1
 654 Implementation of the target optimizer generator
 655 .PP
 656 The target optimizer generator (\fItopgen\fR)
 657 reads a target machine description table and produces
 658 two files:
 659 .IP gen.h: 9
 660 contains macro definitions for
 661 machine parameters that were changed
 662 in the parameter section of the table (see appendix A)
 663 and for some attributes derived from the table
 664 (longest pattern, number of patterns, number
 665 of variables).
 666 .IP gen.c:
 667 contains the entry description tables,
 668 code for checking the parameter restrictions and constraints
 669 (switch statements)
 670 and the user-defined procedures.
 671 .LP
 672 These two files are compiled together with some machine-independent
 673 files to produce a target optimizer.
 674 .PP
 675 Topgen is implemented using
 676 the LL(1) parser generator system LLgen ,
 677 .[
 678 jacobs topics parser generation
 679 .]
 680 a powerful tool of the Amsterdam Compiler Kit.
 681 This system provides a flexible way of describing the syntax of the tables.
 682 The syntactical description of the table format included
 683 in Appendix A was derived from the LLgen syntax rules.
 684 .PP
 685 The parser uses a simple, hand-written, lexical analyzer (scanner).
 686 The scanner returns a single character in most cases.
 687 The recognition of identifiers is left to the parser, as
 688 this eases the analysis of operand descriptions.
 689 Comments are removed from the input by the scanner,
 690 but white space is passed to the parser,
 691 as it is meaningful in some contexts (it separates the
 692 opcode description from the description of the first operand).
 693 .PP
 694 Topgen maintains two symbol tables, one for variable names and one
 695 for tunable parameters.
 696 The symbol tables are organized as binary trees.
 697 .bp
 698 .NH 1
 699 References
 700 .[
 701 $LIST$
 702 .]
 703 .bp
 704 .SH
 705 Appendix A
 706 .PP
 707 In this appendix we present a complete definition of the target
 708 optimizer description table format.
 709 This appendix is intended for table-writers.
 710 We use syntax rules for the description of the table format.
 711 The following notation is used:
 712 .TS
 713 center;
 714 l l.
 715 { a }   zero or more of a
 716 [ a ]   zero or one of a
 717 a b     a followed by b
 718 a | b   a or b
 719 .TE
 720 Terminals are given in quotes, as in ';'.
 721 .PP
 722 The table may contain white space and comment at all reasonable places.
 723 Comments are as in "C", so they begin with /* and end on */.
 724 Identifiers are sequences of letters, digits and the underscore ('_'),
 725 beginning with a letter.
 726 .PP
 727 .DS
 728 .ft CW
 729 table   ->   {parameter_line} '%%;' {variable_declaration} '%%;'
 730              {entry} '%%;' user_routines.
 731 .ft R
 732 .DE
 733 A table consists of four sections, containing machine-dependent
 734 constants, variable declarations, pattern rules and
 735 user-supplied subroutines.
 736 .PP
 737 .DS
 738 .ft CW
 739 parameter_line ->  identifier value ';' .
 740 .ft R
 741 .DE
 742 A parameter line defines some attributes of the target machines
 743 assembly code.
 744 For unspecified parameters default values apply.
 745 The names of the parameters and the corresponding defaults
 746 are shown in table 1.
 747 .TS
 748 center;
 749 l l.
 750 OPC_TERMINATOR  ' '
 751 OP_SEPARATOR    ','
 752 LABEL_STARTER   'I'
 753 LABEL_TERMINATOR        ':'
 754 MAXOP   2
 755 MAXOPLEN        25
 756 MAX_OPC_LEN     10
 757 MAXVARLEN       25
 758 MAXLINELEN      100
 759 PAREN_OPEN      not defined
 760 PAREN_CLOSE     not defined
 761 .TE
 762 .ce 1
 763 table 1: parameter names and defaults
 764 .DE
 765 The OPC_TERMINATOR is the character that separates the instruction
 766 mnemonic from the first operand (if any).
 767 The OP_SEPARATOR separates adjacent operands.
 768 A LABEL_STARTER is the first character of an instruction label.
 769 (Instruction labels are assumed to start with the same character).
 770 The LABEL_TERMINATOR is the last character of a label definition.
 771 It is assumed that this character is not used in an applied
 772 occurrence of the label identifier.
 773 For example, the defining occurrence may be "I0017:"
 774 and the applied occurrence may be "I0017"
 775 as in "jmp I0017".
 776 MAXOP defines the maximum number of operands an instruction can have.
 777 MAXOPLEN is the maximum length (in characters) of an operand.
 778 MAX_OPC_LEN is the maximum length of an instruction opcode.
 779 MAXVARLEN is the maximum length of a declared string variable.
 780 As variables may be set by user routines (see "bitw" example for
 781 the Vax) the table-writer must have access to this length and
 782 must be able to change it.
 783 MAXLINELEN denotes the maximum length of a line of assembly code.
 784 PAREN_OPEN and PAREN_CLOSE must be used when the operand separator can also
 785 occur within operands, between parentheses of some kind. In this case,
 786 PAREN_OPEN must be set to a string containing the opening parentheses, and
 787 PAREN_CLOSE must be set to a string containing the closing parentheses.
 788 .PP
 789 If a line of assembly code violates any of the assumptions or
 790 exceeds some limit,
 791 the line is not optimized.
 792 Optimization does, however, proceed with the rest of the input.
 793 .PP
 794 .DS
 795 .ft CW
 796 variable_declaration  -> identifier {',' identifier} restriction ';' .
 797
 798 restriction           ->  '{' anything '}' .
 799 .ft R
 800 .DE
 801 A variable declaration declares one or more string variables
 802 that may be used in the patterns and in the replacements.
 803 If a variable is used as part of an operand description in
 804 a pattern, the entire pattern can only match if the
 805 restriction evaluates to TRUE.
 806 If the pattern does match, the variable is assigned the matching
 807 part of the actual operand.
 808 Variables that are not used in a pattern are initialized to
 809 null-strings and may be assigned a value in the constraint-part of
 810 the pattern.
 811 .PP
 812 The restriction must be a legal "C" expression.
 813 It may not contain a closing bracket ('}').
 814 Inside the expression, the name VAL stands for the part of the actual
 815 (matching) operand.
 816 The expression may contain calls to procedures that are defined in the
 817 user-routines section.
 818 .DS
 819 .ft CW
 820 entry             ->  pattern '->' replacement ';' .
 821
 822 pattern           ->  instruction_descr
 823                       { ':' instruction_descr }
 824                       constraint .
 825
 826 replacement       ->  [ instruction_descr { ':' instruction_descr } ] .
 827
 828 instruction_descr -> opcode
 829                      white
 830                      [ operand_descr { ',' operand_descr } ] .
 831
 832 constraint        -> '{' anything '}' .
 833
 834 operand_descr     -> [ string_constant ]
 835                      [ variable_name ]
 836                      [ string_constant ] .
 837
 838 variable_name     -> identifier .
 839
 840 opcode            -> anything .
 841 .ft R
 842 .DE
 843 The symbol 'white' stands for white space (space or tab).
 844 An opcode can be any string not containing the special
 845 symbols ';', '{', '}', ':', ',', '->' or white space.
 846 To be recognized, it must begin with a letter.
 847 The opcode should either be a mnemonic of a target machine
 848 instruction or it should be one of the keywords ANY and labdef.
 849 ANY matches any actual opcode. labdef matches only label definitions.
 850 .PP
 851 If an operand description contains an identifier (as defined earlier),
 852 it is checked if the identifier is the name of a declared variable.
 853 This effects the semantics of the matching rules for the operand,
 854 as described in section 2.
 855 An operand may contain at most one such variable name.
 856 .PP
 857 The constraint must be a legal "C" expression, just as the operand restriction.
 858 It may call user-defined procedures and use or change the value of
 859 declared variables.
 860 It may also use the string variable REST,
 861 which contains the mnemonic of the first instruction of the
 862 rest of the input. (REST is a null-string if this mnemonic can
 863 not be determined).
 864 .DS
 865 .ft CW
 866 user_routines -> anything .
 867 .ft R
 868 .DE
 869 The remainder of the table consists of user-defined subroutines.