1 .\" $Id: ncg.doc,v 1.11 1994/06/24 10:02:07 ceriel Exp $
5 The table driven code generator
11 Second Revised Edition
15 Dept. of Mathematics and Computer Science
17 Amsterdam, The Netherlands
19 The Amsterdam Compiler Kit is a collection of tools
20 designed to help automate the process of compiler building.
21 Part of it is a table driven code generator,
24 and a program to check and translate machine description
27 This document provides a description of the internal workings of
29 and a description of syntax and semantics of the driving table.
30 This is required reading for those wishing to write a new table.
35 Part of the Amsterdam Compiler Kit is a code generator system consisting
36 of a code generator generator (\fIcgg\fP for short) and some machine
39 reads a machine description table and creates two files,
40 tables.h and tables.c.
41 These are then used together with other C code to produce
42 a code generator for the machine at hand.
44 This in turn reads compact EM code and produces
46 The remainder of this document will first broadly describe
47 the working of the code generator,
48 then the machine table will be described after which
49 some light is shed onto
50 the internal workings of the code generator.
52 The reader is assumed to have at least a vague notion about the
53 semantics of the intermediary EM code.
54 Someone wishing to write a table for a new machine
55 should be thoroughly acquainted with EM code
56 and the assembly code of the machine at hand.
58 What has changed since version 1 ?
60 This section can be skipped by anyone not familiar with the first version.
61 It is not needed to understand the current version.
63 This paper describes the second version of the code generator system.
64 Although the code generator itself is for the main part unchanged,
65 the table format has been drastically redesigned and the opportunities
66 to make faulty tables are reduced.
67 The format is now aesthaticly more pleasing (according to \fIme\fP that is),
68 mainly because the previous version was designed for one line code rules,
69 which did not work out that way.
71 The `SCRATCH' property is now automatically generated by
76 calls and their ilk are now no longer needed
77 (read: can no longer be forgotten)
78 and all this because the table now
80 what the machine instructions look like and what arguments they
83 Checks are now made for register types, so it is no longer possible
84 to generate a `regind2' token with a floating point register as a base.
85 In general, if the instructions of the machine are correctly defined,
86 it is no longer possible to generate code that does not assemble,
87 which of course does not mean that it is not possible to generate
88 assembly code that does not do what was intended!
90 Checks are made now for missing moves, tests, coercions, etc.
91 There is a form of procedure call now to reduce table size:
92 it is no longer necessary to write the code for conditional
93 instructions six times.
95 The inreg() pseudo-function returns other results!!
97 Global overview of the workings of the code generator.
101 tries to generate good code by simulating the stack
102 of the compiled program and delaying emission of code as long
104 It also keeps track of register contents, which enables it to
105 eliminate redundant moves, and tries to eliminate redundant tests
106 by keeping information about condition code status,
107 if applicable for the machine.
110 maintains a `fake stack' containing `tokens' that are built
111 by executing the pseudo code contained in the code rules given
113 One can think of the fake stack as a logical extension of the real
114 stack the compiled program will have when run.
115 Alternatively one can think of the real stack as an infinite extension
116 at the bottom of the fake stack.
117 Both ways, the concatenation of the real stack and the fake stack
118 will be the stack as it would have been on a real EM machine (see figure).
121 cw(3.5c) cw(3c) cw(3.5c)
122 cw(3.5c) cw(3c) cw(3.5c)
123 |cw(3.5c)| cw(3c) |cw(3.5c)| .
124 EM machine target machine
145 Relation between EM stack, real stack and fake stack.
147 During code generation tokens will be kept on the fake stack as long
148 as possible but when they are moved to the real stack,
149 by generating code for the push,
150 all tokens above\v'-.25m'\(dg\v'.25m'
152 \(dg in this document the stack is assumed to grow downwards,
153 although the top of the stack will mean the first element that will
156 the pushed tokens will be pushed also,
157 so the fake stack will not contain holes.
159 The information about the machine that
161 needs has to be given in a machine description table,
162 with as a major part a list of code rules telling
164 what to do when certain EM-instructions occur
165 with certain tokens on the fake stack.
166 Not all possible fake stack possibilities have to be given of course,
167 there is a possibility for providing rewriting rules, or
169 as they are called in this document.
175 find a pattern of EM instructions starting at the current one to
177 This pattern will usually be of length one but longer patterns can be used.
178 Process any pseudo-instructions found.
180 Select one of the possibly many stack patterns that go with this
181 EM pattern on the basis of heuristics, look ahead or both.
182 The cost fields provided in the token definitions and
183 instruction definitions are used
184 to compute costs during look ahead.
186 Force the current fake stack contents to match the pattern.
188 copying tokens to registers, making dummy transformations, e.g. to
189 transform a `local' into an `indexed from register' or might even
190 cause the move of the complete fake stack contents to the real stack
191 and then back into registers if no suitable coercions
192 were provided by the table writer.
194 Execute the pseudocode associated with the code rule just selected,
195 this may cause registers to be allocated,
196 code to be emitted etc..
198 Put tokens onto the fake stack to reflect the result of the operation.
200 Insert some EM instructions into the stream;
201 this is possible but not common.
203 Account for the cost.
204 The cost is kept in a (space, time) vector and look ahead decisions
205 are based on a linear combination of these.
206 The code generator calls on itself recursively during look ahead,
207 and the recursive incarnations return the costs they made.
208 The costs the top-level code generator makes is of course irrelevant.
210 The table that drives
212 is not read in every time,
213 but instead is used at compile time
216 to set parameters and to load pseudocode tables.
219 reads the table and produces large lists of numbers that are
220 compiled together with machine independent code to produce
221 a code generator for the machine at hand.
223 Part of the information needed is not easily expressed in this table
224 format and must be supplied in two separate files,
226 Their contents are described later in this document.
230 If the machine has more than enough registers to generate code with,
231 it is possible to reserve some of them for use as register variables.
232 If it has not, this section may be skipped and any references
233 to register variables in the rest of this document may be ignored.
235 The front ends generate messages to the back ends telling them which
236 local variables could go into registers.
237 The information given is the offset of the local, its size and type
238 and a scoring number, roughly the number of times it occurs.
240 The decision which variable to put in which register is taken by the
241 machine independent part of
243 with the help of a scoring function provided by the table writer in mach.c.
244 The types of variables known are
246 Just a variable of some integer type.
247 Nothing special known about it.
249 A floating point variable.
251 A loop control variable.
254 Usually they are better candidates to put in registers.
256 If register variables are used,
257 more functions must be supplied in mach.c.
258 These functions are explained later.
260 Description of the machine table
262 The machine description table consists of the
263 concatenation of the following sections:
275 Instruction definitions
287 This is the order in the table
288 but the descriptions in this document will use a slightly different
290 All sections except the first start with an uppercase header word.
291 Examples may be given in early stages that use knowledge that is explained
293 If something is not clear the first time, please read on.
294 All will clear up in a couple of pages.
296 Input is in free format, white space and newlines may be used
297 at will to improve legibility.
298 Identifiers used in the table have the same syntax as C identifiers,
299 upper and lower case considered different, all characters significant.
300 Here is a list of reserved words; all of these are unavailable as identifiers.
304 ADDR STACKINGRULES gen proc test
305 COERCIONS TESTS highw reg_any to
306 INSTRUCTIONS TIMEFACTOR inreg reg_float topeltsize
307 INT TOKENS is_rom reg_loop ufit
308 MOVES call kills reg_pointer uses
309 PATTERNS cost lab regvar with
310 PROPERTIES defined labeldef return yields
311 REGISTERS exact leaving reusing
312 SETS example loww rom
313 SIZEFACTOR fallthrough move samesign
316 C style comments are accepted.
318 /* this is a comment */
320 If the standard constant facility is not enough the C-preprocessor can
321 be used to enhance the table format.
323 Integers in the table have the normal C-style syntax.
324 Decimal by default, octal when preceded by a 0
325 and hexadecimal when preceded by 0x.
329 In the first part of the table some constants can be defined,
334 value being an integer or string.
335 Three constants must be defined here:
337 Number of bytes in a machine word.
338 This is the number of bytes
339 a \fBloc\fP instruction will put on the stack.
341 Number of bytes in a pointer.
342 This is the number of bytes
343 a \fBlal\fP instruction will put on the stack.
345 Number of bytes in the hole between AB and LB.
346 If the calling sequence just saves PC and LB this
347 size will be twice the pointersize.
349 EM_WSIZE and EM_PSIZE are checked when a program is compiled
350 with the resulting code generator.
353 to add to the offset of instructions dealing with locals
354 having positive offsets,
357 Other constants can be defined here to be used as mnemonics
360 Optional is the definition of a printformat for integers in the code file.
365 The string must be a valid printf(III) format,
366 and defaults to "%ld".
367 For example on the PDP-11 one can use
371 to satisfy the old UNIX assembler that reads octal unless followed by
372 a period, and the ACK assembler that follows C conventions.
374 Tables under control of source code control systems like
378 can put their id-string here, for example
382 These strings, like all strings in the table, will eventually
383 end up in the binary code generator produced.
385 Optionally one can give the factors with which the size and time
386 parts of the cost vector have to be multiplied to ensure they have the
387 same order of magnitude.
390 SIZEFACTOR = C\d3\u/C\d4\u
392 TIMEFACTOR = C\d1\u/C\d2\u
394 Above numbers must be read as rational numbers.
395 Defaults are 1/1 for both of them.
396 These constants set the default size/time tradeoff in the code generator,
397 so if TIMEFACTOR and SIZEFACTOR are both 1 the code generator will choose
398 at random between two code sequences where one has
399 cost (10,4) and the other has cost (8,6).
400 See also the description of the cost field below.
404 This part of the table defines the list of properties that can be used
405 to differentiate between register classes.
406 It consists of a list of user-defined
407 identifiers optionally followed by the size
408 of the property in parentheses, default EM_WSIZE.
409 Example for the PDP-11:
412 PROPERTIES /* The header word for this section */
414 GENREG /* All PDP registers */
415 REG /* Normal registers (allocatable) */
416 ODDREG /* All odd registers (allocatable) */
417 REGPAIR(4) /* Register pairs for division */
418 FLTREG(4) /* Floating point registers */
419 DBLREG(8) /* Same, double precision */
420 GENFREG(4) /* generic floating point */
421 GENDREG(8) /* Same, double precision */
422 FLTREGPAIR(8) /* register pair for modf */
423 DBLREGPAIR(16) /* Same, double precision */
424 LOCALBASE /* Guess what */
428 Registers are allocated by asking for a property,
429 so if for some reason in later parts of the table
430 one particular register must be allocated it
431 has to have a unique property.
435 The next part of the tables describes the various registers of the
436 machine and defines identifiers
437 to be used in later parts of the tables.
440 <register definitions> : REGISTERS <list of definitions>
441 <definition> : <registerlist> ':' <propertylist> <optional regvar> '.'
442 <register> : ident [ '(' string ')' ] [ '=' ident [ '+' ident ] ]
444 Example for the PDP-11:
449 r0,r2,r4 : GENREG,REG.
450 r1,r3 : GENREG,REG,ODDREG.
451 r01("r0")=r0+r1 : REGPAIR.
452 fr0("r0"),fr1("r1"),fr2("r2"),fr3("r3") : GENFREG,FLTREG.
453 dr0("r0")=fr0,dr1("r1")=fr1,
454 dr2("r2")=fr2,dr3("r3")=fr3 : GENDREG,DBLREG.
455 fr01("r0")=fr0+fr1,fr23("r2")=fr2+fr3 : FLTREGPAIR.
456 dr01("r0")=dr0+dr1,dr23("r2")=dr2+dr3 : DBLREGPAIR.
457 lb("r5") : GENREG,LOCALBASE.
458 sp : GENREG,STACKPOINTER.
459 pc : GENREG,PROGRAMCOUNTER.
462 The names in the left hand lists are names of registers as used
464 They can optionally be followed by a string in parentheses,
465 their name as far as the assembler is concerned.
466 The default assembler name is the same as the table name.
467 A name can also be followed by
473 = othername + othername
475 which says that the register is composed of the parts
477 The identifiers at the right hand side of the lists are
479 The end of each register definition is a period.
481 It might seem wise to list every property of a register,
482 so one might give r0 the extra property MFPTREG named after the not
483 too well known MFPT instruction on newer PDP-11 types,
484 but this is not a good idea,
485 especially since no use can be made of that instruction anyway.
486 Every extra property means the register set is more unorthogonal
489 execution time is influenced by that,
490 because it has to take into account a larger set of registers
491 that are not equivalent.
492 So try to keep the number of different register classes to a minimum.
493 When faced with the choice between two possible code rules
494 for a nonfrequent EM sequence,
495 one being elegant but requiring an extra property,
496 and the other less elegant,
497 elegance should probably loose.
499 Tables that implement register variables must mark registers to be used
500 for variable storage here by following the list of properties by one
503 regvar \fIor\fP regvar(reg_any)
508 meaning they are candidates for that type of variable.
509 All register variables of one type must be of the same size,
510 and they may have no subregisters.
511 Such registers are not available for normal code generation.
513 Stack token definition
515 The next part describes all possible tokens that can reside on
516 the fake stack during code generation.
517 Attributes of a token are described as a C struct declaration;
518 this is followed by the size of the token in bytes,
519 optionally followed by the cost of the token when used as an addressing mode
520 and the format to be used on output.
522 In general, when writing a table, it is not wise to try
523 to think of all necessary tokens in advance.
524 While writing the necessity or advisability for some token
525 will be seen and it can then be added together with the
526 stacking rules and coercions needed.
528 Tokens should usually be declared for every addressing mode
529 of the machine at hand and for every size directly usable in
530 a machine instruction.
531 Example for the PDP-11 (incomplete):
536 const2 = { INT num; } 2 cost(2,300) "$" num .
537 addr_local = { INT ind; } 2 .
538 addr_external = { ADDR off; } 2 "$" off.
540 regdef2 = { GENREG reg; } 2 "*" reg.
541 regind2 = { GENREG reg; ADDR off; } 2 off "(" reg ")" .
542 reginddef2 = { GENREG reg; ADDR off; } 2 "*" off "(" reg ")" .
543 regconst2 = { GENREG reg; ADDR off; } 2 .
544 relative2 = { ADDR off; } 2 off .
545 reldef2 = { ADDR off; } 2 "*" off.
548 Types allowed in the struct are ADDR, INT and all register properties.
549 The type ADDR means a string and an integer,
550 which is output as string+integer,
551 and arithmetic on mixed ADDR and INT is possible.
552 This is the right mode for anything that can be an
553 assembler address expression.
554 The type of the register in the token is strict.
555 At any assignment of an expression of type register to a token attribute
558 will check if the set of possible results from the expression is a subset
559 of the set of permissible values for the token attribute.
561 The cost-field is made up by the word
563 followed by two numbers in parentheses, the size and timecosts
564 of this token when output in the code file.
565 If omitted, zero cost is assumed.
566 While generating code,
568 keeps track of a linear combination of these costs together
569 with the costs of the instructions itself which we will see later.
570 The coefficients of this linear combination are influenced
573 The SIZEFACTOR and TIMEFACTOR constants,
578 that can adjust the time/space tradeoff to all positions
579 from 100% time to 100% space.
581 By supplying different code rules in certain situations
582 it is possible to get a code generator that can adjust its
583 code to the need of the moment.
584 This is probably most useful with small machines,
585 experience has shown that on the larger micro's and mini's
586 the difference between time-optimal and space-optimal code
589 The printformat consists of a list of strings intermixed with
590 attributes from the token.
591 Strings are output literally, attributes are printed according
592 to their type and value.
593 Tokens without a printformat should never be output,
598 Notice that tokens need not correspond to addressing modes;
599 the regconst2 token listed above,
600 meaning the sum of the contents of the register and the constant,
601 has no corresponding addressing mode on the PDP-11,
602 but is included so that a sequence of add constant, load indirect,
603 can be handled efficiently.
604 This regconst2 token is needed as part of the path
606 REG -> regconst2 -> regind2
608 of which the first and the last "exist" and the middle is needed
609 only as an intermediate step.
611 Tokens with name `LOCAL' or `DLOCAL' are a special case when
612 register variables are used, this is explained further in the
613 section on token descriptions.
617 Usually machines have certain collections of addressing modes that
618 can be used with certain instructions.
619 The stack patterns in the table are lists of these collections
620 and since it is cumbersome to write out these long lists
621 every time, there is a section here to give names to these
623 Please note that it is not forbidden to write out a set
624 in the remainder of the table,
625 but for clarity it is usually better not to.
627 Example for the PDP-11 (incomplete):
632 src2 = GENREG + regdef2 + regind2 + reginddef2 + relative2 +
633 \h'\w'= 'u'reldef2 + addr_external + const2 + LOCAL + ILOCAL +
634 \h'\w'= 'u'autodec + autoinc .
635 dst2 = src2 - ( const2 + addr_external ) .
636 xsrc2 = src2 + ftoint .
637 src1 = regdef1 + regind1 + reginddef1 + relative1 + reldef1 .
639 src1or2 = src1 + src2 .
640 src4 = relative4 + regdef4 + DLOCAL + regind4 .
643 Permissible in the set construction are all the usual set operators, i.e.
651 Normal operator priorities apply, and parentheses can be
653 Every token identifier is also a set identifier
654 denoting the singleton collection of tokens containing
656 Every register property as defined above is also a set
657 matching all registers with that property.
658 The standard set identifier ALL denotes the collection of
661 Instruction definitions
663 In the next part of the table the instructions for the machine
664 are declared together with information about their operands.
665 Example for the PDP-11(very incomplete):
667 .ta 8 16 24 32 40 48 56 64
673 /* Normal instructions */
676 add src2:ro,dst2:rw:cc cost(2,450).
677 ash src2:ro,REG:rw:cc .
678 ashc src2:ro,REGPAIR+ODDREG:rw .
683 /* floating point instructions */
685 movf "ldf" fsrc,freg .
686 movf "stf" freg,fdst .
688 As the examples show an instruction definition consists of the name
690 optionally followed by an assembler mnemonic in
691 quotes-default is the name itself-and then
693 optionally followed by the cost and then a period.
694 If the cost is omitted the cost just after the word
695 INSTRUCTIONS is assumed,
696 if that is also omitted the cost is zero.
697 The cost must be known by
699 of course if it has multiple
700 code generation paths to choose from.
702 For each operand we have the set of possible token values,
703 followed by a qualifier that can be
705 signifies that this operand is read only,
706 so it can be replaced by a register with the same contents
709 signifies that the operand is read-write
711 signifies that the operand is write only.
713 says that after the instruction is finished, the condition codes
714 are set to this operand.
715 If none of the operands have the :cc qualifier set,
717 will assume that condition codes were unaffected
720 The first three qualifiers are of course mutually exclusive.
721 The :ro qualifier does not cause any special action in the current
722 implementation, and the :wo and :rw qualifiers are treated equal.
723 It must be recommended however to be precise in the specifications,
724 since later enhancements to the code generator might use them.
726 As the last examples show it is not necessary to give one definition
728 There are machines that have very unorthogonal instruction sets,
729 in fact most of them do,
730 and it is possible to declare each possible combination
734 program will check all uses of the instruction to find out which
737 Although not in the PDP-11 example above there is a possibility
738 to describe instructions that have side effects to registers not
740 The only thing possible is to say that the instruction is destructive
741 to some registers or the condition codes, by following the operand list
744 and a list of the things destroyed.
745 Example for some hypothetic accumulator machine:
747 add source2:ro kills ACCU :cc .
750 The cost fields in the definitions for tokens and instructions
751 are added together when generating code.
752 It depends on the machine at hand whether the costs are orthogonal
753 enough to make use of both these costs,
754 in extreme cases every combination of instructions and operands
755 can be given in this section,
756 all with their own costs.
760 Throughout the rest of the table expressions can be used in some
762 This section will give the syntax and semantics of expressions.
763 There are four types of expressions: integer, address, register and undefined.
764 Really the type register is nonexistent as such,
765 for each register expression
767 keeps a set of possible values,
768 and this set can be seen as the real type.
770 Type checking is performed by
772 An operator with at least one undefined operand returns undefined except
773 for the defined() function mentioned below.
774 An undefined expression is interpreted as FALSE when it is needed
776 It is the responsibility of the table writer to ensure no undefined
777 expressions are ever used as initialisers for token attributes.
778 This is unfortunately almost impossible to check for
782 Basic terms in an expression are
784 A number is a constant of type integer.
785 Also usable is an identifier defined to a number in the constant
788 A string within double quotes is a constant of type address.
789 All the normal C style escapes may be used within the string.
790 Also usable is an identifier defined to a string in the constant
793 This must be read as a grep-pattern.
794 It evaluates to a string that is the label name for the
795 temporary label meant.
796 More about this in the section on code rules.
798 The name of a register is a constant of type register.
800 A dollarsign followed by a number is the representation of the argument
801 of EM instruction \fI\fP.
802 The type of the operand is dependent on the instruction,
803 sometimes it is integer,
804 sometimes it is address.
805 It is undefined when the instruction has no operand.
806 Instructions with type-letter w can occur without an operand.
807 This can be checked in the code rule with the defined() pseudo function.
809 If it is unimaginable for the operand of the instruction ever to be
810 something different from a plain integer, the type is integer,
811 otherwise it is address.
813 Those who want to know it exactly, the integer instruction types
814 are the instructions marked with the
815 type-letters c,f,l,n,o,s,r,w,z in the EM manual.
818 makes all necessary conversions,
819 like adding EM_BSIZE to positive arguments of instructions
821 prepending underlines to global names,
822 converting code labels into a unique representation etc.
823 Details about this can be found in the section about
824 machine dependent C code.
826 This in general means the token mentioned first in the
828 When used inside an expression the token must be a simple register.
829 Type of this is register.
831 This means attribute "off" of the first stack pattern token.
832 Type is the same as that of attribute "off".
833 To use this expression implies a check that all tokens
834 in the set used have the same attribute in the same place.
836 This means attribute "off" in the `current' token.
837 This can only be used when no confusion is possible about which token
838 was meant, eg. in the optional boolean expressions following token sets
839 in the move and test rules, in coercions or in the kills section inside
843 This is the first subregister of the first token.
844 Previous comments apply.
846 A percent sign followed by a lowercase letter
847 stands for an allocated register.
848 This is the second allocated register.
850 The second subregister of the first allocated register.
852 All normal C operators apply to integers,
853 the + operator on addresses behaves as one would expect
854 and the only operators allowed on register expressions
856 Furthermore there are some special `functions':
858 Returns 1 if expression
860 is defined, 0 otherwise.
862 Returns 1 if integer expression
868 Returns 1 if integer expression
870 fits as a signed integer
875 Same as above but now for unsigned
878 Integer expression giving word
880 from the \fBrom\fP descriptor
881 pointed at by EM instruction
885 Undefined if that descriptor does not exist.
887 Integer expression indicating whether EM instruction number
889 in the EM-pattern refers to ROM. This may be useful for generating
890 position-independent code with the ROM in read-only memory.
892 enables one to see the difference between ROM references and other data
895 Returns the lower half of the argument of EM instruction number
897 This is used to split the arguments of a \fBldc\fP instruction.
901 The next two `functions' are only needed in a table that
902 implements register variables.
904 Returns the status of the local variable with offset
908 negative if the local was not allowed as a register
910 zero if it was allowed but not assigned to a register,
911 and the type of the register if it was assigned to a register.
912 This makes it possible to write
914 inreg($1)==reg_pointer
918 Type of this is register.
919 It returns the register the local with offset
922 The table writer guarantees the register is one of type
926 one of reg_any, reg_loop, reg_pointer or reg_float.
929 is omitted reg_any is assumed.
930 Undefined if inreg(\fIe\fP)<=0 .
932 The next two `functions' are only needed in a table that
933 uses the top element size information.
934 .IP topeltsize($a) 16
935 Returns the size of the element on top of the EM-stack at the label
936 identified by $a. This can be used to put the top of the stack in a
937 register at the moment of an unconditional jump. At an unconditional jump,
938 the size of the top-element will always look 0.
940 Returns 1 if the label identified by $a can be reached via fallthrough, 0
945 Throughout the rest of the table tokens must be described,
946 be it as operands of instructions or as stack-replacements.
947 In all those cases we will speak about a token description.
948 The possibilities for these will be described here.
950 All expressions of type register are token descriptions.
951 The construct %1 means the token matched first in the stack pattern.
952 All other token descriptions are those that are built on the spot.
955 { <tokenname> , <list of token attribute initializing expressions> }
957 All expressions are type-checked by
959 and the number of initializers is also checked.
961 A special case of the last token descriptions occurs when
962 the token name is `LOCAL' or `DLOCAL' and the table uses register
963 variables. The first token attribute then must be of type integer and
964 the token description is automagically replaced by the register chosen
965 if the LOCAL (wordsize) or DLOCAL (twice the wordsize) was assigned
970 The largest section of the tables consists of the code generation rules.
971 They specify EM patterns, stack patterns, code to be generated etc.
972 Broadly the syntax is
974 code rule : EM-part code-part
975 EM-part : EM-pattern | procedure-heading
976 code-part : code-description | procedure-call
977 code-description : stackpattern kills allocates generates yields leaving
979 Ignoring the "procedure"-part for now, the description for the EM-pattern
980 and the code-description follows.
981 Almost everything here is optional, the minimum code rule
986 that will simply throw away
992 The EM pattern consists of a list of EM mnemonics
995 optionally followed by a boolean expression.
1000 will match a single \fBloe\fP instruction,
1002 pat \fBloc\fP \fBloc\fP \fBcif\fP $1==2 && $2==8
1004 is a pattern that will match
1012 pat \fBlol\fP \fBinc\fP \fBstl\fP $1==$3
1014 will match for example
1016 .ta 10m 20m 30m 40m 50m 60m
1017 \fBlol\fP 6 \fBlol\fP -2 \fBlol\fP 4
1018 \fBinc\fP \fBinc\fP but \fInot\fP \fBinc\fP
1019 \fBstl\fP 6 \fBstl\fP -2 \fBstl\fP -4
1021 A missing boolean expression evaluates to TRUE.
1023 The code generator will match the longest EM pattern on every occasion,
1024 if two patterns of the same length match the first in the table will be chosen,
1025 while all patterns of length greater than or equal to three are considered
1026 to be of the same length.
1027 This rule of three is an unfortunate implementation dependent restriction,
1028 but patterns longer than three EM instructions are luckily not needed
1031 The EM mnemonic may also be the pseudo-instruction \fBlab\fP, which matches
1032 a label. Its argument can be used in testing on topeltsize and
1033 fallthrough. When this pattern is specified, the label should be defined
1038 Following the EM-pattern there may be more than one code
1041 will choose using heuristics and the cost
1042 information provided with the instruction and token
1044 Owing to parsing reasons of the table, the word
1047 is mandatory when there are more code rules attached to one
1049 The stack pattern may be empty however.
1053 The optional stack pattern is a list of token sets preceded by the word
1055 The token sets are usually represented by set identifiers for clarity.
1056 No boolean expression is allowed here.
1057 The first expression is the one that matches the top of the stack.
1059 If the pattern is followed by the word STACK
1060 it only matches if there is nothing
1061 else on the fake stack,
1062 and the code generator will stack everything not matched at the start
1065 The pattern can be preceded with the word
1069 that tells the code generator not to try to coerce to the pattern
1070 but only to use it when it is already present on the fake stack.
1071 There are two reasons for this construction,
1072 correctness and speed.
1073 It is needed for correctness when the pattern contains a register
1074 that is not transparent when data is moved through it.
1076 Example: on the PDP-11 the shortest code for
1088 if the floating point processor is in double
1089 precision mode and fr0 is free.
1090 Unfortunately this is not correct since a trap can occur on certain
1092 This could happen if there was a stack pattern for \fBsti\fP\ 8
1097 The code generator would then find that coercing the 8-byte global _a
1098 to a floating point register and then storing it to _b was the cheapest,
1099 if the space/time knob was turned far enough to space.
1100 This can be prevented by changing the stack pattern to
1104 It is unfortunate that the type information is no longer present,
1105 since if _a really is a floating point number the move could be
1108 The second reason for the
1111 When the code generator has a long list of possible stack patterns
1112 for one EM pattern it can waste much time trying to find coercions
1113 to all of them, while the mere presence of such a long list
1114 indicates that the table writer has given many special cases.
1115 Prepending all the special cases by
1117 will stop the code generator from trying to find things
1118 that either cannot be done,
1119 or are too expensive anyway.
1121 So in general it is wise to prepend all stack patterns that
1122 cannot be made by coercions with
1127 and STACK in the stack pattern has the effect that the rule will
1128 only be taken if there is nothing else on the fake stack.
1132 The optional kills part describes certain tokens
1133 that should neither remain on
1134 the fake stack, nor remembered as contents of registers.
1135 This is usually only required with store operations.
1136 The entire fake stack, except for the part matched in the stack pattern,
1137 is searched for tokens matching the expression and they are copied
1139 Every register that contains the token is marked as empty.
1143 kills <list of things to kill separated by commas>
1144 thing to kill : token set optionally followed by boolean expression
1148 kills regind2 %reg != lb || %off == $1
1150 is a kills part used for example in the \fBinl\fP or \fBstl\fP code rule.
1151 It removes all register offsetted tokens where the register is not the
1152 localbase plus the local in which the store is done.
1153 The necessity for this can be seen from the following example:
1159 Without a proper kills part in the rule for \fBinl\fP code would
1160 be generated as here
1165 so local 6 would be given the new value of local 4 instead of the old
1166 as the EM code prescribed.
1168 When generating code for an EM-instruction like
1170 it is necessary to write a line in the table like
1172 kills all_except_constant_or_register
1174 where the long identifier is a set containing all tokens
1175 that can be the destination of some random indirect store.
1176 These indirect stores are the main reason to prevent this
1178 line to be deduced automatically by
1181 When generating something like a branch instruction it
1182 might be needed to empty the fake stack completely.
1183 This can of course be done with
1187 or by ending the stack pattern with the word STACK,
1188 if the stack pattern does not start with
1190 The latter does not erase the contents of registers.
1192 It is unfortunate that this part is still present in the table
1193 but it is too much for now to let the
1195 program discover what rules ruin what kind of tokens.
1196 Maybe some day .....
1200 The optional register allocation part describes the registers needed.
1203 uses <list of use elements separated by commas>
1205 where itemlist is a list of three kinds of things:
1208 < a token description >, for example %1.
1210 This will instruct the code generator that all registers
1211 contained in this token can be reused if they are not used
1212 in another token on the fakestack,
1213 so that they are available for allocation in this
1216 if they were only used in that token.
1219 a register property.
1221 This will allocate a register with that property,
1222 that is marked as empty at this point.
1223 Look ahead can be performed if there is more than one register available.
1225 a register property with initialization.
1227 This will allocate the register as in 2) but will also
1229 This eases the task of the code generator because it can
1230 find a register already filled with the right value
1237 will allocate an odd register, while
1239 uses REG={regind2,lb,$1}
1241 will allocate a register while simultaneously filling it with
1244 Inside the coercion from xsrc2 to REG in the PDP-11 table
1245 the following line can be found.
1247 uses reusing %1, REG=%1
1249 This tells the code generator that registers contained in %1 can be used
1250 again and asks to fill the register allocated with %1.
1251 So if %1={regind2,r3,"4"} and r3 is not in use elsewhere on the fake stack
1252 the following code might be generated.
1256 In the rest of the line the registers allocated can be named by
1257 %a and %b.1,%b.2, i.e. with lower case letters
1258 in order of allocation.
1262 Code to be generated, also optionally, is specified as
1265 followed by a list of items of the following kind:
1267 An instruction name followed by a comma-separated
1268 list of token descriptions.
1270 will search the instruction definitions for the machine to find a suitable
1272 At code generation time the assembler name of the
1273 instruction will be output followed by a space,
1274 followed by a comma separated list of tokens.
1276 In the table an instruction without operands must be
1277 followed by a period.
1282 to accept his syntax without it.
1288 This has the following syntax:
1290 move <token description>,<token description>
1292 Moves are handled specially since that enables the code generator
1293 to keep track of register contents.
1296 move r3,{regind2,lb,$1}
1298 will generate code to move r3 to $1(r5) except when
1299 r3 already was a copy of $1(r5).
1300 Then the code will be omitted.
1301 The rules describing how to move things to each other
1302 can be found in the move definitions section described below.
1304 For machines that have condition codes,
1305 which alas most of them do,
1306 there are provisions to remember condition code settings
1307 and prevent needless testing.
1308 To set the condition code to a token put in the code the following call:
1310 test <token description>
1312 This will generate a test if the condition codes
1313 were not already set to that token.
1314 The rules describing how to test things
1315 can be found in the test definitions section described below.
1316 See also the :cc qualifier that can be used at instruction
1322 Only used when register variables are in use.
1323 This statement causes a call to the machine dependent
1326 Explanation of this must wait for the description of the
1331 statement. Its only argument should be that of the
1333 pseudo-instruction. This is needed to generate local labels when the
1334 top element size information is used. It takes the form
1339 A temporary label of the form <digit>: may be placed here.
1340 Expressions of the form [0-9][bf] in this code rule
1341 generate the same string as is used for this label.
1342 The code generator system could probably easily be changed
1343 to make this work for assemblers that do not support this
1344 type of label by generating unique labels itself.
1345 Implementation of this is not contemplated at the moment.
1349 The optional stack replacement is a possibly empty list
1350 of tokens to be pushed onto the fake stack.
1351 It start with the word
1353 and is followed by a list of token descriptions.
1355 All tokens matched by the stack pattern at the beginning of the code rule
1356 are first removed and their registers deallocated.
1357 Items are pushed in the order of appearance.
1358 This means that the last item will be on the top of the
1359 stack after the push.
1360 So if the stack pattern contained two sets
1361 and they must be pushed back unchanged,
1362 they have to be specified as stack replacement
1366 and not the other way around.
1367 This is known to cause errors in tables so watch out for
1372 In exceptional cases it might be useful to leave part of an EM-pattern
1374 For example, a \fBsdl\fP instruction might
1375 be split into two \fBstl\fP instructions
1376 when there is no 4-byte quantity on the stack.
1377 The EM replacement part allows
1378 one to express this.
1379 It is activated by the word
1384 leaving \fBstl\fP $1 \fBstl\fP $1+2
1386 The instructions are inserted in the stream so that they can match
1387 the first part of a pattern in the next step.
1388 Note that since the code generator traverses the EM instructions in a strict
1390 it is impossible to let the EM replacement match later parts of a pattern.
1391 So if there is a pattern
1393 \fBloc\fP \fBstl\fP $1==0
1397 \fBloc\fP 0 \fBsdl\fP 4
1399 the \fBloc\fP\ 0 will be processed first,
1400 then the \fBsdl\fP might be split into two \fBstl\fP's but the pattern
1405 A list of examples for the PDP-11 is given here.
1406 Far from being complete it gives examples of most kinds
1410 pat loc yields {const2, $1}
1412 pat ldc yields {const2, loww($1)} {const2, highw($1)}
1414 These simple patterns just push one or more tokens onto the fake stack.
1418 with REG yields {regind2,%1,$1}
1419 with exact regconst2 yields {regind2,%1.reg,$1+%1.off}
1420 with exact addr_external yields {relative2,$1+%1.off}
1421 with exact addr_local yields {LOCAL, %1.ind + $1,2}
1423 This pattern shows the possibility to do different things
1424 depending on the fake stack contents,
1425 there are some rules for some specific cases plus a general rule,
1428 that can always be taken after a coercion,
1433 uses REG={LOCAL, SL, 2}, REG={const2,$1-1}
1435 move {regind2,%a, SL},%a
1436 sob %b,{label,1b} yields %a
1438 This rule shows register allocation with initialisation,
1439 and the use of a temporary label.
1440 The constant SL used here is defined to be the offset from lb
1442 that is pushed by the Pascal compiler as the last argument of
1447 with regconst2 xsrc2
1449 gen move %2,{regind2,%1.reg,$1+%1.off}
1450 with addr_external xsrc2
1452 gen move %2,{relative2,$1+%1.off}
1454 This rule shows the use of a
1456 part in a store instruction.
1457 The set allexeptcon contains all tokens that can be the destination
1458 of an indirect store.
1464 gen move %1,{relative4,$1}
1468 movfi %1.reg,{relative4,$1}
1472 gen move %1, {relative2, $1 }
1473 move %2, {relative2, $1+2}
1477 shows the use of the
1479 clause in both qualities,
1480 the first is for correctness,
1481 the second for efficiency.
1482 The third rule is taken by default,
1483 resulting in two separate stores,
1484 nothing better exists on the PDP-11.
1489 gen sub %1,%2 yields %2
1490 with exact REG src2-REG
1496 has a normal first part,
1497 and a hand optimized special case as its second part.
1502 gen mul %2,%1 yields %1
1504 gen mul %1,%2 yields %2
1506 This shows the general property for rules with commutative
1508 heuristics or look ahead will have to decide which rule is the best.
1511 pat loc sli $1==1 && $2==2
1513 gen asl %1 yields %1
1515 A simple rule involving a longer EM-pattern,
1516 to make use of a specialized instruction available.
1519 pat loc loc cii $1==1 && $2==2
1522 gen movb %1,%a yields %a
1524 A somewhat more complicated example of the same.
1530 pat loc loc loc cii $1>=0 && $2==2 && $3==4
1531 leaving loc $1 loc 0
1533 Shows a trivial example of EM-replacement.
1534 This is a rule that could be done by the
1536 if word order in longs was defined in EM.
1537 On a `big-endian' machine the two replacement
1538 instructions would be the other way around.
1543 gen bic {const2,~%1.num},%2 yields %2
1545 gen bic {const2,~%2.num},%1 yields %1
1550 Shows the way to handle the absence
1552 .I and -instruction.
1558 gen ash %1,%a yields %a
1560 Shows the building of a word-size set.
1563 pat lae aar $2==2 && rom($1,3)==1 && rom($1,1)==0
1566 pat lae aar $2==2 && rom($1,3)==1 && rom($1,1)!=0
1567 leaving adi 2 adp 0-rom($1,1)
1569 Two rules showing the use of the rom pseudo function,
1570 and some array optimalisation.
1578 The stack pattern guarantees that everything will be stacked
1579 before the jump is taken.
1581 pat lab topeltsize($1)==2 && !fallthrough($1)
1582 gen labeldef $1 yields r0
1584 pat lab topeltsize($1)==2 && fallthrough($1)
1587 labeldef $1 yields r0
1589 pat lab topeltsize($1)!=2
1594 pat bra topeltsize($1)==2
1599 pat bra topeltsize($1)!=2
1603 The combination of these patterns make sure that the top of the EM-stack will
1604 be in register r0 whenever necessary. The top element size mechanism will
1605 also show a size of 0 whenever a conditional branch to a label
1606 occurs. This saves a lot of patterns and hardly decreases performance.
1607 When the same register is used to return function results, this can save
1608 many moves to and from the stack.
1613 gen jsr pc,{label, $1}
1616 Same comments as previous rule.
1619 pat lfr $1==2 yields r0
1620 pat lfr $1==4 yields r1 r0
1622 Shows the return area conventions of the PDP-11 table.
1623 At this point a reminder:
1626 instruction, and some other instructions must leave
1627 the function return area intact.
1628 See the defining document for EM for exact information.
1636 This shows a rule for
1638 in a table not using register variables.
1639 In a table with register variables the
1641 part would just contain
1647 uses REG={const2,$1/2}
1649 mov {autoinc,%2},{autoinc,%1}
1654 already uses three registers of the same type.
1656 contains code to check all rules
1657 to see if they can be applied from an empty fakestack.
1658 It uses the marriage thesis from Hall,
1659 a thesis from combinatorial mathematics,
1664 with src2 src2 yields %1 %2
1666 This rule shows the exchanging of two elements on the fake stack.
1668 Code rules using procedures
1670 To start this section it must be admitted at once that the
1671 word procedure is chosen here mainly for its advertising
1673 It more resembles a glorified goto but this of course can
1674 not be admitted in the glossy brochures.
1675 This document will continue to use the word
1678 The need for procedures was felt after the first version of
1679 the code generator system was made,
1680 mainly because of conditional instructions.
1681 Often the code sequences for
1689 were identical apart from one opcode in the code rule.
1690 The code sequence had to be written out six times however.
1691 Not only did this increase the table size and bore the
1692 table writer, it also led to errors when changing the table
1693 since it happened now and then that five out of six
1696 In general the procedures in this table format are used to
1697 keep one copy instead of six of the code rules for all
1698 sorts of conditionals and one out of two for things like
1699 increment/decrement.
1701 And now the syntax, first the procedure definition,
1702 which must indeed be defined before the call because
1705 The procedure heading replaces the EM-pattern in a code rule
1706 and looks like this:
1708 proc <identifier> <optional example>
1710 The identifier is used in later calls and the example must
1711 be used if expressions like $1 are used in the code rule.
1713 <optional example> : example <list of EM-instructions>
1715 so an example looks just like an EM-pattern, but without
1716 the optional boolean expression.
1717 The example is needed to know the types of $1 expressions.
1718 The current version of
1720 does not check correctness of the example, so be careful.
1722 A procedure is called with string-parameters,
1723 that are assembler opcodes.
1724 They can be accessed by appending the string `[<number>]'
1725 to a table opcode, where <number> is the parameter number.
1726 The string `*' can be used as an equivalent for `[1]'.
1727 Just in case this is not clear, here is an example for
1728 a procedure to increment/decrement a register.
1731 incop REG:rw:cc . /* in the INSTRUCTIONS part of course */
1735 gen incop* %1 yields %1
1737 The procedure is called with parameter "inc" or "dec".
1739 The procedure call is given instead of the code-part of the
1740 code rule and looks like this
1742 call <identifier> '(' <comma-separated list of strings> ')'
1744 which leads to the following large example:
1747 proc bxx example beq
1748 with src2 src2 STACK
1752 pat blt call bxx("jlt")
1753 pat ble call bxx("jle")
1754 pat beq call bxx("jeq")
1755 pat bne call bxx("jne")
1756 pat bgt call bxx("jgt")
1757 pat bge call bxx("jge")
1762 We now jump back to near the beginning of the table
1763 where the move definitions are found.
1764 The move definitions directly follow the instruction
1767 In certain cases a move is called for,
1768 either explicitly when a
1770 instruction is used in a code rule,
1771 or implicitly in a register initialization.
1772 The different code rules possible to move data from one
1773 spot to another are described here.
1774 Example for the PDP-11:
1776 .ta 8 16 24 32 40 48 56 64
1779 from const2 %num==0 to dst2
1785 from FLTREG to longf4-FLTREG
1788 from longf4-FLTREG to FLTREG
1791 The example shows that the syntax is just
1793 from <source> to <destination> gen <list of instructions>
1795 Source and destination are a token set, optionally followed by
1796 a boolean expression.
1797 The code generator will take the first move that matches,
1798 whenever a move is necessary.
1800 checks whether all moves called for in the table are present.
1804 This part describes the instructions necessary to set the condition codes
1806 These rules are needed when the
1808 instruction is used in code rules.
1809 Example for the PDP-11:
1811 .ta 8 16 24 32 40 48 56 64
1819 to test <source> gen <instruction list>
1821 Source is the same thing as in the move definition.
1823 checks whether all tests called for in the table are present.
1825 Some explanation about the rules behind coercions
1827 A central part in code generation is taken by the
1829 It is the responsibility of the table writer to provide
1830 all necessary coercions so that code generation can continue.
1831 The minimal set of coercions are
1832 the coercions to unstack every token expression,
1833 in combination with the rules to stack every token.
1834 It should not be possible to smuggle a table through
1836 without these basic set available.
1838 If these are present the code generator can always make the necessary
1839 transformations by stacking and unstacking.
1840 Of course for code quality it is usually best to provide extra coercions
1841 to prevent this stacking to take place.
1843 discriminates three types of coercions:
1845 Unstacking coercions.
1846 This category can use the
1850 Splitting coercions, these are the coercions that split
1851 larger tokens into smaller ones.
1853 Transforming coercions, these are the coercions that transform
1854 a token into another of the same size.
1855 This category can use the
1859 When a stack configuration does not match the stack pattern
1861 are searched for in the following order:
1863 First tokens are split if necessary to get their sizes right.
1865 Then transforming coercions are found that will make the pattern match.
1867 Finally if the stack pattern is longer than the fake stack contents
1868 unstacking coercions will be used to fill up the pattern.
1870 At any point, when coercions are missing so code generation could not
1871 continue, the offending tokens are stacked.
1875 The next part of the table defines the stacking rules for the machine.
1876 Each token that may reside on the fake stack must have a rule attached
1877 to put it on the real stack.
1878 Example for the PDP-11:
1880 .ta 8 16 24 32 40 48 56 64
1883 from const2 %num==0 to STACK
1884 gen clr {autodec,sp}
1887 gen mov %1,{autodec,sp}
1889 from regconst2 to STACK
1890 gen mov %1.reg,{autodec,sp}
1891 add {addr_external, %1.off},{regdef2,sp}
1893 from DBLREG to STACK
1894 gen movf %1,{autodec,sp}
1896 from FLTREG to STACK
1897 gen movfo %1,{autodec,sp}
1899 from regind8 to STACK
1902 add {addr_external, 8+%1.off},%a
1903 mov {autodec, %a},{autodec,sp}
1904 mov {autodec, %a},{autodec,sp}
1905 mov {autodec, %a},{autodec,sp}
1906 mov {autodec, %a},{autodec,sp}
1909 These examples should be self-explanatory, except maybe for the last one.
1910 It is possible inside a stacking-rule to use a register.
1911 Since however the stacking might also take place at a moment
1912 when no registers are free, it is mandatory that for each token
1913 there is one stackingrule that does not use a register.
1914 The code generator uses the first rule possible.
1918 The next part of the table defines the coercions that are possible
1919 on the defined tokens.
1920 Example for the PDP-11:
1927 gen mov {autoinc,sp},%a yields %a
1931 gen movf {autoinc,sp},%a yields %a
1935 gen mov {autoinc,sp},%a.1
1936 mov {autoinc,sp},%a.2 yields %a
1938 These three coercions just deliver a certain type
1939 of register by popping it from the real stack.
1942 from LOCAL yields {regind2,lb,%1.ind}
1944 from DLOCAL yields {regind4,lb,%1.ind}
1946 from REG yields {regconst2, %1, 0}
1948 These three are zero-cost rewriting rules.
1951 from regconst2 %1.off==1
1952 uses reusing %1,REG=%1.reg
1953 gen inc %a yields %a
1956 uses reusing %1,REG=%1.reg
1957 gen add {addr_external, %1.off},%a yields %a
1962 add {const2, %1.ind},%a yields %a
1964 The last three are three different cases of the coercion
1965 register+constant to register.
1966 Only in the last case is it always necessary to allocate
1968 since arithmetic on the localbase is unthinkable.
1972 uses reusing %1, REG=%1 yields %a
1975 uses FLTREG=%1 yields %a
1978 uses DBLREG=%1 yields %a
1982 gen bisb %1,%a yields %a
1984 These examples show the coercion of different
1985 tokens to a register of the needed type.
1986 The last one shows the trouble needed on a PDP-11 to
1987 ensure bytes are not sign-extended.
1988 In EM it is defined that the result of a \fBloi\fP\ 1
1989 instruction is an integer in the range 0..255.
1992 from REGPAIR yields %1.2 %1.1
1994 from regind4 yields {regind2,%1.reg,2+%1.off}
1995 {regind2,%1.reg,%1.off}
1997 from relative4 yields {relative2,2+%1.off}
2000 The last examples are splitting rules.
2002 The examples show that
2003 all coercions change one token on the fake stack into one or more others,
2004 possibly generating code.
2005 The STACK token is supposed to be on the fake stack when it is
2006 really empty, and can only be changed into one other token.
2008 The files mach.h and mach.c
2010 The table writer must also supply two files containing
2011 machine dependent declarations and C code.
2012 These files are mach.h and mach.c.
2014 Types in the code generator
2016 Three different types of integer coexist in the code generator
2017 and their range depends on the machine at hand.
2018 They are defined depending on the Target EM_WSIZE, or TEM_WSIZE,
2020 The type 'int' is used for things like counters that won't require
2021 more than 16 bits precision.
2022 The type 'word' is used among others to assemble datawords and
2024 The type 'full' is used for addresses and is of type 'long' if
2025 TEM_WSIZE>2 or TEM_PSIZE>2.
2027 In macro and function definitions in later paragraphs implicit typing
2028 will be used for parameters, that is parameters starting with an 's'
2029 will be of type string, and the letters 'i','w','f' will stand for
2030 int, word and full respectively.
2032 Global variables to work with
2034 Some global variables are present in the code generator
2035 that can be manipulated by the routines in mach.h and mach.c.
2037 The declarations are:
2040 FILE *codefile; /* code is emitted on this stream */
2041 word part_word; /* words to be output are put together here */
2042 int part_size; /* number of bytes already put in part_word */
2043 char str[]; /* Last string read in */
2044 long argval; /* Last int read and kept */
2049 In the file mach.h a collection of macros is defined that have
2050 to do with formatting of assembly code for the machine at hand.
2051 Some of these macros can of course be left undefined in which case the
2052 macro calls are left in the source and will be treated as
2054 These functions can then be defined in \fImach.c\fR.
2056 The macros to be defined are:
2058 Must print the magic incantations that will mark the symbol \fI\fR
2059 to be exported to other modules.
2060 This is the translation of the EM \fBexa\fP and \fBexp\fP instructions.
2062 Same to import the symbol.
2063 Translation of \fBina\fP and \fBinp\fP.
2065 Must print the definition of procedure label \fIs\fR.
2066 If left undefined the newilb() macro is used instead.
2068 Must print the definition of instruction label \fIs\fR.
2070 Must print the definition of data label \fIs\fR.
2072 Must define data label
2077 Must declare a piece of memory initialized to BSS_INIT(see below)
2083 Format to be used when converting constant arguments of
2084 EM instructions to string.
2085 Argument to be formatted will be 'full'.
2087 Format to be used for integer part of label+constant,
2088 argument will be 'full'.
2089 .IP fmt_ilb(ip,il,s)
2090 Must use the numbers
2094 that are a procedure number
2095 and a label number respectively and copy a string to
2097 that must be unique for that combination.
2098 This procedure is optional, if it is not given ilb_fmt
2099 must be defined as below.
2101 Format to be used for creation of unique instruction labels.
2102 Arguments will be a unique procedure number (int) and the label
2105 Format to be used for printing numeric data labels.
2106 Argument will be 'int'.
2108 Format to be used for generation of labels for
2109 space generated by a
2112 Argument will be 'int'.
2114 Format to be used for printing of the address of an element in
2117 Arguments will be the offset in the
2119 block (word) and the number of the
2123 Must generate output that will assemble into one machine word.
2125 Must generate output that will put the address of the instruction label
2126 into the datastream.
2128 Must generate output that will put the address of the data label
2129 into the datastream.
2131 Must take the string in
2133 that is a nonnumeric global label, and transform it into a copy made to
2135 that will not collide with reserved assembler words and system labels.
2136 This procedure is optional, if it is not given the id_first macro is used
2139 Must be a character.
2140 This is prepended to all nonnumeric global labels if their length
2141 is shorter than the maximum allowed(currently 8) or if they already
2142 start with that character.
2143 This is to avoid conflicts of user labels with system labels.
2146 This is the value filled in all the words not initialized explicitly.
2147 This is loader and system dependent.
2148 If omitted no initialization is assumed.
2150 Example mach.h for the PDP-11
2153 #define ex_ap(y) fprintf(codefile,"\et.globl %s\en",y)
2154 #define in_ap(y) /* nothing */
2156 #define newplb(x) fprintf(codefile,"%s:\en",x)
2157 #define newilb(x) fprintf(codefile,"%s:\en",x)
2158 #define newdlb(x) fprintf(codefile,"%s:\en",x)
2159 #define dlbdlb(x,y) fprintf(codefile,"%s=%s\en",x,y)
2160 #define newlbss(l,x) fprintf(codefile,"%s:.=.+%d.\en",l,x);
2162 #define cst_fmt "$%d."
2163 #define off_fmt "%d."
2164 #define ilb_fmt "I%x_%x"
2165 #define dlb_fmt "_%d"
2166 #define hol_fmt "hol%d"
2168 #define hol_off "%ld.+hol%d"
2170 #define con_cst(x) fprintf(codefile,"%ld.\en",x)
2171 #define con_ilb(x) fprintf(codefile,"%s\en",x)
2172 #define con_dlb(x) fprintf(codefile,"%s\en",x)
2174 #define id_first '_'
2180 In mach.c some functions must be supplied,
2181 mostly manipulating data resulting from pseudoinstructions.
2182 The specifications are given here,
2183 implicit typing of parameters as above.
2187 This function must manipulate the globals
2188 part_word and part_size to append the isz bytes
2189 contained in word to the output stream.
2190 If part_word is full, i.e. part_size==TEM_WSIZE
2191 the function part_flush() may be called to empty the buffer.
2192 This is the function that must go through the trouble of
2193 doing byte order in words correct.
2197 This function must take the string str[] and create an integer
2198 from the string of size w_size and generate code to assemble global
2199 data for that integer.
2200 Only the sizes for which arithmetic is implemented need be
2202 so if 200-byte integer division is not implemented,
2203 200-byte integer global data don't have to be implemented.
2204 Here one must take care of word order in long integers.
2208 This function must generate code to assemble a floating
2209 point number of which the size is contained in argval
2210 and the ASCII representation in str[].
2214 This function is called at the start of every procedure.
2215 Function prolog code must be generated,
2216 and room made for local variables for a total of f_nlocals bytes.
2220 This function is called when a
2222 pseudo is seen that is not handled by the machine independent part.
2223 The example below shows all one probably have to know about that.
2227 This is not a function,
2228 but an array of four strings.
2229 These strings are put out whenever the code generator
2231 Segments are SEGTXT, SEGCON, SEGROM and SEGBSS in that order.
2233 If register variables are used in a table, the program
2235 will define the word REGVARS during compilation of the sources.
2236 So the following functions described here should be bracketed
2237 by #ifdef REGVARS and #endif.
2239 regscore(off,size,typ,freq,totyp) long off;
2241 This function should assign a score to a register variable,
2242 the score should preferably be the estimated number of bytes
2243 gained when it is put in a register.
2244 Off and size are the offset and size of the variable,
2245 typ is the type, that is reg_any, reg_pointer, reg_loop or reg_float.
2246 Freq is the count of static occurrences, and totyp
2247 is the type of the register it is planned to go into.
2249 Keep in mind that the gain should be net, that is the cost for
2250 register save/restore sequences and the cost of initialisation
2251 in the case of parameters should already be included.
2255 This function is called at the start of a procedure, just before
2256 register saves are done.
2257 It can be used to initialise some variables if needed.
2261 This function is called at end of the register save sequence.
2262 It can be used to do the real saving if multiple register move
2263 instructions are available.
2265 regsave(regstr,off,size) char *regstr; long off;
2267 Should either do the real saving or set up a table to have
2268 it done by f_regsave.
2269 Note that initialisation of parameters should also be done,
2274 Should restore saved registers and return.
2275 The function result is already in the function return area by now.
2277 Example mach.c for the PDP-11
2279 As an example of the sort of code expected,
2280 the mach.c for the PDP-11 is presented here.
2282 .ta 0.5i 1i 1.5i 2i 2.5i 3i 3.5i 4i 4.5i
2284 * machine dependent back end routines for the PDP-11
2287 con_part(sz,w) register sz; word w; {
2289 while (part_size % sz)
2305 con_mult(sz) word sz; {
2309 fatal("bad icon/ucon size");
2311 fprintf(codefile,"\et%o;%o\en",(int)(l>>16),(int)l);
2316 register short *p,i;
2319 * This code is correct only when the code generator is
2320 * run on a PDP-11 or VAX-11 since it assumes native
2321 * floating point format is PDP-11 format.
2324 if (argval != 4 && argval != 8)
2325 fatal("bad fcon size");
2330 fprintf(codefile,"\et%o;%o;",i,*p++);
2333 fprintf(codefile,"\et%o;%o\en",i,*p++);
2346 regscore(off,size,typ,score,totyp) long off; {
2349 * This function is full of magic constants.
2350 * They are a result of experimentation.
2355 score -= 1; /* allow for save/restore */
2358 if (typ==reg_pointer)
2360 else if (typ==reg_loop)
2361 score = 10*score+50; /* Guestimate */
2364 return(score); /* 10 * estimated # of words of profit */
2376 if (n_regvars==0 || lbytes==0) {
2377 fprintf(codefile,"mov r5,-(sp)\enmov sp,r5\en");
2379 fprintf(codefile,"tst -(sp)\en");
2381 fprintf(codefile,"sub $0%o,sp\en",lbytes);
2382 for (i=0;i<n_regvars;i++)
2383 fprintf(codefile,"mov %s,-(sp)\en",regadm[i].ra_str);
2386 fprintf(codefile,"mov $0%o,r0\en",lbytes);
2387 fprintf(codefile,"jsr r5,PR%s\en",Rstring);
2389 fprintf(codefile,"jsr r5,PR%d%s\en",lbytes,Rstring);
2392 for (i=0;i<n_regvars;i++)
2393 if (regadm[i].ra_off>=0)
2394 fprintf(codefile,"mov 0%lo(r5),%s\en",regadm[i].ra_off,
2398 regsave(regstr,off,size) char *regstr; long off; {
2400 fprintf(codefile,"/ Local %ld into %s\en",off,regstr);
2401 strcat(Rstring,regstr);
2402 regadm[n_regvars].ra_str = regstr;
2403 regadm[n_regvars].ra_off = off;
2409 fprintf(codefile,"jmp RT%s\en",Rstring);
2414 prolog(nlocals) full nlocals; {
2417 fprintf(codefile,"mov r5,-(sp)\enmov sp,r5\en");
2421 fprintf(codefile,"tst -(sp)\en");
2423 fprintf(codefile,"sub $0%o,sp\en",nlocals);
2429 mes(type) word type; {
2432 switch ( (int)type ) {
2435 switch ( argt=getarg(
2436 ptyp(sp_cend)|ptyp(sp_pnam)|sym_ptyp) ) {
2441 fprintf(codefile,".globl %s\en",argstr) ;
2446 while ( getarg(any_ptyp) != sp_cend ) ;
2452 ".text", /* SEGTXT */
2453 ".data", /* SEGCON */
2454 ".data", /* SEGROM */
2459 Internal workings of the code generator.
2461 Description of tables.c and tables.h contents
2463 In this section the intermediate files will be described
2464 that are produced by
2466 and compiled with machine independent code to produce a code generator.
2470 Tables.c contains a large number of initialized array's of all sorts.
2471 Description of each follows:
2477 Pseudo code interpreted by the code generator.
2478 Always starts with some opcode followed by operands depending
2480 Some of the opcodes have an argument encoded in the upper three
2481 bits of the opcode byte.
2482 Integers in this table are between 0 and 32767 and have a one byte
2483 encoding if between 0 and 127.
2487 The format used for output of words.
2491 Number of computed static register class per register.
2492 Two registers are in the same class if they have the same properties
2493 and don't share a common subregister.
2495 struct reginfo machregs[]
2498 Initialized with representation string, size,
2499 members of the register and set of registers affected when this
2501 Also contains room for run time information,
2502 like contents and reference count.
2506 Information per tokentype.
2507 Initialized with size, cost, type of operands and formatstring.
2511 List of triples representing expressions for the code generator.
2513 string codestrings[]
2516 All strings are put in a list and checked for duplication,
2517 so only one copy per string will reside here.
2521 List of token expression sets.
2522 Bit 0 of the set is used for the SCRATCH property of registers,
2523 bit 1 upto NREG are for the corresponding registers
2524 and bit NREG+1 upto the end are for corresponding tokens.
2526 inst_t tokeninstances[]
2528 List of descriptions for building tokens.
2529 Contains type of rule for building one,
2530 plus operands depending on the type.
2535 Contains token expressions for source and destination
2536 plus index for code rule.
2541 Contains token expressions for source
2542 plus index for code rule.
2547 This is structured internally as chains of patterns,
2548 each chain pointed at by pathash[].
2549 After each pattern the list of possible code rules is given.
2553 Indices into pattern[] for all patterns with a certain low order
2554 byte of the hashing function.
2558 List of rules to stack tokens.
2559 Contains token expressions,
2566 List of splitting coercions.
2574 List of one to one coercions.
2580 struct reginfo **reglist[]
2582 List of lists of pointers to register information.
2583 For every property the list is here
2584 to find the registers corresponding to it.
2589 In tables.h various derived constants for the tables are
2591 They are then used to determine array sizes in the actual code generator,
2592 plus loop termination in some cases.
2594 Other important data structures
2596 During code generation some other data structures are used
2597 and here is a short description of some of the important ones.
2599 Tokens are kept in the code generator as a struct consisting of
2602 which is -1 if the token is a register,
2603 and the number of the token otherwise,
2608 of which the first is the register number in case of a register.
2610 The fakestack is an array of these tokens,
2611 there is a global variable
2614 The results of expressions are kept in a struct
2618 giving the type of the expression:
2625 which contains the real result.
2627 A tour through the sources
2631 The file codegen.c contains one large function consisting
2632 of one giant switch statement.
2633 It is the interpreter for the code generator pseudo code
2634 as contained in code rules[].
2635 This function can call itself recursively when doing look ahead.
2638 Pointer into code rules, pseudo program counter.
2640 Number of EM pattern look ahead allowed.
2642 Boolean telling whether this is the toplevel codegen() or
2643 a deeper incarnation.
2645 A cutoff value to limit searches.
2646 If the cost crosses costlimit the incarnation can terminate.
2648 A register number if nonzero.
2649 This is used inside coercions to force the allocate() call to allocate
2650 a register determined by earlier look ahead.
2652 The instructions inplemented in the switch:
2656 Prints debugging information if the code generator runs in debug mode.
2657 This information is only generated if
2659 was called with the -d flag.
2663 Matches the next EM pattern and does look ahead if necessary to find the best
2664 code rule associated with this pattern.
2665 Heuristics are used to determine best code rule when possible.
2666 This is done by calling the distance() function.
2667 It can also handle the procedure mechanism.
2671 This sets the code generator in the state to do a from stack coercion.
2675 This is done when a match no longer has to be checked.
2676 Used when the nocoercions: trick is used in the table.
2680 This is the big one inside this function.
2681 It has the task to transform the contents of the current
2682 fake stack to match the pattern given after it.
2684 Since the code generator does not know combining coercions,
2685 i.e. there is no way to make a big token out of two smaller ones,
2686 the first thing done is to stack every token that is too small.
2687 After that all tokens too big are split if possible to the right size.
2689 Next the coercions are sought that would transform tokens in place to
2690 the right one, plus the coercions that would pop tokens of the stack.
2691 Each of those might need a register, so a list of registers is generated
2692 and at the end of looking for coercions the function
2694 is called to generate the list of all possible \fIn\fP-tuples,
2697 equals the number of registers needed.
2699 Look ahead is now performed if the number of tuples is greater than one.
2700 If no possibility is found within the costlimit,
2701 the fake stack is made smaller by pushing the bottom token,
2702 and this process is repeated until either a way is found or
2703 the fake stack is completely empty and there is still no way
2706 If there is a way the corresponding coercions are executed
2707 and the code is finished.
2711 Here the kills clause is executed, all tokens matched by the
2712 token expression plus boolean expression are pushed.
2713 In the current implementation there is no attempt to move those
2714 tokens to registers, but that is a possible future extension.
2718 This one temporarily decrements by one the reference count of all registers
2719 contained in the token given as argument.
2723 Here all temporary deallocates are made undone.
2727 This is the part that allocates a register and decides which one to use.
2730 argument was given its task is simple,
2731 otherwise some work must be done.
2732 First the list of possible registers is scanned,
2733 all free registers noted and it is noted whether any of those
2734 registers is already
2735 containing the initialization.
2736 If no registers are available some fakestack token is stacked and the
2737 process is repeated.
2739 After that if an exact match was found,
2740 the list of registers is reduced to one register matching exactly
2741 out of every register class.
2742 Now look ahead is performed if necessary and the register chosen.
2743 If an initialization was given the corresponding move is performed,
2744 otherwise the register is marked empty.
2748 This prints an instruction and its operands.
2749 Only done on toplevel.
2753 Calls the move() function in the code generator to implement the move
2754 instruction in the table.
2758 Calls the test() function in the code generator to implement the test
2759 instruction in the table.
2763 Marks the register that is its argument as empty.
2767 This is the token replacement part.
2768 It is also called if there is no token replacement because it has
2769 some other functions as well.
2771 First the tokens that will be pushed on the fake stack are computed
2772 and stored in a temporary array.
2773 Then the tokens that were matched in this rule are popped
2774 and their embedded registers have their reference count
2776 After that the replacement tokens are pushed.
2778 Finally all registers allocated in this rule have their reference count
2780 If they were not pushed on the fake stack they will be available again
2781 in the next code rule.
2785 Places replacement EM instructions back into the instruction stream.
2789 Accounts for cost as given in the code rule.
2793 Returns from this level of codegen().
2794 Is used at the end of coercions,
2799 This prints a label when the top element size mechanism is used. Only done on
2804 This module computes the various expressions as given
2805 in the enodes[] array.
2806 Nothing very special happens here,
2807 it is just a recursive function computing leaves
2808 of expressions and applying the operator.
2812 In this module the tuples() function is implemented.
2813 It is given the number of registers needed and
2814 a list of register lists and it constructs a list of tuples
2815 where the \fIn\fP'th register comes from the \fIn\fP'th list.
2816 Before the list is constructed however
2817 the dynamic register classes are computed.
2818 Two registers are in the same dynamic class if they are in the
2819 same static class and their contents is the same.
2821 After that the permute() recursive function is called to
2822 generate the list of tuples.
2823 After construction a generated tuple is added to the list
2824 if it is not already pairwise in the same class
2825 or if the register relations are not the same,
2826 i.e. if the first and second register share a common
2827 subregister in one tuple and not in the other they are considered different.
2831 This is the routine that does the reading of EM instructions
2832 and the handling of pseudos.
2833 The mach.c module provided by the table writer is included
2834 at the end of this module.
2835 The routine fillemlines() is called by nextem() at toplevel
2836 to make sure there are enough instruction to match.
2837 It fills the EM instruction buffer up to 5 places from the end to
2838 keep room for EM replacement instructions,
2841 The dopseudo() function performs the function of the pseudo last
2845 the corresponding label is saved with the contents of the
2847 to be available to the code generator later.
2848 The rest of the routines are small service routines for either
2849 input or data output.
2853 This module contains routines called by codegen() to generate the real
2854 code to the codefile.
2855 The function genstr() gets a string as argument and copies it to codefile.
2856 The prtoken() function interprets the tokenformat as given in
2861 This module maintains a list of global symbols that have a
2864 There are functions to enter a symbol and to find a symbol.
2868 This module contains routines to handle the top element size messages.
2872 Main routine of the code generator.
2873 Processes arguments and flags.
2874 Flags available are:
2876 Sets debug mode if the code generator was not compiled with
2877 the NDEBUG macro defined.
2878 The flag can be followed by a digit specifying the amount of debugging
2880 and by @labelname giving the start of debugging.
2881 Debug mode gives very long output on stderr indicating
2882 all steps of the code generation process including nesting
2883 of the codegen() function.
2885 Sets the look ahead depth to
2890 a well known word in chess playing programs.
2892 Sets the weight percentage for size in the cost function to
2895 Uses Euclides algorithm to simplify rationals.
2899 Function to implement the move instruction in the tables,
2900 register initialization and the test instruction and associated bookkeeping.
2901 First tests are made to try to prevent the move from really happening.
2902 After that, if there is an after that,
2903 the move rule is found and the code executed.
2907 The entry point of this module is nextem().
2908 It hashes the next three EM instructions,
2909 and uses the low order byte of the hash
2910 as an index into the array pathash[],
2911 to find a chain of patterns in the array
2913 that are all tried for a match.
2915 The function trypat() does most of the work
2917 When a pattern is found to match all instructions
2918 the operands of the instruction are placed into the dollar[] array.
2919 Then the boolean expression is tried.
2920 If it matches the function can return,
2921 leaving the operands still in the dollar[] array,
2922 so later in the code rule they can still be used.
2926 Collection of routines to handle registers.
2927 Reference count routines are here,
2928 chrefcount() and getrefcount(),
2929 plus routines to erase a single register or all of them,
2930 erasereg() and cleanregs().
2932 If NDEBUG hasn't been defined, here is also the routine that checks
2933 if the reference count kept with the register information is in
2934 agreement with the number of times it occurs on the fake stack.
2938 Module for string allocation and garbage collection.
2939 Contains entry points myalloc(),
2940 a routine calling malloc() and checking whether room is left,
2941 myfree(), just free(),
2942 popstr() a function called from state.c to free all strings
2943 made since the last saved status.
2944 Furthermore there is salloc() which has the size of the string as parameter
2945 and returns a pointer to the allocated space,
2946 while keeping a copy of the pointer for garbage allocation purposes.
2948 The function garbage_collect is called from codegen() at toplevel
2950 and checks all places where strings may reside to mark strings
2952 Strings not in use are returned to the pool of free space.
2956 Set of routines called to save current status and
2957 restore a previous saved state.
2961 Random set of leftover routines.
2965 Computes whether a certain token matches a certain token expression.
2966 Just computes a bitnumber according to the algorithm explained with
2968 and tests the bit and the boolean expression if it is there.
2972 These two functions compute a token from a description.
2973 They differ very slight, cinstance() is used to compute
2974 the result of a coercion in a certain context
2975 and therefore has more arguments, which it uses instead of
2976 the global information instance() works on.
2980 eqtoken computes whether two tokens can be considered identical.
2981 Used to check register contents during moves mainly.
2985 This is the heuristic function that computes a distance from
2986 the current fake stack contents to the token pattern in the table.
2987 It likes exact matches most, then matches where at least the sizes are correct
2988 and if the sizes are not correct it likes too large sizes more than too
2989 small, since splitting a token is easier than combining one.
2993 This function tries to find a splitting coercion
2994 and executes it immediately when found.
2995 The fake stack is shuffled thoroughly when this happens,
2996 so pieces below the token that must be split are saved first.
3000 This function executes a coercion that was found.
3001 The same shuffling is done, so the top of the stack is again saved.
3005 This function gets a pointer into the fake stack and must stack
3006 every token including the one pointed at up to the bottom of the fake stack.
3007 The first stacking rule possible is used,
3008 so rules using registers must come first.
3012 Looks for a one to one coercion, if found it returns a pointer
3013 to it and leaves a list of possible registers to use in the global
3014 variable curreglist.
3015 This is used by codegen().
3019 Global variables used by more than one module.
3020 External definitions are in extern.h.