4 Like most phases, SR deals with one procedure
6 Within a procedure, SR works on one loop at a time.
7 Loops are processed in textual order.
8 If loops are nested inside each other,
9 SR starts with the outermost loop and proceeds in the
12 because it enables the optimization
13 of multi-dimensional array address computations,
14 if the elements are accessed in the usual way
15 (i.e. row after row, rather than column after column).
16 For every loop, SR first detects all induction variables
17 and then tries to recognize
18 expressions that can be optimized.
20 Finding induction variables
22 The process of finding induction variables
23 can conveniently be split up
25 First, the EM text of the loop is scanned to find
26 all \fIcandidate\fR induction variables,
27 which are word-sized local variables
28 that are assigned precisely once
29 in the loop, within a firm block.
30 Second, for every candidate, the single assignment
31 is inspected, to see if it has the form
32 required by the definition of an induction variable.
34 Candidates are found by scanning the EM code of the loop.
35 During this scan, two sets are maintained.
36 The set "cand" contains all variables that were
37 assigned exactly once so far, within a firm block.
38 The set "dismiss" contains all variables that
39 should not be made a candidate.
40 Initially, both sets are empty.
41 If a variable is assigned to, it is put
42 in the cand set, if three conditions are met:
44 the variable was not in cand or dismiss already
46 the assignment takes place in a firm block
48 the assignment is not a ZRL instruction (assignment by zero)
49 or a SDL instruction (store double local).
51 If any condition fails, the variable is dismissed from cand
52 (if it was there already) and put in dismiss
53 (if it was not there already).
55 All variables for which no register message was generated (i.e. those
56 variables that may be accessed indirectly) are assumed
57 to be changed in the loop.
59 All variables that remain in cand are candidate induction variables.
61 From the set of candidates, the induction variables can
62 be determined, by inspecting the single assignment.
63 The assignment must match one of the EM patterns below.
64 ('x' is the candidate. 'ws' is the word size of the target machine.
69 \fIpattern\fR \fIstep size\fR
72 LOL x ; (INC | DEC) ; STL x | +1 | -1
73 LOL x ; LOC n ; (ADI ws | SBI ws) ; STL x | +n | -n
74 LOC n ; LOL x ; ADI ws ; STL x +n
77 From the patterns the step size of the induction variable
78 can also be determined.
79 These step sizes are displayed on the right hand side.
81 For every induction variable we maintain the following information:
83 the offset of the variable in the stackframe of its procedure
85 a pointer to the EM text of the assignment statement
90 Optimizing expressions
92 If any induction variables of the loop were found,
93 the EM text of the loop is scanned again,
94 to detect expressions that can be optimized.
95 SR scans for multiplication and array instructions.
96 Whenever it finds such an instruction, it analyses the
98 If an expression is to be optimized, it must
99 be generated by the following syntax rules.
106 address iv_expr address array_instr;
118 An 'address' is an EM instruction that loads an
119 address on the stack.
120 An instruction like LOL may be an 'address', if
121 the size of an address (pointer size, =ps) is
122 the same as the word size.
123 If the pointer size is twice the word size,
124 instructions like LDL are an 'address'.
125 (The addresses in the third grammar rule
126 denote resp. the array address and the
127 array descriptor address).
141 The notion of an iv-expression was introduced earlier.
147 iv_expr iv_expr binary_op |
161 LOL x if x is not changed in loop ;
163 LOL x if x is an induction variable ;
166 An iv_expression must satisfy one additional constraint:
167 it must use exactly one operand that is an induction
169 A simple, hand written, top-down parser is used
170 to recognize an iv-expression.
171 It scans the EM code from right to left
172 (recall that EM is essentially postfix).
173 It uses semantic attributes (inherited as well as
174 derived) to check the additional constraint.
176 All information assembled during the recognition
177 process is put in a 'code_info' structure.
178 This structure contains the following information:
180 the optimizable code itself
182 the loop and basic block the code is part of
184 the induction variable
188 the sign of the induction variable in the
191 the offset and size of the temporary local variable
193 the expensive operator (MLI, LAR etc.)
195 the instruction that loads the constant
196 (for multiplication) or the array descriptor
199 The entire transformation process is driven
201 As the EM text is represented internally
202 as a list, this process consists
203 mainly of straightforward list manipulations.
205 The initialization code must be put
206 immediately before the loop entry.
207 For this purpose a \fIheader block\fR is
208 created that has the loop entry block as
209 its only successor and that dominates the
211 The CFG and all relations (SUCC,PRED, IDOM, LOOPS etc.)
214 An EM instruction that will
215 replace the optimizable code
216 is created and put at the place of the old code.
217 The list representing the old optimizable code
218 is used to create a list for the initializing code,
220 Only two modifications are required:
222 if the expensive operator is a LAR or SAR,
223 it must be replaced by an AAR, as the initial value
224 of TMP is the \fIaddress\fR of the first
225 array element that is accessed.
227 code must be appended to store the result of the
230 Finally, code to increment TMP is created and put after
231 the code of the single assignment to the
233 The generated code uses either an integer addition
234 (ADI) or an integer-to-pointer addition (ADS)
237 SR maintains a set of all expressions that have already
238 been recognized in the present loop.
239 Such expressions are said to be \fIavailable\fR.
240 If an expression is recognized that is
242 no new temporary local variable is allocated for it,
243 and the code to initialize and increment the local