doc/ego/il/il5

   1 .NH 2
   2 Implementation
   3 .PP
   4 A major factor in the implementation
   5 of Inline Substitution is the requirement
   6 not to use an excessive amount of memory.
   7 IL essentially analyzes the entire program;
   8 it makes decisions based on which procedure calls
   9 appear in the whole program.
  10 Yet, because of the memory restriction, it is
  11 not feasible to read the entire program
  12 in main memory.
  13 To solve this problem, the IL phase has been
  14 split up into three subphases that are executed sequentially:
  15 .IP 1.
  16 analyze every procedure; see how it accesses its parameters;
  17 simultaneously collect all calls
  18 appearing in the whole program an put them
  19 in a \fIcall-list\fR.
  20 .IP 2.
  21 use the call-list and decide which calls will be substituted
  22 in line.
  23 .IP 3.
  24 take the decisions of subphase 2 and modify the
  25 program accordingly.
  26 .LP
  27 Subphases 1 and 3 scan the input program; only
  28 subphase 3 modifies it.
  29 It is essential that the decisions can be made
  30 in subphase 2
  31 without using the input program,
  32 provided that subphase 1 puts enough information
  33 in the call-list.
  34 Subphase 2 keeps the entire call-list in main memory
  35 and repeatedly scans it, to
  36 find the next best candidate for expansion.
  37 .PP
  38 We will specify the
  39 data structures used by IL before
  40 describing the subphases.
  41 .NH 3
  42 Data structures
  43 .NH 4
  44 The procedure table
  45 .PP
  46 In subphase 1 information is gathered about every procedure
  47 and added to the procedure table.
  48 This information is used by the heuristic rules.
  49 A proctable entry for procedure p has
  50 the following extra information:
  51 .IP -
  52 is it allowed to substitute an invocation of p in line?
  53 .IP -
  54 is it allowed to put any parameter of such a call in line?
  55 .IP -
  56 the size of p (number of EM instructions)
  57 .IP -
  58 does p 'fall through'?
  59 .IP -
  60 a description of the formal parameters that p accesses; this information
  61 is obtained by looking at the code of p. For every parameter f,
  62 we record:
  63 .RS
  64 .IP -
  65 the offset of f
  66 .IP -
  67 the type of f (word, double word, pointer)
  68 .IP -
  69 may the corresponding actual parameter be put in line?
  70 .IP -
  71 is f ever accessed indirectly?
  72 .IP -
  73 if f used: never, once or more than once?
  74 .RE
  75 .IP -
  76 the number of times p is called (see below)
  77 .IP -
  78 the file address of its call-count information (see below).
  79 .LP
  80 .NH 4
  81 Call-count information
  82 .PP
  83 As a result of Inline Substitution, some procedures may
  84 become useless, because all their invocations have been
  85 substituted in line.
  86 One of the tasks of IL is to keep track which
  87 procedures are no longer called.
  88 Note that IL is especially keen on procedures that are
  89 called only once
  90 (possibly as a result of expanding all other calls to it).
  91 So we want to know how many times a procedure
  92 is called \fIduring\fR Inline Substitution.
  93 It is not good enough to compute this
  94 information afterwards.
  95 The task is rather complex, because
  96 the number of times a procedure is called
  97 varies during the entire process:
  98 .IP 1.
  99 If a call to p is substituted in line,
 100 the number of calls to p gets decremented by 1.
 101 .IP 2.
 102 If a call to p is substituted in line,
 103 and p contains n calls to q, then the number of calls to q
 104 gets incremented by n.
 105 .IP 3.
 106 If a procedure p is removed (because it is no
 107 longer called) and p contains n calls to q,
 108 then the number of calls to q gets decremented by n.
 109 .LP
 110 (Note that p may be the same as q, if p is recursive).
 111 .sp 0
 112 So we actually want to have the following information:
 113 .DS
 114 NRCALL(p,q) = number of call to q appearing in p,
 115
 116 for all procedures p and q that may be put in line.
 117 .DE
 118 This information, called \fIcall-count information\fR is
 119 computed by the first subphase.
 120 It is stored in a file.
 121 It is represented as a number of lists, rather than as
 122 a (very sparse) matrix.
 123 Every procedure has a list of (proc,count) pairs,
 124 telling which procedures it calls, and how many times.
 125 The file address of its call-count list is stored
 126 in its proctable entry.
 127 Whenever this information is needed, it is fetched from
 128 the file, using direct access.
 129 The proctable entry also contains the number of times
 130 a procedure is called, at any moment.
 131 .NH 4
 132 The call-list
 133 .PP
 134 The call-list is the major data structure use by IL.
 135 Every item of the list describes one procedure call.
 136 It contains the following attributes:
 137 .IP -
 138 the calling procedure (caller)
 139 .IP -
 140 the called procedure (callee)
 141 .IP -
 142 identification of the CAL instruction (sequence number)
 143 .IP -
 144 the loop nesting level; our heuristic rules appreciate
 145 calls inside a loop (or even inside a loop nested inside
 146 another loop, etc.) more than other calls
 147 .IP -
 148 the actual parameter expressions involved in the call;
 149 for every actual, we record:
 150 .RS
 151 .IP -
 152 the EM code of the expression
 153 .IP -
 154 the number of bytes of its result (size)
 155 .IP -
 156 an indication if the actual may be put in line
 157 .RE
 158 .LP
 159 The structure of the call-list is rather complex.
 160 Whenever a call is expanded in line, new calls
 161 will suddenly appear in the program,
 162 that were not contained in the original body
 163 of the calling subroutine.
 164 These calls are inherited from the called procedure.
 165 We will refer to these invocations as \fInested calls\fR
 166 (see Fig. 5.1).
 167 .DS
 168 .TS
 169 lw(2.5i) l.
 170 procedure p is
 171 begin   .
 172      a();       .
 173      b();       .
 174 end;
 175 .TE
 176
 177 .TS
 178 lw(2.5i) l.
 179 procedure r is  procedure r is
 180 begin   begin
 181      x();           x();
 182      p();  -- in line       a();  -- nested call
 183      y();           b();  -- nested call
 184 end;        y();
 185         end;
 186 .TE
 187
 188 Fig. 5.1 Example of nested procedure calls
 189 .DE
 190 Nested calls may subsequently be put in line too
 191 (probably resulting in a yet deeper nesting level, etc.).
 192 So the call-list does not always reflect the source program,
 193 but changes dynamically, as decisions are made.
 194 If a call to p is expanded, all calls appearing in p
 195 will be added to the call-list.
 196 .sp 0
 197 A convenient and elegant way to represent
 198 the call-list is to use a LISP-like list.
 199 .[
 200 poel lisp trac
 201 .]
 202 Calls that appear at the same level
 203 are linked in the CDR direction. If a call C
 204 to a procedure p is expanded,
 205 all calls appearing in p are put in a sub-list
 206 of C, i.e. in its CAR.
 207 In the example above, before the decision
 208 to expand the call to p is made, the
 209 call-list of procedure r looks like:
 210 .DS
 211 (call-to-x, call-to-p, call-to-y)
 212 .DE
 213 After the decision, it looks like:
 214 .DS
 215 (call-to-x, (call-to-p*, call-to-a, call-to-b), call-to-y)
 216 .DE
 217 The call to p is marked, because it has been
 218 substituted.
 219 Whenever IL wants to traverse the call-list of some procedure,
 220 it uses the well-known LISP technique of
 221 recursion in the CAR direction and
 222 iteration in the CDR direction
 223 (see page 1.19-2 of
 224 .[
 225 poel lisp trac
 226 .]
 227 ).
 228 All list traversals look like:
 229 .DS
 230 traverse(list)
 231 {
 232     for (c = first(list); c != 0; c = CDR(c)) {
 233         if (c is marked) {
 234             traverse(CAR(c));
 235         } else {
 236             do something with c
 237         }
 238     }
 239 }
 240 .DE
 241 The entire call-list consists of a number of LISP-like lists,
 242 one for every procedure.
 243 The proctable entry of a procedure contains a pointer
 244 to the beginning of the list.
 245 .NH 3
 246 The first subphase: procedure analysis
 247 .PP
 248 The tasks of the first subphase are to determine
 249 several attributes of every procedure
 250 and to construct the basic call-list,
 251 i.e. without nested calls.
 252 The size of a procedure is determined
 253 by simply counting its EM instructions.
 254 Pseudo instructions are skipped.
 255 A procedure does not 'fall through' if its CFG
 256 contains a basic block
 257 that is not the last block of the CFG and
 258 that ends on a RET instruction.
 259 The formal parameters of a procedure are determined
 260 by inspection of
 261 its code.
 262 .PP
 263 The call-list in constructed by looking at all CAL instructions
 264 appearing in the program.
 265 The call-list should only contain calls to procedures
 266 that may be put in line.
 267 This fact is only known if the procedure was
 268 analyzed earlier.
 269 If a call to a procedure p appears in the program
 270 before the body of p,
 271 the call will always be put in the call-list.
 272 If p is later found to be unsuitable,
 273 the call will be removed from the list by the
 274 second subphase.
 275 .PP
 276 An important issue is the recognition
 277 of the actual parameter expressions of the call.
 278 The front ends produces messages telling how many
 279 bytes of formal parameters every procedure accesses.
 280 (If there is no such message for a procedure, it
 281 cannot be put in line).
 282 The actual parameters together must account for
 283 the same number of bytes.A recursive descent parser is used
 284 to parse side-effect free EM expressions.
 285 It uses a table and some
 286 auxiliary routines to determine
 287 how many bytes every EM instruction pops from the stack
 288 and how many bytes it pushes onto the stack.
 289 These numbers depend on the EM instruction, its argument,
 290 and the wordsize and pointersize of the target machine.
 291 Initially, the parser has to recognize the
 292 number of bytes specified in the formals-message,
 293 say N.
 294 Assume the first instruction before the CAL pops S bytes
 295 and pushes R bytes.
 296 If R > N, too many bytes are recognized
 297 and the parser fails.
 298 Else, it calls itself recursively to recognize the
 299 S bytes used as operand of the instruction.
 300 If it succeeds in doing so, it continues with the next instruction,
 301 i.e. the first instruction before the code recognized by
 302 the recursive call, to recognize N-R more bytes.
 303 The result is a number of EM instructions that collectively push N bytes.
 304 If an instruction is come across that has side-effects
 305 (e.g. a store or a procedure call) or of which R and S cannot
 306 be computed statically (e.g. a LOS), it fails.
 307 .sp 0
 308 Note that the parser traverses the code backwards.
 309 As EM code is essentially postfix code, the parser works top down.
 310 .PP
 311 If the parser fails to recognize the parameters, the call will not
 312 be substituted in line.
 313 If the parameters can be determined, they still have to
 314 match the formal parameters of the called procedure.
 315 This check is performed by the second subphase; it cannot be
 316 done here, because it is possible that the called
 317 procedure has not been analyzed yet.
 318 .PP
 319 The entire call-list is written to a file,
 320 to be processed by the second subphase.
 321 .NH 3
 322 The second subphase: making decisions
 323 .PP
 324 The task of the second subphase is quite easy
 325 to understand.
 326 It reads the call-list file,
 327 builds an incore call-list and deletes every
 328 call that may not be expanded in line (either because the called
 329 procedure may not be put in line, or because the actual parameters
 330 of the call do not match the formal parameters of the called procedure).
 331 It assigns a \fIpay-off\fR to every call,
 332 indicating how desirable it is to expand it.
 333 .PP
 334 The subphase repeatedly scans the call-list and takes
 335 the call with the highest ratio.
 336 The chosen one gets marked,
 337 and the call-list is extended with the nested calls,
 338 as described above.
 339 These nested calls are also assigned a ratio,
 340 and will be considered too during the next scans.
 341 .sp 0
 342 After every decision the number of times
 343 every procedure is called is updated, using
 344 the call-count information.
 345 Meanwhile, the subphase keeps track of the amount of space left
 346 available.
 347 If all space is used, or if there are no more calls left to
 348 be expanded, it exits this loop.
 349 Finally, calls to procedures that are called only
 350 once are also chosen.
 351 .PP
 352 The actual parameters of a call are only needed by
 353 this subphase to assign a ratio to a call.
 354 To save some space, these actuals are not kept in main memory.
 355 They are removed after the call has been read and a ratio
 356 has been assigned to it.
 357 So this subphase works with \fIabstracts\fR of calls.
 358 After all work has been done,
 359 the actual parameters of the chosen calls are retrieved
 360 from a file,
 361 as they are needed by the transformation subphase.
 362 .NH 3
 363 The third subphase: doing transformations
 364 .PP
 365 The third subphase makes the actual modifications to
 366 the EM text.
 367 It is directed by the decisions made in the previous subphase,
 368 as expressed via the call-list.
 369 The call-list read by this subphase contains
 370 only calls that were selected for expansion.
 371 The list is ordered in the same way as the EM text,
 372 i.e. if a call C1 appears before a call C2 in the call-list,
 373 C1 also appears before C2 in the EM text.
 374 So the EM text is traversed linearly,
 375 the calls that have to be substituted are determined
 376 and the modifications are made.
 377 If a procedure is come across that is no longer needed,
 378 it is simply not written to the output EM file.
 379 The substitution of a call takes place in distinct steps:
 380 .IP "change the calling sequence" 7
 381 .sp 0
 382 The actual parameter expressions are changed.
 383 Parameters that are put in line are removed.
 384 All remaining ones must store their result in a
 385 temporary local variable, rather than
 386 push it on the stack.
 387 The CAL instruction and any ASP (to pop actual parameters)
 388 or LFR (to fetch the result of a function)
 389 are deleted.
 390 .IP "fetch the text of the called procedure"
 391 .sp 0
 392 Direct disk access is used to to read the text of the
 393 called procedure.
 394 The file offset is obtained from the proctable entry.
 395 .IP "allocate bytes for locals and temporaries"
 396 .sp 0
 397 The local variables of the called procedure will be put in the
 398 stack frame of the calling procedure.
 399 The same applies to any temporary variables
 400 that hold the result of parameters
 401 that were not put in line.
 402 The proctable entry of the caller is updated.
 403 .IP "put a label after the CAL"
 404 .sp 0
 405 If the called procedure contains a RET (return) instruction
 406 somewhere in the middle of its text (i.e. it does
 407 not fall through), the RET must be changed into
 408 a BRA (branch), to jump over the
 409 remainder of the text.
 410 This label is not needed if the called
 411 procedure falls through.
 412 .IP "copy the text of the called procedure and modify it"
 413 .sp 0
 414 References to local variables of the called routine
 415 and to parameters that are not put in line
 416 are changed to refer to the
 417 new local of the caller.
 418 References to in line parameters are replaced
 419 by the actual parameter expression.
 420 Returns (RETs) are either deleted or
 421 replaced by a BRA.
 422 Messages containing information about local
 423 variables or parameters are changed.
 424 Global data declarations and the PRO and END pseudos
 425 are removed.
 426 Instruction labels and references to them are
 427 changed to make sure they do not have the
 428 same identifying number as
 429 labels in the calling procedure.
 430 .IP "insert the modified text"
 431 .sp 0
 432 The pseudos of the called procedure are put after the pseudos
 433 of the calling procedure.
 434 The real text of the callee is put at
 435 the place where the CAL was.
 436 .IP "take care of nested substitutions"
 437 .sp 0
 438 The expanded procedure may contain calls that
 439 have to be expanded too (nested calls).
 440 If the descriptor of this call contains actual
 441 parameter expressions,
 442 the code of the expressions has to be changed
 443 the same way as the code of the callee was changed.
 444 Next, the entire process of finding CALs and doing
 445 the substitutions is repeated recursively.
 446 .LP