doc/ego/il/il4

   1 .NH 2
   2 Heuristic rules
   3 .PP
   4 Using the information described
   5 in the previous section,
   6 we can find all calls that can
   7 be expanded in line, and for which
   8 this expansion is desirable.
   9 In general, we cannot expand all these calls,
  10 so we have to choose the 'best' ones.
  11 With every CAL instruction
  12 that may be expanded, we associate
  13 a \fIpay off\fR,
  14 which expresses how desirable it is
  15 to expand this specific CAL.
  16 .sp
  17 Let Tc denote the portion of EM text involved
  18 in a specific call, i.e. the pushing of the actual
  19 parameter expressions, the CAL itself,
  20 the popping of the parameters and the
  21 pushing of the result (if any, via an LFR).
  22 Let Te denote the EM text that would be obtained
  23 by expanding the call in line.
  24 Let Pc be the original program and Pe the program
  25 with Te substituted for Tc.
  26 The pay off of the CAL depends on two factors:
  27 .IP -
  28 T = execution_time(Pe) - execution_time(Pc)
  29 .IP -
  30 S = code_size(Pe) - code_size(Pc)
  31 .LP
  32 The change in execution time (T) depends on:
  33 .IP -
  34 T1 = execution_time(Te) - execution_time(Tc)
  35 .IP -
  36 N = number of times Te or Tc get executed.
  37 .LP
  38 We assume that T1 will be the same every
  39 time the code gets executed.
  40 This is a reasonable assumption.
  41 (Note that we are talking about one CAL,
  42 not about different calls to the same procedure).
  43 Hence
  44 .DS
  45 T = N * T1
  46 .DE
  47 T1 can be estimated by a careful analysis
  48 of the transformations that are performed.
  49 Below, we list everything that will be
  50 different when a call is expanded in line:
  51 .IP -
  52 The CAL instruction is not executed.
  53 This saves a subroutine jump.
  54 .IP -
  55 The instructions in the procedure prolog
  56 are not executed.
  57 These instructions, generated from the PRO pseudo,
  58 save some machine registers
  59 (including the old LB), set the new LB and allocate space
  60 for the locals of the called routine.
  61 The savings may be less if there are no
  62 locals to allocate.
  63 .IP -
  64 In line parameters are not evaluated before the call
  65 and are not pushed on the stack.
  66 .IP -
  67 All remaining parameters are stored in local variables,
  68 instead of being pushed on the stack.
  69 .IP -
  70 If the number of parameters is nonzero,
  71 the ASP instruction after the CAL is not executed.
  72 .IP -
  73 Every reference to an in line parameter is
  74 substituted by the parameter expression.
  75 .IP -
  76 RET (return) instructions are replaced by
  77 BRA (branch) instructions.
  78 If the called procedure 'falls through'
  79 (i.e. it has only one RET, at the end of its code),
  80 even the BRA is not needed.
  81 .IP -
  82 The LFR (fetch function result) is not executed
  83 .PP
  84 Besides these changes, which are caused directly by IL,
  85 other changes may occur as IL influences other optimization
  86 techniques, such as Register Allocation and Constant Propagation.
  87 Our heuristic rules do not take into account the quite
  88 inpredictable effects on Register Allocation.
  89 It does, however, favour calls that have numeric \fIconstants\fR
  90 as parameter; especially the constant "0" as an inline
  91 parameter gets high scores,
  92 as further optimizations may often be possible.
  93 .PP
  94 It cannot be determined statically how often a CAL instruction gets
  95 executed.
  96 We will use \fIloop nesting\fR information here.
  97 The nesting level of the loop in which
  98 the CAL appears (if any) will be used as an
  99 indication for the number of times it gets executed.
 100 .PP
 101 Based on all these facts,
 102 the pay off of a call will be computed.
 103 The following model was developed empirically.
 104 Assume procedure P calls procedure Q.
 105 The call takes place in basic block B.
 106 .DS
 107 .TS
 108 l l l.
 109 ZP      \&=     # zero parameters
 110 CP      \&=     # constant parameters - ZP
 111 LN      \&=     Loop Nesting level (0 if outside any loop)
 112 F       \&=     \fIif\fR # formal parameters of Q > 0 \fIthen\fR 1 \fIelse\fR 0
 113 FT      \&=     \fIif\fR Q falls through \fIthen\fR 1 \fIelse\fR 0
 114 S       \&=     size(Q) - 1 - # inline_parameters - F
 115 L       \&=     \fIif\fR # local variables of P > 0 \fIthen\fR 0 \fIelse\fR -1
 116 A       \&=     CP + 2 * ZP
 117 N       \&=     \fIif\fR LN=0 and P is never called from a loop \fIthen\fR 0 \fIelse\fR (LN+1)**2
 118 FM      \&=     \fIif\fR B is a firm block \fIthen\fR 2 \fIelse\fR 1
 119
 120 pay_off \&=     (100/S + FT + F + L + A) * N * FM
 121 .TE
 122 .DE
 123 S stands for the size increase of the program,
 124 which is slightly less than the size of Q.
 125 The size of a procedure is taken to be its number
 126 of (non-pseudo) EM instructions.
 127 The terms "loop nesting level" and "firm" were defined
 128 in the chapter on the Intermediate Code (section "loop tables").
 129 If a call is not inside a loop and the calling procedure
 130 is itself never called from a loop (transitively),
 131 then the call will probably be executed at most once.
 132 Such a call is never expanded in line (its pay off is zero).
 133 If the calling procedure doesn't have local variables, a penalty (L)
 134 is introduced, as it will most likely get local variables if the
 135 call gets expanded.