3 .\" $Id: appB,v 2.3 1994/06/24 10:05:16 ceriel Exp $
10 How to use the interpreter
12 The interpreter is not normally used for the debugging of programs under
13 construction. Its primary application is as a verification tool for almost
14 completed programs. Although the proper operation of the interpreter is
15 obviously a black art, this chapter tries to provide some guidelines.
17 For the sake of the argument, the source language is assumed to be C, but most
18 hints apply equally well to other languages supported by ACK.
23 Start with a test case of trivial size; to be on the safe side, reckon with a
24 time dilatation factor of about 500, i.e., a second grows into 10 minutes.
25 (The interpreter takes 0.5 msec to do one EM instruction on a Sun 3/50).
26 Fortunately many trivial test cases are much shorter than one second.
28 Compile the program into an \fIe.out\fP, the EM machine version of a
29 \fIa.out\fP, by calling \fIem22\fP (for 2-byte integers and 2-byte pointers),
30 \fIem24\fP (for 2 and 4) or \fIem44\fP (for 4 and 4) as seems appropriate;
31 if in doubt, use \fIem44\fP. These compilers can be found in the ACK
32 \fIbin\fP directory, and should be used instead of \fIacc\fP (or normal
34 \fIcc\fP). Alternatively, \fIacc \-memNN\fP can be used instead of
37 If a C program consists of more than one file, as it usually does, there is
38 a small problem. The \fIacc\fP and \fIcc\fP compilers generate .o files,
39 whereas the \fIemNN\fP compilers generate .m files as object files.
40 A simple technique to avoid the problem is to call
44 if possible. If not, the following hack on the \fIMakefile\fP generally works.
46 Make sure the \fIMakefile\fP is reasonably clean and complete: all calls to
47 the compiler are through \fI$(CC)\fP, \fICFLAGS\fP is used properly and all
48 dependencies are specified.
50 Add the following lines to the \fIMakefile\fP (possibly permanently):
54 \& $(CC) \-c $(CFLAGS) $<
57 Set CC to \fIem44 \-.c\fP (for example). Make sure CFLAGS includes
58 the \-O option; this yields a speed-up of about 15 %.
60 Change all .o to .m (or .k if the \-O option is not used).
62 If necessary, change \fIa.out\fP to \fIe.out\fP.
64 With these changes, \fImake\fP will produce an EM object;
65 \fIesize\fP can be used to verify that it is indeed an EM object and obtain some
66 statistics. Then call the interpreter:
68 int <EM-object-file> [ parameters ]
70 where the parameters are the normal parameters of the program. This should
71 work exactly like the original program, though slower. It reads from the
72 terminal if the original does, it opens and closes files like the original and
73 it accepts interrupts.
76 .I "Interpreting the results"
78 Now there are several possibilities.
80 It does all this. Great! This means the program
81 does not do very uncouth things. Now
82 read the file \fIint.mess\fP to see if any messages were generated. If there
83 are none, the program did not really run (perhaps the original cc \fIa.out\fP
84 got called instead?) Normally there is at least a termination message like
86 (Message): program exits with status 0 at "awa.p", line 64, INR = 4124
88 This says that the program terminated through an exit(0) on line 64 of the
89 file \fIawa.p\fP after 4124 EM instructions.
90 If this is the only message it is time to move to a bigger test case.
92 On the other hand, the program may come to a grinding halt with an error
94 All messages (errors and warnings) have a format in which the sequence
96 "<file name>", line <ln#>
98 occurs, which is the same sequence many compilers produce for their error
99 messages. Consequently, the \fIint.mess\fP file can be processed as any
100 compiler message output.
102 One such message can be
104 (Fatal error) a.em: trap "Addressing non existent memory" not caught at "a.c", line 2, INR = 16
106 produced by the abysmal program
113 Often the effects are more subtle, however. The program
121 produces the following five warnings (in far less than a second):
123 (Warning 47, #1): Local data pointer expected at "t.c", line 4, INR = 17
124 (Warning 61, cont.): Actual memory is undefined at "t.c", line 4, INR = 17
125 (Warning 102, #1): Returned function result too small at "<unknown>", line 0, INR = 21
126 (Warning 43, #1): Local integer expected at "exit.c", line 11, INR = 34
127 (Warning 61, cont.): Actual memory is undefined at "exit.c", line 11, INR = 34
129 The one about the function result looks the most frightening,
130 but is the most easily solved:
131 \fImain\fP is a function returning an int, so the start-up routine expects a
132 (four-byte) integer but gets an empty (zero-byte) return area.
134 \fINote\fP: The experts are divided about this. The traditional school holds
135 that \fImain\fP is an int function and its result is the return code; this
136 leaves them with two ways of supplying a return code: one as the parameter
137 of \fIexit()\fP and one as the result
138 of \fImain\fP. The modern school (Berkeley 4.2 etc.) claims that
139 return codes are supplied exclusively
140 by \fIexit()\fP, and they have an \fIexit(0)\fP in
141 the start-up routine, just after the call to \fImain()\fP; leaving \fImain()\fP
142 through the bottom implies successful termination.
144 We shall satisfy both groups by
155 (Warning 47, #1): Local data pointer expected at "t.c", line 4, INR = 17
156 (Warning 61, cont.): Actual memory is undefined at "t.c", line 4, INR = 17
157 (Message): program exits with status 0 at "exit.c", line 11, INR = 33
159 which is pretty clear as it stands.
162 .I "Using stack dumps"
164 Let's, for the sake of argument
165 and to avoid the fierce realism of 10000-line programs, assume that the above
166 still does not give enough information.
167 Since the error occurred in EM instruction number 17, we should like to see
168 more information around that moment. Call the interpreter again, now with the
169 shell variable AT set at 17:
173 (The interpreter has a number of internal variables that can be set by
174 assignments on the command line, like with \fImake\fP.)
175 This gives a file called \fIint.log\fP containing the
176 stack dump of 150 lines presented at the end of this chapter.
178 Since dumping is a subfacility of logging in the interpreter, the formats of
180 the same. If a line starts with an @, it will contain a file-name/line-number
181 indication; the next two characters are the subject and the log
182 level. Then comes the information, preceded by a space. The text contains
183 three stack dumps, one before the offending instruction, one at it, and one
184 after it; then the interpreter stops. All kinds of other dumps can be
185 obtained, but this is default.
187 For each instruction we have, in order:
189 an @x9 line, giving the position in the program,
191 the messages, warnings and errors from the instruction as it is being executed,
193 dump(s), as requested.
195 The first two lines mean that at line 4 in file \fIt.c\fP the interpreter
196 performed its 16-th instruction, with the Program Counter at 30 pointing at
197 opcode 180 in the text segment; the instruction was an LOL (LOad Local)
198 with the operand \-4 derived from the opcode. It copies the local at offset
199 \-4 to the top of the stack. The effect can be seen from the subsequent stack
200 dump, where the undefined word at addresses 2147483568 to ...571 (the variable
201 \fIa\fP) has been copied to the top of the stack at 2147483560 (copying
202 undefined values does not generate a warning).
203 Since we used the \fIem44\fP compiler, all pointers and ints in our dump are
205 So a variable at address X in reality extends from address X to X+3.
207 Note that this is not the offending instruction; this stack dump represents
208 the situation just before the error.
210 The stack consists of a sequence of frames, each containing data followed by
211 a Return Status Block resulting from a call; the last frame ends in
212 top-of-stack. The first frame represents the stack when the program starts,
213 through a call to the start-up routine. This routine prepares the second
214 stack frame with the actual parameters to \fImain()\fP:
215 \fIargc\fP at 2147483596, \fIargv\fP at 2147483600 and \fIenviron\fP at
218 The RSB line shows that the call to \fImain()\fP was made from procedure 0
219 which has 0 locals, with PC at
220 16, an LB of 2147483608 and file name and line number still unknown.
221 The \fIcode\fP in the RSB tells how this RSB was made; possible values are STP
222 (start-up), CAL, RTT (returnable trap) and NRT (non-returnable trap).
224 The next frame shows the local variable(s) of \fImain()\fP; there are two of
225 them, the pointer \fIa\fP at 2147483568, which is undefined, and variable
226 \fIb\fP at 2147483564, which has the value 777. Then comes a copy of \fIa\fP,
227 just made by the LOL instruction, at 2147483560. The following line shows that
228 the Function Return Area (which does not reside at the end of the stack, but
229 just happens to be printed here) has size 0 and is presently undefined.
231 by showing that the Actuals Base is at 2147483596 (pointing at \fIargc\fP), the
232 Locals Base at 2147483572 (pointing just above the local \fIa\fP), the Stack
233 Pointer at 2147483560 (pointing at the undefined pointer), the line count is 4
234 and the file name is "t.c".
236 (Notice that there is one more stack frame than one would probably expect, the
237 one above the start-up routine.)
239 The Function Return Area
240 could have a size larger than 0 and still be undefined, for
241 example when an instruction that does not preserve the contents of the FRA has
242 just been executed; likewise the FRA could have size 0 and be defined
243 nevertheless, for example just after a RET 0 instruction.
245 All this has set the scene for the distaster which is about to strike in the
246 next instruction. This is indeed a LOI (LOad Indirect) of size 4, opcode 169;
247 it causes the message
249 warning: Local data pointer expected [stack.c: 242]
253 warning cont.: Actual memory is undefined
255 (detected in the interpreter file \fIstack.c\fP at line 242; this can be
256 useful for sorting out dubious semantics). We see that the effect, as shown in
257 the third frame of this stack dump (at instruction number 17) is somewhat
258 unexpected: the LOI has fetched the value 4 and stacked it. The reason is
259 that, unfortunately, undefinedness is not transitive in the interpreter. When
260 an undefined value is used in an operation (other than copying) a warning is
261 given, but thereafter the value is treated as if it were zero. So, after the
262 warning a normal null pointer remains, which is then used to pick up the value
263 at location 0. This is the place where the EM machine stores its current line
264 number, which is presently 4.
266 The third stack dump shows the final effect: the value 4 has been unstacked
267 and copied to variable \fIb\fP at 2147483564 through an STL (STore Local)
270 Since this form of logging dumps the stack only, the log file is relatively
272 Nevertheless, a useful excerpt can be obtained with the command
276 This extracts the Return Status Block lines from the log, thus producing three
277 traces of calls, one for each instruction in the log:
279 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
280 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
281 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, LIN = 4, FIL = "t.c"
282 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
283 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
284 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, LIN = 4, FIL = "t.c"
285 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
286 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
287 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483564, HP = 848, LIN = 4, FIL = "t.c"
289 Theoretically, the pertinent trace is the middle one, but in practice all three
290 are equal. In the present case there isn't much to trace, but in real programs
291 the trace can be useful.
294 .I "Errors in libraries"
296 Since libraries are generally compiled with suppression of line number and
297 file name information, the line number and file name in the interpreter will
298 not be updated when it enters a library routine. Consequently, all messages
299 generated by interpreting library routines will seem to originate from the
300 line of the call. This is especially true for the routine malloc(), which,
301 from the nature of its business, often contains dubitable code.
305 (Warning 43, #1): Local integer expected at "buff.c", line 18, INR = 266
306 (Warning 64, cont.): Actual memory contains a data pointer at "buff.c", line 18, INR = 266
308 and indeed at line 18 of the file buff.c we find:
310 buff = malloc(buff_size = BFSIZE);
312 This problem can be avoided by using a specially compiled version of the
313 library that contains the correct LIN and FIL instructions, or, less
314 elegantly, by including the source code of the library routines in the
315 program; in the latter case, one has to be sure to have them all.
318 .I "Unavoidable messages"
320 Some messages produced by the logging are almost unavoidable; sometimes the
321 writer of a library routine is forced to take liberties with the semantics of
324 Examples from C include the memory allocation routines.
325 For efficiency reasons, one bit of an pointer in the administration is used as
326 a flag; setting, clearing and reading this bit requires bitwise operations on
327 pointers, which gives the above messages.
328 Realloc causes a problem in that it may have to copy the originally allocated
329 area to a different place; this area may contain uninitialised bytes.
333 @x9 "t.c", line 4, INR = 16, PC = 30 OPCODE = 180
334 @L6 "t.c", line 4, INR = 16, DoLOLm(-4)
336 d2 . . STACK_DUMP[4/4] . . INR = 16 . . STACK_DUMP . .
337 d2 ----------------------------------------------------------------
338 d2 ADDRESS BYTE ITEM VALUE SHADOW
342 d2 2147483640 40 [ 40] (Dp)
346 d2 2147483636 64 [ 832] (Dp)
350 d2 2147483632 1 [ 1] (In)
351 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
353 d2 ADDRESS BYTE ITEM VALUE SHADOW
357 d2 2147483604 40 [ 40] (Dp)
361 d2 2147483600 64 [ 832] (Dp)
365 d2 2147483596 1 [ 1] (In)
366 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
368 d2 ADDRESS BYTE ITEM VALUE SHADOW
371 d2 2147483568 undef (1 word)
375 d2 2147483564 9 [ 777] (In)
378 d2 2147483560 undef (1 word)
379 d2 FRA: size = 0, undefined
380 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, \e
382 d2 ----------------------------------------------------------------
384 @x9 "t.c", line 4, INR = 17, PC = 31 OPCODE = 169
385 @w1 "t.c", line 4, INR = 17, warning: Local data pointer expected [stack.c: 242]
386 @w1 "t.c", line 4, INR = 17, warning cont.: Actual memory is undefined
387 @L6 "t.c", line 4, INR = 17, DoLOIm(4)
389 d2 . . STACK_DUMP[4/4] . . INR = 17 . . STACK_DUMP . .
390 d2 ----------------------------------------------------------------
391 d2 ADDRESS BYTE ITEM VALUE SHADOW
395 d2 2147483640 40 [ 40] (Dp)
399 d2 2147483636 64 [ 832] (Dp)
403 d2 2147483632 1 [ 1] (In)
404 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
406 d2 ADDRESS BYTE ITEM VALUE SHADOW
410 d2 2147483604 40 [ 40] (Dp)
414 d2 2147483600 64 [ 832] (Dp)
418 d2 2147483596 1 [ 1] (In)
419 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
421 d2 ADDRESS BYTE ITEM VALUE SHADOW
424 d2 2147483568 undef (1 word)
428 d2 2147483564 9 [ 777] (In)
432 d2 2147483560 4 [ 4] (In)
433 d2 FRA: size = 0, undefined
434 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, \e
436 d2 ----------------------------------------------------------------
438 @x9 "t.c", line 4, INR = 18, PC = 32 OPCODE = 229
439 @S6 "t.c", line 4, INR = 18, DoSTLm(-8)
441 d2 . . STACK_DUMP[4/4] . . INR = 18 . . STACK_DUMP . .
442 d2 ----------------------------------------------------------------
443 d2 ADDRESS BYTE ITEM VALUE SHADOW
447 d2 2147483640 40 [ 40] (Dp)
451 d2 2147483636 64 [ 832] (Dp)
455 d2 2147483632 1 [ 1] (In)
456 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
458 d2 ADDRESS BYTE ITEM VALUE SHADOW
462 d2 2147483604 40 [ 40] (Dp)
466 d2 2147483600 64 [ 832] (Dp)
470 d2 2147483596 1 [ 1] (In)
471 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
473 d2 ADDRESS BYTE ITEM VALUE SHADOW
476 d2 2147483568 undef (1 word)
480 d2 2147483564 4 [ 4] (In)
481 d2 FRA: size = 0, undefined
482 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483564, HP = 848, \e
484 d2 ----------------------------------------------------------------