1 \ This is freeware, copyright Gordon Charlton, 12th of September 1994.
\r
2 \ Copy and distribute it. Use it. Don't mess with this file. Acknowledge
\r
3 \ its use. I make no guarentees as to its fitness for any purpose. Tell
\r
4 \ me about any bugs. Tell me how much you like it.
\r
8 \ This is an implementation of the ANS Forth Memory-Allocation Word Set.
\r
9 \ This is an ANS standard program that has the following environmental
\r
10 \ dependency - two's complement arithmetic. It requires four words
\r
11 \ from the core extension: 0> NIP TUCK \
\r
13 \ (If you go to the trouble of checking these claims, please e-mail me
\r
14 \ with your findings; gordon@charlton.demon.co.uk)
\r
16 \ There are five broad areas that the program covers;
\r
18 \ 1, General purpose extensions to the Forth system.
\r
20 \ 2, Creation of the heap and associated use of the data space.
\r
22 \ 3, Allocation of space from the heap.
\r
24 \ 4, Releasing space back to the heap.
\r
26 \ 5, Altering the size of allocated heap space.
\r
29 \ The ANS word set consists of three words, ALLOCATE, FREE, and RESIZE
\r
30 \ which give the minimum functionality required to use the heap. These are
\r
31 \ given in areas 3, 4 and 5 respectively.
\r
33 \ The heap is maintained as a doubly linked ordered circular list of nodes
\r
34 \ with an additional field noting the size of each node and whether it is in
\r
35 \ use. The size of the heap is specified by the constant HEAPSIZE. The
\r
36 \ constant HYSTERESIS controls the amount of spare space that is added to
\r
37 \ an allocation, to reduce the need for block moves during resizing.
\r
39 \ Initially there is only one node, the size of the heap. Aditional nodes
\r
40 \ are created by dividing an existing node into two parts. Nodes are removed
\r
41 \ by marking as free, and merging with adjoining free nodes. Nodes are
\r
42 \ altered in size by merging with a following free node, if possible, and a
\r
43 \ node being created above the new size of the node, if needed, or by
\r
44 \ allocating a new node and block moving the data field if necessary.
\r
46 \ Finding an available node is done by sequential search and comparison. The
\r
47 \ first node to be found that is large enough is used for allocation. Each
\r
48 \ search starts from the node most recently allocated, making this a
\r
49 \ "nextfit" algorithm. The redundancy in the head fields is required to
\r
50 \ optimise the search loop, as is the use of a sentinel to terminate the
\r
51 \ search once every node has been looked at, by always succeeding. A final
\r
52 \ refinement is the use of the sign bit of the size field to mark "in-use"
\r
53 \ nodes so that they are disregarded without a separate test.
\r
56 \ **1** General Purpose Extensions
\r
58 : unique ( ) VARIABLE ;
\r
60 \ Defining word. Each child returns a different non-zero number. The
\r
61 \ standard introduces the need for unique identifiers in the form of IORs
\r
62 \ and THROW codes, but provides no means for generating them. This does
\r
65 : k ( n--n) 1024 * ;
\r
67 \ A convenient way of referring to large numbers. Multiplies a number by
\r
70 0 1 2 UM/MOD NIP 1- CONSTANT maxpos
\r
72 \ The largest positive single length integer.
\r
75 \ **2** Heap Creation
\r
77 \ ANSI Heap --- Constants
\r
79 16 k CELLS CONSTANT heapsize
\r
81 \ Number of address units of data space that the heap occupies.
\r
83 4 CELLS 1- CONSTANT hysteresis
\r
85 \ Node lengths are rounded up according to the value of HYSTERESIS to
\r
86 \ reduce the number of block moves during RESIZE operations. The value of
\r
87 \ this constant must be one less than a power of two and at least equal to
\r
88 \ one less than the size of a cell.
\r
90 unique allocationerror
\r
92 \ Indicates there is less contiguous heap space available than required.
\r
94 3 CELLS CONSTANT headsize
\r
96 \ A node on the heap consists of a three cell head followed by a variable
\r
97 \ length data space. The first cell in the head points to the next node in
\r
98 \ the heap. The second cell indicates the size of the node, and the third
\r
99 \ points to the previous node. The second cell is negated to indicate the
\r
100 \ node is in use. The heap consists of a doubly linked circular list. There
\r
101 \ is no special notation to indicate an empty list, as this situation
\r
104 : adjustsize ( n--n) headsize + hysteresis OR 1+ ;
\r
106 \ The amount of space that is requested for a node needs adjusting to
\r
107 \ include the length of the head, and to incorporate the hysteresis.
\r
109 0 adjustsize CONSTANT overhead
\r
111 \ The size of the smallest possible node.
\r
114 \ ANSI Heap --- Structure
\r
116 CREATE sentinel HERE CELL+ , maxpos , 0 , 0 ,
\r
118 \ A dummy node used to speed up searching the heap. The search, which is
\r
119 \ for a node larger than or equal to the specified size will always succeed.
\r
120 \ The cell that points to the next node is set up so that the there is a zero
\r
121 \ three cells ahead of where it points, where the pointer to the previous
\r
122 \ node (ie the sentinel) should be. This is a special value that indicates the
\r
123 \ search has failed.
\r
125 CREATE heap heapsize ALLOT
\r
127 \ The heap is as described in HEADSIZE.
\r
131 \ Searching is done using a "nextfit" algorithm. NEXTNODE points to the
\r
132 \ most recently allocated node to indicate where the next search is to
\r
135 : >size ( addr--addr) CELL+ ;
\r
137 \ Move from the "next" cell in the node head to the "size" cell. Within the
\r
138 \ word set nodes are referred to by the address of the "next" cell.
\r
139 \ Externally they are referred to by the address of the start of the data
\r
142 : >prev ( addr--addr) 2 CELLS + ;
\r
144 \ Move from the "next" cell to the "previous" cell.
\r
146 : init-heap ( ) heap DUP nextnode !
\r
148 DUP heapsize OVER >size !
\r
151 \ Initially the heap contains only one node, which is the same size as the
\r
152 \ heap. Both the "next" cell and the "previous" cell point to the "next"
\r
153 \ cell, as does NEXTNODE.
\r
157 \ **3** Heap Allocation
\r
159 \ ANSI Heap --- List Searching
\r
161 : attach ( addr) >prev @
\r
165 \ The sentinel is joined into the nodelist. The "next" field of the node
\r
166 \ preceding the one specified (addr) is set to point to the sentinel, and
\r
167 \ the "prev" field of the sentinel to point to the node that points to the
\r
170 : search ( addr size--addr|0)
\r
171 >R BEGIN 2@ SWAP R@ < INVERT UNTIL
\r
174 \ Search the nodelist, starting at the node specified (addr), for a free
\r
175 \ node larger than or equal to the specified size. Return the address of the
\r
176 \ first node that matches, or zero for no match. The heap structure is set up
\r
177 \ to make this a near optimal search loop. The "size" field is next to the "next"
\r
178 \ field so that both can be collected in a single operation (2@). Nodes in
\r
179 \ use have negated sizes so they never match the search. The "previous"
\r
180 \ field is included to allow the search to overshoot the match by one node
\r
181 \ and then link back outside the loop, rather than remembering the address
\r
182 \ of the node just examined. The sentinel removes the need for a separate
\r
183 \ test for failure. SEARCH assumes the sentinel is in place.
\r
185 : detach ( addr) DUP >prev @ ! ;
\r
187 \ Remake the link from the node prior to the one specified to the one
\r
188 \ specified. This will remove the sentinel if it is attached here. (It will
\r
191 : findspace ( size--addr|0) nextnode @
\r
196 \ Search the nodelist for a node larger or equal to that specified. Return
\r
197 \ the address of a suitable node, or zero if none found. The search starts at
\r
198 \ the node pointed to by NEXTNODE, the sentinal temporarily attached, the
\r
199 \ search proceeded with and the sentinel detached.
\r
202 \ ANSI Heap --- Head Creation
\r
204 : fits ( size addr--flag) >size @ SWAP - overhead < ;
\r
206 \ Returns TRUE if the size of the node specified is the same as the
\r
207 \ specified size, or larger than it by less than the size of the smallest
\r
208 \ possible node. Returns FALSE otherwise.
\r
210 : togglesize ( addr) >size DUP @ NEGATE SWAP ! ;
\r
212 \ Negate the contents of the "size" field of the specified node. If the
\r
213 \ node was available it is marked as in use, and vice versa.
\r
215 : next! ( addr) nextnode ! ;
\r
217 \ Make the specified node the starting node for future searches of the node
\r
220 : sizes! ( size addr--addr) 2DUP + >R
\r
221 >size 2DUP @ SWAP -
\r
223 SWAP NEGATE SWAP ! R> ;
\r
225 \ Given a free node (addr), reduce its size to that specified and mark it
\r
226 \ as in use. Start to construct a new node within the specified node beyond
\r
227 \ its new length, by storing the length of the remainder of the node in the
\r
228 \ size field of the new node. Return the address of the partially
\r
229 \ constructed node.
\r
231 : links! ( addr1 addr2) 2DUP SWAP @ 2DUP SWAP ! >prev !
\r
232 2DUP >prev ! SWAP ! ;
\r
235 \ Addr1 is an existing node. Addr2 is the address of a new node just above
\r
236 \ the existing node. Break the links from the existing node to the next
\r
237 \ node and from the next node to the existing node and join the new node to
\r
241 \ ANSI heap --- Node Construction ALLOCATE
\r
243 : newnode ( size addr) TUCK sizes! links! ;
\r
245 \ Given a free node at addr split it into an in-use node of the specified
\r
246 \ size and a new free node above the in-use node.
\r
248 : makenode ( size addr) 2DUP fits IF togglesize DROP
\r
252 \ Given a free node at addr make an in-use node of the specified size
\r
253 \ and free the remainder, if there is any usable space left.
\r
255 : ALLOCATE ( u--addr ior)
\r
256 DUP 0< IF allocationerror
\r
262 ELSE DROP allocationerror
\r
266 \ Make an in-use node with a data field at least u address units long.
\r
267 \ Return the address of the data field and an ior of 0 to indicate success.
\r
268 \ If the space is not available return any old number and an ior equal to the
\r
269 \ constant ALLOCATIONERROR. The standard specifies that the argument to
\r
270 \ ALLOCATE is unsigned. As the implementation uses the sign bit of the size
\r
271 \ field for its own purposes any request for an amount of space greater
\r
272 \ than MAXPOS must fail. As this would be a request for half the
\r
273 \ addressable memory or more this is not unreasonable.
\r
275 \ **4** Releasing Space
\r
277 \ ANSI heap --- Head Destruction
\r
279 : mergesizes ( addr addr)
\r
280 >size @ SWAP >size +! ;
\r
282 \ Make the size field of the node at addr1 equal to the sum of the sizes of
\r
283 \ the two specified nodes. In usage the node at addr2 will be the one
\r
284 \ immediately above addr1.
\r
286 : mergelinks ( addr addr)
\r
290 \ The node at addr2 is removed from the node list. As with MERGESIZES the
\r
291 \ node at addr2 will be immediately above that at addr1. Destroy the link
\r
292 \ from node1 to node2 and relink node1 to the node above node2. Destroy the
\r
293 \ backward link from the node above node2 and relink it to node1.
\r
296 nextnode @ @ >prev @ next! ;
\r
298 \ There is a possibility when a node is removed from the node list that
\r
299 \ NEXTNODE may point to it. This is cured by making it point to the node
\r
300 \ prior to the one removed. We do not want to alter the pointer if it does
\r
301 \ not point to the removed node as that could be detrimental to the
\r
302 \ efficiency of the nextfit search algorithm. Rather than testing for this
\r
303 \ condition we jiggle the pointer about a bit to settle it into a linked
\r
304 \ node. This is done for reasons of programmer amusement. Specifically
\r
305 \ NEXTNODE is set to point to the node pointed to by the "previous" field
\r
306 \ of the node pointed to in the "next" field of the node pointed to by
\r
307 \ NEXTNODE. Ordinarily this is a no-op (ie I am my father's son) but when
\r
308 \ the node has had its links merged it sets NEXTNODE to point to the node
\r
309 \ prior to the node it pointed to (ie when I died my father adopted my son,
\r
310 \ so now my son is my father's son).
\r
313 DUP @ 2DUP mergesizes
\r
314 mergelinks jiggle ;
\r
316 \ Combine the node specified with the node above it. Merge the sizes, merge
\r
317 \ the lengths and jiggle.
\r
320 \ ANSI Heap --- Node Removal FREE
\r
322 : ?merge ( addr1 addr2) >size @
\r
328 \ Merge the node at addr1 with the one above it on two conditions, firstly
\r
329 \ that the node at addr2 is free, and secondly that the node pointed to by
\r
330 \ the next field in addr1 is actually above addr1 (ie that it does not wrap
\r
331 \ around because it is the topmost node). In usage addr2 will be either
\r
332 \ addr1 or the node above it. In each instance the other affected node
\r
333 \ (either the node above addr1 or addr1) is known to be free, so no test is
\r
336 : ?mergenext ( addr) DUP @ ?merge ;
\r
338 \ Merge the node following the specified node with the specified node, if
\r
339 \ following node is free.
\r
341 : ?mergeprev ( addr) >prev @ DUP ?merge ;
\r
343 \ Merge the specified node with the one preceding it, if the preceding node
\r
346 : FREE ( addr--ior) headsize -
\r
351 \ Mark the specified in-use word as free, and merge with any adjacent free
\r
352 \ space. As this is a standard word addr is the address of the data field
\r
353 \ rather than the "next" field. As there is no compelling reason for this
\r
354 \ to fail the ior is zero.
\r
357 \ **5** Resizing Allocated Space
\r
359 \ ANSI Heap --- Node Repairing
\r
363 \ The RESIZE algorithm is simplified and made faster by assuming that it
\r
364 \ will always succeed. STASH holds the minimum information required to make
\r
365 \ good when it fails.
\r
367 : savelink ( addr) @ stash ! ;
\r
369 \ Saves the contents of the >NEXT field of the node being RESIZEd in STASH
\r
372 : restorelink ( addr) stash @ SWAP ! ;
\r
374 \ Converse operation to SAVELINK (above).
\r
376 : fixprev ( addr) DUP >prev @ ! ;
\r
378 \ The >NEXT field of the node prior to the node being RESIZEd should point
\r
379 \ to the node being RESIZEd. It may very well do already, but this makes
\r
382 : fixnext ( addr) DUP @ >prev ! ;
\r
384 \ The >PREV field of the node after the node resized may need correcting.
\r
385 \ This corrects it whether it needs it or not. (Its quicker just to do it
\r
386 \ than to check first.)
\r
388 : fixlinks ( addr) DUP fixprev DUP fixnext @ fixnext ;
\r
390 \ RESIZE may very well merge its argument node with the previous one. It
\r
391 \ may very well merge that with the next one. This means we need to fix the
\r
392 \ previous one, the next one and the one after next. To extend the metaphor
\r
393 \ started in the description of JIGGLE (above), not only did I die, but my
\r
394 \ father did too. This brings my grandfather into the picture as guardian
\r
395 \ of my son. Now to confound things we have all come back to life. I still
\r
396 \ remember who my son is, and my father remembers who his father is. Once I
\r
397 \ know who my father is I can tell my son that I am his father, I can tell
\r
398 \ my father that I am his son and my grandfather who his son is. Thankfully
\r
399 \ we are only concerned about the male lineage here! (In fact nodes
\r
400 \ reproduce by division, like amoebae, which is where the metaphor breaks
\r
401 \ down -- (1) they are sexless and (2) which half is parent and which
\r
404 : fixsize ( addr) DUP >size @ 0>
\r
406 IF OVER - SWAP >size !
\r
412 \ Reconstruct the size field of a node from the address of the head and the
\r
413 \ contents of the >NEXT field provided that the node is free and it is not
\r
414 \ the topmost node in the heap (ie there is no wraparound). Both these
\r
415 \ conditions need to be true for the node to have been merged with its
\r
418 : fixsizes ( addr) DUP fixsize >prev @ fixsize ;
\r
420 \ The two nodes whose size fields may need repairing are the one passed as
\r
421 \ an argument to RESIZE (damaged by ?MERGENEXT) and its predecessor
\r
422 \ (damaged by ?MERGEPREV).
\r
424 : repair ( addr) DUP restorelink
\r
425 DUP fixlinks DUP fixsizes
\r
428 \ Make good the damage done by RESIZE. Restore the >next field, fix the
\r
429 \ links, fix the size fields and mark the node as in-use. Note that this
\r
430 \ may not restore the system to exactly how it was. In particular the pointer
\r
431 \ NEXTNODE may have moved back one or two nodes by virtue of having been
\r
432 \ JIGGLEd about if it happened to be pointing to the wrong node. This is not
\r
433 \ serious, so I have chosen to ignore it.
\r
436 \ ANSI Heap --- Node Movement
\r
438 : toobig? ( addr size--flag)
\r
441 \ Flag is true if the node at addr is smaller than the specified size.
\r
443 : copynode ( addr1 addr2)
\r
444 OVER >size @ headsize -
\r
445 ROT headsize + ROT ROT MOVE ;
\r
447 \ Move the contents of the data field of the node at addr1 to the data
\r
448 \ field at addr2. Assumes addr2 is large enough. It will be.
\r
450 : enlarge ( addr1 size--addr2 ior)
\r
457 \ Make a new node of the size specified. Copy the data field of addr1 to
\r
458 \ the new node. Merge the node at addr1 with the one preceding it, if
\r
459 \ possible. This last behaviour is to finish off removing the node at
\r
460 \ addr1. The word ADJUST (below) starts removing the node. The node is
\r
461 \ removed before allocation to increase the probability of ALLOCATE
\r
462 \ succeeding. The address returned by ENLARGE is that returned by ALLOCATE,
\r
463 \ which is that of the data field, not the head. If the allocation fails
\r
464 \ repair the damage done by removing the node at addr1.
\r
467 \ ANSI Heap --- Node Restructuring RESIZE
\r
469 : adjust ( addr1 size1--addr2 size2) adjustsize >R
\r
473 DUP ?mergenext R> ;
\r
475 \ Addr1 points to the data field of a node, not the "next" field. This
\r
476 \ needs correcting. Size1 also needs adjusting as per ADJUSTSIZE. In
\r
477 \ addition it is easier to work with free nodes than live ones as the size
\r
478 \ field is correct, and, as we intend to change the nodes size we will
\r
479 \ inevitably want to muck about with the next node, if its free, so lets
\r
480 \ merge with it straight away. Sufficient information is first saved to put
\r
481 \ the heap back as it was, if necessary. Now we are ready to get down to
\r
484 : RESIZE ( addr1 u--addr2 ior)
\r
485 DUP 0< IF DROP allocationerror
\r
493 \ Resize the node at addr1 to the specified size. Return the address of the
\r
494 \ resized node (addr2) along with an ior of zero if successful and
\r
495 \ ALLOCATIONERROR if not. Addr2 may be the same as, or different to, addr1.
\r
496 \ If ior is non-zero then addr2 is not meaningful. Being a standard word
\r
497 \ the arguments need adjusting to the internal representation on entry, and
\r
498 \ back again on exit. If after the first merge the requested size is still
\r
499 \ too large to reuse the specified node then it is moved to a larger node
\r
500 \ and the specified node released. If, on the other hand the request is not
\r
501 \ too big for the node, then we remake the node at the right length, and
\r
502 \ free any space at the top using MAKENODE, which has just the right
\r
503 \ functionality. In this case the ior is zero. As this is a standard word it
\r
504 \ takes an unsigned size argument, but excessive requests fail
\r
505 \ automatically, as with ALLOCATE.
\r