memory.fth

   1 \ This is freeware, copyright Gordon Charlton, 12th of September 1994.\r
   2 \ Copy and distribute it. Use it. Don't mess with this file. Acknowledge\r
   3 \ its use. I make no guarentees as to its fitness for any purpose. Tell\r
   4 \ me about any bugs. Tell me how much you like it.\r
   5 \r
   6 \                               An ANS Heap\r
   7 \r
   8 \ This is an implementation of the ANS Forth Memory-Allocation Word Set.\r
   9 \ This is an ANS standard program that has the following environmental\r
  10 \ dependency - two's complement arithmetic.  It requires four words\r
  11 \ from the core extension:   0> NIP TUCK \\r
  12 \r
  13 \ (If you go to the trouble of checking these claims, please e-mail me\r
  14 \ with your findings; gordon@charlton.demon.co.uk)\r
  15 \r
  16 \ There are five broad areas that the program covers;\r
  17 \r
  18 \      1, General purpose extensions to the Forth system.\r
  19 \r
  20 \      2, Creation of the heap and associated use of the data space.\r
  21 \r
  22 \      3, Allocation of space from the heap.\r
  23 \r
  24 \      4, Releasing space back to the heap.\r
  25 \r
  26 \      5, Altering the size of allocated heap space.\r
  27 \r
  28 \r
  29 \ The ANS word set consists of three words, ALLOCATE, FREE, and RESIZE\r
  30 \ which give the minimum functionality required to use the heap. These are\r
  31 \ given in areas 3, 4 and 5 respectively.\r
  32 \r
  33 \ The heap is maintained as a doubly linked ordered circular list of nodes\r
  34 \ with an additional field noting the size of each node and whether it is in\r
  35 \ use. The size of the heap is specified by the constant HEAPSIZE. The\r
  36 \ constant HYSTERESIS controls the amount of spare space that is added to\r
  37 \ an allocation, to reduce the need for block moves during resizing.\r
  38 \r
  39 \ Initially there is only one node, the size of the heap. Aditional nodes\r
  40 \ are created by dividing an existing node into two parts. Nodes are removed\r
  41 \ by marking as free, and merging with adjoining free nodes. Nodes are\r
  42 \ altered in size by merging with a following free node, if possible, and a\r
  43 \ node being created above the new size of the node, if needed, or by\r
  44 \ allocating a new node and block moving the data field if necessary.\r
  45 \r
  46 \ Finding an available node is done by sequential search and comparison. The\r
  47 \ first node to be found that is large enough is used for allocation. Each\r
  48 \ search starts from the node most recently allocated, making this a\r
  49 \ "nextfit" algorithm. The redundancy in the head fields is required to\r
  50 \ optimise the search loop, as is the use of a sentinel to terminate the\r
  51 \ search once every node has been looked at, by always succeeding. A final\r
  52 \ refinement is the use of the sign bit of the size field to mark "in-use"\r
  53 \ nodes so that they are disregarded without a separate test.\r
  54 \r
  55 \r
  56 \ **1** General Purpose Extensions\r
  57 \r
  58 : unique (  )  VARIABLE ;\r
  59 \\r
  60 \ Defining word. Each child returns a different non-zero number. The\r
  61 \ standard introduces the need for unique identifiers in the form of IORs\r
  62 \ and THROW codes, but provides no means for generating them. This does\r
  63 \ the trick.\r
  64 \r
  65 : k ( n--n)  1024 * ;\r
  66 \\r
  67 \ A convenient way of referring to large numbers. Multiplies a number by\r
  68 \ 1024.\r
  69 \r
  70 0 1 2 UM/MOD NIP 1- CONSTANT maxpos\r
  71 \\r
  72 \ The largest positive single length integer.\r
  73 \r
  74 \r
  75 \ **2** Heap Creation\r
  76 \r
  77 \ ANSI Heap  ---  Constants\r
  78 \r
  79 16 k CELLS CONSTANT heapsize\r
  80 \\r
  81 \ Number of address units of data space that the heap occupies.\r
  82 \r
  83 4 CELLS 1- CONSTANT hysteresis\r
  84 \\r
  85 \ Node lengths are rounded up according to the value of HYSTERESIS to\r
  86 \ reduce the number of block moves during RESIZE operations. The value of\r
  87 \ this constant must be one less than a power of two and at least equal to\r
  88 \ one less than the size of a cell.\r
  89 \r
  90 unique allocationerror\r
  91 \\r
  92 \ Indicates there is less contiguous heap space available than required.\r
  93 \r
  94 3 CELLS CONSTANT headsize\r
  95 \\r
  96 \ A node on the heap consists of a three cell head followed by a variable\r
  97 \ length data space. The first cell in the head points to the next node in\r
  98 \ the heap. The second cell indicates the size of the node, and the third\r
  99 \ points to the previous node. The second cell is negated to indicate the\r
 100 \ node is in use. The heap consists of a doubly linked circular list. There\r
 101 \ is no special notation to indicate an empty list, as this situation\r
 102 \ cannot occur.\r
 103 \r
 104 : adjustsize ( n--n)  headsize +  hysteresis OR  1+ ;\r
 105 \\r
 106 \ The amount of space that is requested for a node needs adjusting to\r
 107 \ include the length of the head, and to incorporate the hysteresis.\r
 108 \r
 109 0 adjustsize CONSTANT overhead\r
 110 \\r
 111 \ The size of the smallest possible node.\r
 112 \r
 113 \r
 114 \ ANSI Heap  ---  Structure\r
 115 \r
 116 CREATE sentinel  HERE CELL+ ,   maxpos ,  0 ,  0 ,\r
 117 \\r
 118 \ A dummy node used to speed up searching the heap. The search, which is\r
 119 \ for a node larger than or equal to the specified size will always succeed.\r
 120 \ The cell that points to the next node is set up so that the there is a zero\r
 121 \ three cells ahead of where it points, where the pointer to the previous\r
 122 \ node (ie the sentinel) should be. This is a special value that indicates the\r
 123 \ search has failed.\r
 124 \r
 125 CREATE heap  heapsize ALLOT\r
 126 \\r
 127 \ The heap is as described in HEADSIZE.\r
 128 \r
 129 VARIABLE nextnode\r
 130 \\r
 131 \ Searching is done using a "nextfit" algorithm. NEXTNODE points to the\r
 132 \ most recently allocated node to indicate where the next search is to\r
 133 \ start from.\r
 134 \r
 135 : >size ( addr--addr)  CELL+ ;\r
 136 \\r
 137 \ Move from the "next" cell in the node head to the "size" cell. Within the\r
 138 \ word set nodes are referred to by the address of the "next" cell.\r
 139 \ Externally they are referred to by the address of the start of the data\r
 140 \ field.\r
 141 \r
 142 : >prev ( addr--addr)  2 CELLS + ;\r
 143 \\r
 144 \ Move from the "next" cell to the "previous" cell.\r
 145 \r
 146 : init-heap (  )  heap DUP nextnode !\r
 147                   DUP DUP !\r
 148                   DUP heapsize  OVER >size !\r
 149                   >prev ! ;\r
 150 \\r
 151 \ Initially the heap contains only one node, which is the same size as the\r
 152 \ heap. Both the "next" cell and the "previous" cell point to the "next"\r
 153 \ cell, as does NEXTNODE.\r
 154 \r
 155 init-heap\r
 156 \r
 157 \ **3** Heap Allocation\r
 158 \r
 159 \ ANSI Heap  ---  List Searching\r
 160 \r
 161 : attach ( addr)  >prev @\r
 162                   DUP sentinel ROT !\r
 163                   sentinel >prev ! ;\r
 164 \\r
 165 \ The sentinel is joined into the nodelist. The "next" field of the node\r
 166 \ preceding the one specified (addr) is set to point to the sentinel, and\r
 167 \ the "prev" field of the sentinel to point to the node that points to the\r
 168 \ sentinel.\r
 169 \r
 170 : search  ( addr size--addr|0)\r
 171           >R BEGIN 2@ SWAP R@ < INVERT UNTIL\r
 172           R> DROP  >prev @ ;\r
 173 \\r
 174 \ Search the nodelist, starting at the node specified (addr), for a free\r
 175 \ node larger than or equal to the specified size. Return the address of the\r
 176 \ first node that matches, or zero for no match. The heap structure is set up\r
 177 \ to make this a near optimal search loop. The "size" field is next to the "next"\r
 178 \ field so that both can be collected in a single operation (2@). Nodes in\r
 179 \ use have negated sizes so they never match the search. The "previous"\r
 180 \ field is included to allow the search to overshoot the match by one node\r
 181 \ and then link back outside the loop, rather than remembering the address\r
 182 \ of the node just examined. The sentinel removes the need for a separate\r
 183 \ test for failure. SEARCH assumes the sentinel is in place.\r
 184 \r
 185 : detach ( addr)  DUP >prev @ ! ;\r
 186 \\r
 187 \ Remake the link from the node prior to the one specified to the one\r
 188 \ specified. This will remove the sentinel if it is attached here. (It will\r
 189 \ be.)\r
 190 \r
 191 : findspace ( size--addr|0)  nextnode @\r
 192                              DUP      attach\r
 193                              DUP ROT  search\r
 194                              SWAP     detach ;\r
 195 \\r
 196 \ Search the nodelist for a node larger or equal to that specified. Return\r
 197 \ the address of a suitable node, or zero if none found. The search starts at\r
 198 \ the node pointed to by NEXTNODE, the sentinal temporarily attached, the\r
 199 \ search proceeded with and the sentinel detached.\r
 200 \r
 201 \r
 202 \ ANSI Heap  ---  Head Creation\r
 203 \r
 204 : fits ( size addr--flag)  >size @ SWAP -  overhead  < ;\r
 205 \\r
 206 \ Returns TRUE if the size of the node specified is the same as the\r
 207 \ specified size, or larger than it by less than the size of the smallest\r
 208 \ possible node. Returns FALSE otherwise.\r
 209 \r
 210 : togglesize ( addr)  >size DUP @  NEGATE SWAP ! ;\r
 211 \\r
 212 \ Negate the contents of the "size" field of the specified node. If the\r
 213 \ node was available it is marked as in use, and vice versa.\r
 214 \r
 215 : next! ( addr)  nextnode ! ;\r
 216 \\r
 217 \ Make the specified node the starting node for future searches of the node\r
 218 \ list.\r
 219 \r
 220 : sizes! ( size addr--addr)  2DUP + >R\r
 221                              >size 2DUP @ SWAP -\r
 222                              R@ >size !\r
 223                              SWAP NEGATE SWAP !  R> ;\r
 224 \\r
 225 \ Given a free node (addr), reduce its size to that specified and mark it\r
 226 \ as in use. Start to construct a new node within the specified node beyond\r
 227 \ its new length, by storing the length of the remainder of the node in the\r
 228 \ size field of the new node. Return the address of the partially\r
 229 \ constructed node.\r
 230 \r
 231 : links! ( addr1 addr2)  2DUP SWAP @  2DUP  SWAP !  >prev !\r
 232                                       2DUP >prev !   SWAP ! ;\r
 233 \r
 234 \\r
 235 \ Addr1 is an existing node. Addr2 is the address of a new node just above\r
 236 \ the existing node. Break the links from the existing node to the next\r
 237 \ node and from the next node to the existing node and join the new node to\r
 238 \ them.\r
 239 \r
 240 \r
 241 \ ANSI heap  ---  Node Construction     ALLOCATE\r
 242 \r
 243 : newnode ( size addr)  TUCK sizes!  links! ;\r
 244 \\r
 245 \ Given a free node at addr split it into an in-use node of the specified\r
 246 \ size and a new free node above the in-use node.\r
 247 \r
 248 : makenode ( size addr)  2DUP fits IF  togglesize DROP\r
 249                                  ELSE  newnode\r
 250                                  THEN ;\r
 251 \\r
 252 \ Given a free node at addr make an in-use node of the specified size\r
 253 \ and free the remainder, if there is any usable space left.\r
 254 \r
 255 : ALLOCATE ( u--addr ior)\r
 256           DUP 0< IF  allocationerror\r
 257                ELSE  adjustsize\r
 258                      DUP findspace\r
 259                      DUP IF  DUP next!\r
 260                              TUCK makenode\r
 261                              headsize +  0\r
 262                        ELSE  DROP allocationerror\r
 263                        THEN\r
 264                THEN ;\r
 265 \\r
 266 \ Make an in-use node with a data field at least u address units long.\r
 267 \ Return the address of the data field and an ior of 0 to indicate success.\r
 268 \ If the space is not available return any old number and an ior equal to the\r
 269 \ constant ALLOCATIONERROR. The standard specifies that the argument to\r
 270 \ ALLOCATE is unsigned. As the implementation uses the sign bit of the size\r
 271 \ field for its own purposes any request for an amount of space greater\r
 272 \ than MAXPOS must fail. As this would be a request for half the\r
 273 \ addressable memory or more this is not unreasonable.\r
 274 \r
 275 \ **4** Releasing Space\r
 276 \r
 277 \ ANSI heap  ---  Head Destruction\r
 278 \r
 279 : mergesizes ( addr addr)\r
 280              >size @ SWAP >size +! ;\r
 281 \\r
 282 \ Make the size field of the node at addr1 equal to the sum of the sizes of\r
 283 \ the two specified nodes. In usage the node at addr2 will be the one\r
 284 \ immediately above addr1.\r
 285 \r
 286 : mergelinks ( addr addr)\r
 287              @ 2DUP SWAP !\r
 288                    >prev ! ;\r
 289 \\r
 290 \ The node at addr2 is removed from the node list. As with MERGESIZES the\r
 291 \ node at addr2 will be immediately above that at addr1. Destroy the link\r
 292 \ from node1 to node2 and relink node1 to the node above node2. Destroy the\r
 293 \ backward link from the node above node2 and relink it to node1.\r
 294 \r
 295 : jiggle (  )\r
 296          nextnode @ @  >prev @  next! ;\r
 297 \\r
 298 \ There is a possibility when a node is removed from the node list that\r
 299 \ NEXTNODE may point to it. This is cured by making it point to the node\r
 300 \ prior to the one removed. We do not want to alter the pointer if it does\r
 301 \ not point to the removed node as that could be detrimental to the\r
 302 \ efficiency of the nextfit search algorithm. Rather than testing for this\r
 303 \ condition we jiggle the pointer about a bit to settle it into a linked\r
 304 \ node. This is done for reasons of programmer amusement. Specifically\r
 305 \ NEXTNODE is set to point to the node pointed to by the "previous" field\r
 306 \ of the node pointed to in the "next" field of the node pointed to by\r
 307 \ NEXTNODE. Ordinarily this is a no-op (ie I am my father's son) but when\r
 308 \ the node has had its links merged it sets NEXTNODE to point to the node\r
 309 \ prior to the node it pointed to (ie when I died my father adopted my son,\r
 310 \ so now my son is my father's son).\r
 311 \r
 312 : merge ( addr)\r
 313         DUP @ 2DUP mergesizes\r
 314                    mergelinks  jiggle ;\r
 315 \\r
 316 \ Combine the node specified with the node above it. Merge the sizes, merge\r
 317 \ the lengths and jiggle.\r
 318 \r
 319 \r
 320 \ ANSI Heap  ---  Node Removal          FREE\r
 321 \r
 322 : ?merge ( addr1 addr2)  >size @\r
 323                          0> IF  DUP DUP @\r
 324                                 U< IF  DUP merge\r
 325                                    THEN\r
 326                             THEN  DROP ;\r
 327 \\r
 328 \ Merge the node at addr1 with the one above it on two conditions, firstly\r
 329 \ that the node at addr2 is free, and secondly that the node pointed to by\r
 330 \ the next field in addr1 is actually above addr1 (ie that it does not wrap\r
 331 \ around because it is the topmost node). In usage addr2 will be either\r
 332 \ addr1 or the node above it. In each instance the other affected node\r
 333 \ (either the node above addr1 or addr1) is known to be free, so no test is\r
 334 \ needed for this.\r
 335 \r
 336 : ?mergenext ( addr)  DUP @ ?merge ;\r
 337 \\r
 338 \ Merge the node following the specified node with the specified node, if\r
 339 \ following node is free.\r
 340 \r
 341 : ?mergeprev ( addr)  >prev @ DUP ?merge ;\r
 342 \\r
 343 \ Merge the specified node with the one preceding it, if the preceding node\r
 344 \ is free.\r
 345 \r
 346 : FREE ( addr--ior)  headsize -\r
 347                      DUP togglesize\r
 348                      DUP ?mergenext\r
 349                      ?mergeprev  0 ;\r
 350 \\r
 351 \ Mark the specified in-use word as free, and merge with any adjacent free\r
 352 \ space. As this is a standard word addr is the address of the data field\r
 353 \ rather than the "next" field. As there is no compelling reason for this\r
 354 \ to fail the ior is zero.\r
 355 \r
 356 \r
 357 \ **5** Resizing Allocated Space\r
 358 \r
 359 \ ANSI Heap  ---  Node Repairing\r
 360 \r
 361 VARIABLE stash\r
 362 \\r
 363 \ The RESIZE algorithm is simplified and made faster by assuming that it\r
 364 \ will always succeed. STASH holds the minimum information required to make\r
 365 \ good when it fails.\r
 366 \r
 367 : savelink ( addr)  @ stash ! ;\r
 368 \\r
 369 \ Saves the contents of the >NEXT field of the node being RESIZEd in STASH\r
 370 \ (above).\r
 371 \r
 372 : restorelink ( addr)  stash @  SWAP ! ;\r
 373 \\r
 374 \ Converse operation to SAVELINK (above).\r
 375 \r
 376 : fixprev ( addr)  DUP >prev @ ! ;\r
 377 \\r
 378 \ The >NEXT field of the node prior to the node being RESIZEd should point\r
 379 \ to the node being RESIZEd. It may very well do already, but this makes\r
 380 \ sure.\r
 381 \r
 382 : fixnext ( addr)  DUP @ >prev ! ;\r
 383 \\r
 384 \ The >PREV field of the node after the node resized may need correcting.\r
 385 \ This corrects it whether it needs it or not. (Its quicker just to do it\r
 386 \ than to check first.)\r
 387 \r
 388 : fixlinks ( addr)  DUP fixprev  DUP fixnext  @ fixnext ;\r
 389 \\r
 390 \ RESIZE may very well merge its argument node with the previous one. It\r
 391 \ may very well merge that with the next one. This means we need to fix the\r
 392 \ previous one, the next one and the one after next. To extend the metaphor\r
 393 \ started in the description of JIGGLE (above), not only did I die, but my\r
 394 \ father did too. This brings my grandfather into the picture as guardian\r
 395 \ of my son. Now to confound things we have all come back to life. I still\r
 396 \ remember who my son is, and my father remembers who his father is. Once I\r
 397 \ know who my father is I can tell my son that I am his father, I can tell\r
 398 \ my father that I am his son and my grandfather who his son is. Thankfully\r
 399 \ we are only concerned about the male lineage here! (In fact nodes\r
 400 \ reproduce by division, like amoebae, which is where the metaphor breaks\r
 401 \ down -- (1) they are sexless and (2) which half is parent and which\r
 402 \ child?)\r
 403 \r
 404 : fixsize ( addr)  DUP >size @ 0>\r
 405                    IF  DUP @  2DUP <\r
 406                        IF  OVER - SWAP >size !\r
 407                      ELSE 2DROP\r
 408                      THEN\r
 409                  ELSE  DROP\r
 410                  THEN ;\r
 411 \\r
 412 \ Reconstruct the size field of a node from the address of the head and the\r
 413 \ contents of the >NEXT field provided that the node is free and it is not\r
 414 \ the topmost node in the heap (ie there is no wraparound). Both these\r
 415 \ conditions need to be true for the node to have been merged with its\r
 416 \ successor.\r
 417 \r
 418 : fixsizes ( addr)  DUP fixsize  >prev @ fixsize ;\r
 419 \\r
 420 \ The two nodes whose size fields may need repairing are the one passed as\r
 421 \ an argument to RESIZE (damaged by ?MERGENEXT) and its predecessor\r
 422 \ (damaged by ?MERGEPREV).\r
 423 \r
 424 : repair ( addr)  DUP restorelink\r
 425                   DUP fixlinks  DUP fixsizes\r
 426                   togglesize ;\r
 427 \\r
 428 \ Make good the damage done by RESIZE. Restore the >next field, fix the\r
 429 \ links, fix the size fields and mark the node as in-use. Note that this\r
 430 \ may not restore the system to exactly how it was. In particular the pointer\r
 431 \ NEXTNODE may have moved back one or two nodes by virtue of having been\r
 432 \ JIGGLEd about if it happened to be pointing to the wrong node. This is not\r
 433 \ serious, so I have chosen to ignore it.\r
 434 \r
 435 \r
 436 \ ANSI Heap  ---  Node Movement\r
 437 \r
 438 : toobig? ( addr size--flag)\r
 439           SWAP  >size @  > ;\r
 440 \\r
 441 \ Flag is true if the node at addr is smaller than the specified size.\r
 442 \r
 443 : copynode ( addr1 addr2)\r
 444        OVER >size @  headsize -\r
 445        ROT  headsize + ROT ROT MOVE ;\r
 446 \\r
 447 \ Move the contents of the data field of the node at addr1 to the data\r
 448 \ field at addr2. Assumes addr2 is large enough. It will be.\r
 449 \r
 450 : enlarge ( addr1 size--addr2 ior)\r
 451           OVER  ?mergeprev\r
 452           ALLOCATE DUP >R\r
 453           IF  SWAP repair\r
 454         ELSE  TUCK copynode\r
 455         THEN R> ;\r
 456 \\r
 457 \ Make a new node of the size specified. Copy the data field of addr1 to\r
 458 \ the new node. Merge the node at addr1 with the one preceding it, if\r
 459 \ possible. This last behaviour is to finish off removing the node at\r
 460 \ addr1. The word ADJUST (below) starts removing the node. The node is\r
 461 \ removed before allocation to increase the probability of ALLOCATE\r
 462 \ succeeding. The address returned by ENLARGE is that returned by ALLOCATE,\r
 463 \ which is that of the data field, not the head. If the allocation fails\r
 464 \ repair the damage done by removing the node at addr1.\r
 465 \r
 466 \r
 467 \ ANSI Heap  ---  Node Restructuring    RESIZE\r
 468 \r
 469 : adjust ( addr1 size1--addr2 size2)  adjustsize >R\r
 470                                       headsize -\r
 471                                       DUP savelink\r
 472                                       DUP togglesize\r
 473                                       DUP ?mergenext R> ;\r
 474 \\r
 475 \ Addr1 points to the data field of a node, not the "next" field. This\r
 476 \ needs correcting. Size1 also needs adjusting as per ADJUSTSIZE. In\r
 477 \ addition it is easier to work with free nodes than live ones as the size\r
 478 \ field is correct, and, as we intend to change the nodes size we will\r
 479 \ inevitably want to muck about with the next node, if its free, so lets\r
 480 \ merge with it straight away. Sufficient information is first saved to put\r
 481 \ the heap back as it was, if necessary. Now we are ready to get down to\r
 482 \ business.\r
 483 \r
 484 : RESIZE ( addr1 u--addr2 ior)\r
 485          DUP 0< IF  DROP allocationerror\r
 486               ELSE  adjust  2DUP\r
 487                     toobig?  IF  enlarge\r
 488                            ELSE  OVER makenode\r
 489                                  headsize +  0\r
 490                            THEN\r
 491               THEN ;\r
 492 \\r
 493 \ Resize the node at addr1 to the specified size. Return the address of the\r
 494 \ resized node (addr2) along with an ior of zero if successful and\r
 495 \ ALLOCATIONERROR if not. Addr2 may be the same as, or different to, addr1.\r
 496 \ If ior is non-zero then addr2 is not meaningful. Being a standard word\r
 497 \ the arguments need adjusting to the internal representation on entry, and\r
 498 \ back again on exit. If after the first merge the requested size is still\r
 499 \ too large to reuse the specified node then it is moved to a larger node\r
 500 \ and the specified node released. If, on the other hand the request is not\r
 501 \ too big for the node, then we remake the node at the right length, and\r
 502 \ free any space at the top using MAKENODE, which has just the right\r
 503 \ functionality. In this case the ior is zero. As this is a standard word it\r
 504 \ takes an unsigned size argument, but excessive requests fail\r
 505 \ automatically, as with ALLOCATE.\r