Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!cs.utexas.edu!uunet!mcsun!ukc!acorn!john
From: john@acorn.co.uk (John Bowler)
Newsgroups: comp.sys.acorn
Subject: Re: Memory handling
Message-ID: <6079@acorn.co.uk>
Date: 27 Mar 91 13:16:11 GMT
References: <1991Mar20.143227.247@ecc.tased.oz.au> <1991Mar22.171119.11147@ifi.uio.no>
Organization: Acorn Computers Ltd, Cambridge, UK
Lines: 253

In article <1991Mar22.171119.11147@ifi.uio.no> oystein@ifi.uio.no (ystein Carl Emil Hebnes) writes:
>
>In article <1991Mar20.143227.247@ecc.tased.oz.au>, ecc_jim@ecc.tased.oz.au writes: 
>> We are developing a package using C 1.3B that makes extensive use of tree
>> structures for handling indexes into data files.
[[description of problem deleted]]
>> ANSI C release 3 has an additional memory management tool called a flex
>> pool.
>>[...]
>> There is, however, a restriction on a flex memory allocation (from page 225
>> of the ANSI C release 3 guide):
>> 
>>    "...you cannot have flex pointers within blocks of memory allocated by
>> flex."
>> 
>> This means that you cannot have linked lists or binary trees - the main
>> reason that memory is being used by our application.
>
[summary deleted]
>
>Let's assume all the nodes have the same size (nsize = sizeof (struct foo)).
>We can't keep pointers to the nodes themselves, so we have to be a bit more
>devious. The first thing to do is to allocate an array of void pointers.
>Call this the anchor block. It will contain anchors, as required by the flex
>functions. It is vital that the anchor block is never moved around, so
>malloc() it, or flex_alloc() it before anything else (the first call to
>flex_alloc() is guaranteed to return a pointer to an area which will never
>be moved around.)
>
>Now, you're ready to allocate your first node. The thing to do, is to
>allocate a block of several nodes, say 100. This is the other type of block,
>a node block. The first anchor in the anchor block should point to this block
>(you pass that pointer's address as a parameter to flex_alloc().)
>As you allocate more nodes, you'll fill up the first node block, and
>eventually, you'll allocate another node block in the same way, and it'll
>be pointed to by the second anchor, and so on. Just make sure that
>no_of_anchors * nodes_per_block gives you enough nodes. An overestimate
>does no harm, anchor pointers are cheap.
>
>So, what about the internal node pointers in the structure? The left/right,
>next or parent pointers?
>They won't be ordinary pointers. They must contain 2 elements. First the
>node block number, and then the node number in the node block. That's the
>trick. To access e.g. the next node in a list, then, you first read the
>node block number, then aquire the actual node block address by following
>the right pointer in the anchor block, and then calculate the node address
>by a formula similar to node_block_start + (node_no - 1) * nsize.
>
>As you see, following pointers requires a bit more work than usual, but
>you gain the ability to dynamically manage memory.

To (probably mis)quote someone famous ``any problem can be solved
by an extra level of indirection''.  This is effectively what ystein
Carl Emil Hebnes's algorithm does.  A simpler implementation is to
use the ``normal'' malloc algorithm with flex_alloc but store
pointers to the flex anchors instead of the flex anchor values in
the allocated data sturctures (ie instead of the pointer to the object
store a pointer to the pointer).  Ie:-

struct btree {
	struct btree	*left;
	struct btree	*right
} *btree_ptr;

becomes:-

struct flex_btree {
	struct flex_btree	**left;
	struct flex_btree	**right;
} *flex_btree_ptr;

And:-

	btree_ptr->left

becomes

	*(flex_btree_ptr->left);	/* Brackets for clarity */

To use this, you must allocate two objects per node; one is an
anchor which must be allocated via malloc (or allocated statically,
as flex_btree_ptr above), the other is the actual object:

/* Caller must allocate (*anchor) */
void allocate_node(struct flex_btree **anchor) {
	flex_alloc(anchor, sizeof **anchor);
	/* (((The following may look expensive, but CSE will speed it
	 * up as long as no functions are called))) */
	(*anchor)->left = 0;
	(*anchor)->right = 0;
	/* +Other initialisation */
}

So this might be used:-

	...
	/* New node needed */
	anchor = malloc(sizeof *anchor);    /* struct flex_btree **anchor */
	allocate_node(anchor);
	(*anchor_of_parent)->left = anchor;
	...

There is a problem:-

Allocating individual anchors using malloc() may result in (very)
bad internal fragmentation of the malloc() heap and is inefficient
because of the one word overhead per mallocated block in the RISC
OS implementation.  (Ie you only get 50% storage utilisation at
best for anchors).

The answer to this is to write your own anchor allocator (and this
allocator *can* be used for all anchors - not just flex_btree ones);
use the standard free list approach.  (NB - I have *only* compiled
the following code; no testing!  There should be some #ifdef DEBUG
consistency checking in here...)

#include <stdlib.h>

extern void reclaim_memory(void);

static void **anchor_free_list = 0;

/* Following allocates anchors 128 at a time. */

#define HANDLE_BLOCK_SIZE 128

static void make_anchors(void) {
	void **new = malloc(HANDLE_BLOCK_SIZE * sizeof *new);
	if (new) {
		int i;

		for (i=0; i<(HANDLE_BLOCK_SIZE-1); ++i)
			new[i] = new+i+1;   /* No cast, target is (void*) */
		/* Paranoia... routine may be called when we still
		 * have anchors? */
		new[i] = anchor_free_list;
		anchor_free_list = new;
		return;
	}

	/* Out of memory - reclaim_memory() tries to get mallocated
	 * space back from the rest of the system, causing the app
	 * to fail *gracefully* if it cannot.
	 */
	reclaim_memory();	/* External to module */
	make_anchors();
}

void **new_anchor(void) {
	void **anchor;

	if (anchor_free_list == 0) make_anchors();
	anchor = anchor_free_list;
	/* Isn't (void *) wonderful - no casting required! */
	anchor_free_list = *anchor;	/* Unlink head of list */)
	*anchor = 0;			/* Paranoia */
	return anchor;
}

void free_anchor(void **anchor) {
#ifdef FREEING_ANCHOR_FREES_FLEX_OBJECT
	/* Following code is slightly dubious; the caller must
	 * still be very careful to remove pointers to the anchor
	 * before freeing it!  Alternatively put in a paranoid
	 * check here and abort the program if *anchor is != 0.
	 */
	if (*anchor) flex_free(anchor);
#endif
	*anchor = anchor_free_list;
	anchor_free_list = anchor;
}

Even using this approach anchors are malloc()ed.  If the node size
in the original program is very small (so the total number of
allocated nodes was actually very large) the overhead of the anchor
allocations may still be unacceptable.  You can't get round this
with the current flex implementation - you really need to rearrange
the anchors themselves and free up unused memory in the anchor free
list.

Notice that, despite the words in the manual, the first allocated
flex block *CAN* move in 3.1B; this can arise if malloc() needs to
extend the heap (effectively the malloc heap is the first flex
block in 3.1B...).  In the original posters problem it is very
likely that the first flex block will move as a result of normal
malloc()ing in the program (the program uses a lot of heap!)
Because the stack extension mechanism uses the heap to obtain extra
stack space such a movement may occur on a function call; see the
example on page 226 of the manual; never pass (*anchor) as a function
argument, or assign it to a variable.

Future versions of the library will make it possible to avoid this
potential problem, in particular the default behaviour will be that
a program which uses flex will not be permitted to extend the heap
(after the call to flex_init()).  In *this* environment it
is possible to place the anchors in the first flex block and to
allocate new anchors using flex_extend() of that block (the rest
of flex space will be shunted up to make the space).  This is a
general and useful paradigm; either:-

1) After flex_init() allocate the initial anchor array using
flex_alloc() (make an initial call to make_anchors(), which would
then need to be external).  In make_anchors() use flex_extend() to
increase the size of the anchor array.  Every now and then go
through the anchor_free_list stripping off anchor slots which are
at the end of the array, then use flex_extend to contract the array
- a call to a function to do this should go in reclaim_memory().
*DO NOT* use heap_init/heap_alloc in the program.

Or:-

2) Do the same thing, but use heap_init/heap_alloc to allocate
anchor buffers in make_anchors (use heap_alloc exactly as malloc()
is used in make_anchors() above).  Use heap_free() when an array
of anchors is emptied.  You could allocate anchors individually,
but there *is* a storage overhead in heap_alloc() so this is not
really sensible.

RISC iX users should note that the RISC iX storage manager (malloc/free
- no flex in RISC iX) does *not* have this storage overhead problem;
it is sensible to use malloc() to allocate individual anchors in
RISC iX as it implements the block allocation algorithm internally.
On the other hand writers of portable UNIX programs should certainly
not rely on this.

General points:

i) Use RISC OS flex plus an extra level of indirection to allow
the OS to reclaim program data space.

ii) flex anchors must be allocated in static data space or in the
malloc heap.  (The stack is in the heap and is static, so you can
allocate automatically, but you *MUST* flex_free the anchor before
the function returns).  After 3.1B you will be able to allocate in
the first flex block (or the heap_alloc heap), but don't do this
at the moment.

iii) Use free lists to make efficient use of heap space for small
extensively used data structures.  With most malloc() implementations
free lists also have an enormous speed benefit (sometimes several
orders of magnitude, but this does *not* apply to the RISC iX
malloc).  They also have the normal code correctness benefits gained
from object specific interfaces; the compiler can do more type
checking in the allocate and free function calls and the code can
do better initialisation.

Free lists can be used for flex objects, but the list must be
maintained as a list of handles (meaning two objects per list entry)
and the algorithm defeats the object of using flex (that it be
possible to return memory to the OS when an object is no longer
used).

John Bowler (jbowler@acorn.co.uk)