Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!mit-eddie!mit-amt!mit-caf!vlcek
From: vlcek@mit-caf.MIT.EDU (Jim Vlcek)
Newsgroups: comp.lang.c
Subject: Indeterminate Array Sizing
Message-ID: <2297@mit-caf.MIT.EDU>
Date: 6 May 89 02:05:11 GMT
References: <10946@bloom-beacon.MIT.EDU> <12546@ut-emx.UUCP> <8758@csli.Stanford.EDU> <11021@bloom-beacon.MIT.EDU> <29978@apple.Apple.COM>
Reply-To: vlcek@mit-caf.UUCP (Jim Vlcek)
Organization: Microsystems Technology Laboratories, MIT
Lines: 36

In article <29978@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes:
Steve Summit, in <11021@bloom-beacon.MIT.EDU>, suggests:

``Getting back to data files, it's not necessary to know how big
  they are while reading them.  Just use code like the following:
	[ while data
	    if (not room)
		allocate another N bytes]''

And then Peter Desnoyers, in <29978@apple.Apple.COM>, rejoins:

``No, No. This algorithm takes O(n^^2) copies in the worst case. If
  you change your code to:

	while data
	  if not room
	    current_size *= 2
	    ptr = realloc( ptr, curr_size)

``(i.e. increase multiplicatively instead of additively) then you will
  only need O(n) copies. There is a direct tradeoff between wasted
  memory and cycles using this algorithm - if the multiplicative
  factor is N, then :

   	copies = 1 + 1/N
	mean wasted memory (fraction) = (N-1)/2N''

I've got a couple of questions here.  One, does the first algorithm
typically actually require O(n*n) copies?  If no other memory
allocation is taking place concurrently, does it not stand a chance of
needing no copying at all?  And, second, should one use the
multiplicative algorithm, is it useful to conserve memory by
performing a final realloc, after the last data element has been read
in, to the size of the array actually used?

Jim Vlcek (vlcek@caf.mit.edu  uunet!mit-eddie!mit-caf!vlcek)