Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!purdue!haven!uvaarpa!mcnc!ecsvax!dukeac!bet From: bet@dukeac.UUCP (Bennett Todd) Newsgroups: comp.lang.c Subject: Re: binary data files Message-ID: <1387@dukeac.UUCP> Date: 3 May 89 18:38:58 GMT References: <10946@bloom-beacon.MIT.EDU> <12546@ut-emx.UUCP> <8758@csli.Stanford.EDU> <11021@bloom-beacon.MIT.EDU> Reply-To: bet@dukeac.UUCP (Bennett Todd) Organization: Radiology, Duke Med. Center, Durham, NC Lines: 37 In article <11021@bloom-beacon.MIT.EDU> scs@adam.pika.mit.edu (Steve Summit) writes that assuming you can stat a file for its size breaks down on non-UNIX systems, and recommends reading into a dynamically grown buffer, which he grows linearly. I have often done similar things. The getline() routine in my libbent does a conceptually similar job for reading arbitrarily long text lines (where you don't know in advance how many bytes to allocate to last you 'till the next newline on input). Also, in an image I/O and manipulation library I wrote, I wanted to be able to read an image from a pipe. I disbelieve in header parsing, and deduce image dimensions from the file length, so I had to do roughly the same thing. However, I am not sure I like the linear reallocation strategy. I would tend to assume, in general, that realloc would usually be implemented as a series of malloc/memcpy/free, and thus I try to avoid working it too hard. I found a binary growth algorithm easy to code, however; basically it looks just like Steve's linear algorithm except instead of nallocated += 10; I use nallocated *= 2; Also, I start with somewhat larger allocations; for getline() I started with 128, and with the image reading facility I start with 65536. Finally, where you are reading hoping for EOF, by all means issue one big read for reach realloc, rather than reading along by one or two at a time. I actually haven't done any performance measurements to determine whether I am buying any speed with this strategy; however, it isn't much harder to code, and I am sure on some (if not most) machines the vendor doesn't take enough care to optimize performance of realloc(). -Bennett bet@orion.mc.duke.edu