Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!know!zaphod.mps.ohio-state.edu!wuarchive!uunet!mcsun!ukc!edcastle!refson From: refson@castle.ed.ac.uk (Keith Refson) Newsgroups: comp.unix.cray Subject: Re: Malloc() under UNICOS Message-ID: <5998@castle.ed.ac.uk> Date: 29 Aug 90 13:43:04 GMT References: <5931@castle.ed.ac.uk> Reply-To: keith@earth.ox.ac.uk Organization: Oxford University Earth Sciences Department Lines: 78 I would like to thank the many people who responded with helpful suggestions to my query about speeding up memory allocation. Unfortunately since I did not give details of the my program they did not exactly address the problem. However with some of the advice I received I have made considerable progress which I thought it might be useful to pass on. My program is a molecular dynamics simulation code - a repetitive procedure which runs for several tens of thousands of iterations each of which may take seconds of CPU and each of which has an identical call sequence. It uses dynamic memory as run-time dimensioned arrays in a stack-like manner; that is memory is allocated on function entry and freed upon exit. The upshot of this is that the 100000th iteration should use no more memory than the first and the heap does not need to grow as the run proceeds. Furthermore it is allocating a few big chunks rather than many small ones. In my test case run of 10 iterations which took 1.3s of CPU calloc() was called only 1154 times and yet was using about 10% of the CPU. Now comes the curious part. To see what was actually going on I replaced the calloc() call with malloc() and memset() (in my interface function which handles all memory allocation). The time spent in memory allocation (malloc()+memset()) decreased by a factor of 3 compared with calloc()! Now surely calloc() should be entirely equivalent to calling malloc()+memset(). Is there some gross inefficiency in Cray's implementation of calloc() ? I am using C version 5.0, but I don't know the library version. Working with the malloc()+memset() version I then tried out some of the suggestions to see if I could get a further improvement. By far the most useful was that of Steve Larson of CRI which was to use the loader directives "HEAP=50000+50000;STACK=10000+10000" to set the heap and stack's initial size and increment to values more appropriate to my code. This not only reduced the amount of time spent in malloc() by a further factor of three but reduced the stack management overhead (I presume - times of subroutines $STKOFEN $STKUFEX $STKCR%) from 6% of total execution time to nothing. I still don't understand though why a code which performs so few malloc() calls should be affected by this. Using the procstat and procrpt commands (thank you again Steve) showed that with the default heap size there were 18 calls to the memory processor irrespective of run length (10 or 100 iterations). Nevertheless the memory processor used 0.85s for the hundred step run compared with 0.007s for the 10 step run (out of 10s and 1s respectively). Setting the heap and stack as above reduced the number of calls to 1 and the time spent to effectively zero. Doeas anybody out there know?, Steve? Kent Koeniger suggested that doing one enormous malloc() call at the beginning and immediately freeing it would help by reducing the system-call overhead. I think that the loader directives achieve the same effect more elegantly. Others (Rod Meyer, Ethan Miller, Ted Stockwell) gave suggestions which amounted to re-implementing malloc(), tailored to my application. Some (please forgive me if I slander you unduly) appeared to think that malloc() was an expensive system call to be avoided at all costs, rather than a library function. I thought that malloc() was an interface to the real system call sbrk() designed precisely to avoid excessive system overhead. Surely an efficient malloc() should be part of every C library. To summarize. There is a curious inefficiency in calloc(). Use malloc() and memset() instead. Set the heap and stack initial sizes and increments to suitable values. I imagine there are cases where there could be a much bigger gain than mine. Thanks to all those who responded. ---------------------------------------------------------------------------- | Keith Refson |ROYAL MAIL: | |-----------------------------------------| Department of Earth Sciences | | JANET : keith@uk.ac.ox.earth | Parks Road | | INTERNET: keith@earth.ox.ac.uk | Oxford OX1 3PR | | BITNET : keith%uk.ac.ox.earth@ukacrl | UK | | UUCP : K.Refson@ed.uucp |PHONE : +44 865 272026/272016 | | : keith%uk.ac.ox.earth@ukc.uucp |FAX : +44 865 272072 | ----------------------------------------------------------------------------