Xref: utzoo comp.editors:1199 gnu.emacs:2054 comp.unix.wizards:19951 Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!bu.edu!bu-cs!bloom-beacon!eru!luth!sunic!tut!santra!mcsun!ukc!edcastle!dcl-cs!aber-cs!pcg From: pcg@aber-cs.UUCP (Piercarlo Grandi) Newsgroups: comp.editors,gnu.emacs,comp.unix.wizards Subject: GNU Emacs, memory usage, releasing Summary: GNU Emacs *does* release memory, but should not use gap. Keywords: GNU emacs malloc memory working set gap editor Message-ID: <1558@aber-cs.UUCP> Date: 29 Dec 89 01:04:26 GMT Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi) Organization: Dept of CS, UCW Aberystwyth (Disclaimer: my statements are purely personal) Lines: 130 There has been some discussion recently about the memory usage of GNU Emacs. I had stated that, using my sbrk-lowering malloc package, I could get micrognu etc... to release memory to the OS, but not GNU emacs. I speculated that this was because when allocating a buffer something is allocated after it that is not released when the buffer is deallocated. Well, I was wrong, but there is also an interesting story. I looked at the code in buffer.c and insdel.c, which are the low level editing engine parts of GNU emacs. What happens is that when a buffer is created, a buffer header is allocated, and then memory for the buffer contents is reserved (just a little initially, 20 bytes). When it is deleted, the contents and some other data structures are released, but *not* the buffer header and some dependent data structures. These are only released at the next garbage collection. Indeed, if after killing a buffer I collect, the virtual address space size of my GNU emacs process shrinks, as it should (using the proper malloc library. Don't ask for it, by the way -- it has been posted to alt.sources, and I will shortly contribute it to comp.sources, and/or to the FSF). This is not the end of the story. It also appears that the buffer contents are kept using the "gap" method, in which the entire file is represented as a single huge string, and at the (insertion) point there is a "gap". When the gap is filled by too many insertions, the gap is recreated by shifting the tail of the buffer upwards. Unfortunately as it is implemented in GNU emacs the "gap" method is a disgrace for performance because: 1) every time you move the insertion point by N characters, N characters are *copied* from one extreme of the gap to the other to shift it. 2) thank goodness the gap is moved only when the insertion point changes, not when the point changes as well. Too bad that it takes three seconds on a SUN 3/50 to move the gap by 750K... something like 250KBytes per second bandwidth, which is more like a poorly optimized disc or an Ethernet. 3) The copy is effected by the classic C loop, in segments of 32000 bytes: while (--i >= 0) *--to = *--from; while instead of course if i is greater than a few dozen bytes memcpy(3) or bcopy(3) should be used, hoping that these do cache line sized and aligned transfers to avoid catastrophic performance on byte writes. 4) You may argue that since the gap is created at about 2000 bytes, and usually many insertions are done at the same point, the cost of moving the gap is spread across many insertions. Too bad that the cost is still so terribly high. 5) the idea of using the gap method is not too convincing in itself, mostly because of its poor virtual memory profile. Most of the going around the huge line that is the buffer contents will wreack havock with most paging algorithms, because it is strictly sequential, and this is a known bad pattern of reference for most LRU based paging algorithms. When the gap is moved hell ensues. Programs with sequential access characteristics to memory have been reported to markedly improve their performance when it was possible to inform the pager of their peculiarity By the way, this is why, all other advantages notwithstanding, memory mapped files often perform less well than traditional io; the latter usually is heavvily tuned to expect sequential access patterns, both as to read ahead, and to forget behind. Larger page sizes help a bit with the former, but with some loss elsewhere. 6) Having a linked list of lines is a fairly cheap technology, and inasmush it can make the working set smaller (no need to move the gap) it will often actually *save* memory, even if it has an overhead for the links (often smaller however for small files than the cost of keeping a gap). 7) Most interestingly when the gap closes, because we have inserted that much worth of data, the *entire* buffer contents is realloc()ated. If the buffer is not the top one in virtual memory, or it is but you have a a realloc() that will not attempt just extending a block, but simply allocates a new block and copies the old into the new, you are in for extra overheads. They are no longer proportional to the amount of work you do, but to the total size of the file as well, which may be bad news. 8) most interestingly, realloc()ating the buffer will leave large holes in your virtual address space, that will expand forever, taking up if anything swap space. The holes are also likely to get fragmented as your session progresses (remember, GNU emacs has such high overheads that you are supposed to start it up once as a server and then use emacsclient as the "real" editor), and therefore the virtual memory profile of your session will worsen steadily. 9) GNU emacs also has some other interesting overheads. The screen display procedure is no speed daemon either... What are the solutions? Well, essentially GNU Emacs is not built for paged virtual memory machines; it it designed for core based segment swapping systems. It is not by chance that the gap method was very popular on base+limit relocation machines, and (I think) comes from the editor on the MIT CTSS. There are palliatives of course. The default size of the gap could be increased; this would make realloc()ations less frequent, and would not greatly increase the working set size, as the pages making up the gap are never referenced. A gap of say 4 pages per buffer may do, and actually save more address space than more frequent realloc()ations. Use also a nice realloc() that will try to avoid copying blocks to be extended as much as possible. Improving the locality of the lisp area with a compacting collector may help, as a faster, simpler redisplay. Profling GNU emacs can be great fun I suppose. The better solution, made relatively easy by the reasonably modular and layered structure of GNU emacs, would be to accept The Emacs Cookbook recommendation (adopted by Jove and the MicroEmacs/Gnu family of editors) and implement a linked list system. I would suggest just reading (or map, on the operating systems that allow it) the file to be edited as a large lump of virtual address space, then building a linked list of pointers to lines therein, and then to allocate modified lines from ageneral purpose arena. Informing the OS of the peculiar access patterns would also help, if possible. To keep the line descriptor size small (and thus its overhead) it would be perfectly feasible I think to have for example a byte sized line length field, and then use hash linking for storing the lengths of the (expectedly few) lines longer than 254 characters. Many other tricks could be used. I am very busy currently doing other work for the greater glory of the FSF, so I hope somebody else will enjoy the honour to do something about this. If Toshiba, Matsushita, Samsung and the other DRAM suppliers don't get him/her first :->. -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk