Xref: utzoo comp.editors:1200 gnu.emacs:2056 comp.unix.wizards:19955 Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!think!think.com From: rlk@think.com (Robert Krawitz) Newsgroups: comp.editors,gnu.emacs,comp.unix.wizards Subject: Re: GNU Emacs, memory usage, releasing Keywords: GNU emacs malloc memory working set gap editor Message-ID: <32534@news.Think.COM> Date: 31 Dec 89 18:53:27 GMT References: <1558@aber-cs.UUCP> Sender: news@Think.COM Reply-To: rlk@think.com (Robert Krawitz) Followup-To: comp.editors Organization: Thinking Machines Corp., Cambridge MA Lines: 53 cc: rlk In-reply-to: pcg@aber-cs.UUCP (Piercarlo Grandi) Very interesting note. It explains a large number of observations I have made over the years (some of them I was aware of long before reading your note, as I did a lot of work on rmail around 1985, but this ties a lot of stuff together). 1) Rmail is very slow when getting new mail from an inbox. I was aware of this very early, and I understood why (the gap). Rmail normally has to convert each message to babyl format by making a few small edits on each message. When I worked with pmd (personal mail daemon), I put in code to permit mailboxes to be written in rmail format, thereby not requiring any conversion to be done. This speeds up emacs substantially. However, certain operations (such as computing a summary buffer) are still slow. This is in part because rmail writes the summary line into the message header (to cache it for future use). I was never in favor of this, but I never thought too hard about the fact that it edits in the same pattern. BTW, a favorite benchmark of mine involves the following: converting a large number of messages (1000, say) to babyl format, and deleting and expunging these same 1000 messages. The messages are deliberately kept small (a very small header and one line of body) to minimize paging effects. My experience was that the early IBM RT (which was otherwise a real dog) could keep up with a Microvax II on this test, and that in general RISC machines do extremely well on this test (they run emacs very well in general, as it happens). 2) Emacs dies very quickly after its virtual size exceeds 16 Mbytes, due to the 24 bit pointers used (the top 8 bits are used as tag bits for the Lisp interpreter). I have frequently noticed that killing off old buffers does not permit me to prolong the life of my emacs session, and that an emacs with a Lisp buffer (which grows rapidly but erratically) tends to run out of room quickly. This I assume is due to the constant realloc'ing going on. I don't necessarily agree that the issue is design for virtual memory vs. swapping, by the way. There is a general problem in emacs with a lot of things being scaled poorly, or otherwise underdesigned. For example, the 24 bit limit on integers (23 bit signed integers in lisp, 24 bit pointers internally), the inexplicable and seemingly gratuitous divergences from common lisp, etc. The 24 bit integer/pointer problem worried me even in 1985, but RMS wasn't too interested in hearing about it. The problem is only really showing up now (for example, my Sparcstation has 16 MB of physical memory and 100 MB swap, and I run big emacs processes). Judging by your comments, the memory management scheme was similarly unplanned. I don't think it was designed with swapping systems in mind, I simply don't think it was designed to any great degree. A real pity, since no other Unix editor shows any more design. I wish it had been done right in the first place. It's not clear to me that any of this will ever be fixed. -- ames >>>>>>>>> | Robert Krawitz 245 First St. bloom-beacon > |think!rlk (postmaster) Cambridge, MA 02142 harvard >>>>>> . Thinking Machines Corp. (617)876-1111