Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!agate!labrea!russell!pereira From: pereira@russell.STANFORD.EDU (Fernando Pereira) Newsgroups: comp.lang.prolog Subject: Re: Inline expansion versus threaded code Summary: Paging kills Keywords: paging threaded code Message-ID: <8489@russell.STANFORD.EDU> Date: 11 Apr 89 18:42:26 GMT References: <1635@kulcs.kulcs.uucp> Sender: pereira@russell.Stanford.EDU (Fernando Pereira) Reply-To: pereira@russell.UUCP (Fernando Pereira) Organization: Center for the Study of Language and Information, Stanford U. Lines: 39 Our experience with large Prolog and C programs for natural-language analysis, grammar analysis, simulation and speech understanding is that soon after your program starts paging (even on an empty 32MB Sun-4/280 with local SMD disk), you may as well forget about it. The problem is that all of those applications create or use very large relations over complex terms leading to essentially random access to most of program memory for Prolog or data area for C. For such programs, compact encoding is essential in that it allows us to run an n times larger problem without paging (n varying from 4 to 10 depending on the size ratio between native compiled code and threaded code). It doesn't matter much to us that native code might be between 1.5 and 3 times faster than threaded code on small or well-behaved benchmarks, since native code would be too bulky to run at all! (not to mention the cost of the disk space for enormous swap partitions). Incidentally, one of the complaints most often heard about Lisp on general-purpose machines is the enormous size of compiled code and what that does to paging. This is particularly bad if the compiled code is optimized for speed; in our experience, Sun/Lucid Common Lisp cannot be used effectively on a Sun with less than 12MB or real memory. In contrast, a Symbolics machine with 1M words (approx 4.5MB) has adequate performance, and I believe this is in good part due to the fact that the machine executes directly high-level Lisp instructions, leading to much less bulky code. One obvious reason why the conventional wisdom on threaded versus native code may not apply to Prolog or Lisp is that most work on language compilation and interpretation has been done for relatively low-level languages like C, in which the average number of machine instructions per basic language construct is substantially lower than for Prolog or Lisp. Actually, this treadeoff is much less clear for more sophisticated imperative languages like Algol68 or Simula67, in which native-compiled code uses calls to out-of-line routines to do a lot of the work (eg. access to nonlocal variables via displays). It was a very instructive experience to look at compiled code for the DEC-10 Simula 67 compiler... -- Fernando Pereira pereira@ai.sri.com