Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!agate!labrea!russell!pereira
From: pereira@russell.STANFORD.EDU (Fernando Pereira)
Newsgroups: comp.lang.prolog
Subject: Re: Inline expansion versus threaded code
Summary: Paging kills
Keywords: paging threaded code
Message-ID: <8489@russell.STANFORD.EDU>
Date: 11 Apr 89 18:42:26 GMT
References: <1635@kulcs.kulcs.uucp>
Sender: pereira@russell.Stanford.EDU (Fernando Pereira)
Reply-To: pereira@russell.UUCP (Fernando Pereira)
Organization: Center for the Study of Language and Information, Stanford U.
Lines: 39

Our experience with large Prolog and C programs for natural-language
analysis, grammar analysis, simulation and speech understanding is
that soon after your program starts paging (even on an empty 32MB
Sun-4/280 with local SMD disk), you may as well forget about it. The
problem is that all of those applications create or use very large
relations over complex terms leading to essentially random access to
most of program memory for Prolog or data area for C. For such
programs, compact encoding is essential in that it allows us to run an
n times larger problem without paging (n varying from 4 to 10
depending on the size ratio between native compiled code and threaded
code). It doesn't matter much to us that native code might be
between 1.5 and 3 times faster than threaded code on small or
well-behaved benchmarks, since native code would be too bulky to run
at all! (not to mention the cost of the disk space for enormous swap
partitions). 

Incidentally, one of the complaints most often heard about Lisp on
general-purpose machines is the enormous size of compiled code and
what that does to paging. This is particularly bad if the compiled
code is optimized for speed; in our experience, Sun/Lucid Common Lisp
cannot be used effectively on a Sun with less than 12MB or real
memory. In contrast, a Symbolics machine with 1M words (approx 4.5MB)
has adequate performance, and I believe this is in good part due to
the fact that the machine executes directly high-level Lisp
instructions, leading to much less bulky code.

One obvious reason why the conventional wisdom on threaded versus
native code may not apply to Prolog or Lisp is that most work on
language compilation and interpretation has been done for relatively
low-level languages like C, in which the average number of machine
instructions per basic language construct is substantially lower than
for Prolog or Lisp. Actually, this treadeoff is much less clear for
more sophisticated imperative languages like Algol68 or Simula67, in
which native-compiled code uses calls to out-of-line routines to do a
lot of the work (eg. access to nonlocal variables via displays). It
was a very instructive experience to look at compiled code for the
DEC-10 Simula 67 compiler...

-- Fernando Pereira
pereira@ai.sri.com