Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!lll-winken!uunet!munnari!murtoa.cs.mu.oz.au!munnari.oz!lee From: lee@munnari.oz (Lee Naish) Newsgroups: comp.lang.prolog Subject: Re: Inline expansion versus threaded code Keywords: paging threaded code Message-ID: <1399@murtoa.cs.mu.oz.au> Date: 14 Apr 89 07:24:15 GMT References: <1635@kulcs.kulcs.uucp> <8489@russell.STANFORD.EDU> Sender: news@cs.mu.oz.au Reply-To: lee@munmurra.UUCP (Lee Naish) Organization: University of Melbourne, Comp Sci Dept Lines: 40 pereira@russell.UUCP (Fernando Pereira) writes: >soon after your program starts paging >you may as well forget about it >compact encoding is essential in that it allows us to run an >n times larger problem without paging (n varying from 4 to 10 >depending on the size ratio between native compiled code and threaded >code) When I was visiting SICS over the northern winter I ran the MTS natural language system, written by Xiuming Huang, under Sicstus Prolog using the bytecode emulator and the (new) native code system. I have to agree that paging really kills the system. However, the code size factor was not as great as Fernando suggests in this system. The native code version was between 14 and 15 Mb; the emulated system between 6 and 7 Mb (if I recall correctly). Sicstus emulated code is not particularly compact (instructions are halfword or word aligned to speed up loading of instructions and operands) but I think that is true for many Prolog systems (eg, Quintus uses 16 bit opcodes, I've heard). About half the code in MTS is complex dcg rules and about half is the lexicon (large sets of facts). I'm not sure how the size and speed ratios compare with these two rather different types of code. It may be important for optimization (eg, it might be best to compile one with native code and emulate the other). Locality is another important issue. The working set of the lexicon code is related to the number of different words in the sentence/text being processed. This is probably reasonably small, even though the fine grain locality is poor. For the grammar rules, fine grain locality is probably better, but the quantity of code used overall is larger. In the longer term, parallelism may have a significant role to play. Having lots of memory attached to a single processor means the memory is not used efficiently. Shared memory multiprocessors make better use of memory and parallelism can also absorb some memory latency due to paging (while one bit of the computation is waiting for a page, another bit can be done). In other words, the processor utilisation is increased also. You have to be careful to avoid thrashing of course. lee