Xref: utzoo comp.lang.prolog:1163 comp.lang.misc:1750
Path: utzoo!attcan!uunet!husc6!uwvax!umn-d-ub!rutgers!mit-eddie!uw-beaver!teknowledge-vaxc!sri-unix!quintus!ok
From: ok@quintus
Newsgroups: comp.lang.prolog,comp.lang.misc
Subject: Is ICON higher level than Prolog?
Message-ID: <260@quintus.UUCP>
Date: 8 Aug 88 03:23:37 GMT
Sender: news@quintus.UUCP
Reply-To: ok@quintus ()
Organization: Quintus Computer Systems, Inc.
Lines: 98

ICON (described in "The ICON Programming Language", Griswold & Griswold,
Prentice-Hall, 1983) is a descendant of SNOBOL.  It has a fairly
conventional syntax, but adds backtracking, strings, hash tables,
and pattern-matching (patterns are not a data type as in SNOBOL, but
you can do most of the same sorts of things).

Version 7 of ICON (the current one) is implemented (in C and YACC) as
a compiler which produces abstract instructions which are then interpreted
by another C program.  This is rather like the Berkeley Pascal "pi"
compiler and "px" interpreter.

Version 5 of ICON (the one described in the book) had two implementations
sharing a common front end: an "abstract instruction" version, and one
which produced native code.  However, the book says that "a difference of
5% to 10%" in speed between "compiled" and "interpreted" versions was
typical.  So the ICON group decided that it wasn't worth maintaining the
native-code version any longer.

What I'd like to understand is why the difference was so slight.
Let's put it in context (bigger numbers mean faster, each language's
C-coded interpreter set to 1):

 1   Pascal, Berkeley "pi" + "px"
36   Pascal, Berkeley "pc"		-- my measurement on a SUN-3

 1   Prolog, WAM emulated in C
 2   Prolog, WAM emulated in assembly code
 4   Prolog, WAM macro-expanded and peephole-optimised to native code

 1.0 ICON, Version 5 "icont" + "iconx"
 1.1 ICON, Version 5 "iconc"

It's a long time since I was sure of my facts about Berkeley Pascal, but
I don't recall the abstract instructions as being designed for particularly
fast emulation, or the compiler as doing much optimisation.  I suspect that
a ratio of 4 to 10 for "simple-minded native code" -vs- "good C emulator"
should be more typical of Pascal.

So why is the ICON ratio so low?

It could be that the emulator was unusually good, or the compiler unusually
bad, or it could be something about the language.  To put _that_ in
context, I decided to take a Prolog program and recode it in ICON.  Since
I happened to have it handy, I used the "darlington-to-workington" example
on page 163 of the 3rd edition of Clocksin & Mellish (section 7.9).
The total times to find all shortest paths from each town were
    original Prolog version	4.8 seconds
    ICON version		6.2 seconds
    comparable Prolog version   0.5 seconds

ohe original Prolog version is the one which appears in Clocksin & Mellish.
It uses "findall", which is a fairly expensive operation, and uses a
quadratic algorithm for finding the shortest element of a list.
The ICON version, not being able to backtrack over a Prolog-like data base,
backtracks over a vector of triples, but is otherwise determinate.
The second Prolog version is much closer to the ICON version in structure.

The ICON version suffers from not having lists (in the sense of Prolog and
Lisp).  Instead it has extensible vectors (and calls them lists).  This is
a neat idea, but the overheads of the ICON implementation are high.  So I
declared a "pair" record type (pair(Cost,Path) in ICON corresponding to
Cost-Path in Prolog) and a "cons" record type (cons(Head,Tail) in ICON
corresponding to [Head|Tail]) in Prolog.  Allow a factor of 3 for the
size of the data structures (ICON data structures taking up a lot of
memory), and a factor of 2 to allow for the fact that the ICON emulator
is coded in C and the Quintus Prolog emulator isn't, and we're looking
at a ratio of about 2 to 1 between ICON and Prolog [on _this_ task, as
coded by me, running on a Sun-3/50, & other context].

The bottom line of this comparison is that the ICON interpreter doesn't seem
to be especially good or especially bad.  If ICON were like Prolog, we
would expect about a factor of 4 between the speed of interpreted code and
the speed of native code.  That fact that this was _not_ observed suggests
that either the compiler or the language was the problem.

It's worth drawing a distinction between polymorphic languages like ML
and typed Prolog, and languages with overloading, like PL/I and ICON.
An example of a polymorphic operation is:
	-- len: * list -> int
	++ len nil = 0
	++ len H.T = 1 + len T
which computes the length of a list, be the type of the elements what it may.
An example of an overloaded operation is:
	*X
which yields: the cardinality of a character set, the cardinality of a
(general) set, the length of a string, the length of a vector, the
number of fields in a record, &c, each of which is a different machine-
level operation.  Similarly (courtesy of automatic run-time coercion),
X||Y (string concatenation) does not correspond to a single operation either.

So I _think_ the low ratio for ICON wasn't do to the compiler either, but 
to the fact that there is so little that the compiler can do.

Would anyone who understands (I have the ICON Implementation book, but have
not finished it yet and still can't pretend to understand) the ICON
implementation care to comment on this?

[I'm quite happy with ICON's speed: it's quite a bit faster than AWK....]