Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!zaphod.mps.ohio-state.edu!tut.cis.ohio-state.edu!ucbvax!ENG.SUN.COM!wmb
From: wmb@ENG.SUN.COM
Newsgroups: comp.lang.forth
Subject: Re: Why is Postscript not Forth?
Message-ID: <9002161509.AA19458@jade.berkeley.edu>
Date: 15 Feb 90 19:54:00 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Reply-To: wmb@ENG.SUN.COM
Organization: The Internet
Lines: 105

> I'm not yet willing to concede [ that PostScript's one closure vs.
> Forth's many closures is a fundamental difference ].  If it is fairly simple
 to
> create the other language's closures then I don't see this as fundamental.

It is not easy to create PostScript's closure on top of a Forth closure,
at least not in a portable way.  It is not so hard to do if you know
the implementation details of a particular Forth (John Wavrik posted
a particular solution a while back), but I tried to come up with a
general solution and failed.  It is certainly possible to implement the
PostScript closure "manually" in Forth (using an explicit array, with
a high-level word to interpret its contents), but it's hard to do it
portably "on top of" an existing Forth closure.

On the other hand, I had no trouble implementing Forth's many closures on
top of the PostScript one.

> [ Forth's direct access to hardware data types, PostScript's lack of
> such access ]

In PostScript, you can't look at the bits in a stack cell, and you also
can't pretend that a number is an address and try to "@" that location.
In nearly all Forth implementations, both of these are possible.

In PostScript, stack objects are typed, and if you try to use e.g. a logical
operator on a number object, you get an error signal and the operation aborts.

> >    way while executing a definition (specifically, you cannot, for
> >    instance, easily write a Forth word which creates a word called
> >    "foo", unless you can arrange for the name "foo" to appear in
> >    the input stream at the time your word is executed).
> I take it from this that by 'easily' you mean: 'without knowing how the
> dictionary structure of the Forth you are using works,' or perhaps, 'in a
> portable manner, across more than one type of Forth.'?

Specifically, I mean that I have tried several times to do this, and have
never come up with a good portable way of doing it, without resorting to
extremely gruesome mucking about with the mechanics of the input stream,
which isn't very portable because most of the Forth systems that I care
about have extended those input stream mechanisms to cope with text files,
and the input stream hacks aren't portable across those systems.

There may be some good ANSI news on this input mechanism front however.
At the last meeting, I led a group that worked out a specification for
dealing with text input files, and the results are pretty encouraging.
The basics proposals of the scheme were passed on the last day, and
several more proposals, completing the scheme, are pending.

> ... Forth's confusion about strings, and the possibility of defining
> your own strings package ...

The argument that "you can define your own" is often cited in defense
of Forth's lack of particular features.  However, many people do not
wish to have to "roll their own" this and that.  Perhaps they do not
have the skill.  Perhaps they do not have the time.  Perhaps they would
rather concentrate on their application without having to build up
the tool base by themselves.  Having well-debugged, optimized, supported
tool packages (e.g. strings) can save application developers time and
money.  Given the choice between "rolling your own" and buying, the
"buy" decision is often economically sound.

> > 1) "adr len"  Address and length of array of bytes.  This is the
> >    best representation.  Example operator: TYPE
> Hmmm, lets not pretend the string issue exists in a vacuum.
> There are those of use weened on 'C' that would claim NULL terminated
> strings are best.  I'd rather avoid religious rhetoric and stick to
> verifiable/demonstrable claims about real situations.

This isn't rhetoric.  "Adr len" strings are objectively best, in that
a) Any character can appear in a string (null-terminated and tagged
   strings are weak in this respect).
b) Many types of string manipulation can be performed on "adr len" strings
   without copying, without allocation of extra memory, and without
   concerns about "read-only" storage.  (counted strings are weak in
   these respects).
c) An "adr len" string can be arbitrarily long.
d) Any region of memory can be described as an "adr len" string without
   requiring preallocation of space for either a count byte or a delimiter
   byte.
The one weakness of "adr len" strings is an issue of convenience.  There
are 2 things on the stack instead of 1.

I believe that the above claims are both verifiable and demonstrable.

The ANSI committee has settled upon the "adr len" representation for
all new functions with string arguments.


> [ virtual machine vs. high level assembler ]
> I answer 'maybe'.  Perhaps what is needed is to avoid having to specify
> 'bit-level details of how the dictionary is implemented' and to start to
> address the question of how to provide a set of words that will achieve
> certain semantic-effect-manipulations of the dictionary.  The
> definitions of those dictionary-smashing words will mostly likely be
> non-portable, but code that uses them would be portable.  And depending on
> how you define those words, they might even be immediate words that 'comma
> in' to the word being defined fast code for direct manipulation, thus
> avoiding one level of nesting and yet still not sacrifice portability.

I agree entirely.  I published a paper in one of the FORML proceedings
proposing such a set of "dictionary abstraction" words.  Furthermore,
one such word (COMPILE-TOKEN) is currently on the table at ANSI.  It was
the subject of much fierce debate at the last meeting.

Mitch