Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!snorkelwacker!apple!agate!ucbvax!MITCH.ENG.SUN.COM!wmb
From: wmb@MITCH.ENG.SUN.COM
Newsgroups: comp.lang.forth
Subject: Re: Data Structures
Message-ID: <9009142026.AA19345@ucbvax.Berkeley.EDU>
Date: 14 Sep 90 16:34:04 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Reply-To: wmb%MITCH.ENG.SUN.COM@SCFVM.GSFC.NASA.GOV
Organization: The Internet
Lines: 31

>   2.) Why not have counted strings that are optionally null-terminated?

In my system, counted strings are *always* null-terminated.  The null
is not included in the count.  This makes it easy to pass strings to C.

HOWEVER, I do not recommend this!  I recommend that counted strings be
*eliminated entirely*.  Forth should only have one visible string
representation (currently it has 4), and that representation should be
"adr len" on the stack.  The "adr len" representation is by far the
most flexible, and does not suffer from the fundamental problems of
counted strings or null-terminated strings.

Those problems are:

1) Counted strings have an inherent length limitation.
2) You can't just point to some memory and say it's a counted string or
   a null-terminated string; instead you have to copy it somewhere so you
   can insert either the count byte or the null byte.
3) Null-terminated strings cannot represent byte arrays containing the
   null byte.
4) Extraction of substrings of counted strings and null-terminated strings
   requires copying.

The only down side of "adr len" strings is the stack manipulations required,
and that turns out not to be too bad once you get used to using "2dup"
and "2swap".

Over time, I am weeding out all externally-visible uses of counted strings
in my system.

Mitch Bradley, wmb@Eng.Sun.COM