Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!ll-xn!mit-eddie!uw-beaver!cornell!rochester!PT.CS.CMU.EDU!SPICE.CS.CMU.EDU!skef
From: skef@SPICE.CS.CMU.EDU (Skef Wholey)
Newsgroups: comp.lang.misc
Subject: Re: The Joy of Zero-based Arrays
Message-ID: <1026@PT.CS.CMU.EDU>
Date: 3 Mar 88 20:51:55 GMT
References: <1012@PT.CS.CMU.EDU> <17419@think.UUCP>
Sender: netnews@PT.CS.CMU.EDU
Organization: Carnegie-Mellon University, CS/RI
Lines: 85

>From: barmar@think.COM (Barry Margolin)
>[...]
>I've used both PL/I (which allows arbitrary array indices, but
>defaults to 1-based, and only has 1-based strings, and Lisp, which
>only has 0-based arrays.  It hardly needs to be said that the best
>thing is to allow the programmer to specify the index base.  I've had
>my share of fencepost errors in both languages, but I think I prefer
>1-based arrays.

I can't contest your preference.  The articles I was responding to,
though, asserted that zero-based arrays had the "obvious" disadvantage
that things were "off by one a lot."  I hope I've shown that this isn't
really the case.

You (Barmar) bring up the good point that if you've got the right set of
iteration constructs, one-based arrays can be dealt with as cleanly as
zero-based arrays.  I agree with this, and almost said as much in my
previous post.  In Lisp the user can define these things as macros if
they aren't provided (as by Zetalisp's LOOP).  PL/I has these things,
but then, what doesn't it have?

This zero-based/one-based discussion arose from the description of the
Oberon programming language.  Oberon, like Pascal and Modula-2 and even
C, provides a few iteration constructs and no practical, general, way of
defining more (the C preprocessor leaves much to be desired as a
macro-writing language).  So, given these restrictions, I would say that
if you're going to have a limited number of iteration constructs AND
fixed-based arrays, the iteration constructs should be chosen to reflect
the base (be it zero or one) of your arrays.  (To put it another way: If
your language is going to put you in a straightjacket, it should at
least be a comfortable one.)

>>  (do ((i start (1+ i)))
>>      ((= i (1+ end)))
>>    (frob (oref array i))) ; "Oref" for "One-based Aref"
>
>actually, we generally used (> i end) as our end test.

Yeah, the "system" code I write usually does that -- you can't trust
lusers not to pass you bad indices.  But my personal taste is that =
looks much nicer than >.  No biggie.

On moving on to the next subsequence:
>... in PL/I, at the end of a "do i = start to end" loop, the
>variable i is left with the value end+1.  So, it is common to continue
>with:
>
>	do i = i to new-end;...

Yes, it's nice when the value of a loop index variable is defined after
the loop.  Unfortunately, not all languages do that, presumably to give
the implementor more leeway in the name of "efficiency."  (If a compiler
unrolls the loop, for example, it would still have to assign something
to the iteration variable afterwards.)

>About the only time in PL/I where you have to add 1 is when going from
>loops like the above to calls to the SUBSTR builtin, which extracts a
>portion of a string.  This function takes start and length rather than
>start and end.

Is that the only builtin that uses lengths?  That seems like a kind of
odd wart on the language, if so.  If not, I'd argue as before that the
end-exclusive property Length = End - Start is something worth trying to
preserve.

>As I pointed out above, there's no real reason why end-inclusive
>subranges and 1-based arrays must be related.

Yeah, I agree, given that your iteration constructs are flexible enough,
or that you can roll your own.  I'll make the vague and unsupported
claim that people usually manipulate 1-based arrays using an
end-inclusive model, though.  The articles which prompted me to post
here seemed to be written by people thinking along those lines.

>... we humans tend to think 1-based.

Hmm.  I think I think zero-based.  I don't know if that comes from
having taken too many math courses, or from dealing with icky innards of
low-level software (computer memory is, after all, zero-based).  I guess
you could argue that it took thousands of years of counting things
before people discovered zero, so it can't be all THAT natural.  Maybe
one-based counting IS "natural," but that doesn't necessarily make it
"right."

--Skef
  (Preferred mail address: Wholey@C.CS.CMU.EDU)