Path: utzoo!mnetor!uunet!lll-winken!lll-tis!ames!necntc!ima!think!barmar From: barmar@think.COM (Barry Margolin) Newsgroups: comp.lang.misc Subject: Re: The Joy of Zero-based Arrays Message-ID: <17419@think.UUCP> Date: 3 Mar 88 04:18:41 GMT References: <1012@PT.CS.CMU.EDU> Sender: usenet@think.UUCP Reply-To: barmar@fafnir.think.com.UUCP (Barry Margolin) Organization: Thinking Machines Corporation, Cambridge, MA Lines: 106 In article <1012@PT.CS.CMU.EDU> skef@SPICE.CS.CMU.EDU (Skef Wholey) writes: >In a couple of recent articles, people have asserted that zero-based arrays >are somehow more burdensome and less natural to use than one-based arrays: [references deleted] >Here I argue the opposite: that zero-based arrays are God's Gift To >Programmers, and that one-based arrays are Satan's Way Of Complicating Code. I've used both PL/I (which allows arbitrary array indices, but defaults to 1-based, and only has 1-based strings, and Lisp, which only has 0-based arrays. It hardly needs to be said that the best thing is to allow the programmer to specify the index base. I've had my share of fencepost errors in both languages, but I think I prefer 1-based arrays. >Now, one can of course use the end-exclusive model with one-based arrays, but >then the subsequence containing the whole array does *not* have an end that is >the highest index of the array; it is one more than that index. That's true for 0-based arrays, too. For a 10-element 0-based array, the end-exclusive subsequence is [0,10), but the highest index of the array is 9. If it is 1-based, the end-exclusive subsequence is [1,11), and the highest index is 10. What's the difference? > So when using >one-based arrays people usually use an end-INclusive model, which also leads >to nastiness. The Length of a subsequence is now End - Start + 1. To >represent a zero-length sequence the end index must be one LESS than the start >index. Yes, I admit that arrays declared (1 .. 0) seem strange, and I think some languages don't even allow it. > > (do ((i start (1+ i))) > ((= i (1+ end))) > (frob (oref array i))) ; "Oref" for "One-based Aref" actually, we generally used (> i end) as our end test. Or we use traditional FROM-TO loops, as in the PL/I do i = start to end; frob (array(i)); end; In your posting you made mention of something like this when you trandlated your Lisp DO to pseudo-Pascal: > For I from Start (inclusive) to End (exclusive) Do > Frob(Array[I]); Most languages don't have an FOR-loop that is exclusive of the End (Zetalisp LOOP is an exception, with its FOR FROM BELOW option); the End is usually inclusive, which works perfectly when you use the end-inclusive model of array subranges. For example, to loop over an entire array in PL/I you simply do: do i = lbound(array,1) to hbound(array,1); frob (array(i)); end; This code is actually independent of the base of the array, because it uses the LBOUND (Lower BOUND) and HBOUND (Higher BOUND) builtin functions to determine the actual index range (the second argument is the dimension whose bound should be returned). >And a following subsequence starts at one MORE then the end of the current >subsequence. > > (do ((i (1+ old-end) (1+ i)) > ((< i new-end)) > ...) > >So the +1's suddenly start creeping into the code, into the initialization or >termination clauses. Well, in PL/I, at the end of a "do i = start to end" loop, the variable i is left with the value end+1. So, it is common to continue with: do i = i to new-end;... >I've written tens of thousands of lines of code in Common Lisp and C, and >after I understood the beauty and just plain right-ness of the end-exclusive >model, zero-based arrays became a joy to use. I seldom, if ever, need to >adjust starts, ends, lengths, or indices by one. But I had to do that kind of >stuff frequently when using one-based arrays. The more random incrementing >and decrementing you do, the more fence-post errors your code will have. About the only time in PL/I where you have to add 1 is when going from loops like the above to calls to the SUBSTR builtin, which extracts a portion of a string. This function takes start and length rather than start and end. As I pointed out above, there's no real reason why end-inclusive subranges and 1-based arrays must be related. By the way, Lisp isn't even consistent about whether it is 1-based or 0-based. (FIRST ) is equivalent to (NTH 0 ), (SECOND ) is (NTH 1 ), etc. Mind you, I'm not complaining; the ordinal functions are intended for those cases where the programmer knows the position a priori, and we humans tend to think 1-based. Barry Margolin Thinking Machines Corp. barmar@think.com uunet!think!barmar