Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!utgpu!water!watmath!clyde!cbosgd!ihnp4!homxb!houxm!mhuxt!mhuxm!mhuxo!ulysses!allegra!mit-eddie!gatech!udel!rochester!pt!ius2.cs.cmu.edu!edw
From: edw@ius2.cs.cmu.edu.UUCP
Newsgroups: comp.lang.c
Subject: Re: type-indexed arrays (was: enum - enum ?)
Message-ID: <1197@ius2.cs.cmu.edu>
Date: Sun, 14-Jun-87 15:17:36 EDT
Article-I.D.: ius2.1197
Posted: Sun Jun 14 15:17:36 1987
Date-Received: Wed, 17-Jun-87 02:01:51 EDT
References: <139@starfire.UUCP> <516@haddock.UUCP> <20540@sun.uucp> <1226@crash.CTS.COM>
Organization: Carnegie-Mellon University, CS/RI
Lines: 109

In article <1226@crash.CTS.COM>, ford@crash.CTS.COM (Michael Ditto) writes:
> 
> > [...]  Say I declare an array
> >
> >	int	demo[ 21 : -10 ];
> >
> >which I define to mean 'an array of ints, called demo, 21 ints long,
> >first element is demo[ -10 ]'.
> 
> This is not a bad idea, but I would prefer something like the pascal syntax:
> 
> 	int	demo[ -10 .. 10 ];
> 
> >Now, what is the value of the token 'demo' ?
> 
> I think it should be the address of the FIRST element in demo (i.e.
> demo[-10]).  This allows such idiomatic constructs as
> 
> 	write(fd, demo, sizeof demo);
> 
> which would become very non-intuitive if it were necessary to specify the
> starting or ending indices.

	We have a problem here.  If demo is actually the address of
the first element of the array then what about the code generated for
a = demo[-10] presuming that you what to index over the subrange.
Does the compiler have to keep around subrange values so the code generated
for the above is

		

	a = *(demo + index - range_base)
> 
> >                                    And further, when I then code up
> >something like 'frobozz( 2, "hi there", demo );' and frobozz() looks
> >like
> >
> >	double frobozz( a, b, c );
> >		int a;
> >		char * b;
> >		int c[];
> >	{ /* and so on... */
> >
> >what is the allowable range of indicies of c[] ? -10 .. +10 ?  0 ..
> >20 ?
> >
> 
> The problem here is that the declaration "int c[]" is misleading, it implies
> that an array is passed to this function, when only a pointer is passed.  It
> makes sence to declare c as "int *", and then you can pass the address of
> whichever element of demo[] you like, with "demo" alone refering to the add-
> ress of the FIRST (not necessarily the zero-th) element.

	This is how you propose to solve the problem of indexing
for subranges in C.  Why bother putting them into the language in 
the first place if you can't subscript over them in the most natural way
ie.

		for (i = -10; i <= 10; i++)
			sum += c[i];

  The better solution would be to require the subrange declaration
for c. ie	
		int c[-10..10];

> 
> I see no real reason to have negative (or even specifiable) ranges of indices
> for arrays.  This is another "educational" feature of pascal that detracts
> from C's philosophy of being a 'high-level assembly language', i.e., C is
> meant to provide a portable representation for operations present in most
> machines' instruction sets.  But negative subscripts to POINTERS can be very
> useful, almost always has a direct mapping to machine instructions, and is
> already used in a lot of code that I have seen, including most C compilers'
> C libraries (in things like strcat, strtok, et al.).
> 
> Michael "Ford" Ditto				-=] Ford [=-
> P.O. Box 1721					ford@crash.CTS.COM
> Bonita, CA 92002				ford%oz@prep.mit.ai.edu

   Points to be made.

	If subranges are added to the language then

	1) the indexing code for arrays would be let efficient

		array base address + index - base subrange
		(constant folding can remove some inefficencies
		 but not all - if the array is an automatic var
		 the address isn't known until the function is called)

	2) would require the code generation to be a lot smarter about
	   array indexing

		can nolonger assume the base index will start at the
		integer value 0 - indexing subrange information
		on arrays would have to be keep around.

	3) make the language more strongly typed

		To properly handle the frobozz example, the subrange
		information has to be present.   Hence, one could
		say that int ??[-10..10] is a new type and to 
		insure proper preformance only variables of the
		same type should be given as input parameters.

-- 
					Eddie Wyatt

e-mail: edw@ius2.cs.cmu.edu