Path: utzoo!attcan!uunet!microsoft!bobatk
From: bobatk@microsoft.UUCP (Bob ATKINSON)
Newsgroups: comp.lang.c++
Subject: Re: Assignments to reference variables [ and operator.() ]
Message-ID: <57714@microsoft.UUCP>
Date: 25 Sep 90 18:02:44 GMT
References: <8445@jarthur.Claremont.EDU> <57570@microsoft.UUCP> <1677@lupine.NCD.COM> <57684@microsoft.UUCP>
Reply-To: bobatk@microsoft.UUCP (Bob ATKINSON)
Organization: Microsoft Corp., Redmond WA
Lines: 208

In article <57684@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>In article <1677@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>|Look folks, just because a particular C++ token contains some non-
>|alphanumeric characters does not make it an operator!
>
>Right.  For that to happen the author[s] of the language has to say: "This 
>particular token is an operator."  Said comment has been made for "->",
>has not been made for "."  I jump the gun in referring to "op.", because
>I am persuing an analogy to op-> -- an analogy that Ron seems to buy into.

But see E&S, 13.4, pg 330: 
"The following *operators* cannot be overloaded: 
	
	.  .*  :: ?:

nor can the preprocessing symbols # and ## (Sect 16)."
[Emphasis mine]


>Likewise foo[n] is [was] "syntactic sugar" for *(foo+n).  When operator
>overloading was first allowed in C++, the choice was made that the
>decision to keep or not keep the historical equivalences from C in the
>overloaded operators was up to the class programmer's discretion.


The reason that operators exist *at all* in C or C++ is for notational
convenience. Clearly, the language could have used, say, a purely
functional syntax, but this was deemed (correctly, in my opinion) to 
have been far too cumbersome.  Therefore, convenient notation was 
invented to manipulate *the data types that were present in the language*.
Of course, there are some relationships between the different notations.
Examples of these include the relationships between -> and (*). and 
[] vs *(+).

An important change occurred in C++. The programmer is now building *new*
data types that he would like to manipulate with as much ease and simplicity
as he can manipulate the built-in types. If good notations were important
for the ease of use of C's types, then there is every reason to believe
that they are important to the ease of use of programmer-defined types.  

Here is an exercise:

	Imagine you have an object which represents a range on a 
	spreadsheet.  Three very imporant operations that exist in
	today's macro languages are range construction (return a 
	rectangular range from its upper left and lower right corners),
	range union, and range intersection. Today, Microsoft Excel
	uses ':', ',', and ' ' (space) respectively for these operations.
	Because of their frequency, it is *absolutely essential* that 
	range construction and union be done in C++ with an operator.
	Functional notation (our only other alternative) is just 
	too cumbersome.  The desire for an operator for 
	intersection lies in it symmetry w/ union.

	You have some constraints:

		precedence construction 
			> precedence intersect 
			> precedence union

		ranges (cells) also respond to arithmetic and 
			comparision operators.

	Question: what C++ operators would you choose for these operations?	
		  (I believe there are two, maybe three, appropriate choices.)


The point of this exercise is twofold:

	1) to illustrate a user-defined data type for which notational
	   convenience is very important, and

	2) to illustrate the difficulty of providing that convenience
	   given the limited choice of operators and their fixed 
	   precedences.


There is very little reason that I can see for believing that the notation
appropriate and efficient for manipulating the built-in data types of C
will be appropriate and efficient for manipulating user-defined data types.

There is even less reason to believe that the relationships between 
the existing notations will be appropriate for the built-in types.
Sure, these relationships are a starting point for learning about
operators, but when a programmer encounters an interface to a new data
type that involves operators, he has to *realize* that unless the data
tyep is just a number-like thing (such as Complex, Fraction, or Matrix
might be), then the choice of operator notation for that class is 
a delicate exercise in compromise. *Many* of his preconceived notions
about the semantics of operators will simply not apply.


I personally believe that in the long term, this need for better notation 
for manipulating user-defined types will lead C++ to allow user-defined
operators.  In the mean time, given that we have to work with a fixed set
of operators at fixed precedences and associativities, I believe that 
we'll have to make do as best we can.  The more flexibility, the better
job we can do of providing appropriate efficient notation.


>|So the selector is *not* an operator.  Period.  Allowing overloading for
>|it would make about as much sense as allowing overloading for `{' or `}'
>|or `::' or `"'.
>
>"." is not an operator until it can be overloaded.  It is then an operator.
>Like op->, it can make sense to turn "." into a unary operator, because it
>certainly has an object on the lhs to bind to.

I disagree. I don't see the necessity of something being overloadable
before being labled an operator.  See below.


>Other combinations of tokens with an object on one side, or the other, or
>both sides, could also be candidates for similar promotion to "operator"
>status, but I leave it to someone more flame resistant than myself to
>make such proposals, if they feel so motivated.
>
>In particular, it might be interesting to allow an extension to the 
>language of:
>
>object1 -> object2  and
>object1 . object2
>
>[IE allow binary overloading of op-> and op. where both the lhs, and the rhs
> are unambiguously objects, not field selectors]

A neat idea. A particularly useful form for such a RHS happens in the case
where object2 is of a particular enum type, say type IEnum. This provides
the programmer with the ability to implement what look syntactically like 
data members but are implemented with a function for access.  This *might*
look something like:

	class C {
	enum IEnum { foo, bar, baz };
	public: int operator.(IEnum);
	};

	int C::operator.(anIEnum IEnum) {
		// return some integer
		}

	someClient {
		//...
		int anInt = someC.foo + someC.bar;
		//..
		}

Just a thought...


>|If they really think that this is such a swell idea, then I challenge them
>|to tell me (and everyone) why they are not suggesting allowing `operator{'
>|to be overloaded.
>
>1) Unlike op. , no one has presented a reasonable proposal why it might be
>   good to allow doing so.
>
>2) Unlike op. , one cannot argue that either the lhs or the rhs of "operator{"
>   is an object such that the appropriate overloaded "operator{" can be 
>   selected.

I think there is a more simple answer. All the tokens currently described
as operators can be found in strings derivable from "expression" as 
defined on pg 388 of E&S. { cannot be found in strings derived from
expression.  This is a fundamental distinction.


>|If there are any rules for what should be called an operator and what should
>|not, and what should be overloadable and what should not, I'd like to see
>|them!  If these rules are at all consistant, if they make any sense
>|whatsoever, and if they still would seem to permit -> to be overloaded,
>|I'll eat my hat.
>
>My proposal for such a set of rules is simply:
>
>1) One or both sides of the proposed "operator" needs to be certainly an
>   object, so that a member function can be unambiguously selected based on
>   the type[s] of those object[s].
>
>2) Someone needs to make a strong argument for how allowing such to be
>   an overloaded operator is going to solve real-world programming problems.
>
>3) It better not cause wide-spread changes throughout the language.


A cursory examination of 17.2 of E&S leads me to *conjecture* the 
following precise definition for the general notion of "operator."

	A operator is a token which is not an indentifier but 
	which can be found in _some_ string legally derivable
	from the non-terminal "expression" in the grammar.

(This definition must be modified to accomodate "parenthetical" operators.
I shall not do that here. It merely pairs into one operator tokens which
by this naive definition would be considered separate operators.)

My brief examination indicates that all such tokens derivable from
expression are today considered operators, and that conversely, all 
tokens considered operators today can be found in such strings.

For this purpose, I will not consider a reserved word to be an identifier.
This definition therefore encompasses the definition of "new", "delete", 
and "sizeof" as operators.

	
	Bob Atkinson
	Microsoft