Path: utzoo!attcan!uunet!microsoft!bobatk From: bobatk@microsoft.UUCP (Bob ATKINSON) Newsgroups: comp.lang.c++ Subject: Re: Assignments to reference variables [ and operator.() ] Message-ID: <57714@microsoft.UUCP> Date: 25 Sep 90 18:02:44 GMT References: <8445@jarthur.Claremont.EDU> <57570@microsoft.UUCP> <1677@lupine.NCD.COM> <57684@microsoft.UUCP> Reply-To: bobatk@microsoft.UUCP (Bob ATKINSON) Organization: Microsoft Corp., Redmond WA Lines: 208 In article <57684@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes: >In article <1677@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: >|Look folks, just because a particular C++ token contains some non- >|alphanumeric characters does not make it an operator! > >Right. For that to happen the author[s] of the language has to say: "This >particular token is an operator." Said comment has been made for "->", >has not been made for "." I jump the gun in referring to "op.", because >I am persuing an analogy to op-> -- an analogy that Ron seems to buy into. But see E&S, 13.4, pg 330: "The following *operators* cannot be overloaded: . .* :: ?: nor can the preprocessing symbols # and ## (Sect 16)." [Emphasis mine] >Likewise foo[n] is [was] "syntactic sugar" for *(foo+n). When operator >overloading was first allowed in C++, the choice was made that the >decision to keep or not keep the historical equivalences from C in the >overloaded operators was up to the class programmer's discretion. The reason that operators exist *at all* in C or C++ is for notational convenience. Clearly, the language could have used, say, a purely functional syntax, but this was deemed (correctly, in my opinion) to have been far too cumbersome. Therefore, convenient notation was invented to manipulate *the data types that were present in the language*. Of course, there are some relationships between the different notations. Examples of these include the relationships between -> and (*). and [] vs *(+). An important change occurred in C++. The programmer is now building *new* data types that he would like to manipulate with as much ease and simplicity as he can manipulate the built-in types. If good notations were important for the ease of use of C's types, then there is every reason to believe that they are important to the ease of use of programmer-defined types. Here is an exercise: Imagine you have an object which represents a range on a spreadsheet. Three very imporant operations that exist in today's macro languages are range construction (return a rectangular range from its upper left and lower right corners), range union, and range intersection. Today, Microsoft Excel uses ':', ',', and ' ' (space) respectively for these operations. Because of their frequency, it is *absolutely essential* that range construction and union be done in C++ with an operator. Functional notation (our only other alternative) is just too cumbersome. The desire for an operator for intersection lies in it symmetry w/ union. You have some constraints: precedence construction > precedence intersect > precedence union ranges (cells) also respond to arithmetic and comparision operators. Question: what C++ operators would you choose for these operations? (I believe there are two, maybe three, appropriate choices.) The point of this exercise is twofold: 1) to illustrate a user-defined data type for which notational convenience is very important, and 2) to illustrate the difficulty of providing that convenience given the limited choice of operators and their fixed precedences. There is very little reason that I can see for believing that the notation appropriate and efficient for manipulating the built-in data types of C will be appropriate and efficient for manipulating user-defined data types. There is even less reason to believe that the relationships between the existing notations will be appropriate for the built-in types. Sure, these relationships are a starting point for learning about operators, but when a programmer encounters an interface to a new data type that involves operators, he has to *realize* that unless the data tyep is just a number-like thing (such as Complex, Fraction, or Matrix might be), then the choice of operator notation for that class is a delicate exercise in compromise. *Many* of his preconceived notions about the semantics of operators will simply not apply. I personally believe that in the long term, this need for better notation for manipulating user-defined types will lead C++ to allow user-defined operators. In the mean time, given that we have to work with a fixed set of operators at fixed precedences and associativities, I believe that we'll have to make do as best we can. The more flexibility, the better job we can do of providing appropriate efficient notation. >|So the selector is *not* an operator. Period. Allowing overloading for >|it would make about as much sense as allowing overloading for `{' or `}' >|or `::' or `"'. > >"." is not an operator until it can be overloaded. It is then an operator. >Like op->, it can make sense to turn "." into a unary operator, because it >certainly has an object on the lhs to bind to. I disagree. I don't see the necessity of something being overloadable before being labled an operator. See below. >Other combinations of tokens with an object on one side, or the other, or >both sides, could also be candidates for similar promotion to "operator" >status, but I leave it to someone more flame resistant than myself to >make such proposals, if they feel so motivated. > >In particular, it might be interesting to allow an extension to the >language of: > >object1 -> object2 and >object1 . object2 > >[IE allow binary overloading of op-> and op. where both the lhs, and the rhs > are unambiguously objects, not field selectors] A neat idea. A particularly useful form for such a RHS happens in the case where object2 is of a particular enum type, say type IEnum. This provides the programmer with the ability to implement what look syntactically like data members but are implemented with a function for access. This *might* look something like: class C { enum IEnum { foo, bar, baz }; public: int operator.(IEnum); }; int C::operator.(anIEnum IEnum) { // return some integer } someClient { //... int anInt = someC.foo + someC.bar; //.. } Just a thought... >|If they really think that this is such a swell idea, then I challenge them >|to tell me (and everyone) why they are not suggesting allowing `operator{' >|to be overloaded. > >1) Unlike op. , no one has presented a reasonable proposal why it might be > good to allow doing so. > >2) Unlike op. , one cannot argue that either the lhs or the rhs of "operator{" > is an object such that the appropriate overloaded "operator{" can be > selected. I think there is a more simple answer. All the tokens currently described as operators can be found in strings derivable from "expression" as defined on pg 388 of E&S. { cannot be found in strings derived from expression. This is a fundamental distinction. >|If there are any rules for what should be called an operator and what should >|not, and what should be overloadable and what should not, I'd like to see >|them! If these rules are at all consistant, if they make any sense >|whatsoever, and if they still would seem to permit -> to be overloaded, >|I'll eat my hat. > >My proposal for such a set of rules is simply: > >1) One or both sides of the proposed "operator" needs to be certainly an > object, so that a member function can be unambiguously selected based on > the type[s] of those object[s]. > >2) Someone needs to make a strong argument for how allowing such to be > an overloaded operator is going to solve real-world programming problems. > >3) It better not cause wide-spread changes throughout the language. A cursory examination of 17.2 of E&S leads me to *conjecture* the following precise definition for the general notion of "operator." A operator is a token which is not an indentifier but which can be found in _some_ string legally derivable from the non-terminal "expression" in the grammar. (This definition must be modified to accomodate "parenthetical" operators. I shall not do that here. It merely pairs into one operator tokens which by this naive definition would be considered separate operators.) My brief examination indicates that all such tokens derivable from expression are today considered operators, and that conversely, all tokens considered operators today can be found in such strings. For this purpose, I will not consider a reserved word to be an identifier. This definition therefore encompasses the definition of "new", "delete", and "sizeof" as operators. Bob Atkinson Microsoft