Path: utzoo!mnetor!uunet!mcvax!ukc!its63b!db
From: db@its63b.ed.ac.uk (D Berry)
Newsgroups: comp.lang.c++
Subject: User defined operators
Message-ID: <1206@its63b.ed.ac.uk>
Date: 25 Apr 88 17:47:15 GMT
References: <4444@ihlpf.ATT.COM>
Reply-To: db@itspna.ed.ac.uk (Dave Berry)
Organization: University of Edinburgh
Lines: 75

In article <4444@ihlpf.ATT.COM>  writes:
>>In article <8804140925.AA13150@klaus.olsen.uucp> Info-Modula2 Distribution List <INFO-M2%UCF1VM.bitnet@jade.berkeley.edu> writes:
>>
>>Does C++ allow infix procedures other than the standard set?
>
>No.  Doing this tends to lead to unreadable code.  For example:  If I
>overload the word 'or' as an infix operator, this sentence no longer has
>the same meaning that I intended (this is because 'word' becomes 'w or d'.

Does this mean that "newton" isn't a legal C++ identifier because it will
be parsed as "new" "ton"?  I doubt it.
Most languages that allow user defined infix operators let them be any
(of a subrange of) lexically distinct token(s).  Often alphanumeric and symbolic
tokens will be different sets, allowing expressions such as "w+d" to be parsed
correctly, while requiring the spaces in "w or d" to distinguish this case
from "word".

>It also leads to nightmares for the parser (is '/+' an error or an overload
>operator, etc.).

The easiest way to handle this is to take the rule for distinguishing between
alphanumeric identifers -- read the longest -- and use it for symbolic
identifiers as well.  So Nevin's example would be a single identifier "/+".
If this were done to C++ (I'm not suggesting it should be done), its
expression would differ from C in some cases.  E.g.

	Expression	C parse				(C++)++ parse 

	a+++++b   	"a" "++" "+" "++" "b"		"a" "+++++" "b"
	*++p		"*" "++" "p"			"*++" "p"

However, C++ isn't source code compatible with C anyway, and this scheme
would make the existing treatment of "/*" and "*/" examples of a general rule
rather than a specific case.  It would probably also make cases like the above
easier to read, as they would have to be broken up:

	a++ + ++b		*(++p)

(Really basic symbols such as brackets and quotes shouldn't be allowed to
appear in symbolic identifiers or things get out of hand).

Defining your own operators is subject to the same cautions as overloading
existing ones.  It can make code easier to read, but you can also use it to
make a real dog's breakfast.

It also requires scope rules for the infix nature of the token.  One person
might define "or" to be a infix operator in class A while someone else defines
"or" as a function in class B.  How does a function parse "or" in a program that
uses both classes?

The rule used in Standard ML would translate to C++ as follows: an infix
operator is infix in the class in which it's declared and in all subclasses
and member functions.  From outside the class, it's treated as a prefix
function of two arguments (e.g. A::or (x, y);   B::** (x, n);).  A keyword can
make all infix tokens of a class be parsed as infix in the current file
(e.g. acceptinfix A;  x or y; acceptinfix B; x ** n;).

An alternative would be to let "A::or" be used infix all the time (e.g.
x A::or y;  x B::** n;).  This would probably be better for C++.

Presumably to be orthogonal this scheme would have to allow user defined
prefix operators as well.  Functions and prefix operators would have to
be distinguished by the same rule as functions and infix operators.
C++ can already distinguish between prefix and infix operators.

I'm not proposing that user defined operators should be added to C++; I'm just
attempting to show that it could be done and to point out some of the problems
that would need to be resolved.  Please follow up if I've missed anything.
 
>
> _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194

-- 
"The answer is simple, they could do it with ease;
 stop attacking the patients, and attack the disease."	-- Tom Robinson.