Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!princeton!caip!topaz!harvard!seismo!umcp-cs!chris From: chris@umcp-cs.UUCP (Chris Torek) Newsgroups: net.lang.c,net.micro.pc,net.unix Subject: Re: C'mon, guys! (Really, pointer pedagogy) Message-ID: <2107@umcp-cs.UUCP> Date: Fri, 20-Jun-86 18:35:25 EDT Article-I.D.: umcp-cs.2107 Posted: Fri Jun 20 18:35:25 1986 Date-Received: Sun, 22-Jun-86 04:01:51 EDT References: <487@cubsvax.UUCP> <748@eneevax.UUCP> Reply-To: chris@maryland.UUCP (Chris Torek) Organization: University of Maryland, Dept. of Computer Sci. Lines: 82 Xref: watmath net.lang.c:9528 net.micro.pc:8780 net.unix:8312 [Warning: this is not an article about `C', but rather an article about `about C'. Nothing truly technical is contained herein.] In article <748@eneevax.UUCP> phaedrus@eneevax.UUCP (Praveen Kumar) writes: >I believe that a lot of the notation in C is derived from PDP assembly >language. I think (it has been long time since I mucked around with >PDPs) that the increment, "++", and the dereferencing, "*" operators are >straight out of PDP assembly. This is not really for me to say, for I was not in on the creation of the C language, yet I feel I should answer this. (If I do a good enough job, perhaps I can even provoke DMR into a few minor corrections. :-) ) Was the C notation derived from PDP-11 assembly? I think the answer here is both no and yes. Much C notation was certainly influenced by '11 assembly; but I think `derived' is too strong. DEC PDP-11 assemblers use `@', not `*', but let us assume that Ken Thompson had been using `*' with whatever assembler he was using. (The 4BSD Vax assembler uses `*', so it is reasonable to guess that this was handed down from an earlier era.) First contrast mov *(r4)+,-(r5) with *--p = **q++; (if I have not botched the '11 assembly; I have never used an '11). Close? Well, somewhat: I can see a resemblance, at any rate. Now step back a bit and consider the notation in and of itself. We have here three basic operations: `--p', `q++', and `*'. From early mathematics notation we can take `-' as `subtract' and `+' as `add'. `*' is an abberation; it looks more like one of the generic binary operation symbols used in group theory than anything else (though this may depend on your terminal's font). As for why there are two each of `+' and `-', I think we can put that down to the exigencies of parsing. Now we have `-p' and `q+'---but what might these mean? Well, if `-' is subtract and `+' is add, then we have `subtracted p' and `q added'. There is nothing explicitly being subtracted or added, so it is perhaps reasonable to assume one of the classical computer science numbers, namely `zero', `one', and `many'. Adding and subtracting zero is useless, and adding and subtracting many is ambiguous, so we will add and subtract one. I think it is also a small step to say that the `-' is `before' `p', and the `+' is `after' `q', so we should do the subtraction `before' and the addition `after'. Before and after what? Here I resort to fiat and say `before and after *, which we define to mean indirection'. Of course, all this does is demonstrate that the PDP-11 assembly notation was in some respects `reasonable', and not that the notation appears in C for that particular reason. In order to refute the quoted statement above, I must find `a lot of C notation' that does not seem to be `derived from PDP assembly'. So let us consider some more C notation, in particular in expressions. 1. Arithmetic. C arithmetic seems to be quite conventional for post-FORTRAN languages. `a + b * (c - d)' does not look much like a series of `sub', `mul', and `add' instructions to me. 2. Structures. Structure member access via `.' is again very conventional; it looks like PL/I, among others. Pointer member access is a little different. `p->member' can indeed be done with a single '11 instruction in many cases, yet the `->' notation itself does not appear in '11 assembly. 3. Logical operations. `&&' and `||' have no direct counterpart in '11 assembly, and must be implemented with rather complex series of tests and branches. No doubt more examples can be found by those cleverer than I; but I think this much is sufficient. I think I will close by saying that the notation used in C is simply a well-coordinated set of notations borrowed from other places and languages, including but not limited to PDP-11 assembly, and modified as appropriate to obtain that coordination. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu