Path: utzoo!attcan!uunet!aplcen!uakari.primate.wisc.edu!sdd.hp.com!usc!ucsd!ucbvax!AI.MIT.EDU!bson From: bson@AI.MIT.EDU (Jan Brittenson) Newsgroups: comp.society.futures Subject: Re: C's sins of commission (was: (pssst...fortran?)) Message-ID: <9009220848.AA00539@wheat-chex> Date: 22 Sep 90 08:48:56 GMT Sender: usenet@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 152 X-Unparsable-Date: Sat, 22 Sep T 04:48:12 EDT This message is quite long. I apologize if you think I'm filling up your mailbox with junk flamage. Jim Giles: >> 1. Pointer range check (to see if a buffer crosses page >> boundaries, for instance). > Well, without pointers, why do you need a pointer range check? Computing > the range of something that doesn't exist seems a little silly. Pointers _are_ addresses, and nothing else. Regardless of whether they include segment information, or other information relevant only to non-state-of-the-art architectures. The "address" idiom covers all information relevant to locating the addressee. Pointers may be interpreted differently, depending on the datum, though. On a pdp-10, not only is a word address necessary, but also a character index within the word if it's a character pointer. > I think you had in mind casting the pointer to an int and looking at > the raw address - the ANSI standard leaves this process undefined. You're right, that was my intent with the buffer example. But unless _somehow_ a means of retrieving the address of the buffer - a pointer to it - is provided, the page boundary check can not be done _at all_, defined or undefined, portable or not. To me the simple C-style casting is preferable to some obscure union declared miles away, since pointer-to-int casting at least tells me what is going on. Besides, in almost any implementation casting a pointer to an int of sufficient size and then later back, will yield the original pointer. I most certainly would refuse to use a compiler for which this assumption wasn't correct. If the machine hardware is such that it's not a reasonable assumption to make - say on a Lisp Machine, for instance - then, well, forget about portable C code. > Now, if you're talking about non-standard extensions to C which would > allow you to do this stuff - then any other language can contain the > same non-standard extensions. Extensions, or non-uptight about pointer typing, call it whatever you like. >> [...] >> 2. Calculate physical addresses for DMA controllers. > Why should I care? The system/environment should be able to give me the > address if I need it. But, how do I use a raw address anyway? _Standard_ > C pointers don't give me any such access. Access to such things as > hardware controllers should be privilaged to the system - and _it_ > can contain machine dependent code - like assembly. ...or like C, which most certainly is more defined than assembler! I'm not sure what kind of programming you're talking about. There are languages which are defined similar to what you have described here, but few outside academia use them - Euclid for instance. According to my experience, programmers can be put into either of two major groups: application programmers and system programmers. While the former use various 4G and other kinds of application-oriented tools - such as XYZ-SQL, COBOL, or Prolog, to write applications, the latter do the system-dependent stuff, such as database, server, and support tool implementations - mostly things that are system-dependent to start with. Neither of these groups would have particular use for your proposed language - the application people would ask you what syntax applies to selecting records in a database, while the system people would ask you how to set up 2D bitblt operation in a graphics device, or how to create a channel program in a mainframe environment. For sure, some of the work done by system folks falls somewhere in-between. But I seriously doubt programming efficiency or maintenance would be improved to any degree worth mentioning by forcing everyone to learn Yet Another Language and an entirely new set of idioms when the previous ones are considered quite sufficient. Can you give me one example of a project you or a first-hand reference has been involved in that falls between the two major categories I've outlined above, and which by itself constitutes a project large enough to warrant not simply making do with what you've got and are used to, and possibly for an employer to require experience with your language as desirable? >> [...] > 3. Sort a linked list on addresses of some data >> pointed to > from within the node. Or to keep it sorted as new >> (addresses > of) data is added. > I guess you'll have to tell me how this differs from sorting on the > index of the data within an array or sequence. Since the sequence is > dynamic, ... So how do I know where a certain index resides? I guess this would be an undefined topic - although in this example it would be well defined in C, since the buffers would be of the same type (i.e. arbitrarily dimensioned character vectors). > ... you can add all the elements you wish - and still sort on index. How do I know that the addresses of the previous indexes do not change as new elements are added? This would have to be undefined, as well. >> [...] >> 4. Implement malloc()/free(). > When I found out that the ANSI C standard prohibited comparing/subtracting > pointers to different objects, I pointed out on comp.lang.c that malloc() > and free() could not not be written in _standard_ C. They agreed with me. No doubt you're correct. Implementation is fairly trivial in "nonstandard" C, and I fail to see how it could be made easier or more "defined" without any pointers (i.e. explicit object addresses) at all? >> [...] >> I'm curious as to why so many programmers engage themselves in hot >> debates over how to best implement strings. String processing is >> proportionally insignificant - the first thing done after a read is >> usually a tokenization, either through hand-written code or the output of >> a lexical front-end generator. [...] > Tokens are also strings .... Symbol tables also contain strings > among other stuff). First, tokens are best handled as small integers or enumerated types, while symbol tables are commonly hashed. Other than converting strings-to-int-tokens and symbols-to-hash-values, very little string processing is done. Second, take a look at an assembler or compiler, and you'll be amazed at the total lack of string operations. (Apart from the lexical front-ends, of course.) > Text processors usually don't have much data that isn't part of one > string or another. Granted, but then for most text processors, a simple string or any other sequence isn't enough to store the text and all relevant information. A couple of years ago I wrote a type-setting system - it should qualify as a "text processor" as good as any. The first thing done with the incoming text was chopping it up in segments containing font-pitch-kerning-etc-info unique to the segment. The actual characters of the segment weren't used again until it was time to print them. _All_ work was performed on the remaining segment information, the lists of segments, and lists of lists of segments. Of all hairy things done, _none_ involved character data. (And rarely any duplication either, for that matter.) Let's distinguish between "defined," and "portable." Even if a program adheres to a formal definition, there is no guarantee that it's going to run on every other system that adheres to the same definition. In the end, common sense and portability constraints will have to lead all development. -- Jan Brittenson bson@ai.mit.edu