Path: utzoo!attcan!uunet!lll-winken!sol.ctr.columbia.edu!samsung!cs.utexas.edu!yale!cmcl2!lanl!jlg From: jlg@lanl.gov (Jim Giles) Newsgroups: comp.lang.misc Subject: Re: C's sins of commission Message-ID: <65662@lanl.gov> Date: 13 Oct 90 00:22:58 GMT References: <26295@megaron.cs.arizona.edu> Organization: Los Alamos Natl Lab, Los Alamos, N.M. Lines: 189 From article <26295@megaron.cs.arizona.edu>, by gudeman@cs.arizona.edu (David Gudeman): > In article <65265@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > [...] > ]Yes! Because, a[i] and b[j] are guaranteed _not_ to be aliased... > > But a[i] and a[j] don't have that guarantee. So there are a few > aliasing problems that are easier to solve with arrays. No big deal. My objection here is with the word "few" and the phrase "no big deal." _MOST_ aliasing between pointers simulating arrays is actually between separate arrays. On a pipelined machine, the slowdown for aliasing is usually on the order of a factor of 2. On a vector machine, the factor may be between 10 and 100. Most users who are hit with this problem don't regard it as "no big deal." > [...] > ]... The pointer syatax _may_ be used to simulate arrays, but > ]you might be planning to use it for dynamic memory, strings, recursive > ]data structures, run-tim equivalencing, etc.. How does the reader know > ]that the pointer will not be used in any of those ways - he knows the > ]array won't be. > > No he doesn't. In a language without pointers, the array might be > used to simulate pointers, and the pointer-simulation used for one of > the above purposes. Touche. _IF_ the choice were strictly between pointers and arrays, then the rest of the important data type construction tools would have to be simulated with whichever is chosen. However, I don't recomment minimalist language design. The language should allow arrays, sequences (essentially variable length 1-d arrays), unions (always tagged), records (structs, whatever you want to call them), and recursive data structures. These features should be allowed to be used individually or in any combination. In addition, the 'aliased' attribute should be allowed to be applied to any collection of data items of the same type (which allows the "shallow copy" assignment to be applied to them). Further attributes that should be applicable to any variables are 'dynamic' (which allows the objects to be allocated and deallocated explicitly by the user), 'static' (which makes the variable have permanent scope): the default is to put the variable on the stack. Finally, an explicit way of defeating type checking (for run-time 'equivalence' work) should be provided. The presence of all these doesn't _guarantee_ that some user won't try to simulate one of them with some combination of the others, but why should a user do so? It would simply make his code harder to read and maintain. I prefer to state _explicitly_ how my data is structured. Perhaps your objection is founded on this last issue. Maybe you don't know a-priori how you data really is structured and you want to delay the decision until most of the code is written. This violates the spirit of structured programming (iterative refinement of programs - which differs from Structured (note the initial capital) programming which is a religion of GOTO evasion). > [...] > ] Each of these features should have separate syntax since > ]they are separate features. > > C-style pointers struck me from my very first exposure as a simple and > elegant way of merging various things that are only _apparently_ > different. [...] C pointers always struck me as rather _inelegant_ - even if I wanted to merge distinct features. > [...] I don't suppose you would be willing to just admit that > people have different tastes and give up your crusade against > pointers? I might if I believed that people's tastes were correlated to their productivity. I've always worked in or near the consulting office (help desk, whatever your business calls it). I worked my way through college as such a consultant. In 18 years of such experience I think I have a pretty good idea of what kind of errors people make, what kind cause the most trouble, and what kind of language features don't seem to engender such errors. Pointers are associated with the error side of the ledger. Further, many user productivity studies have been performed on language features. (Although I haven't found one on pointers yet.) Many of the researchers who conduct such tests have remarked that, very often, the feature the users thought was most productive was actually the reverse. Users are, in fact, notoriously bad at gauging their own productivity or the features that effect it. For this reason, when I see a feature which is associated with a disproportional percentage of errors (and difficult to find and correct ones at that), I perfectly willing to ascribe a good share of the blame for such errors on the feature itself - even if it's a feature I personally like. (This is indeed the case. Ten years ago I quite liked pointers. Since then I've found substitutes which are just as efficient and are more readible and less error prone.) > [...] > I'm willing to recognize that you prefer to divide the world up into a > bunch of seperate boxes. You ought to recognize that others prefer to > integrate different things into the same box. It is simple hubris to > try to convince others that your personal preferences are somehow > objective and that their preferences are misguided. And yet, you do not find it "simple hubris" when people (including yourself) maintain that the features of C are all anyone ever needs. If my opposition to pointers seems overblown, it is because all the false hype popularizing C is even more so. How many time have you chastized a C supporter for claiming that a preference for Fortran was misguided? It happens on the net all the time - at least a dozen times a year on each relevant newsgroup (and some irrelevant ones). Yet, such a position is as much "simple hubris" (or more) than the statements I'm making. > [...] > ] Forcing them all to masquerade as pointers > ]only confuses the person maintaining the code - and doesn't give the > ]compiler enough information to adequately optimize. > > Funny, I've never been confused by any of the above uses of pointers > (with the possible exception of run-time equivalencing, I don't know > what you mean by that...). This is a variation on the old "blame the victim" approach that lawyers defending rapists use. What you're saying is that you don't have the problem and those that do aren't worth considering. Aside from the ethical questions about your apathy toward other programmers, what about practical issues such as increased software costs, or increased taxes (the government hires programmers too - not all of them are as immune as you are to these errors)? But, let's test your claim: what is the expected argument type in the following ANSI C style function prototype? int f(char *x); Now, is the argument expected a single character which is passed by reference? Does the function expect an array of char? Does the function expect a sequence of char (terminated with a zero byte)? Or, perhaps the function expects something entirely different and it is declaring its argument to be a (char *) because the programmer "knows" that (char *) is more or less generic (this dependence on internal, machine dependent internal structure is what I mean by run-time equivalence - it is often a valuable and important thing to do, but I don't think pointer casting - implicit or explicit - is the best mechanism)? You can't tell? You are now having a problem pointers that you claim you don't have. The only way to tell what this argument is expected to be is to inspect the body of the function. Commentary and other sources may _claim_ that the argument is used in a particular way - but without compiler enforcement, such claims are not reliable. > [...] > If my mental picture of a solution to a problem involves > pointers, then the language should let me _express_ the solution with > pointers. Yes, yes, _you_ don't think I should be picturing problem > solutions with pointers. I happen to disagree with you, and so do > thousands of other C programmers. Then leave the discussion! C is a fait accompli and this is a discussion about the possible design of some future language. If you feel that C has all you need already, and in a form you like, then I promise never to twist your arm to buy this new language. You can continue to use C until doomsday for all I care. My remarks are addresses to those who aren't already religious about C or about a programming pradigm requiring pointers. To those people I am pointing out that there are ways to design languages which permit them to explicitly declare the structure of their data in ways that the compiler can check - thus rendering a large class of potential errors completely impossible. Additional advantages to this approach include the fact that the compiler itself can make use of this more explicit information to produce more efficient code. Another advantage is that explicit data structuring actually allows greater flexibility in the use of such structures. As an example of this last point: consider arrays and sequences. They are, as you keep asserting, quite similar concepts. But, they aren't identical. The rank and extent of an array is fixed for the lifetime of the array (which may be dynamic: specified in an allocate statement). The length of a sequence (it is always of rank one) is variable. This means that the concatenate operator is quite appropriate and useful for sequences, but is of doubtful use for arrays (and even of doubtful meaning for arrays of rank greater than one). If both arrays and sequences are implemented in the same way (as you insist using pointers), then the introduction of a concatenate operator would endanger error due to accidental use on arrays. This may not seem important and it may not occur very often, but it could be a very difficult error to discover in a large code. And, it's so easy to avoid entirely by making sequences and arrays distinct concepts in the language. J. Giles