Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84 SMI; site sun.uucp Path: utzoo!linus!decvax!decwrl!sun!guy From: guy@sun.uucp (Guy Harris) Newsgroups: net.lang.c Subject: Re: Uses of "short" ? Message-ID: <2883@sun.uucp> Date: Sat, 12-Oct-85 19:08:36 EDT Article-I.D.: sun.2883 Posted: Sat Oct 12 19:08:36 1985 Date-Received: Tue, 15-Oct-85 06:40:11 EDT References: <486@houxh.UUCP> <2600017@ccvaxa> Organization: Sun Microsystems, Inc. Lines: 147 > > I don't want them to have a pretty good idea when it's going to violate > > that default assumption on a particular machine. I want them to have a > > pretty good idea when it's going to violate that default assumption on > > a 16-bit-"int" machine; > Well, I can see how that would make life easier for you, but it's not > really my problem. The project I work on would have saved a lot of > time if the code we're porting hadn't been written for a system using > memory-mapped files, but I don't curse the authors for writing for the > environment they had. One can write: int size_of_UNIX_file; or one can write long size_of_UNIX_file; The former is incorrect, and the latter is correct. The two are equivalent on 32-bit machines, so there is NO reason to write the former rather than the latter on a 32-bit machine. If one can write code for a more general environment with NO extra effort other than a little thought, one should curse an author who didn't make that extra effort. > > (Consider all the postings that say "our news system truncates items > > with more than 64KB, so could you please repost XXX" for an example of > > why it is a bad practice.) > What has that to do with anything? Somebody failed to anticipate > future needs and used a short when she should have used a long. The code in question uses an "int" where it should have used a "long". Using a "short" would have been *more* acceptable; the documentation for this system says can be up to 65535 bytes long (2^32 bytes in 4.1c BSD), Since the only *real* constraint on the size of items is the amount of disk space available and the time taken to transmit the items, neither of which is significantly affected by the width of a processor's ALU and registers, the system should not make the maximum item size dependent on either of those two factors. The ideal would have been to use "long" instead of "int"; however, if the cost of converting item databases on PDP-11s would have been too high, using "short" would have been acceptable. The ideal would have been to do something like #ifdef BACKCOMPAT typedef itemsz_t unsigned int; #else typedef itemsz_t unsigned long; #endif and *not* restrict items to 65535 bytes by default; if it's really too much trouble for a site to convert its database, then they can build a version which is backwards-compatible with older versions. > There are people who rely on two-digit year codes, too. Yes, but how many of them rely on two-digit year codes on 16-bit machines and four-digit year codes on 32-bit machines? Not planning for future needs may be regarded as a misfortune; having a system like the aforementioned meet future needs or not depending on the width of a machine's registers looks like carelessness. (Sorry, Oscar, but I didn't think you'd mind...) There are cases where the difference between a 16-bit machine and a 32-bit machine *is* relevant; an example would be a program which did FFTs of large data sets. I have no problem with 1) the program being written very differently for a PDP-11, which would have to do overlaying, or provide a software virtual memory system, or perform some other technique to do the FFTing on disk, and for a VAX, where you could (assuming you could keep the entire data set in *physical* memory) write it in a more straightforward fashion (although, if it *didn't* all fit in physical memory, it would have to use some techniques similar to the PDP-11 techniques to avoid thrashing) or 2) saying "this program needs a machine with a large address space". > Portability is one of many factors to be considered in setting local > coding standards. I have spent a lot of time recently understanding > code written for a very different environment and converting it to C. > It had lots of size and byte-ordering problems. That's the breaks. > It's not the authors' fault that I had different requirements than they. In many of these cases, there is little if any gain to be had by writing software in a non-portable fashion. Under those circumstances, it *is* the authors' fault that they did something one way when they could have done it another way with little or no extra effort. In the case of byte ordering, it takes more effort to write something so that data is portable between machines. If it's a question of a program which *doesn't* try to exchange data between machines and *still* fails on machines with a different byte order than the machine for which it was written, there'd better have been a significant performance improvement gained by not writing it portably. And in the case of using "long" vs. "int", there is NOTHING to be gained from using "int" instead of "long" on a 32-bit machine (on a truly 32-bit machine, "long"s and "int"s will both be 32-bit quantities unless the implementor of C was totally out to lunch), so it SHOULD NOT BE DONE. Period. > But it is not our business to produce code that runs on PDP-11s, > let alone (as you requested in a previous posting) code that runs > efficiently on PDP-11s. I made no such request, but I'll let that pass. If you can get code that runs on PDP-11s with no effort other than getting people to use C properly, it *is* your business to get them to so use C and write portable code whenever possible. If your system permits code to reference location 0 (or whatever location a null pointer points to, assuming it doesn't have a special "invalid pointer" bit pattern), it *is* your business not to write code which dereferences null pointers - such code is NOT valid C. Programmer X can get away with writing code like that, if they have such a system; programmers Y, Z, and W who work for a company which does not permit code to get away with dereferencing null pointers have every right to stick it to programmer X when their company's customers stick it to them because "your machine is broken and won't run this program". Saying "programmer X is not at fault" is blaming the victim, not the perpetrator. > I have no objection to the principle that we should try, other things > being equal, to write portable code. But the FIRST consideration of > good professional practice is to write code that is clear, > maintainable, and efficient in the environment for which we are paid > to produce it. It is not bad practice to put that environment first. If all other things are not equal, or close to it, I have no objection to unportable code. The trouble is that people don't even seem to try to write portable code when they *are* equal. It *is* bad practice to blindly assume that the environment you're writing for is the only interesting environment. Some minimum amount of thought should be given to portability, even if portability concerns are rejected. Can you absolutely guarantee that the people who paid you to write that code won't ever try to build it in a different environment? If not, by writing non-portable code you may end up costing them *more* money in the long run; it's more expensive to retroactively fix non-portable code than to write it portably in the first place. If somebody says that, now that ANSI C finally "defines 'int's as 16-bit quantities", they'll start thinking about when it's appropriate to use "long" and when it's appropriate to use "int", they haven't given the proper minimum amount of thought to portability. Guy Harris