From: utzoo!decvax!harpo!npoiv!alice!research!dmr Newsgroups: net.unix-wizards Title: big ptrs, small ints in C Article-I.D.: research.324 Posted: Fri Dec 10 01:54:20 1982 Received: Sun Dec 12 12:39:34 1982 There has been enough muttering about the size of pointers that I suppose I should say something on the subject. The most obvious machine on which 16-bit integers and 32-bit pointers are plausible is the Motorola 68000. If you make a compiler for the 68000 according to these specs, and follow the manual faithfully, you get a useful product that successfully handles, for example, most Unix utilities. I know of some programs that were carelessly coded and did cause problems, but mostly there is little trouble, based on what I've heard. When you use such a compiler to develop new applications, as we have been doing locally, things are smooth indeed. However, there is one big difficulty. The manual states unambiguously that the type of "sizeof" is unsigned (formerly int; but in any case int-sized) and the type of ptr-ptr is int. This makes it difficult to have a large array, and the ability to use lots of storage is presumably one of the reasons one wants to use big pointers. The most obvious solution is to change the language definition to make the type of sizeof and p-p depend explicitly on the implementation, and in fact this is my current inclination. However, just waving the wand does not solve all problems; in particular a lot of Unix programs stop working, especially those that contain sizeof or p-p as function arguments. (This includes a substantial fraction of those with calls to read, write, and qsort.) Let's say that the size of ptrs, and the type of sizeof and p-p, are freely selectable. You are implementing a system on the 68000. Consider these choices and their consequences. 1) Simplify life and go for 32-bit ints and ptrs. Unreliable tests on a small sample of programs indicate you will pay 10-20% typically in execution time; it rises to a factor of 2 on programs with lots of multiplication (as in subscripting) or division. 2) 16-bit ints, 32-bit ptrs, short sizeof. You give up the ability to declare large arrays. However Unix utilities should port with little trouble, and it should be possible to allocate big arrays dynamically. (Nothing says that long subscripts can't be handled.) ptr-ptr is an open question, but I bet it occurs seldom enough not to be a really hot issue. 3) 16-bit ints, 32-bit ptrs, long sizeof. Least portability for existing code, and depends on a change in the written standard which will most assuredly not become a de facto standard merely by being written down. But it seems the best choice for new applications in today's technology. A couple of related points that people wondered about. The current language definition (as distributed with System III) says that pointers can be converted to "sufficiently long" integers and back again. Also, char ptrs are guaranteed to have the most resolution, and other pointers can be explicitly converted to char pointers and back. Dennis Ritchie