Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!rutgers!seismo!munnari!moncskermit!goanna!yabbie!rcodi From: rcodi@yabbie.rmit.oz (Ian Donaldson) Newsgroups: comp.unix.wizards Subject: Re: brk's zero-fill behavior on VAXen (useful undefined checks) Message-ID: <363@yabbie.rmit.oz> Date: Sun, 9-Nov-86 22:48:02 EST Article-I.D.: yabbie.363 Posted: Sun Nov 9 22:48:02 1986 Date-Received: Tue, 11-Nov-86 01:01:23 EST References: <7208@elsie.UUCP> <5142@brl-smoke.ARPA> <2447@hcr.UUCP> Organization: RMIT Comm & Elec Eng, Melbourne, Australia. Lines: 56 Summary: initializing to other than zero is more useful In article <2447@hcr.UUCP>, mike@hcr.UUCP (Mike Tilson) writes: > I'd like to point out that there is another very good reason to > set newly allocated memory to a fixed value: buggy programs are much > less likely to exhibit non-deterministic behavior, which makes it > much easier to fix problems. If newly allocated memory were initialized > with random values, then tracking down wild pointers, etc., would be much > harder. I might point out that initializing such memory with zero is less likely to reveal bugs in a program than would be initializing with a constant garbage value (eg: 0x3e). Now, if a pointer was to be used that lived in such memory, it would be: 0x3e3e3e3e, a value that will cause most CPU's to give a bus-error or seg fault, because (1) if the pointer is a pointer to an int, then it is an odd-address, causing many cpus such as the 68000 to crap out; and (2) very few programs have addresses that live up that high in their data or that low in their stack segments. Initializing to zero will only cause machines that disallow references to low-memory (eg: Sun's) to show up the error. The CDC Cyber 170 series uses this concept to advantage with most languages; since it has 60-bits (a silly number, I agree), it sets all 'bss' storage to 0600000000000004nnnnn, where nnnnnn is the address of the storage. Since pointers on the Cyber cannot exceed 131071 (0377777), any reference to the data as a pointer will fail. The 06 part is used so that the hardware can trap any arithmetic operations on such data as overflow's. Fortran, Pascal and several other languages use this to advantage to give sensible post-mortem dumps, as it is always known with reasonable probability which variables are undefined, since the address of the variable is inside its contents. Minnesota Pascal-4 uses all this to great advantage, as when run-time tests are on, even stack-frames are initialized this way, making it very easy to debug programs that use uninitialized variables. Pointers declared in parts of the program where run-time-checks are switched on are also physically larger than normal, to accomodate extra information (the key) so that the pointer can be checked for validity. When a new() is done, a unique key is tacked on top of the object allocated, that must match the key in the pointer referencing it, otherwise a "pointer-invalid" run-time error occurs. On the cyber, this is easy, since there are so many bits available in a word. Perhaps for the sake of run-time checking available with languages such as Pascal on a 32-bit machine you could sacrifice one state of the 4G available to be classified as 'undefined'. An obvious state is due to 2's complement machines having an imbalance in the range of signed numbers. 16-bit numbers go from -32768 to 32767. You could probably steal the -32768 for such checking without affecting too many programs. Similarly for 32-bit ints (0x80000000 I think?). Pity you can't do a lot of this checking in C without breaking huge amounts of code. Therefore, Pascal++ :-) Ian Donaldson.