Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!wuarchive!uunet!snorkelwacker!bloom-beacon!world!burley From: burley@world.std.com (James C Burley) Newsgroups: comp.lang.c Subject: Re: Crash a RISC machine from user-mode code: Message-ID: Date: 11 Aug 90 06:43:56 GMT References: <1826@mountn.dec.com> <49041@seismo.CSS.GOV> Sender: burley@world.std.com (James C Burley) Organization: The World Lines: 206 In-Reply-To: stead@beno.CSS.GOV's message of 11 Aug 90 00:32:00 GMT NOTE: -LONG- POSTING, look at the summary at the bottom first if you don't want to read a single long posting on crashing systems!! I'd boil it down, but I've already spent too much access $$ just composing the thing, so I apologize to everyone for the length and to those who knew me as a software tech writer long ago (I've always been an overly verbose engineer :-)! -------- Hmm, this discussion was at first very interesting to me but seems to have gotten off the track I was hoping for...let me explain: 1) As I recall, the original posting talked about somebody wondering if the new RISC machines were bullet-proof in user mode (essentially, based on their wording -- something about "register paths" and such), and proposed running a program that jumped to random data. The result of such a program is the execution of random defined AND UNDEFINED instructions. 2) This kind of program should ONLY be run under so-called "user-mode" protection, i.e. under operating systems like UNIX, OS/2 (I think), VMS, A/UX, and so on, and only on CPUs where those systems offer (and have enabled) memory protection, fault catching, preemptive scheduling, and such like. Thus it is NOT USEFUL to run the program on systems like IBM PCs running DOS, XENIX (I think), or Macintoshes running Apple's non- UNIX OS. (Maybe A/UX fits in this category too?) Why? Because no matter what it does (short of reducing oil prices), any hand-written program could have done the same -- including (caution!) erasing your hard disk! Doubt I've lost anyone so far.... 3) There's no doubt that jumping to random junk produces no useful productive work in the normal sense; nobody is suggesting this is a good way to use any kind of computer. BUT, by running random junk, one may increase the likelihood of discovering a "hole" in the system (hardware or kernel, usually) compared to running regular code generated by a compiler or even regular assembly code written by users. It may even have a better chance than examining the instruction set architecture and trying to purposely write code that breaks the machine. 4) If such a program does anything that any normal user mode program may conceivably do, then it should not be considered worth noting. This is especially the case (even for weird things like deleting files) if the program is run after some other useful program has been run and still has parts of it sitting around in memory; the random program could easily jump to it. Other things included in this "not interesting behavior", IMHO: a) Putting the process into an infinite loop (but the system as a whole still works to the same extent it would if one actually ran a hand-coded infinite loop). b) Spewing junk to the terminal screen, or hanging for input from the terminal. c) Signaling conditions caught by the OS. d) Logging out, playing with files, network connections, or other things like that. e) Thrashing the swapper or pager (again, assuming any user program can do it). 5) However, if the random program manages to do things clearly out of the accepted realm of "user program", and assuming it (and thus the user or "wetware") cannot invoke "superuser" or some other "give me direct access to the kernel" function, such as "poke the kernel's memory" or "write to raw disk sector", then one may conclude that either the operating system in control has a security hole, or perhaps the hardware itself has a security hole. THIS IS PART OF HOW RICHARD MORRIS'S PROGRAM TRASHED THE NET: he knew passing a certain invalid value to a kernel-mode function from user-mode would escape normal defensive programming (since there wasn't any in that particular case), and allow his program to insinuate part of itself (data/instructions) into the kernel's memory and then be executed as kernel, not user, code. I highlight this issue because it IS important: if your operating system provides a "hole" through which any user (who can write and execute raw machine code if even only via BASIC POKE instructions, but certainly via use of assembler/loader) can do something not normally allowed in user mode, then your operating system has a security hole. (I'm not talking about the non-user-mode systems like MAC OS, PC/MS-DOS; I mean Unix, VMS, PRIMOS, and so on.) Very likely, the hole can be found and fixed (though the fix is painful if the "bug" is really a convenient "entry point" for utilities needing special features; I've dealt with fixing this kind of thing many times, usually involving timesharing systems' batch and printer queue utilities). But if the problem is that the underlying CPU allows a user mode program to somehow circumvent documented user mode protection, then the problem cannot be fixed without either switching to another kind of CPU (not easy; porting is a problem) or preventing users from writing machine code (the acceptable answer if you are providing only pure end-user services; for example, the Prodigy on-line service allows no programming, so conceptually could be implemented entirely on Apple IIs without having any architectural exposure from a security perspective -- of course, performance is another issue :-). IF a system is "hackable" from the hardware perspective, the manufacturer of that CPU better find and fix the problem fast, and perhaps even provide inexpensive replacements to their customers. Otherwise their machine becomes a "target" of evil hackers, and administrators will learn to avoid any system based on their CPU especially when it comes to attaching such a system to any network or putting any sensitive data on it. SAMPLE WAYS TO TELL if your system has a "hole" like this, based on the behavior of the "random-jumping" program: a) Running the program crashes the entire system, but there is no known way of so doing with a hand-written user program. (The culprit may be the OS or the CPU, but check the OS carefully first.) b) The program manages to rewrite part of the (protected) kernel without crashing the system or calling any "may I write the kernel" function. (Likely to be a CPU bug.) c) The program somehow causes Iraq to unilaterally disarm. Again, if the random program does something you cannot imagine ANY user mode program doing (not just a correct or "well-written" one), then you might well be looking at a security hole. ("Security" meaning either a user can access things he/she shouldn't be able to, or is able to trash or crash things he/she shouldn't have access to, like the CPU itself.) 6) If you think the random-jumping program has exposed a hole in your system, be it RISC or CISC, first determine (by reading the documentation or asking an expert on your configuration of CPU and OS) whether your system even ATTEMPTS to catch all possible user-mode violations. If your CPU allows, for example, I/O instructions in user mode, then although it wouldn't fit MY definition of "user mode", it would mean a user mode program could do almost anything (including rewriting a swapping/paging kernel or other kernel-mode programs right out from under themselves by rewriting their disk images), so the random-jumping program would simply be something to avoid running any more! But, if you are running, say, under VAX/VMS, or on a 68030 running a memory-protecting UNIX, or some such thing, and the random-jumper does something out of bounds, then perhaps you've discovered one or more "holes" in the system. If you can reproduce the problem reliably (if the program always creates the same random data each time, for example), then you might be able to step through it and find the actual instruction or instruction sequence that causes the crash. (HINT: if it takes long in terms of instruction steps, and your system provides a user-mode n-stepper, find a large value of "n" to step the program that results in a crash, then use a binary search technique to lower "n" until you have a value that falls just short of the crash.) Once you've narrowed down the problem to a few instructions, if they're user mode and don't involve a kernel call, you might have a true-blue CPU bug: document the problem and discuss with another expert on that CPU (especially, try and reproduce on other chips in case yours just has a local flaw, and on other slightly different models of the same CPU, e.g. a 486 if the failure is on a 386, a 68030 if on a 68040, etc), then if you still think it's a hardware problem, report it to the manufacturer. However, it's likely that a supposed CPU problem is really an OS problem if the offending instructions cause a valid trap to the OS that the OS mishandles or fails to handle, so make sure the offending instructions aren't trapping to kernel mode at all. If you find the problem's in the OS, for example a call to an OS function with absurd arguments that don't get "noticed" until it's too late, then (again, after checking with experts and other copies and different versions of the OS) let the OS writer know. And, depending on your own sense of ethics, perhaps let everyone else (via a newsgroup) know as well, so they can plan their own defenses if they are using that OS. (I wouldn't personally recommed advertising a CPU hole; if you're wrong about somebody's OS, it's fairly easy for them to prove assuming you've narrowed the problem down adequately, and in any case people can actually defend their systems fairly rapidly via patches, but if you're wrong about someone's CPU, they can't show everyone the schematics to prove it and you may have tarnished the manufacturer's image permanently, and meanwhile there isn't much most people can do about it quickly. Wait for the manufacturer to verify/refute the problem and take their own steps, IMHO.) 7) Remember that even if you're the ONLY USER of a system with a "hole", you've still got a security problem unless you're also the ONLY PROGRAMMER of every new (or recent) program running in user mode on your system. If someone else knows of a hole in a particular OS/CPU combination, they might use that knowledge to write a trojan horse program that pretends to be one thing but, when it detects a system for which it has a "kernel access" code to exploit a security hole, does bad things like attaching viruses to other programs or erasing disks. (IMHO, the best protection against this kind of situation is to only use "free software" that comes with source code and never use the binaries, but always do the rebuilds yourself, and only after inspecting the source code via a quick perusal: it is much harder to hide a code missile in source code than in binary code. Be suspicious of any data tables without adequate explanation, especially if they can get "jumped" to. Unfortunately, scanning assembler code can be much harder than scanning HLL code like C, Pascal, or (best of all due to lack of pointers and such) Fortran and Cobol.) I know this has been a long posting, but I've tried to explain what I think are the important issues about a random-jumping program. Again, please don't get excited if such a program goes into an infinite loop, or signals conditions that your OS catches -- any user program can do those things. DON'T run such a program on ANY system that doesn't offer full user mode protections like memory, I/O, scheduling, and others; you might just end up trashing your hard disk or some such thing. Finally, if you DO run the program on an "interesting" (i.e. protecting) system and it produces "interesting" (i.e. not-normally-allowed-in-user-mode) results, PLEASE look into it further and, if possible, involve an expert -- you may have taken a step towards preventing the next major virus or trojan horse infiltration! (I mean, if YOU can find the problem, so can someone else who wants publicity for being a mediocre, obnoxious hacker!) James Craig Burley, Software Craftsperson burley@world.std.com