Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site ulysses.UUCP Path: utzoo!linus!decvax!harpo!eagle!mhuxl!ulysses!ggs From: ggs@ulysses.UUCP (Griff Smith) Newsgroups: net.unix-wizards Subject: Re: Unix Error Messages at Crash Time Message-ID: <741@ulysses.UUCP> Date: Wed, 21-Dec-83 22:46:28 EST Article-I.D.: ulysses.741 Posted: Wed Dec 21 22:46:28 1983 Date-Received: Fri, 23-Dec-83 01:32:29 EST References: <14805@sri-arpa.UUCP> Organization: AT&T Bell Laboratories, Murray Hill Lines: 46 With regard to the following: >Is there anyone out there who knows what Unix error messages at crash time >mean? I am talking about the ones not explained in section 8 of volume 1. >Messages like "panic: mba, zero entry", "unit 0: random interrupt", or >"machine check". I suppose a direct reply would have been more appropriate, but with a path like "...!sri-unix!ben%brandeis@csnet-relay" a mail response wouldn't stand a snowball's chance in Hell of getting there. "panic: mba, zero entry" happens under 4.1BSD and 4.2BSD when you read a mag tape that has a hard read error. It is caused by some brain damage in mt.c that makes it assume that mba.c knows how to "read backwards". When mt.c gets the "read opposite" status from the tape controller, it passes a "read backwards" request to mba.c, along with the buffer address and buffer size. Since this is "read backwards", mba.c is supposed to map the pages of the buffer into the mba address space and then set the initial input address to be the end of the buffer. Unfortunately, it leaves the starting address unchanged. Tape input starts at the beginning of the buffer, erases any innocent static or stack variables in front of the buffer until it reaches the beginning of the page, then falls off the end of the world. If you are lucky, your process then aborts with a strange error message resulting from using the text in those variables as binary numbers. If you are unlucky, the kernel is deranged and panics when it tries to use a bent table. As far as I can tell, you get the panic if the input buffer is smaller than the input block and you get the mangled static area if the buffer is larger than the input block. "unit 0: random interrupt" should be "unit 0: non-data transfer error interrupt, error status = xxxxxx". I changed my mt.c to be something like that, and found that the error status code is usually 32 (base 8). My DEC tape controller manual says this means "TM fault B", otherwise known as "I am broken, please fix me". The error code in the LED display inside the TM front panel gives further help to the DEC CE that you call in when this happens. I intend to fix these problems soon, unless someone posts reasonable solutions and saves me the trouble. Whether the fixes can escape the proprietary black hole of AT&T Bell Laboratories is another matter. -- Griff Smith AT&T Bell Laboratories, Murray Hill Phone: (201) 582-7736 Internet: ggs@ulysses.uucp UUCP: ulysses!ggs