Path: utzoo!utgpu!watserv1!watmath!att!rutgers!uwm.edu!rpi!dali.cs.montana.edu!uakari.primate.wisc.edu!sdd.hp.com!cs.utexas.edu!rice!sun-spots-request From: fuat@cunixf.cc.columbia.edu (Fuat C. Baran) Newsgroups: comp.sys.sun Subject: SunOS 4.1 multi-user dump causes crashes Keywords: SunOS Message-ID: <8486@brazos.Rice.edu> Date: 1 Jun 90 22:02:53 GMT Sender: root@rice.edu Organization: Sun-Spots Lines: 74 Approved: Sun-Spots@rice.edu X-Sun-Spots-Digest: Volume 9, Issue 199, message 3 Last weekend we upgraded our Sun-4/280's from SunOS 4.0.1 to SunOS 4.1. Since then they have been crashing (panic: writeback error) every time we try to backup the disks in multi-user mode using dump (our usual procedure for daily and weekly backups). Backups are done to a 1/2 inch tape drive on a Xylogics 472 tape controller. The systems crash with the GENERIC kernel as well as a custom config'ed kernel. Hardware configuration is 4/280 with rev 26 CPU's (PROM 3.0) (as well as a rev 22 CPU with PROM 2.8.4 and a rev 14 CPU with PROM 1.7), 3 16 Mb memory boards, one ALM-II, one Xylogics 472 tape controller with one tape drive, one Xylogics 450/451(?) controller, 2 Hitachi DK815-10 drives. Most of the time the system hangs after the panic, though once we were able to get a core dump. Output on the console at crash time is (addresses vary slightly): Memory Error Register 1d4 DVMA=1, context=0, virtual address=fff3cfc0 pme=0, physical address=fc0 panic: writeback error syncing file system... {at this point it hangs and we have to reset from the cpu board, though in one of the 20 or so crashes it saved a core image} stack backtrace of the vmcore file shows: _panic(0xf80d1272,0x0,0x1bdc,0xfff3fbdc,0x0,0xf80bcf20) + 6c _ecc_error(0xffff6004,0xf80a3120,0xc000,0xf80e86f0,0x0,0xf80d1272) + 1c4 _memerr(0x0,0x0,0xffff8000,0x1f0,0xc0,0xd4) + 80 memory_err(?) _splx(0xf817fc74,0xff005f74,0xff005f74,0x0,0x1,0x64c000) + 14 _hat_pagesync(?) _page_sortadd(0xf81c4d84,0xf817fc9c,0x80,0x0,0x566000,0xf817fbd4) + 1c8 _pvn_getdirty(0xf817fc9c,0xf81c4d84,0x0,0x12000,0x566000,0xff005f74) + 29c _pvn_vplist_dirty(0xff005f74,0x0,0x100,0x0,0xf817fcc4,0xf817fc9c) + 110 _spec_putpage(0xff005f74,0x0,0x0,0x100,0x0,0xf8128348) + 1dc _spec_sync(0x0,0xf80cab90,0xf80cb850,0xf80de9d8,0xff005f70,0xff0fd234) + 98 _sync(0xf81c4fe0,0x120,0xf80c85f8,0xf80c8718,0xf81c5000,0xf80cab48) + 3c _syscall(0xf81c5000) + 3b4 Since our summer semester started on Tuesday, we haven't had the opportunity to do exhaustive tests such as single-user vs. multi-user, tar vs dump, remote dumps, etc., though we have used rdump on our Encore Multimax systems to back them up onto the Sun tape drives successfully. Sun software support is currently "working on it". We made enough of a fuss so they have given it "high priority". The first response I got was "All dumps have to be done single-user, and multi-user dumps are not supported. If you want, we can design a custom program to do it, though you'll have to contract us to develop it," though they retracted this when I asked for that statement in writing. Since it crashes the OS, it is a bug regardless of what is and isn't supported in the application, and they have finally begun to look into it. So far they haven't gotten back to me with an analysis, fix, or estimates on how long it will take for both. Does anyone else have a similarly configured system running SunOS 4.1? Can you do backups with the system multi-user? Does anyone have any ideas as to what the problem is? We currently are forced to take the systems standalone to do backups. Needless to say, these machines are in constant use 24 hours a day by students working on homework, and they don't appreciate a 2-3 hour interruption of service for backups, no matter what time of day or night we schedule it for. One other alternative is to give up and downgrade to SunOS 4.0.1, which for the most part worked (ignoring such things as NFS bugs, VNODE hangs, etc.)... Any help, suggestions, or reports of similar occurences would be appreciated. Internet: fuat@columbia.edu U.S. MAIL: Columbia University BITNET: fuat@cunixf Center for Computing Activities UUCP: ...!rutgers!columbia!cunixf!fuat 712 Watson Labs, 612 W115th St. Phone: (212) 854-5128 Fax: (212) 662-6442 New York, NY 10025