Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!uwm.edu!bionet!agate!ucbvax!nrc.CA!Claude.P.Cantin From: Claude.P.Cantin@nrc.CA Newsgroups: comp.sys.sgi Subject: disk timeout, SCSI reset... Message-ID: <9104051754.aa05599@VMB.BRL.MIL> Date: 5 Apr 91 22:56:00 GMT Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 101 PROBLEM SUMMARY: --------------- WREN VI hard drive as "dks0d2". sc0,2,0: timeout after 30 sec. Resetting SCSI BUS dksc0d2s7: retrying request dksc0d2s7: retrying request dksc0d2s7: retrying request Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: dks0d2s7: retrying request Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: sc0,2,0: timeout after 30 sec. Res bru "filename" warning - X block checksum error bru "filename2" warning - block sequence error bru "filename3" warning - file synchronization error - attempting recovery. BACKGROUND LEADING TO PROBLEM: ----------------------------- We received a Seagate Wren VI drive from PARITY systems, already formatted and partitioned for our Personnal IRIS (4D/35). It was even setup to be used as disk "2" (dks0d2). I installed it on the PI, used "fx" to be sure it was partitioned. It is partitioned in the same way SGI partitions their (16 MB for root, 50 for swap, and the rest for "/usr"). Partition 7 representing the entire area ("/" + swap + "/usr"), I used mkfs /dev/dsk/dks0d2s7 That worked fine. I then issued ln /dev/dsk/dks0d2s7 /dev/usr2 and the same for the raw device. mount /dev/usr2 /usr2 was also successfull, as was writing small files to the disk. I then wanted to move all users from /usr/people to /usr2/people. For some reason, I felt like using bru, so I issued cd /usr bru -cvf /usr2/bru.dat people This worked fine and created a file of 77 MB. cd /usr2 bru -xvf bru.dat started normally, BUT at several intervals (at 13, 21, 23, 30.6, 30.8, 32.3, 38.4, 47.5, 59, 65.3, 75.9 Megabytes), the extraction from the file stopped, then when it started again the messages sc0,2,0: timeout after 30 sec. Resetting SCSI BUS dksc0d2s7: retrying request dksc0d2s7: retrying request dksc0d2s7: retrying request would appear on the console. The /usr/adm/SYSLOG file (the last 20 lines) looks like: Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: dks0d2s7: retrying request Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: dks0d2s7: retrying request Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: sc0,2,0: timeout after 30 sec. Res Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: dks0d2s7: retrying request Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: sc0,2,0: timeout after 30 sec. Res Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: dks0d2s7: retrying request Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: sc0,2,0: timeout after 30 sec. Res Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: dks0d2s7: retrying request Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: sc0,2,0: timeout after 30 sec. Res Apr 5 16:27:24 nrcbs3 grcond[471]: CIO: dks0d2s7: retrying request ALSO, as bru was extracting from the file "bru.dat", the following messages would should up at irregular intervals (and not necessarily in the following order): bru "filename" warning - X block checksum error bru "filename2" warning - block sequence error bru "filename3" warning - file synchronization error - attempting recovery. *** The same SCSI timeout errors happened when I used *** *** cp -r /usr/people/* /usr2/people/ *** The hard disk is "auto-terminating", so it does not need a SCSI terminator. That system is on one of our satellite campuses, so it's hard to keep carrying equipement back and forth (i.e. carry the disk here and try on our own PI, or come back here and get another cable, or terminator, or anything else...) What is wrong?? anyone have any clue?? Any suggestions?? Thank you for your suggestions, Claude Cantin National Reasearch Council