Avax135.329 net.unix-wizards utzoo!decvax!ucbvax!ihnss!houxi!houxm!houxg!lime!vax135!jfr Wed Mar 10 11:13:37 1982 VAX-11/750 bugs re: VAX-11/750 bugs The VAX-11/750 has a long (and continuing) history of bugs in memory management vs. the CALLS instruction. I have an operating system for the VAX-11 that features demand paging and the equivalent of TENEX PMAP. The system has been running on the VAX-11/780 for about 2 years. Due to microcode/hardware bugs, I have had an extremely difficult time getting the system to run on the VAX-11/750. Even with the latest release (my SID registers say 02005E03), I need a heuristic software patch to correct a microcode error. An exchange of messages with Bill Munson in February 1982 leaves me with the distinct impression that DEC is not interested in doing anything effective to fix the bugs until July 1983, if ever. Here is a synopsis: August 1980: My operating system won't run on Greg Chesson's beta-test comet, microcode version <50. Dennis Ritchie determines that "CALLS $0,..." with a write-protected stack reports the faulting address as the contents of PC rather than the contents of SP. A new set of ROMs fixes the problem on Greg's machine, but the general fix is not promised until level 62, and many level-less-than-62 machines are shipped from the factory to customers. July 1981: I get two 11/750s, level 62. The fault address is now correct, but the fault type word doesn't always say "write or modify intent"; usually it's zero ("read"). After a week of intensive debugging, I produce a stand- alone program which gives pure garbage for the fault type parameter, and (7/30/81) send the program to Peter Jessel and Armando Stettner. Response is "Yeah, there's a problem, we'll look into it." August 28, 1981: I send a followup message requesting a schedule for the fix. No response. Fall 1981: Jessel leaves DEC; problem languishes. Same bad behavior occurs on other level-62 750s. Meanwhile I find a heuristic which detects and patches around the error. The heuristic has not failed yet, but we have a very light load on the 750s. January 1982: My 11/750s are upgraded to level 94. The standalone program still bombs in the same way, except that the system ID register says 02005E03. Here is a copy of the message I sent on July 30, 1981: ***************************************************************** re: 11/750 CALLS on write-protected stack I am having trouble with the memory management on a VAX-11/750. The fault parameter word for an access control violation does not always have bits set as described in the VAX Hardware Handbook (1980-81 p.76 Fig. 4-17). In particular, a CALLS instruction in user mode with zero parameters and with the stack valid but write-protected, sometimes results in a parameter word of 0 instead of 4. I include a console transcript with appropriate registers and memory locations examined. I also include a standalone program which can be deposited and run, producing a different and even more horrible fault parameter word. John F. Reiser Bell Laboratories 4F-635 Holmdel, NJ 07733 (201) 949-3942 vax135!jfr =================================================================== Console transcript which gets fault parameter word of 0 for CALLS on readonly stack ------------------------------------------------------------------- >>>B/1 %% *unix.uerr2 real mem = 1048576 free mem = 896000 # cat /etc/rc date >>/dev/console rm -f /etc/mtab /etc/mount /dev/rp0h /usr /usr/lib/ex3.6preserve -a cd /tmp rm -f * cd / rm -f /usr/spool/uucp/STST.* /usr/spool/uucp/LCK.* rm -f /usr/spool/lpd/lock /etc/update& /etc/cron& /etc/dzkload >>/dev/console # ;; typed to enter multiuser mode 80000CCE 06 ;; the fault in question >>>E P 00C00004 ;; kernel mode, kernel stack >>>E/G E G 0000000E 7FFFFFF0 >>>E/V 7FFFFFF0 ;; the fault parameter words P 0002FDF0 00000000 ;; should be 00000004 >>>E P 0002FDF4 7FFFF510 ;; faulting address >>>E P 0002FDF8 00003453 ;; pc >>>E P 0002FDFC 03C00004 ;; psl >>>E/I 3E ;; SID register I 0000003E 02003EFF ;; 11/750, level 62 microcode >>>E/I 11 ;; SCBB I 00000011 00000200 >>>E/P 220 ;; access control violation vector P 00000220 80000CC8 >>>E/V 80000CC8 ;; the fault handler code itself P 00000CC8 126E00D1 ;; CMPL $0,(SP) >>>E ;; BNEQ 1$ P 00000CCC 1AE10001 ;; HALT >>>E ;;1$: P 00000CD0 00010CAE >>>E P 00000CD4 AED03FBB >>>E/I 8 ;; current mapping registers I 00000008 8001FE00 ;; P0BR >>>E I 00000009 00000025 ;; P0LR >>>E I 0000000A 7F820000 ;; P1BR >>>E I 0000000B 001FFFF7 ;; P1LR >>>E/V 8001FFE0 ;; page table for end of P1 P 0002C5E0 20000000 ;; 7ffff000 >>>E P 0002C5E4 20000000 >>>E P 0002C5E8 FD00015F ;; 7ffff400 >>>E P 0002C5EC FD00017D >>>E P 0002C5F0 E4000181 ;; 7ffff800 >>>E P 0002C5F4 E4000180 >>>E P 0002C5F8 E000017F >>>E P 0002C5FC E400017E >>>E/V 7FFFF510 ;; the faulting address P 0002BF10 20000000 >>>E P 0002BF14 00000000 >>>E P 0002BF18 20000000 >>>E P 0002BF1C 7FFFF584 >>>E P 0002BF20 7FFFF55C >>>E/V 3453 ;; code which caused the fault P 0002F853 48CF00FB ;; CALLS $0,^W...(pc) >>>E P 0002F857 CF00FBF2 >>>E/I 3 ;; USP I 00000003 7FFFF52C ;; same page as fault address >>> ===================================================================== Standalone program for producing bad fault parameter word --------------------------------------------------------------------- # # page contents # 0 this program # 1 SCB # 2 SCB UNIBUS extension # 3 HALTs # .set PCBB,0x10 .set SCBB,0x11 .set SBR,0x0c .set SLR,0x0d .set MAPEN,0x38 .set TBIA,0x39 # p.3 is HALTs movc5 $0,(r0),$0,$0x200,*$0x600 # SCB on p.1 movab *$0x200,r0 mtpr r0,$SCBB # vectors 000 through 0fc halt at same offset on p.3 movl $0x100/4,r2 L100: movab 0x80000400(r0),(r0)+ sobgtr r2,L100 # vectors 100 through 3fc rei movl $(0x400-0x100)/4,r2 L200: movl $0x80000000+_rei,(r0)+ sobgtr r2,L200 nop jmp *$0x80000000+ready ready: movl $0x80000000+istack,sp mtpr $pcb,$PCBB mtpr $sbr,$SBR mtpr $4,$SLR mtpr $1,$TBIA mtpr $1,$MAPEN ldpctx rei foo: .word 0 calls $0,foo halt .align 2 _rei: rei .align 2 sbr: .long 0x90000000 # V KW page0 .long 0x90000001 # V KW page1 .long 0x90000002 # V KW page2 .long 0x90000003 # V KW page3 p0br: .long 0xf8000000 # V UR page0 pcb: .long 0x80000000+kstack,-1,-1,ustack .long 0,0,0,0,0,0,0,0,0,0,0,0,0,0 # r0 through r13(fp) .long foo+2,0x03c00000 # pc, psl .long 0x80000000+p0br, 0x04000001 # P0 .long 0x7f800000+p0br+4,0x001fffff # P1 ontop of P0 .long 0,0,0,0 istack: .long 0,0,0,0 kstack: .long 0,0,0,0,0,0,0 ustack: ----------------------------------------------------------------------- Execution of above program >>>I >>>D/P/L 0 60002C >>>D + 9F02008F >>>D + 600 >>>D + 2009F9E >>>D + DA500000 >>>D + 8FD01150 >>>D + 40 >>>D + E09E52 >>>D + 80800004 >>>D + D0F652F5 >>>D + C08F >>>D + 8FD05200 >>>D + 8000006C >>>D + F652F580 >>>D + 3F9F1701 >>>D + D0800000 >>>D + F48F >>>D + 8FDA5E80 >>>D + 84 >>>D + 708FDA10 >>>D + C000000 >>>D + DA0D04DA >>>D + 1DA3901 >>>D + 20638 >>>D + EF00FB00 >>>D + FFFFFFF7 >>>D + 0 >>>D + 2 >>>D + 90000000 >>>D + 90000001 >>>D + 90000002 >>>D + 90000003 >>>D + F8000000 >>>D + 80000104 >>>D + FFFFFFFF >>>D + FFFFFFFF >>>D + 120 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 61 >>>D + 3C00000 >>>D + 80000080 >>>D + 4000001 >>>D + 7F800084 >>>D + 1FFFFF >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>>D + 0 >>> >>>E P 041F0000 >>>S 0 80000621 06 >>>E P 00C00000 >>>E/G E G 0000000E 800000F4 >>>E/V 800000F4 P 000000F4 00800010 ;; pure garbage >>>E P 000000F8 00000108 >>>E P 000000FC 00000061 >>>E P 00000100 03C00000 >>>E/I 3E I 0000003E 02003EFF >>>E/I 11 I 00000011 00000200 >>>E/P 200 P 00000200 80000600 >>>E P 00000204 80000604 >>>E P 00000208 80000608 >>>E P 0000020C 8000060C >>>E P 00000210 80000610 >>>E P 00000214 80000614 >>>E P 00000218 80000618 >>>E P 0000021C 8000061C >>>E P 00000220 80000620 >>>E P 00000224 80000624 >>>E/V 80000620 P 00000620 00000000 >>>E/I 8 I 00000008 80000080 >>>E I 00000009 00000001 >>>E I 0000000A 7F800084 >>>E I 0000000B 001FFFFF >>>E/V 80000080 P 00000080 F8000000 >>>E P 00000084 80000104 >>>E/V 108 P 00000108 00000000 >>> ----------------------------------------------------------------------- If the user-mode stack pointer in the assembled PCB above is changed to 0x80000000 and the program is run, I get a correct fault parameter word of 00000004.