Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!rutgers!mit-eddie!uw-beaver!rice!sun-spots-request From: roy@phri.nyu.edu (Roy Smith) Newsgroups: comp.sys.sun Subject: Re: 32MB on new Sparc Server? Keywords: Hardware Message-ID: <3772@phri.UUCP> Date: 12 May 89 18:14:34 GMT References: <8905081818.AA22009@cs.columbia.edu> Sender: usenet@rice.edu Organization: Public Health Research Inst. (NY, NY) Lines: 44 Approved: Sun-Spots@rice.edu X-Sun-Spots-Digest: Volume 7, Issue 297, message 1 of 18 dupuy@cs.columbia.edu (Alexander Dupuy) writes: > the ECC subsystems usually give you enough information to pinpoint > the failing chip Alex is right, ECC can pinpoint the source of a memory error down to the specific chip which is failing. With the proper (fairly trivial) logic in the memory management section of the OS, you could very will print out error messages saying "replace chip at position N-14 on board 3". Even if you don't want to be that fancy, it's easy enough to go from address and syndrome to chip location if you know the way memory is laid out. And everything you need to know about the chip layout could be contained in one side of a single sheet of paper. You don't even really need ECC to do that; the ROM-resident diagnostics in a 3/50 can isolate memory errors down to a specific address and bit-within-word just fine without ECC. But Noooo, Sun won't tell you how to do this. They claim that the details of the memory layout are company confidential! We had a memory chip go bad on a 3/50 once. I called Sun up to try and find out how to map from address/bit to chip location and they wouldn't tell me. After much fighting with them, we ended up returning the board to them for repair at a cost of $1300 and it took over a month! If we wanted 3-day turnaround, it would have been something like $3000, which is about 80% of what we paid for the whole workstation new. If they were just willing to pry loose one piece of paper and send it to me, I could have fixed it in an hour for $10 in parts. So, what good does it do to have ECC memory do 99% of the job of locating a bad chip, if Sun won't give you the information to do the critical last 1% of the job yourself? Now that I'm in a good mood :-), maybe somebody can tell me why I can buy a Mbyte of 100ns memory for a Mac-II for $160 but a Mbyte of memory for a Sun-3 costs more like $500, even from a third party like Clearpoint or Helios? OK, the Mac memory isn't even parity, but surely the real difference in price can't be that much, or anywhere near it. Roy Smith, System Administrator Public Health Research Institute {allegra,philabs,cmcl2,rutgers,hombre}!phri!roy -or- roy@phri.nyu.edu "The connector is the network"