Path: utzoo!attcan!uunet!aplcen!samsung!cs.utexas.edu!sun-barr!newstop!sun!joe!petolino From: petolino@joe.Sun.COM (Joe Petolino) Newsgroups: comp.arch Subject: Re: 370 Operand Alignment and Page Faults Message-ID: <131098@sun.Eng.Sun.COM> Date: 1 Feb 90 21:12:15 GMT References: <9001270059.AA26776@ucbvax.Berkeley.EDU> <49365@sgi.sgi.com> Sender: news@sun.Eng.Sun.COM Reply-To: petolino@sun.UUCP (Joe Petolino) Organization: Sun Microsystems, Mountain View Lines: 48 >+--------------- >| In article <35102@mips.mips.COM> mash@mips.COM (John Mashey) writes: >| > Consider the following instruction : >| > load.word x,0(x) >| > where x happens to point across a page boundary.... >| In the 370 analogy to the load.word example above, the instruction that >| had an unaligned operand crossing a page boundary with both pages faulting >| (let's make it hard :-) will get started 3 times. The first 2 times it >| will not finish because the prefetching of the operands will fail, so >| the operating system will have to restart it twice, but the third time >| it will finally complete (in a couple of cycles)... >+--------------- > >On a related issue, my understanding was (from idle conversation with >some IBM guys several years ago) that the 370 architecture needs at >least 8 elements in its TLB, and that the TLB must be *at least* 4-way >set-associative. The reason is some instruction which copies a source to >a destination while referring to a table (some version of Translate-And-Test >maybe?). Anyway, the idea was that if the instruction, the source, the >destination, and the table entry being used all span pages, then you need >at least 8 valid entries in the TLB to make progress on the instruction. > >(Note: I didn't say to *finish* the instruction, I said "make progress". >If the count were larger than a page size you could hit the same problems >at the next page boundary. But after handling potential TLB faults [which >cold cause page faults] you would eventually begin to make progress again.) > >Has anyone got a more exact reference with the details of the "worst case"? >What *is* the absolute minimum size of TLB a 370 or a 30xx needs? How big >are the actual TLBs in real machines? I don't know the exact answer to all of these questions, but I can give a counterexample to the 4-way-set-associative requirement. All of the Amdahl machines that I'm familiar with use 2-way-associative TLBs, and have no trouble maintaining 370 binary compatibility. I think those IBM guys were confusing the requirements of one particular implementation with the requirements of the architecture. As to the total TLB size, the number 256 sticks in my mind. In practice, this is probably more a function of the available RAMs than anything else. In an unrelated but interesting note, you'll notice that I said '2-way-associative', not '2-way-set-associative'. In some of these designs, the two entries being looked at are chosen by *different* functions of the virtual address. The reasoning behind this is left as an exercise for the reader :-) . -Joe