Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!ames!ucbcad!ucbvax!sdcsvax!darrell From: mash@mips.uucp (John Mashey) Newsgroups: comp.os.research,mod.os Subject: Re: Life with TLB and no PT Message-ID: <3030@sdcsvax.UCSD.EDU> Date: Thu, 23-Apr-87 00:02:16 EST Article-I.D.: sdcsvax.3030 Posted: Thu Apr 23 00:02:16 1987 Date-Received: Sat, 25-Apr-87 10:02:09 EST Sender: darrell@sdcsvax.UCSD.EDU Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 56 Approved: mod-os@sdcsvax.uucp Xref: mnetor comp.os.research:8 mod.os:148 In article <3027@sdcsvax.UCSD.EDU> stuart@CS.ROCHESTER.EDU (Stuart Friedberg) writes: >Prodded by something Avie Tevanian of the MACH research group said, >I have been considering life with a translation lookaside buffer (TLB) >but without hardware page tables (PT)... > >Given a good TLB design which doesn't require frequent flushing of >entries, it is possible to do without hardware PT entirely. I don't >know of anybody working on a machine with TLB and no PT, but I wouldn't >be at all surprised to see one in the next few years...... The following machines, which shipped in 1986 or earlier do exactly this: Celerity 12xx (I think) HP840 (and other Spectrum RISC machines) (for sure) MIPS R2000 (for sure) (See "Operating Support on a RISC", in Proc. IEEE COMPCON, March 1986, for rationale behind doing TLBs this way.) The recently-announced AMD29000 does the same. > >Another benefit is reduced hardware/microcode complexity and therefore >reduced cost.... Yes. Bus interface units are often pretty random. TLB cells are at least regular. > >Is a TLB/no-PT architecture at all feasible? ..... Yes. With 4K pages, 64 fully-associative on-chip TLB entries, an unmapped hole in the address space in the TLB to greatly lessen the usual TLB-crunching from the nonlocality of kernels, we usually see 1% of the user CPU time (or less) in TLB-miss processing. [Simulations + measurements/counting]. Once you get it down that far, you stop worrying about it: lots of other things are MUCH more important for performance. > >One thing that is clearly needed for a PT-less TLB system, or even a >PT-full TLB system with good performance, is a set of context tags >(process identifiers) so that TLB entries do not need to be flushed on >every context switch. We use 6-bit PIDs in our TLB; worst case context-switch overhead for TLB-flushing = 1 microsecond [on an 8MHz machine] / context switch on the average, assuming that you have >64 processes, and you roundrobin them without rescheduling. [This isn't the way real systems work: usually you switch amongst a smaller number of more active processes, even on large machines.] Anyway, TLBs refilled by software are pretty easy to measure: you just put extra counters in there. However, as noted: it stops being an interesting problem when the performance hit is so low. At least in our case, both 4.3BSD and System V.3 both run [with slightly different refill code, for convenience] on this, and we've written half a dozen variants to fit various prospects' existing or preferred environments. -- -john mashey DISCLAIMER: UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086