Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!usc!apple!mips!lloyd!cprice From: cprice@mips.COM (Charlie Price) Newsgroups: comp.arch Subject: Re: Translating 64-bit addresses Message-ID: <46134@mips.mips.COM> Date: 23 Feb 91 22:31:59 GMT References: <6590@hplabsz.HP.COM> Sender: news@mips.COM Reply-To: cprice@mips.COM (Charlie Price) Organization: MIPS Computer Systems, Inc Lines: 73 In article <6590@hplabsz.HP.COM> connors@hplabs.hp.com (Tim Connors) writes: >Now that we've entered the brave new world of 64 bit (flat) address spaces, >is it time to revive the old flame wars on address translation mechanisms? > >For 32 bit addresses, Motorola's MC68851 uses a "two level" translation >tree involving 4K pages (12 bits) and two 10 bit indices, one index for >each level. How could this technique be applied to 64 bit addresses? >Would more levels be needed? Should the page size be larger? > >More interestingly, could the pointers which link one level to the next be >only 32 bits and thus save on translation table size? This might >limit the placement of the tables in a machine with more than 4Gbytes of RAM. >It also requires switching from 64 to 32 bit mode during TLB miss handling. > >What about inverted page tables. Would they be any better for 64 bit >addresses? Does this make life tough for the MACH operating system? > >Are 64 bit addresses spaces more likely to be sparse? What affect does that >have on the translation mechanism? > >I can think of alot more questions, but I'll leave it there except to ask >if anyone from MIPS can tell us how you intend to do address translation on >the R4000? Address translation for the R4000 is a lot like the R2000/R3000. The short answer is that the in-memory page-table arrangement is entirely up to the OS programmers because it is all done by software. These processors have a fully-associative on-chip TLB. During execution, the hardware looks in the TLB. If the right information is not in the TLB, the processor takes an exception and *software* refills the TLB. The software can do whatever it likes. DETAILS: Hit "N" now if not interested. The processors do have give the exception handler enough information to do the TLB refill and this information in in the right form to makd a one-level page table especially fast. If you use a one-level page table, the R3000 user TLB miss refill routine takes 9 instructions. The R3000 has a CONTEXT register that looks like: ----------------------------------------------- | PTE base | bad VPN |..| ----------------------------------------------- ^----- 2 bits of 0 The PTE-base part is filled in by the processor during a context switch. The bad-VPN field is filled at exception time, by the hardware, with the Virtual Page Number (VPN) of the failed translation. The net intended effect is that when VPN NNN gets a translation fault, the CONTEXT register contains the *kernel address* of the 1-word user page-table entry for page NNN! On the R2000/R3000 a TLB miss for a user-mode access (i.e. a user program) vectors to the UTLB miss exception vector so it can be handled quickly. This routine is 9 instructions for a 1-word PTE (add 1 instruction to shift the address left one for 2-word wide PTE like RISC/os uses). For a kernel-mode TLB miss, the exception sets a fault cause register and vectors to the common exception vector, and this takes more effort to sort out (but also presumably happens a lot less often). The R4000 takes all non-nested TLB misses through the fast exception vector. There is nothing that *requires* anybody to use a one-level page table. If you think that memory savings or whatever makes it worth the cost in CPU cycles to have a more complex TLB refill routine, then you (the OS hacker) are allowed to make that tradeoff. -- Charlie Price cprice@mips.mips.com (408) 720-1700 MIPS Computer Systems / 928 Arques Ave. / Sunnyvale, CA 94086-23650 Brought to you by Super Global Mega Corp .com