Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!uunet!mcsun!ukc!dcl-cs!aber-cs!athene!pcg From: pcg@cs.aber.ac.uk (Piercarlo Antonio Grandi) Newsgroups: comp.arch Subject: Re: Translating 64-bit addresses Message-ID: Date: 11 Mar 91 14:41:47 GMT References: <6590@hplabsz.HP.COM> <12030@pt.cs.cmu.edu> <6626@hplabsz.HP.COM> <92-9BOB@xds13.ferranti.com> Sender: aro@aber-cs.UUCP Organization: Coleg Prifysgol Cymru Lines: 79 Nntp-Posting-Host: aberdb In-reply-to: peter@ficc.ferranti.com's message of 10 Mar 91 17:50:11 GMT On 10 Mar 91 17:50:11 GMT, peter@ficc.ferranti.com (peter da silva) said: peter> Hasn't the PC/RT been found to have surprisingly poor performance peter> once the number of context switches involved get too high? I don't know, but this could be for many other reasons. I remember having seen hints that the RS/6000 does badly on context switching, but whether this is due to shared memory simulation or rather one of a million porbable bogosities in the OS I cannot know. peter> Once you want to access more than 256K (64K for each of DS, SS, peter> CS, and ES) you *have* to reload the segment registers. The peter> machine can *not* directly address more than 64K per segment, and peter> it only has the 4 segment registers. This is a hard limit unless peter> you start reloading segment registers... which is sufficiently peter> expensive to have an exquisitely painful impact on performance. Maybe you have tired of reading my article before its end, but I maintain that the 286 large model, even in pointer expensive programs, has at most a 50% average slow down compared with small model, except for pathological cases. Such pathological cases are easy to find for every cache organization, as you will readily concede. Accessing two arrays that happen to map to the same cache lines kills almost every machine out there, for one thing... That the shadow register organization of the 286 is misguided I have been ready to concede, but it should not reflect on a judgement on the merits of shared memory simulation via remapping for reverse MMUs, or on the merits of segmented architectures in general. I also have the impression that you loathe so much the 286 two dimensional addressing scheme that you also detest all segmentation schemes but the two issues are unrelated. Most paged and segmented VM systems have linear addressing, e.g. the 370, or the VAX-11, and so on. peter> Loading a segment register is an expensive operation, pcg> Around 20-30 cycles if memory serves me right. Compared to a pcg> context switch it is insignificant. peter> But it happens so much more often. Dereferencing a far pointer costs only three times a near pointer, and not every instruction is a far pointer dereference. Also, when one does segment remapping, really one twiddles the contents of a field in the LDT (the page table), not that of the segment registers, and at most once per context switch (and this does not happne on most context switches). The cost of reloading a segment register and of remapping a segment are therefore totally unrelated. peter> Well, only that in the case you're talking about the cost of peter> remapping the segments is even higher. True... But not tragic. Taking a trap, finding out which segment should be remapped, fiddling the LDT of the process who had the segment mapped, and remapping it might cost as much maybe as reading a block off the buffer cache, i.e. a few hundred instructions. I would think that it is of the order of a page fault (mind you, I was maybe not clear before: just the CPU cost of a page fault, not the many milliseconds for the IO time possibly associated to it), and less frequent. I remember that the BSD VM subsystem that used a like technique to simulate a 'referenced' bit for each page (take a fault and map the page in) cost less than 5% on a VAX-11/780, and that was for much more frequent faulting. People do IPC using pipes or System V MSGs or sockets which cost far far far more. On machines like the 286 that can share segments simultaneously, pure shared memory is OK. On those that cannot, like the RT, the cost is not excessive, and probably inferior to that of most alternatives. -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@aber.ac.uk