Path: utzoo!mnetor!uunet!husc6!bloom-beacon!mit-eddie!uw-beaver!microsoft!jangr From: jangr@microsoft.UUCP (Jan Gray) Newsgroups: comp.arch Subject: Re: The WM Machine Message-ID: <1439@microsoft.UUCP> Date: 5 May 88 17:12:35 GMT References: <5339@aw.sei.cmu.edu> Reply-To: jangr@microsoft.UUCP (Jan Gray) Organization: Microsoft Corporation, Redmond, WA Lines: 77 In article <5339@aw.sei.cmu.edu> firth@sei.cmu.edu (Robert Firth) writes: > >Address Modes >------------- > For the WM computation O1 op1 (O2 op2 O3), assume that we have: two inner operators: 1lit10 ::= ((O2 << 5) | O3 | (1 << 10)) 0lit10 ::= ((O2 << 5) | O3) two outer operators: lli ::= Rd[15-0] <- (O1 << 11) | (O2 op2 O3) ("load lower immediate") lui ::= Rd[31-16] <- (O1 << 11) | (O2 op2 O3) ("load upper immediate") (I know that instructions are actually of the form Rd := O1 op1 (O2 op2 O3) so that "lui" isn't *strictly* legal given the description he presents, but also assume Wulf isn't telling us all the little details...a safe assumption!) >Consider first, how one might access a simple static variable, at a given >(32-bit absolute) address. Clearly, one cannot say the equivalent of > > Load @16#abcdefgh# > >since the machine does not permit an absolute address as an operand. If you allow my assumption above, you can load a 32 bit constant into a register using lli and lui. This still takes two cycles but saves you a read from the literal pool. You still have to issue a load on this address... >Alternatively, one could load the address into a register, using a load >literal instruction (as is done on the M/500). But there is no such >instruction. No instruction as such, but it can be fabricated from the appropriate operators... >As another example, consider accessing a local variable. If it is >within 32 bytes of the frame pointer, we are lucky: > > (fp) + (displacement noop noreg) > >will serve. If it is, say, at local address 40, then we might say > > (fp) + (8 + (rx)) > >where rx holds the useful constant 32, and at the price of an extra >addition (and surely an extra cycle) we can now encode a six-bit >displacement. (Actually, it would be marginally better to reverse the >registers). But how are we to load 32 into rx? Perhaps by With the 1lit10 and 0lit10 operations above, you can access 2K bytes of the frame: (fp) + (#10101 0lit10 #010101) >Finally, any implementation of this design will be faced with serious >problems in building correct synchronous traps, efficient task context >switches, and transparent memory management. That's for sure! According to the paper, Wulf has been simulating this design on "a fairly large set of benchmark fragments". He says this architecture can do about 4 RISC-like instructions per cycle. I believe him. What I'm less sure of is if there is a memory architecture that will keep this machine running. Hopefully someone will implement this architecture (or simulate a real-world memory architecture for it) and we'll all know for sure. These are my opinions, but not necessarily my employer's. Jan Gray uunet!microsoft!jangr Microsoft Corp., Redmond Wash. 206-882-8080 ------------------------------ "Application for patent protection has been made for portions of the material described here". Right.