Path: utzoo!attcan!uunet!ogicse!ucsd!usc!zaphod.mps.ohio-state.edu!rpi!uupsi!cmcl2!lanl!lambda!lambda.lanl.gov!scp
From: scp@acl.lanl.gov (Stephen C. Pope)
Newsgroups: comp.sys.celerity
Subject: Re: Model 500 Scalar Processor Instructions
Message-ID: <SCP.90Feb22120525@blanche.acl.lanl.gov>
Date: 22 Feb 90 19:05:25 GMT
References: <SCP.90Feb16092214@blanche.LANL> <GOV> >
	<1914@ncrcce.StPaul.NCR.COM> <6994@celit.fps.com> <6997@celit.fps.com>
Sender: news@lambda.UUCP
Reply-To: scp@acl.lanl.gov
Organization: Advanced Computing Lab, LANL, NM
Lines: 54
In-reply-to: ps@fps.com's message of 21 Feb 90 16:31:37 GMT


on 21 Feb 90 16:31:37 GMT,
ps@fps.com (Patricia Shanahan) said:

[...]

Patricia> If you are doing a gcc port, would you implement a new assembler or use
Patricia> the existing one?

As I mentioned the gcc port, I should make it clear that I haven't actually
tried to do this.  It's not clear to me how interested I am in paleantology!
The main gripe is that it'll be a while before FPS delivers a cc/cpp/ccpass1
that uses dynamically allocated tables instead of fixed size ones, and I
don't know whether we'll ever see an ANSI compiler that does voids and enums
right.  For me, a main reason is to get g++ going.

Patricia> If you implement a new assembler, there is another set
Patricia> of information that you need on some non-interlocked pipeline conflicts.
Patricia> The assembler inserts nop (actually "tw r0,r0") when not optimizing, or
Patricia> re-arranges code when optimizing, to prevent some restricted sequences from
Patricia> occuring. For example, on a Celerity (but not a Model 500) modifying the data
Patricia> register of a store on the immediately following cycle may give undefined
Patricia> results if the store page faults.

The situation is in some ways analogous to the mips situation: with a
smart assembler, you might as well let the assembler do the instruction
scheduling, but you might lose over what a really smart gcc could do.
Alas, instruction scheduling is not really there yet in gcc.  There's
also the question of symbol management with the FPS as;  I've not
even looked to see whether g++ will work in concert with as.

Patricia> You also need to decide whether to write your own calling conventions or use
Patricia> ours. If you use ours, your code will be able to call and be called by
Patricia> FPS compiled code. Even if you do your own, make sure that r14 contains the
Patricia> address of the end of the memory stack. When calling signal handlers the
Patricia> kernel uses the area below where r14 points.

Patricia> Some of the instructions are obscure, including djibzm. The compiler does
Patricia> not actually generate these, nor do programmers (even on the rare occasions
Patricia> when we resort to assembly language programming) normally use them. The
Patricia> assembler has a set of macros with names like "djneq" for delayed jump on
Patricia> inequality, that are implemented by the assembler using them. Similarly, we
Patricia> have an assembler mnemonic "load ra,lit" which means "load general register
Patricia> ra with literal lit by any suitable method". The assembler generates a 
Patricia> sequence that takes one to five cycles on a Celerity, or one to four cycles
Patricia> on a FPS M500 depending on the value of "lit".

All this non-obvious (but welcomed!) info is exactly why there are
users out here very interested in the free flow of information!

stephen pope
advanced computing lab, lanl
scp@acl.lanl.gov