Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!apple!agate!usenet.ins.cwru.edu!eagle!bach.lerc.nasa.gov!fsset
From: fsset@bach.lerc.nasa.gov (Scott E. Townsend)
Newsgroups: comp.sys.m88k
Subject: Re: Big Endian vs. Little Endian register pairs
Message-ID: <1991May1.120003.13827@eagle.lerc.nasa.gov>
Date: 1 May 91 12:00:03 GMT
References: <1991Apr30.185908.16474@eagle.lerc.nasa.gov> <1991Apr30.215621.25387@oakhill.sps.mot.com>
Sender: news@eagle.lerc.nasa.gov
Distribution: na
Organization: Nasa Lewis Research Center ( Cleveland )
Lines: 50

In article <1991Apr30.215621.25387@oakhill.sps.mot.com> marvin@bushwood.UUCP (Marvin Denman) writes:
>In article <1991Apr30.185908.16474@eagle.lerc.nasa.gov> fsset@bach.lerc.nasa.gov (Scott E. Townsend) writes:
>>Just curious: does anyone know what the arguments were for/against the
>>register pairing scheme?  It just occured to me that if the pairing
>>scheme was reversed (i.e. r1,r0 rather than r1,r2) then we'd save a cycle
>>loading many floating-point constants in preparation for fadd & friends.
>>
>>This would of course look Little-Endian in a register dump :-(
>>
>
>How would we save anything?  Can you give a little more explanation?
>
>-- 
>Marvin Denman
>Motorola 88000 Design
>cs.utexas.edu!oakhill!marvin

Well, I was looking at some compiler output and found it often doing
something like this for a small double-precision constant:

	or.u	r4,r0,hi16(constant upper half)
	or	r5,r0,r0
	fadd.ddd r6,r6,r4

This was to implement things like x += 3.0.  Now if the 'endianness' of 
register pairs was reversed, an equivalent scheme would be:

	or.u	r1,r0,hi16(constant upper half)
	fadd.ddd r7,r7,r1

Thus saving a cycle.  Note that this compiler always saves the original r1
on function entry, so it's available for use.

Upon thinking about this last night, I realized that the compiler could
be a bit more intelligent with the existing 'endianess' by initializing
the small constant into a single-precision value. (Except for exponent
range, a double with 32 lsb's of zero is no better than a single)

So then the original sequence would become:

	or.u	r4,r0,hi16(single-precision upper half)
	fadd.dds r6,r6,r4

Saving the cycle I was so upset about ;-)

-- 
------------------------------------------------------------------------
Scott Townsend               |   Mail Stop: 5-11
NASA Lewis Research Center   |   Email: fsset@bach.lerc.nasa.gov
Cleveland, Ohio  44135       |