Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!rice!titan!preston From: preston@titan.rice.edu (Preston Briggs) Newsgroups: comp.arch Subject: paired registers Message-ID: <4259@kalliope.rice.edu> Date: 11 Jul 89 16:18:33 GMT References: <2190@oakhill.UUCP> <13980@lanl.gov> <2199@oakhill.UUCP> Sender: usenet@rice.edu Reply-To: preston@titan.rice.edu (Preston Briggs) Organization: Rice University, Houston Lines: 84 People have been arguing about how difficult it is to compile for a RISC or CISC. The subject of "paired registers" arose as a difficulty associated with RISCs. My 1st (bad) experience was with the AP-101, a sixteen bit version of the 360 used in the Shuttle; certainly a CISC. There are other examples. However, they aren't hard! There are 2 common cases: shifts perhaps source and destination can be profitably paired (IBM RT) floats sometimes single precision requires 1 register, and double requires a pair of (adjacent) registers. The 680x0 family commonly uses a coprocessor to handle FP. The 68881 has 8x80 bit registers and thus avoids the problem. Of course, there is a speed penalty in that "single precision" results are determined to more than the required precision. There are other cases of course; the much discussed "divide and remainder" operation, or multiply giving a 64 bit result, and so on. All of these can be handled in a similar fashion. I use a coloring register allocator, styled after Chaitin (vs. Chow). After building the interference graph, we make a pass over the code looking for COPY's that can be removed. (If the source and destination don't interfere, coalesce them and remove the COPY instruction.) Additionally, we attempt to coalesce the destination and a source operand of 2-address instrctions. These are well known and mentioned in the original papers by Chaitin. It's also possible to handle register pairs at the same time. If, while examining the code, an instruction is discovered that requires some of its registers to be paired, we find a pair of machine registers (that don't interfere with the symbolic (uncolored) registers) and "pre-color" the pair. If no pair can be found (because of pre-existing interferences), then we may have a problem. The solution depends on the available instruction set. For the RT (and paired shifts), a COPY must be inserted later (while handling any remaining 2-address instructions). For FP-type problems, spill code may be required; circumstances vary. You may be unconvinced. The cost (compile-time and coding time) is minimal; we already have all the machinery in place and we're already making a pass over the code to coalesce registers. The "symbolic registers" had to be assigned to real registers eventually, so pre-assigning is ok. It is true that the graph becomes more constrained; but the pairing requirement is the constraint -- pre-assignment just adds this additional constraint to the graph. The scheme requires a graph coloring allocator; but you should be using one anyway. I've implemented the idea for a compiler on the RT (paired shifts). It works pretty well. It's suprising how tricky the resulting code can be; makes looking at assembly fun. The original papers describing register allocation via graph coloring are: "Register aloocation via coloring" Chaitin, Auslander, Chandra, Cocke, Hopkins, Markstein Computer Languages, January 1981 and "Register allocation and spilling via graph coloring" Chaitin Proceedings of the SIGPLAN '82 Symposium on Compiler Construction SIGPLAN Notices, June 1982 The scheme for handling paired registers was mentioned by Martin Hopkins in a paper, "Compiling for the RT PC ROMP", which is collected in a small book called "IBM RT Personal Computer Technology", 1986. Additional papers on graph coloring register allocators appear in SIGPLAN '84, '86, and '89. Preston Briggs preston@titan.rice.edu