Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!bloom-beacon!apple!oliveb!mipos3!omepd!mipon2!mcg From: mcg@mipon2.intel.com (Steven McGeady) Newsgroups: gnu.gcc Subject: Suggestions for improvements/enhancements to GCC Message-ID: <4462@omepd.UUCP> Date: 23 May 89 15:51:57 GMT Sender: news@omepd.UUCP Reply-To: mcg@mipon2.intel.com (Steven McGeady) Lines: 240 Here is a note I sent some time ago - I never saw it reflected onto the net. I am posting three messages - A previous message, which contains changes I needed to make to the base, this one , which contains suggestions for improvements in the base, and a third which contains mods for gcc.c to provide environment variable support for locating passes and files. I have not posted the machine-specific files for the 960. If anyone wants them they can write to me. S. McGeady Intel Corp. (503) 696-4393 ------------ To: rms@wheaties.ai.mit.edu Cc: info-gcc@wheaties.ai.mit.edu Subject: [MSG 3 of 3] Suggestions for improvements/enhancements in GCC Date: Fri, 21 Apr 89 13:46:18 PDT From: mcg 1. Ease of Retarget Issues a. More generic calling sequence support My port would have been made much easier if FUNCTION_ARG could have just returned an RTX for each argument in the list, rather than just for the register arguments. Then I could have returned a register rtx when appropriate, and (plus (arg_pointer_rtx) (const_int 64)) or whatever when I needed to put something into the argument block. I think that both expr.c and stmt.c would be simplified if this were the case. Perhaps you could let expr.c and stmt.c figure it out if a NIL was returned. (One would also need a macro to be called after all parameters were assigned that returned the total amount of stack space consumed, but this is easy to keep track of in CUM.) b. Machine-specific function header setup The machine descriptions should have control of outputting the header of a procedure. The reason for this is the optimization of leaf-procedures: A different kind of definition must preceed the '.globl _foo\n\n_foo:' lines, e.g. .leafproc _foo,__foo rret: return _foo: lda rret,g14 __foo: /* leaf procedure, entered with branch-link */ ... bx (g14) FUNCTION_PROLOGUE is too late for this. I had to hijack FUNCTION_NAME_DECLARE, which is probably not the appropriate place for such code. Why not just let FUNCTION_PROLOGUE output the .globl and the initial label? Pass it the same parameters that you pass FUNCTION_NAME_DECLARE. That seems the easiest solution. c. A three-word integer type would be useful, so three-word BLKmode structures and aggregates could be kept in registers. 2. Optimization Issues a. Support for leaf-procedure and tail-call optimizations These optimizations are quite common in RISC architectures. A leaf-procedure is one that meets certain machine-dependent criteria regarding how many and which registers are used, perhaps whether a frame is needed or how big that frame is, etc. Currently, I need to scan all of the insns in a procedure looking for calls and for argument block or frame references to know whether the procedure is leaf or not. This is because regs_ever_live[FRAME_POINTER_REGNUM] and regs_ever_live[ARG_POINTER_REGNUM] are not correct at FUNCTION_PROLOGUE time, and because there is no indication of whether any calls are made from the procedure. Tail-calls are calls made immediately prior to a return. The call can often "piggy-back" on the current procedure's stack, and avoid the overhead of allocating another one. The crucial test is whether any of the called procedure's actual parameters are addresses on the calling procedure's frame. There is no easy way to determine this currently (that I am aware of), so I scan all the insns looking for a load effective address insn that references the frame pointer. This is not optimal though, because such references may have nothing to do with the actual parameters of any particular call. An indication of whether a call is a potential tail-call would be useful. b. Constant costs & Constants in common registers The CONST_COST macro is severely limited by not knowing the context in which the constant will be used. Take for example a standard RISC architecture, which has a limited number of constants that can appear in an instruction. The constant '256' may not fit in the instruction, so one would assign it a non-zero cost, and it will be put in a register. However, if an insn (or a peephole), can recognize (for example), 'and 256,r0,r0; test r0' and replace it with 'bittst 7' or whatever, then substantial optimizations could be realized. It is easy to define an insn which has operands which match various types of constant operands, but the loading of these operands into registers often eliminates optimization possibilities. The solution might be to give CONST_COST an idea of what instruction context a constant is being used in, or it may be to delay loading of constants into registers until after instruction selection. I can give numerous examples where better code could be generated if I could recognize certain constants: especially bit strings (0x00ff00) and power-of-2 operands (0x80000000). There is an another case as well. I would like to define an instruction template that has a condition: if (operands[0] == const0_rtx && regs_ever_used(REG_G14,...)) Because I have a register used in the calling sequence which is almost always zero, except in routines which require an argument block. I can probably handle this by writing special code which examines the function definition tree, but that duplicates a lot of code in a place it really shouldn't be. c. GCC generates extremely bad code for RISC architectures when performing bit-field extractions on automatic memory-based operands (typically those passed in a register but addressed later). When the bitfield is addressed within a module, GCC assumes that it must write the bitfield out to memory after each statement, causing a larger number of redundant loads and stores to be generated. Instead, GCC should load the variable into memory only immediately before a call or a distinct reference to memory that might be a reference to the structure is made. d. On the same subject as (c), the compiler does not do the dataflow analysis needed to determine whether an addressed automatic variable (either scaler or aggregate) can actually be used at any point in a procedure. It would be worth doing this data flow, at least within basic blocks, if not within the whole procedure. Only spilling the value at basic block boundaries would be a major win. e. Constant-value tracking across basic-blocks is poor, e.g: for (i = 0; i < 100; i++) { ... } generates something like: mov 0,rX mov 99,rY br L1: L2: ... sub 1,rX,rX L1: cmp rX,rY ble L2 The initial branch in this can be eliminated under all circumstances, since the initial comparison is constant. f. Some indication should be given for each insn (especially load/store insns) whether the insn is in a loop or not, and approximately (if known) how many times that loop might be executed. A machine-description might want to make a trade-off between code density and speed in certain circumstances, and generate a different sequence of instructions in a loop. One example is use of complex addressing modes. A complex addressing mode (such as reg+const+(reg*scale)) might be desirable in straight-line code, especially if it is a dense instruction and does not take much longer than the corresponding expanded code to perform the address calculation. However, one would want to expand and strength-reduce or hoist the address address calculation of the load/store occured in a loop. g. An indication of the top and bottom of a loop, and perhaps identification of the loop control insns. There are certain optimizations that may or may not be beyond the scope of the compiler such as loop unrolling and software pipelining. At a less aggressive level, on some RISC machines it is and optimization to align branch targets in memory. This is often not a win except for loop tops. A loop header that noted the minimum and maximum number of times the loop is executed (if known), whether the loop is left with a break, and some other information of that nature would be helpful both to the code generator and potentially to post-pass optimizers. h. Support for __builtin_strcpy, etc, generating movstrqi instructions or equivalent, would be useful. A hook for strcmp would also be nice. I am told that the guy at Data General who has done the 88k port has done this. 3. Machine Type Support a. It would be nice if there were a straightforward way to support 3-word (and 4-word) structures in registers. As noted in an earlier note, I have added support for 4-word structures, and they appear to work, but I have not added 3-word structures out of fear that the base compiler might not work on non-power-of-2 sized data. 4. Environment/Miscellaneous Issues a. It would be desirable if GCC would get its paths (optionally) from environment variables, rather than from compiled-in constants: e.g.: $GmdBASE - base of GNU tools $GmdLIBDIR - where to look for libraries default $GNUBASE/lib $GmdBINDIR - where to look for executable binaries default $GNUBASE/bin $GmdINCDIR - where to look for include files default $GNUBASE/include $GPPmdSPEC - default CPP_SPEC $GASmdSPEC - default ASM_SPEC $GLDmdSPEC - default loader spec $GASmdPATH - path to assembler default $GBINDIR/gas $GLDmdPATH - path to linker default $GBINDIR/gld $GPPmdPATH - path to cpp default $GBINDIR/cpp (the 'md' would be a machine-dependent string) This would allow different users on the same machine to use different combinations of the system and gnu tools without each having a private gcc. Also, it would allow cross-compilers to isolate themselves from the system includes and whatnot. Currently, someone targeting GCC as a cross-compiler must invade gcc, cpp, and all the other tools to scrub them of references to /usr/include, /lib, /usr/lib, etc. If the above variable where used, a "normal", non-cross user would not set any variables, and gcc could substitute in the standards. However, a cross user could just set $GmdBASE, and then get a cross environment of include files and libraries. S. McGeady