Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!uakari.primate.wisc.edu!xanth!mcnc!rti!xyzzy!dg-rtp!meissner From: meissner@twohot.rtp.dg.com (Michael Meissner) Newsgroups: comp.unix.questions Subject: Re: Getting the most for a process. Message-ID: Date: 16 Oct 89 14:01:20 GMT References: <593@cogent.UUCP> <12034@cgl.ucsf.EDU> <40090@bu-cs.BU.EDU> <1029@crdos1.crd.ge.COM> <20140@mimsy.UUCP> Sender: usenet@xyzzy.UUCP Organization: Data General (Languages @ Research Triangle Park, NC.) Lines: 74 In-reply-to: chris@mimsy.UUCP's message of 12 Oct 89 18:39:31 GMT In article <20140@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: > In article <1029@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.COM > (Wm E Davidsen Jr) writes: > > The Encore version of make looks at an environment variable and > >determines how many copies of the ccompilers to start. On a machine with > >8 cpu's you get a blindingly fast make compared to doing the same thing > >(in serial) on a faster machine. > > (Not if the serial machine is more than 8 times faster, or if there is > only one source file.) > > Unfortunately, the Encore version of cc, which is apparently a Greenhills > C compiler, has all of its `phases' built in. Thus, if you are compiling > a single file, you cannot preprocess on cpu 0, compile on cpu 1, and > assemble on cpu 2 all at the same time. > > Given the standard edit/compile/debug cycle, this---combining > everything---seems to me to be a major mistake. Well, not so major as > all that, perhaps, since most of the time is spent in the compilation > part, not in preprocessing or assembly. Still, the potential was > there, and would return if Encore used gcc as their standard compiler. Especially when you optimize with gcc, most (80% or more) is spent in cc1, which has the following passes over the RTL file: * The initial pass creating the RTL file from the TREE file created by the parser; * A pass to copy any shared RTL structure that should not be shared. * The first jump optimization pass. * A pass to scan for registers to prepare for common sub- expression eliminiation. * The common sub-expression elimination pass. * Another jump optimization pass. * Another register scan pass for loop optimizations. * A loop optimization pass. * A flow analysis pass. * A combiner pass to combine multiple RTL expressions into larger RTL expressions. * A pass to allocate registers that are live within a single basic block. * A pass to allocate registers whose lifetime spans multiple basic blocks. * A final jump optimization pass. * An optional delayed branch recognition pass. * A final pass that expands peepholes, and emits assembler code. Also note in using -pipe, that the preprocessor internally buffers the entire text, and writes it in one fell swoop at the end. This means that in general only the compiler proper (cc1) and assembler run in parallel. This helps to some degree. A few months ago, I measured how much it helped on a dual processor 88100. Without the -pipe option, the build time of the entire compiler was about 14 minutes without -pipe, and about 10-12 minutes with -pipe. -- Michael Meissner, Data General. If compiles where much Uucp: ...!mcnc!rti!xyzzy!meissner faster, when would we Internet: meissner@dg-rtp.DG.COM have time for netnews?