Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!umich!umeecs!billms
From: billms@zip.eecs.umich.edu (Bill Mangione-Smith)
Newsgroups: comp.arch
Subject: icache size (was Compilers taking advantage of architectural enhancements)
Message-ID: <1990Oct12.135814.14346@zip.eecs.umich.edu>
Date: 12 Oct 90 13:58:14 GMT
References: <3300194@m.cs.uiuc.edu> <AGLEW.90Oct11222801@treflan.crhc.uiuc.edu> <1990Oct12.042251.18884@cs.cmu.edu>
Organization: University of Michigan EECS Dept., Ann Arbor, MI
Lines: 27

In article <1990Oct12.042251.18884@cs.cmu.edu> spot@TR4.GP.CS.CMU.EDU (Scott Draves) writes:
>shouldn't loop unrolling burn lots of registers also?  when unrolling,
>which ceiling will you hit first, the number of registers, or the size
>of the I-cache?

I don't understand why people are still consumed by code size in with (most)
aggressive loop optimizations.  Loop unrolling and polycyclic scheduling
do increase code size.  That just isn't important anymore.  Give me a 4k
icache.  Thats usually 1k instructions, right?  I've been using the 
Astronautics ZS-1, which almost always unrolls loops and does a very good
job of picking the 'correct' unrolling depth.  Yet the loops I've looked
at are almost never expanded to over 100 instructions, let alone 1k.

When you are unrolling loops, you only need a certain number of instructions
(dependant on fu and mem latencies) to work with.  Even a small icache,
i.e. 1-4k, can hold the required number of instructions.

Granted, this issue might still be important for modern cpus that still
have very small icaches, but they are quickly being replaced.

>Scott Draves	

bill
-- 
-------------------------------
	Bill Mangione-Smith
	billms@eecs.umich.edu