Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!yale!cs.yale.edu!peterson-john From: peterson-john@cs.yale.edu (John C. Peterson) Newsgroups: comp.lang.functional Subject: Re: Run-time strictness analysis Message-ID: Date: 28 Nov 90 05:16:55 GMT References: <451sis-b@massey.ac.nz> Sender: news@cs.yale.edu Organization: Yale Haskell Project Lines: 60 Nntp-Posting-Host: object.systemsz.cs.yale.edu Evan Ireland (E.Ireland@massey.ac.nz) asks `lazy f.l. implementors' (parse that as you wish) about compile time and run time strictness issues and separate compilation. Here's what is going on at Yale right now. First, let me comment that laziness is probably the primary performance obstacle we have to deal with so this is certainly a problem that merits serious study. We have not had a chance to explore this problem very far, but we do have a few results. Higher-order functions present the greatest problems; it is impossible to do much to a function like foldr due to separate compilation. Since the argument to foldr is completely unknown, the compiler must plan for the worst when generating the code for foldr. Our only solution (at the moment) is to agressively inline calls to foldr, thus removing the higher-order function entirely. For small functions like foldl and foldr this is an acceptable solution. We currently supply two entry points to every function: a general entry which is completely curried and uses only non-strict (delayed) arguments and an optimized entry which is uncurried and takes strict arguments as recognized by compile time strictness analysis. Right now, we are using a flat domain analysis so this affects only the top level delay of a structure. The optimized entry is used only on direct calls with complete argument lists, all other calls (curried or higher order) use the much less efficient standard entry. This works well for programs without higher-order functions and mainly scalar data objects. We use our own interface files for separate compilation instead of the official Haskell interface files; these keep all information needed to do this optimisation across module boundaries. It should be remembered that interface files are not an essential component of the language and implementations are free to deal with separate compilation issues in manners not described in the report. We also avoid unneeded delays for simple expressions. Expressions are associated with a complexity measure which gives an upper bound on the time taken to evaluate it. When this is below some threshold, defined by the overhead in creating a delay, the expression will be calculated rather than delayed. This picks up simple expressions like x+1 (assuming that + cannot cause a runtime error and that x is strict). There are also some more unusual cases involving structures, such as f [] = [] f x = g (tail x) Here, f is strict in x. Even if g is not strict, it is easier to compute tail x (which yields a delay) than to delay the call to tail. We feel that using multiple versions of a function is a `good thing' and fills needs beyond strictness problems, such as efficient overloaded functions. Partial evaluation (including evaluation with respect to strictness constraints) can yield versions of a function which are far more efficient than the most general case; the problem is to decide which partially evaluated copies of a function to create. This is a subject of ongoing research in our group. Selecting the best function dynamicly based on run-time information is certainly a very interesting area. John Peterson peterson-john@cs.yale.edu