Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!ncar!husc6!spdcc!ima!compilers-sender From: wpl@PRC.Unisys.COM (William P Loftus) Newsgroups: comp.compilers Subject: Re: Lex surrogates Message-ID: <3300@ima.ima.isc.com> Date: 6 Feb 89 16:55:16 GMT References: <3286@ima.ima.isc.com> Sender: compilers-sender@ima.ima.isc.com Reply-To: wpl@PRC.Unisys.COM (William P Loftus) Lines: 64 Approved: compilers@ima.UUCP Many of the lexical analyzer generators gain their speed from limiting the specification power of their tools. They are taking advantage of the fact that most languages don't need a general RE processor. In my opinion, this limits the use of such tools to traditional language lexical analyzers (they are not for what Aho et al. call swiss army knives---see the dragon book). However, there is a full RE lexical analyzer generator called Flex. Flex is a lex clone, but much faster. Here is a file contained in the distribution of flex (which is freely available from several ftp and uucp sites) that describes timing: -- flex vs. lex timings for a C tokenizer which includes keywords: Generation times: lex 83.0 secs flex 3.9 flex -cfe 7.1 # uncompressed table, equivalence classes flex -cf 15.0 # uncompressed table, no equivalence classes Scanner object file sizes: lex 41.0K bytes flex 9.4K flex -cfe 49.6K flex -cf 126.5K Running times on a 28,088 line input (685K characters): lex 29.8 secs flex 19.3 flex -cfe 9.0 flex -cf 7.8 The timings were made on a Sun 3/60. All times are user + system CPU time, and don't include hashing of identifiers. Summary: For about the same sized scanner, you get a factor of 3 in performance. For a 30% faster scanner, you get a scanner 1/4th the size, and it's generated in 1/20th the time. For a scanner that's 3 times larger, you get a factor of 3.8 in performance. -- There is a paper by Vern Paxson that describes how he obtains these speeds (I don't have the paper with me at the moment, so I can't tell you the title). William Loftus -- William P Loftus UUCP: wpl@burdvax.UUCP Unisys/Paoli Research Center ARPA: wpl@anarchy.prc.unisys.com PO Box 517 215-354-0614 (home) Paoli, PA 19301 215-648-7248 (work) -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request