Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!linus!security!genrad!grkermit!masscomp!clyde!floyd!harpo!seismo!hao!hplabs!sri-unix!dan@bbncd From: dan%bbncd@sri-unix.UUCP Newsgroups: net.unix-wizards Subject: Re: the many greps Message-ID: <13684@sri-arpa.UUCP> Date: Tue, 15-Nov-83 17:32:50 EST Article-I.D.: sri-arpa.13684 Posted: Tue Nov 15 17:32:50 1983 Date-Received: Fri, 18-Nov-83 00:11:06 EST Lines: 27 From: Dan Franklin Each time the 3 greps are discussed, and people point out that they use different algorithms, each best for different kinds of regular expressions, I am puzzled by the leap to the conclusion that they must therefore be different programs. Some UNIX C compilers have several different algorithms for the 'switch' statement, choosing either an indexed table, a hashed table with linear rehash, or the obvious if/then/else structure for the output, depending on the properties of the input. These compilers do not provide 'switch1', 'switch2', and 'switch3' statements; the compiler examines the properties of the case list and chooses the best representation. If the only difference between the three greps were the space-time performance of each algorithm, the sensible thing to do would be to have one 'grep' which chose the most efficient algorithm for the regular expression--with, perhaps, a switch so the user could override grep's choice on special occasions (no heuristic can be perfect). So why doesn't somebody do just that? Consider how much new-user puzzlement (and excess unix-wizards mail) would be eliminated! There is a reason: the three greps interpret three different forms of regular expression. You can't take an arbitrary shell script which uses, say, 'grep' and substitute 'egrep' everywhere without first scrutinizing each regular expression to make sure it doesn't have parentheses, vertical bars, etc. So even if 'egrep' could use a variant of the 'grep' algorithm in the right circumstances, you couldn't throw away 'grep'. (Each command also accepts a different subset of options, but that problem could be solved.) Too bad. Dan Franklin