Path: utzoo!attcan!uunet!husc6!bloom-beacon!tut.cis.ohio-state.edu!mailrus!cornell!uw-beaver!teknowledge-vaxc!sri-unix!quintus!ok From: ok@quintus.uucp (Richard A. O'Keefe) Newsgroups: comp.unix.wizards Subject: Re: what should egrep '|root' /etc/passwd print? Message-ID: <424@quintus.UUCP> Date: 17 Sep 88 09:16:34 GMT References: <44414@beno.seismo.CSS.GOV> <68203@sun.uucp> <8202@alice.UUCP> <410@quintus.UUCP> <8209@alice.UUCP> Sender: news@quintus.UUCP Reply-To: ok@quintus.UUCP (Richard A. O'Keefe) Organization: Quintus Computer Systems, Inc. Lines: 26 In article <8209@alice.UUCP> andrew@alice.UUCP (Andrew Hume) writes: >it sounds appealing to allow a missing RE to mean the empty string >but i am unconvinced as to its utility. the examples show this; [He goes on to show that each of the examples can be done without the use of empty R.E.s] This misses the point. In fact, at least two points. (1) The method of eliminating the empty R.E. is *DIFFERENT* in each case. (2) The empty R.E. meaning "match the empty string" is not an arbitrary association. "empty" is the identity for concatenation of both strings and patterns. It is the mathematically obvious thing, and a well-written program would have to go out of its way to disallow it. In fact, as the error message from a System V "grep" shows, grep at least *IS* going out of its way to disallow empty patterns in order to support an ed(1) hack (// means "use previous pattern") which cannot possibly be useful to grep (there *is* no previous pattern). The argument "I am unconvinced as to its utility [because you can always translate it away]" can be transferred to the "*" operator: "with the current egrep" you can always rewrite a pattern P* as P+ . In fact, *, +, and ? are all special cases of \{..\} (see grep(1)) so instead of P*, P+, and P? we should be writing P\{0,\}, P\{1,\}, and P\{0,1\}. (And, of course, empty expressions as ;\{0\}.) Why go out of your way to prohibit the obvious way of doing something?