Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod!usc!snorkelwacker.mit.edu!ai-lab!life!bson
From: bson@ai.mit.edu (Jan Brittenson)
Newsgroups: comp.arch
Subject: Re: UNIX mind-set  -> OK, OK!
Message-ID: <BSON.91Jan14170904@rice-chex.ai.mit.edu>
Date: 14 Jan 91 22:09:04 GMT
References: <1991Jan14.013815.11419@ims.alaska.edu> <11314@lanl.gov>
	<5340@idunno.Princeton.EDU> <1991Jan14.170115.17178@Think.COM>
Sender: news@ai.mit.edu
Organization: nil
Lines: 78
In-reply-to: barmar@think.com's message of 14 Jan 91 17:01:15 GMT

In article <1991Jan14.170115.17178@Think.COM> 
   barmar@think.com (Barry Margolin) writes:

 > I've used several systems (Multics, ITS, TOPS-20) where the commands
 > invoked the globbing routine, ... Filename arguments that don't make
 > sense to be wildcards (e.g. the last argument to mv) are scanned for
 > wildcard characters, and generate an error if any are seen

   The Unix approach have its advantages sometimes. Assume you have
the files "L19901200.axx772" and "L19910100.bxx19974". If you wish
overwrite the second file with the first it makes sense to use

	mv L*72 L*74

if that's unique enough. Another case is cd. When you have only file,
and it's a directory, "cd *" makes sense. If I have a list of files
and like to see if L199901200.axx772 has been backed up, it makes
sense to use the command

	grep -l L*772 L*backup

to get a list of what backup file lists it appears in. This
orthogonality is quite a strength of Unix. Granted, there's an almost
infinite potential for error, with no recovery.

 > Putting the work in the utility allows useful syntaxes such as:
 >
 >     mv * =.old

   This is something I quite miss. But I'd rather see it solved in the
shell than in the tools, perhaps as variation of the {...} syntax. Say
that {:re} were made a regex quote, then one could write something
like:

	mv {:\(^.+\)$ \1.old}

The name "foo" would thus be replaced by "foo foo.old". At '$' the
shell would go from "match" to "replace." If it didn't match at all,
the shell would just move on to the next file name. I'm confident this
exact issue has been dragged around in the dirt in more than one
newsgroup, so dear netter, please don't regard this as some kind of
shell-extension proposal, only as a highly hypothetical and off-hand
contribution to the where-do-we-glob issue. The point I'd like to make
is that making this modification in a library requires relinking and
possibly recompiling, every tool that uses it when modified, whereas
the shell can be quickly recompiled and replaced, with several
versions available for comparisons.

   I guess every Unix programmer has at some point wished there was a
sh-compatible glob(3) for some very specific purposes, though.

 > If there's a reasonable library routine available, the hardest part
 > should be deciding which arguments should be processed as wildcards.

   It really doesn't make much difference whether it's in a library or
in a shell, really, as long as there are sufficient hooks for
redefining the syntax. I mean, instead of escaping don't-matches, you
end up escaping do-matches to tell the globber you do want a specific
argument globbed regardless of what the tool thinks is appropriate.

   In Unix hooks really aren't necessary, as you can switch shells
easily, or implement a different syntax in your application and be
sure nothing is globbed following exec(2).

 > I agree, it isn't cat's job [to glob].  However, it would be the job
 > of a wildcard_match() library routine, which would take a wildcard
 > argument and return an array of filenames.  It would be cat's job to
 > call this for its filename arguments.

   I'm not sure how feasible this is. In order to handle `command`
constructs you would need to include an entire shell, more or less. Or
you could start a subshell, in which case we're back where this
discussion started. (Should the subshell call wildcard_match()?)

Just some random thoughts,

						-- Jan Brittenson
						   bson@ai.mit.edu