Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!elroy.jpl.nasa.gov!ncar!hsdndev!cmcl2!lanl!cochiti.lanl.gov!jlg
From: jlg@cochiti.lanl.gov (Jim Giles)
Newsgroups: comp.os.misc
Subject: Re: Globbing
Message-ID: <18511@lanl.gov>
Date: 20 Mar 91 17:39:45 GMT
References: <17602@lanl.gov> <WG0A148@xds13.ferranti.com> <18205@lanl.gov> <A23AFH9@xds13.ferranti.com> <18365@lanl.gov> <B.3A_=8@xds13.ferranti.com>
Sender: news@lanl.gov
Reply-To: jlg@cochiti.lanl.gov (Jim Giles)
Organization: Los Alamos National Laboratory
Lines: 99

In article <B.3A_=8@xds13.ferranti.com>, peter@ficc.ferranti.com (Peter
da Silva) writes:
|> [...]
|> What you're saying makes perfect sense in an ideal world, but that's what
|> I've been saying all along. In the real world, you have to assume that
|> the program you're passing stuff to might decide to glob it.

If I don't know what a program is going to do with its arguments,
I ain't a-goin' to use the program!!  Period.

|> In article <18365@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes:
|> > [... lots of nested quotes or repeated escape symbols ...]
|> 
|> Sounds like the programmer screwed up somewhere. I've never had to nest
|> more than two quotes. [...]

Then, you've never sent arguments to shell scripts which in turn
invoke shell scripts, which in turn invoke shell scripts, ....  This
is a common practice on UNIX systems.  Often, systems 'programs' are
implemented just that way.

|> [...]
|> > |> 	OPEN(NAME='JGTEST.TXT',TYPE=UNKNOWN)
|> 
|> > Good example!  Note how the ultimate consumer of the string "JGTEST.TXT"
|> > will get exactly that string
|> 
|> But that's not the string he started with. He started with "'JGTEST.TXT'".

The string he started with was ->  JGTEST.TXT  <- with no markers on it
at all.  The apostrophes on in the open statement were there to clearly
denote that fact.  The apostrophes are _not_ part of the value - and
are only there to prevent the Fortran compiler from incorrectly trying
to evaluate the string as a variable name or expression.  As I said
before, Fortran does evaluation in the the most meaningful context - the
caller.  In a command language, the most meaningful context for argument
evaluation is in the recipient.  In _neither_ language should the 
parameter passing mechanism do the argument evaluation.

|> [...]
|> He quoted it at the top level (the language) and it then went all the way
|> down with no further quotes because none of the levels in the way did any
|> globbing. Your argument that tools should glob is like expecting OPEN to
|> glob.

A good place for it.  If _any_ component of the Fortran programming
environment were to be given the ability to match wildcards against
file names, the I/O library would be the correct place to do it. OPEN
doesn't glob because nothing in a standard Fortran environment does.

What _you_ are recommending is like expecting the LOADER to glob (or
insert globbing code) when it links procedures together.

|> [...]
|> > Still, the difference between what I want and what other programming
|> > languages do is trivial - I just pointed it out:  The arguments should
|> > only be evaluated _once_ in either the command or the programming
language.
|> 
|> And they are. And they are evaluated in a context where the meaning of
|> the argument is known: by the caller. [...]

They are evaluated by the _shell_ NOT the _caller_.  The _shell_ has
no business doing so.  The _shell_ does NOT do it only once, it globs
every time an argument gets passed to it - which in a UNIX environment
may be very often indeed.  The _shell_ hasn't got the slightest idea
what the argument means, but _assumes_ that it's a file name.  So, your
last sentence above is completely false.

|> [...]                                 The shell is still a programming
|> language, however much you deny it.

I didn't deny it.  But I will.  The shell scripting syntax and semantics
do indeed constitute a language.  But, the shell itself is an intermediary.
It is nothing more than a particularly poorly informed interpreter.  If
the shell scritping language allowed arguments to be given _types_, then
the problem would solve itself: the shell could glob (once) any argument
which had the data type <list-of-files>.  Any other argument it would
leave alone.  This would be a workable solution.  _BUT_ it would require
that all commands be declared to the shell so that it would know the
types of the command's arguments.  Various people in this discussion have
asserted that this is an unacceptable solution.*  The second best is to 
each command evaluate its own arguments - the command knows what they
mean.  The third best solution is to have the user explicitly evaluate
the arguments himself - he also knows what they mean (or he has no 
business using the command), but it's just a lot of unnecessary work
to force the user to glob manually.  The _worst_ solution is to have
the _shell_ do the argument evaluation _blindly_ - which is what you
are advocating.

J. Giles

Footnote[*]:  I have no idea why people think a shell which required 
all command names to be declared is unacceptable.  It is standard CS
dogma that all things must be declared in a language.  Yet this is
resisted in the command language.  These days, it's even common for
languages to require declarations of external procedures (right down
to the types of the arguments) - this is just the sort of model for
command languages I have described above.