Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!uwm.edu!cs.utexas.edu!hellgate.utah.edu!uplherc!wicat!meph!gsarff From: gsarff@meph.UUCP (Gary Sarff) Newsgroups: comp.arch Subject: Re: Globbing Message-ID: <00085@meph.UUCP> Date: 8 Mar 91 18:24:58 GMT References: <1991Feb18.152347.28521@dgbt.doc.ca> <474@bria> <19217@cbmvax.commodore.com> <5573:Feb2307:19:4491@kramden.acf.nyu.edu> Reply-To: gsarff@meph.UUCP Organization: WICAT Systems Inc., Orem Utah Lines: 123 Sorry for the diatribe, but this posting struck a sensitive spot for me. In article <5573:Feb2307:19:4491@kramden.acf.nyu.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In article <19217@cbmvax.commodore.com> daveh@cbmvax.commodore.com (Dave Haynie) writes: >> Unless all possible commands fit into the >> command [flags] arg1..argN globbed_filesystem_arg >> model, you're pretty much in trouble if you only have shell globbing. > >Why? You didn't provide any justification for this statement. > >> Program driven globbing doesn't force inconsistency, and certainly shell >> globbing doesn't force consistency, as UNIX is more than happy to prove to >> anyone using it. > >Why? You didn't provide any justification for this statement. > >Name one thing that you could accomplish by moving globbing into >programs---that you couldn't accomplish at least as easily by modifying >the shell. After all, you're complaining about the user interface, and >the shell is the program responsible for that interface. Ok, one thing, modifying the shell to know about all the argument types/usages of all the utilities you are going to run from it. It seems to me that somebody has to type all that information in so the shell will know, it isn't a mind reader. And it seems to me that the information may have to be entered more than once for the different shells, csh,sh,ksh,etc. _AND_ updated every time some user wants to add a program, again possibly multiple times for the different flavours of shell on the system. Is this easier than _not_ adding code to the shell and _not_ entering argument information, but doing it _ONCE_ when the program/utility is written? > >Here are some disadvantages: 1. Programs (such as shell scripts) often >invoke other programs, even with (gasp) arguments. As is, it suffices to >use an occasional -- to turn off all argument processing. With globbing >in every program, this would become much harder. Really? (as an aside, from looking at many shell scripts, the escaping of shell metacharacters is hardly "occasional", and is one reason that some people have the opinion, as I believe it was Jim Giles who started this by saying that shell scripts appeared to him to be line noise.) Let's take an example. The OS I develop on at work has a pattern searching utility similar to grep, and all of our OS utilities call a library routine to parse their command lines. As an example, suppose we want to scan all of our .c source files in a directory for the string inlcude*.h (include followed by some possibly empty sequence, followed by ".h") We have (our OS called WMCS) the following command lines: Example 1: WMCS: wscan *.c include*.h UNIX: grep include\*.h *.c Which is easier, or more intuitive? I have to remember to escape the *.h field in UNIX. Or I could go to the trouble of turning the globbing off for this one command and then turning it back on. Example 2: And what about the case where there are a _LOT_ of files in the directory. A customer sent in a WREN VII 1.2 Gigabyte hard drive a few weeks back and wanted us to rescue the data off of it and prepare a new drive to send back to him with his data intact. One directory on that drive contained over 24,000 files! (Big! application) My command still works (even delivered the files in alphabetical order!) The best unix could do was print "Command line too long", because I had used * for file wildcarding. A lot of help that is, it didn't even scan my files! Which is easier now? Oh, the UNIX way, I should have thought of that and used "find" or written a shell script on the command line and suffered the process creation overhead as the thing loaded and ran grep 24,000 times, silly me. Which looks easier now? Example 3: And what about this case, I want to do the same scan on the entire disk, so for the two OS's we get the following: wscan /*/*.c include*.h Unix: probably something like a shell loop or using find find / -name \* -exec grep include\*.h \{\} \; Now which is easier? Five backslashes for UNIX, the perfect environment for developers? Bah! Or of course remember to turn the shell globbing off for the find and then turn it back on. Again Bah! >3. Programmers shouldn't be forced to manually handle >standard conventions just to write a conventional program. Ever heard of >modularity? Oh, but programmers and users should be forced to remember which arguments need to be escaped and which don't, or turn off globbing and turn it back on, or have to write shell scripts to hide the backslashes and the quotes from the light of day, and remember that they can't put too many files in one directory or all the unix utilities that use shell globbing will not work in that directory? And this seems reasonable to you? Every time I have asked which seems easier above, I meant for the user. Many of us here in comp.arch have "users" of our computers, our OS's, our software, and in turn we ourselves are users of software and OS's to get our work done. I for one have more important things to do, like improving the kernel and utilities, to spare time remembering what should be quoted and what should not. >4. The system is slow enough as is without every application scanning its >arguments multiple times and opening up one directory after another. Either the shell scans the directory or the utility does, how can one be slower than the other? --------------------------------------------------------------------------- Do memory page swapping to floppies?, I said, yes we can do that, but you haven't lived until you see our machine do swapping over a 1200 Baud modem line, and keep on ticking. ..gsarff@meph.UUCP