Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!isi.edu!gremlin!charming!jpl From: jpl@charming.nrtc.northrop.com (Jeffrey P. Lankford ) Newsgroups: comp.lang.rexx Subject: Re: need a REXX-flavored version of getopt() Summary: Postings should be more specific re: REXX operating environment Message-ID: <9493@gremlin.nrtc.northrop.com> Date: 14 Sep 90 02:19:28 GMT References: <90239.171548BOYDJ@QUCDN.BITNET> <147@rufus.UUCP> <176@rufus.UUCP> Sender: news@gremlin.nrtc.northrop.com Reply-To: jlankford@nrtc.northrop.com Distribution: comp Organization: Northrop Research & Technology Center Lines: 126 In article <176@rufus.UUCP> drake@drake.almaden.ibm.com writes: > ... CMS *does* have something similar to "getopt()" to assist in parsing >CMS-flavored command lines. It's called "PARSECMD", and is present in >all recent releases of CMS. Before anyone forms the wrong impression that CMS is actually a useful programming environment, some comparison of PARSECMD and getopt() is in order. The salient features of getopt(3) are: * uses a (simple) grammer to specify valid options and any associated (optional) arguments, * accepts a less restrictive format for an instance of options and arguments and generates a canonical expression of the options and arguments, * parses the instance and generates an error condition in the event that the instance violates the grammer, * the getopt Unix library function returns individual tokens, while the getopt Unix command returns the cononical format (easily tokenized, since tokens are merely separated by white space). In a previous posting, Sam Drake had a good description with some examples. I would repeat that here but i lost that posting, so here is another example (vaguely remembered from a man page somewhere), where the speification is of a statment of "option 'a' or 'b' or 'o' (followed by one or more argument) followed by anything": c = getopt(argc,argv,"abo:") /* C language invocation */ set -- `getopt abo: $*` # Bourne shell invocation All the following are legal statements in this grammer (the last being in cononical form), where "f1" and "f2" are (optional) arguments, not options: -a -o arg f1 f2 -aoarg f1 f2 -oarg -a f1 f2 -a -o arg -- f1 f2 Getopt has the following limitations: * no means of describing mutually exclusive options (i.e., either 'a' or 'b' or neither is legal, but not both -- this must handled by user code that deals with the individual options), * no means of specifying that arguments to an option may or may not be present (ie "o:" specifies that if the 'o' option appears it must be followed by one (1) or more arguments -- in my extension to getopt i arbitrarily use "o." to indicate that any 'o' may be followed by zero (0) or more arguments), * limited to about 50 options (ie, a-zA-Z -- although any character in the character set could be used, do you really what to use non-printing characters for option flags?), * no multi-character options (ie, "ab" is always 'a' and 'b' -- although this could be considered an advantage rather than a drawback). PARSECMD is both a CMS command and a macro, where the command is callable from REXX and the macro from assembler programs. Suppose we want to parse the command: MYCmd1 ft ft [ ( [Disk|PRint] [NUMrecs nnn] [)] ] where '(' and ')' are literals, '|' is an or, '[' and ']' enclose and optional part, upper case indicates abbreviation, and 'nnn' indicates and numeric constant. The grammer is specified in a DLCS (acronym city here we come) formatted file as follows: :DLCS DMS USER AMENG ;; :CMD MMYCMD1 MYCMD1 MYCMD1 3 :; :SYN MY1 3 :; :OPR FCN(FN) :; :OPR FCN(FT) :; :OPT KWL( ) :; :OPT KWL() FCN(PINTEGER) :; The grammer specification is converted to a format that PARSECMD can manipulate by invoking 'CONVERT COMMANDS' CMS command. Then 'SET LANGUAGE' must be invoked to activate the appropriate language parser. The French grammer spec might look like: :DLCS DMS USER FRANC;; :CMD MMYCMD1 MYCMD1 FRANCMD1 8 :; :SYN MY1 3 :; :OPR FCN(FN) :; :OPR FCN(FT) :; :OPT KWL( ) :; :OPT KWL() FCN(PINTEGER) :; Then your REXX script can call 'PARSECMD MYCMD1' to read the command line and generate a cononical form, where options are tokenized to non-abbreviated form and shoved into a compound symbol (named 'token' i believe). As with getopt, the CMS command parser mechanism leaves to the user code handling of mutually exclusive options (despite the fact that the grammer implies some sort of exclusive option specification). Unlike getopt, an arbitrary number of arguments associated with an option is not definable; however, option can be multiple characters with an abbreviation capability (this is still way to verbose). Bottom line: though perhaps slightly less powerful, getopt is vastly easier to use than the CMS command parser mechanism; the command syntax of getopt is more elegant (to my eye) and has an implicit escape to permit arbitrary number of trailing tokens, whereas the CMS mechanism requires every token to be predefined (which is onne reason why CMS commands can't easily handle multiple filename arguments). Any further discusion of getopt or PARSECMD should be directed to /dev/null. Whew, now that we've flogged this topic past the pearly gates, (and it's not even a REXX topic), lets get back to discussing REXX. How about any of the following (where clearly the environment of choice should be CMS): * Portability issues among the various run-time environments * Performance tuning tricks for environment X * Debugging techniques for environment Y * Language extension proposals (only well-reasoned arguments need apply) * Cute functions i (no, not me -- you) have coded * Standardization efforts (:-) [WARNING: Surgeon General advises that discussion of this topic may be hazardous to your health] Also, at the risk of being pedantic, I request that future postings should consider explicit identification of the run-time environment(s) (of course, i exclude myself because after this posting everyone knows i'm usng CMS/REXX), since it's evident that REXX features vary among the run-time environments on which it is supported (Amiga, CMS, Unix(?), ...). Now here's a question (reply directly and i might post summary) for all those folks looking for REXX interpreter/compiler for Unix. Why? When you could use Bourne shell, or csh, or ksh, or tcsh (or *sh) (and all the Unix commands expr, awk, sed), why use REXX? I can't imagine any hefty REXX applications being ported without modification to a different environment (say CMS to Unix), and trivial applications could easily be re-written. REXX without extensions would make a lousy Unix command interpretter (no pipes or i/o redirection or job control or ...) and if the REXX application isn't a command script, but more a string processing application, why not use awk? Jeff Lankford Northrop Research and Technology Center 213/544-5394 One Research Park, Palos Verdes Peninsula, CA 90274