Xref: utzoo alt.sources:2971 comp.lang.perl:3478 Path: utzoo!utgpu!cs.utexas.edu!uunet!convex!usenet From: tchrist@convex.COM (Tom Christiansen) Newsgroups: alt.sources,comp.lang.perl Subject: The Answer to All Man's Problems (part 4 of 6) Keywords: man perl Message-ID: <1991Jan07.222202.10687@convex.com> Date: 7 Jan 91 22:22:02 GMT References: <1991Jan07.220902.9726@convex.com> Sender: usenet@convex.com (news access account) Reply-To: tchrist@convex.COM (Tom Christiansen) Followup-To: alt.sources.d Organization: CONVEX Software Development, Richardson, TX Lines: 1001 Nntp-Posting-Host: pixel.convex.com Xsection would be recognized, a X.B manq Xdirectory would not be, and while a X.B man3f Xdirectory would be recognized, a X.B man3x11 Xdirectory would not be. X.PP XLikewise, the possible subsections Xfor a man page were also embedded in the source code, so Xa man page named something like X.I /usr/man/man3/XmLabel.3x11 Xwould not be found because X.B 3x11 Xwas not in the hard-coded list of viable subsections. XSome systems install all man pages stripped of subsection Xcomponents in the file name. This situation is less than optimal Xbecause it proves useful to be able Xto supply both a X.M getc 3f Xand a X.M getc 3s . XDistinguishing between subsections is Xparticularly convenient with the ``intro'' man pages; Xa vendor could supply X.M intro 3 X.M intro 3a , X.M intro 3c , X.M intro 3f , X.M intro 3m , X.M intro 3n , X.M intro 3r , X.M intro 3s , Xand X.M intro 3x Xas introductory man pages for the various libraries. XHowever, the task of running X.M access 2 Xon all possible subsections is slow and tedious, requiring Xrecompilation whenever a new subsection is invented. X.NH XReferences in the Filesystem X.PP XThe existing man system had no elegant way to handle Xman pages containing more than one entry. For example, X.M string 3 Xcontains references to X.M strcat 3 , X.M strcpy 3 , Xamongst others. Because the \fIman\fP program looks for Xentries only in the file system, these extra references must be Xrepresented as files that reference the base man page. The most Xcommon practice is to have a file consisting of Xa single line Xtelling X.I troff Xto source the other man page. XThis file would read something like: X.sp X.ti 5 X.CW X\&.so man3/string.3 X.CE X.sp XOccasionally, Xextra references are created with a link in the file Xsystem (either a hard link or a symbolic one). Except when Xusing Xhard links, this method wastes Xdisk blocks and inodes. In any case, Xthe directory gains more entries, slowing Xdown accesses to files in those directories. Logic Xmust be built into the \fIman\fP program to Xdetect these extra references. XIf not, when man pages are reformatted into their Xcat directories, separate formatted man pages are stored Xon disk, wasting substantial amounts of disk space Xon duplicate information. XOn systems with numerous man pages, the directories can grow Xso large that all man Xpages for a given section cannot be listed on the command line Xat one time because of kernel restrictions on the total length of the Xarguments to X.M exec 2 . XBecause of the need to store reference information Xin the file system, the problem is only made worse. XThis often happens in Xsection 3 after the man pages for the X Xlibrary have been installed, but Xcan occur in other sections as well. X.PP XThe X.M makewhatis 8 Xprogram is a Bourne shell script that generates the X.I /usr/lib/whatis Xindex, and is used by X.M apropos 1 Xand X.M whatis 1 Xto provide one-line summaries of man pages. These Xprograms are part of the X.I man Xsystem Xand are often links to each other and sometimes to X.I man Xitself. XIf any of Xthe man subdirectories contain more files than the shell Xcan successfully expand on the command line, the X.I makewhatis Xscript fails Xand no index is generated. When this occurs, X.I whatis Xand X.I apropos Xstop working. The X.M catman 8 Xprogram, used to pre-format raw man pages, suffers Xfrom the same problem. X.PP XOf course, X.I makewhatis Xwasn't working all that well, anyway. XIt was a wrapper around many calls to little programs Xthat each did a small piece of the work, making it Xrun slowly. XIt, too, had a hard-coded pathname for where man pages resided Xon disk and which sections were permitted. X.I Makewhatis Xdidn't always extract the proper information Xfrom the man page's \s-1NAME\s0 Xsection. When it did, this information was sometimes Xgarbled due to embedded X.I troff Xformatting information. XBut even garbled information was better Xthan none at all. XEven so, these programs left some things to be desired. X.I Apropos Xdidn't understand regular expression searches, and both Xit and X.I whatis Xpreferred to do their own lookups using basic, unoptimized C functions Xlike X.M index 3 Xrather than using a general-purpose optimized string search program Xlike X.M egrep 1 . X.NH XThe Solution X.NH 2 XA Real Database X.PP XThe problem in all these cases appeared to be that the filesystem Xwas being used as a database, and that this paradigm did not hold Xup well to expansion. Therefore the solution was to move Xthis information into a database for more rapid access. XUsing this database, X.I man Xand X.I whatis Xneed no longer call X.M access 2 Xto test all possible locations for the desired man page. XTo solve the other problems, X.M makewhatis 8 Xwould be recoded so it didn't rely on the shell Xfor looking at directories. X.NH 2 XCoding in Perl X.PP XWhen the project was first contemplated, the Xperl programming language by Larry Wall was rapidly Xgaining popularity as an alternative to C for tasks that Xwere either too slow when written as shell scripts, or Xsimply exceeded the shells' somewhat limited capabilities. XSince perl was Xoptimized for parsing text, had convenient X.M dbm 3x Xsupport built in to it, and the task really didn't seem complex Xenough to merit a full-blown treatment in C or C++, Xperl was selected as the language of choice. XHaving all code written in perl would also help support Xheterogeneous environments because the resulting scripts could Xbe copied and run on any hardware or software platform supporting Xperl. No recompilation would be required. X.PP XSome concern existed about choosing Xan interpreted language when one of the issues to address was Xthat of speed. It was decided to do the prototype in perl Xand, if necessary, translate this into C should performance Xprove unacceptable. X.PP XThe first task was to recode X.M makewhatis 8 Xto generate the new X.I whatis Xdatabase using \fIdbm\fP. The X.M directory 3 Xroutines were used rather than shell globbing to circumvent Xthe problem of large directories breaking shell wildcard Xexpansions. Perl proved to be an appropriate choice for this Xtype of text processing (see Figure 1). X.BF "\fImakewhatis\fP excerpt #1" Xs/\e\ef([PBIR]|\e(..)//g; # kill font changes Xs/\e\es[+-]?\ed+//g; # kill point changes Xs/\e\e&//g; # and \e& Xs/\e\e\e((ru|ul)/_/g; # xlate to '_' Xs/\e\e\e((mi|hy|em)/-/g; # xlate to '-' Xs/\e\e\e*\e(..//g && # no troff strings X print STDERR "trimmed troff string macro in NAME section of $FILE\en"; Xs/\e\e//g; # kill all remaining backslashes Xs/^\e.\e\e"\es*//; # kill comments Xif (!/\es+-+\es+/) { X # ^ otherwise L-devices would be L X print STDERR "$FILE: no separated dash in $_\en"; X $needcmdlist = 1; # forgive their braindamage X s/.*-//; X $desc = $_; X} else { X ($cmdlist, $desc) = ( $`, $' ); X $cmdlist =~ s/^\es+//; X} X.EF X.NH 2 XDatabase Format X.PP XThe database entries themselves are conveniently Xaccessed as arrays from perl. To save space and Xaccommodate man pages with multiple references, two Xkinds of database entries exist: direct and indirect. XIndirect entries are simply references to direct entries. XFor example, indirect entries for X.M getc 3s , X.M getchar 3s , X.M fgetc 3s , Xand X.M getw 3s Xall point to the real entry, which is X.M getc 3s . XIndirect entries are created for multiple entries in Xthe \s-1NAME\s0 section, for symbolic and hard links, and Xfor X.B \&.so Xreferences. Using the \s-1NAME\s0 section is the preferred Xmethod; the others are supported for backwards compatibility. X.PP X.ne 4 XAssuming that the \s-1WHATIS\s0 array has been bound to the Xappropriate X.I dbm Xfile, storing indirect entries is trivial: X.sp X.CW X.ti 1i X$WHATIS{'fgetc'} = 'getc.3s'; X.sp X.CE XWhen a program encounters an indirect entry, such as Xfor \fIfgetc\fP, it must make another lookup based on Xthe return value of first lookup (stripped of its Xtrailing extension) until it finds a direct entry. The Xtrailing extension is kept so that an indirect reference Xto X.M gtty 3c Xdoesn't accidentally pull out X.M stty 1 Xwhen it really wanted X.M stty 3c . X.PP XThe format of a direct entry is more complicated, because Xit needs to encode the description to be used by X.M whatis 1 Xas well as the section and subsection information. XIt can be distinguished from an indirect entry because Xit contains four fields delimited by control-A's (\s-1ASCII 001\s0), Xwhich are themselves prohibited from being in any Xof the fields. The fields are as follows: X.br X.in +5n X.IP 1 XList of references that point to this man page; this Xis usually everything to the left of the hyphen Xin the \s-1NAME\s0 section. X.IP 2 XRelative pathname of the file the man page is kept in; Xthis is stored for the indirect entries. X.IP 3 XTrailing component of the directory in which the Xman page can be found, such as X.B 3 Xfor \fBman3\fP. X.IP 4 XDescription of the man page for use by Xthe X.I whatis Xand X.I apropos Xprograms; basically everything to the right of the hyphen in the XN\s-1AME\s0 section. X.in -5n X.PP XAt first glance, the third field would Xseem redundant. It would appear that you could Xderive it from the character after the dot in the second field. XHowever, to support arbitrary subdirectories like X.B man3f Xor X\fBman3x11\fP, you must also know the name of the Xdirectory so you don't look in X.B man3 Xinstead. Additionally, a long-standing tradition exists Xof using the X.B mano Xsection Xto store old man pages from arbitrary sections. XFurthermore, man pages are sometimes installed in the Xwrong section. To support these scenarios, restrictions Xregarding the format of filenames used for man pages were Xrelaxed in \fIman\fR, X\fImakewhatis\fR, and \fIcatman\fR, Xbut warnings would be issued by X.I makewhatis Xfor man pages installed in directories that don't have Xthe same suffix as the man pages. X.NH 2 XMultiple References to the Same Topic X.PP XA problem arises from the fact that the same topic Xmay exist in more than one section of the manual. XWhen a lookup is performed on a topic, Xyou want to retrieve all possible man page locations Xfor that topic. The X.I whatis Xprogram wants to display them all to the user, while Xthe X.I man Xprogram will either show all the man pages X(if the X.B \-a Xflag is given) or Xsort what it has retrieved according to a particular section and Xsubsection precedence, by default showing entries from section X1 before those from section 2, and so forth. Therefore, Xeach lookup may actually return a list of direct and Xindirect lookups. This list is delimited by control-B's X(\s-1ASCII 002\s0), which are stripped from the data fields, should Xthey somehow contain any. The code for storing a direct entry Xin the X.I whatis Xdatabase is featured in Figure 2. X.BF "\fImakewhatis\fP excerpt #2" Xsub store_direct { X local($cmd, $list, $page, $section, $desc) = @_; # args X local($datum); X X $datum = join("\e001", $list, $page, $section, $desc); X X if (defined $WHATIS{$cmd}) { X if (length($WHATIS{$cmd}) + length($datum) + 1 > $MAXDATUM) { X print STDERR "can't store $page -- would break DBM\en"; X return; X } X $WHATIS{$cmd} .= "\e002"; # append separator X } X $WHATIS{$cmd} .= $datum; # append entry X} X.EF X.KE X.PP XNotice the check of the new datum's Xlength against the value of \s-1MAXDATUM.\s0 This is because of the Xinherent limitations in the implementation of the X.M dbm 3x Xroutines. This is 1k for X.I dbm Xand 4k for X.I ndbm . XThis restriction will be relaxed Xif a \fIdbm\fR-compatible set of routines is written without Xthese size limitations. The \s-1GNU\s0 X.I gdbm Xroutines hold promise, but they were released after the Xwriting of these programs and haven't been investigated yet. XIn practice, these limits are seldom if ever reached, especially Xwhen X.I ndbm Xis used. X.NH XOther Problems, Other Solutions X.PP XThe rewrite of X.I makewhatis , X.I catman , Xand X.I man Xto understand multiple man trees and to use a database Xfor topic-to-pathname mapping Xdid much to alleviate the most important problems Xin the existing man system, but several minor problems Xremained. Since this was a complete rewrite of the entire Xsystem, it seemed an appropriate time to address these as well. X.NH 2 XIndexing Long Pages X.PP XSeveral of the most frequently consulted man pages on the system Xhave grown beyond the scope of a quick reference guide, Xinstead filling the function of a detailed user manual. XMan pages of this sort include those for shells, window Xmanagers, Xgeneral purpose Xutilities such as awk and perl, Xand the \s-1X11\s0 man pages. XAlthough these man pages Xare internally organized into sections and subsections that Xare easily visible on a hard-copy printout, the on-line Xman system could not recognize these internal Xsections. Instead, the user was forced to search through pages Xof output looking for the section of the man page containing Xthe desired information. X.PPe XTo alleviate this time-consuming tedium, the man program Xwas taught to parse the X.I nroff Xsource for man pages in order to build up an index of these sections Xand present them to the user on demand. XSee Figure 3 for an excerpt from the X.M ksh 1 Xindex page, displayable via the new X.B \-i Xswitch. X.BF "\fIksh\fP index excerpt" XIdx Subsections in ksh.1 Lines X 1 NAME 3 X 2 SYNOPSIS 22 X 3 DESCRIPTION 15 X 4 Definitions. 43 X 5 Commands. 338 X 6 Comments. 6 X 7 Aliasing. 107 X 8 Tilde Substitution. 47 X 9 Command Substitution. 28 X10 Process Substitution. 49 X11 Parameter Substitution. 645 X12 Blank Interpretation. 15 X13 File Name Generation. 87 X.EF X.PP XThe X.I /usr/man/idx*/ Xdirectories Xserve the Xsame function for saved indices Xas X.I /usr/man/cat*/ Xdirectories do for saved formatted man pages. XThese are regenerated as needed according the Xthe same criteria used to regenerate the cat pages. XThey can be used to index into a given man page or Xto list a man page's subsections. XTo begin at a given subsection, the user appends Xthe desired subsection to the name of the man page Xon the command line, Xusing a forward slash as a delimiter. Alternatively, Xthe user can just supply a trailing slash on the man page Xname, in which case they are presented with the index listing Xlike the one the X.B \-i Xswitch provides, then prompted for the section Xin which they are interested. A double slash indicates Xan arbitrary regular expression, not a section name. XThis is merely a short-hand notation for first running Xman and then typing X.CW X/expr X.CE Xfrom within the user's pager. XSee Figure 4 Xfor example usages of the indexing features. X.BF "Index Examples" Xman -i ksh # show sections Xman ksh/ # show sections, prompt for which one X Xman ksh/tilde Xman ksh/8 # equivalent to preceding line X Xman ksh/file Xman ksh/generat # equivalent to preceding line Xman ksh/13 # so is this X Xman ksh//hangup # start at this string X.EF X.PP XThis indexing scheme is implemented by searching the index stored in X.I /usr/man/idx1/ksh.1 Xif it exists, or generated dynamically otherwise, Xfor the requested subsection. A numeric subsection is Xeasily handled. For strings, a case-insensitive Xpattern match is first Xmade anchored to the front of the string, then \(em failing Xthat \(em anywhere in the section description. This way Xthe user doesn't need to type the full section title. XThe X.I man Xprogram starts up the pager with a Xleading argument to begin at that section. Both X.M more 1 Xand X.M less 1 Xunderstand this particular notation. XIn the first Xexample given above, this would be X.sp X.CW X.ti +.5i Xless '+/^[ \et]*Tilde Substitution' /usr/man/cat1/ksh.1 X.sp X.CE X.PP XOnce again, perl proved Xuseful for coding this algorithm concisely. The Xsubroutine for doing this is given in XFigure 5. Given an expression such as ``5'' Xor ``tilde'' or ``file'' and a pathname of the man Xpage, X.I man Xloads Xan array of subsection Xindex titles and quickly retrieves the proper Xheader to pass on to the pager. Perl's built-in X.B grep Xroutine for selecting from arrays those elements Xconforming to certain criteria made the coding easy. X.BF "Locate Subsection by Index" Xsub find_index { X local($expr, $path) = @_; # subroutine args X local(@matches, @ssindex); X @ssindex = &load_index($path); X X if ($expr > 0) { # test for numeric section X return $ssindex[$expr]; X } else { X if (@matches = grep (/^$expr/i, @ssindex)) { X return $matches[0]; X } elsif (@matches = grep (/$expr/i, @ssindex)) { X return $matches[0]; X } else { X return ''; X } X } X} X.EF X.NH 2 XConditional Tbl and Eqn Inclusion X.PP XSeveral other relatively minor enhancements were made Xto the man system in the course of its rewrite. XOne of these Xwas to include calls to X.M eqn 1 Xand X.M tbl 1 Xwhere appropriate. For instance, the \s-1X11\s0 man pages use X.I tbl Xdirectives to construct a number of tables. XIt was not sufficient to supply Xthese extra filters for all man pages. Besides the Xslight performance degradation this would incur, a Xmore serious problem exists: some systems have man pages that Xcontain embedded X.LB .TS Xand X.LB .TE Xdirectives; however, the data between them was not X.I tbl Xinput, but rather its output. They have already Xbeen pre-processed in the unformatted versions. XTo do so again causes X.I tbl Xto complain bitterly, so heuristics to check for this condition Xwere built in to the function that determines which filters Xare needed. X.PP XTo support tables and equations in man pages when viewed on-line, Xthe output must be run through X.M col 1 Xto be legible. Unfortunately, this strips the man pages Xof any bold font changes, which is undesirable because it is Xoften important to distinguish between bold and italics for Xclarity. Therefore, before the formatted man page is fed to X\fIcol\fP, all text in bold (between escape sequences) Xis converted to character-backspace-character combinations. These Xcombinations Xcan be recognized by the user's pager as a character in Xa bold font, just as underbar-backspace-character is recognized Xas an italic (or underlined) one. Unfortunately, while X.I less Xdoes recognize this convention, X.I more Xdoes not. By storing the formatted versions with all escape-sequences Xremoved, the user's pager can be invoked without a pipe to X.I ul Xor X.I col Xto fix the reverse line motion directives. This provides the pager with Xa handle on the pathname of the cat page, allowing users to back up Xto the start of man pages, even exceptionally long ones, without exiting the X.I man Xprogram. This would not be feasible if the pager were being fed Xfrom a pipe. X.NH 2 XTroffing and Previewing Man Pages X.PP XNow that many sites have high-quality laser printers Xand bit-mapped displays, it seemed desirable for X.I man Xto understand how to direct X.I troff Xoutput to these. A new option, \fB-t\fR, Xwas added to mean that X.I troff Xshould be used instead of X\fInroff\fR. XThis way users can easily get pretty-printed versions of Xtheir man pages. X.PP XFor workstation or X-terminal users, X.I man Xwill recognize Xa \s-1TROFF\s0 environment variable or Xcommand line argument to indicate an Xalternate program to use for typesetting. X(This presumes that the program recognizes X.I troff Xoptions.) This method often produces more legible output Xthan X.I nroff Xwould, allows the user to stay in their office, and saves Xtrees as well. X.NH 2 XSection Ordering X.PP XThe same topic can occur in more than one section of Xthe manual, but Xnot all users on the system want the same default Xsection ordering that X.I man Xuses to sort these possible pages. XFor instance, XC programmers who want to look up the man page for X.M sleep 3 Xor X.M stty 3 Xfind that by default, X.I man Xgives them X.M sleep 1 Xand X.M stty 1 Xinstead. A \s-1FORTRAN\s0 programmer may want to see X.M system 3f , Xbut instead gets X.M system 3 . XTo accommodate these needs, the X.I man Xprogram will honor a \s-1MANSECT\s0 environment Xvariable (or a X.B \-S Xcommand line switch) containing a list of section suffixes. XIf subsection or multi-character section ordering Xis desired, this string should be colon-delimited. XThe default ordering is ``ln16823457po''. XA C programmer might set his \s-1MANSECT\s0 to be ``231'' instead to access Xsubroutines and system calls before commands of the same name. XA \s-1FORTRAN\s0 programmer might prefer ``3f:2:3:1'' to get Xat the \s-1FORTRAN\s0 versions of subroutines before the standard XC versions. XSections absent from the \s-1MANSECT\s0 have a sorting priority Xlower than any that are present. X.NH 2 XCompressed Man Pages X.PP XBecause man pages are \s-1ASCII\s0 text files, they stand to benefit from Xbeing run through the X.M compress 1 Xprogram. XCompressing man pages Xtypically yields disk space savings of around 60%. XThe start-up time for decompressing the man page when Xviewing is not enough to be bothersome. However, running X.I makewhatis Xacross compressed man pages takes significantly longer Xthan running it over uncompressed ones, so some sites may wish to Xkeep only the formatted pages compressed, not the unformatted Xones. X.PP XTwo different Xways of indicating compressed man pages seem to exist Xtoday. One is where the man page itself has an attached X.B .Z Xsuffix, yielding pathnames like X\fI/usr/man/man1/who.1.Z\fR. XThe other way is to have Xthe section directory contain the X.B .Z Xsuffix Xand have the files named normally, as in X\fI/usr/man/man1.Z/who.1\fR. XEither strategy is supported to ease porting Xthe program to other systems. XAll programs dealing with man pages have been updated to Xunderstand man pages stored in compressed form. X.NH 2 XAutomated Consistency Checking X.PP XAfter receiving a half-dozen or so bug reports regarding Xnon-existent man pages referenced in \s-1SEE\s0 \s-1ALSO\s0 sections, Xit became apparent that the only way to verify that all Xbugs of this nature had really been expurgated would be to automate the process. XThe X.I cfman Xprogram Xverifies that man pages Xare mutually consistent in their \s-1SEE\s0 \s-1ALSO\s0 references. It Xalso reports man pages whose X.LB .TH Xline claims the man page is in Xa different place than X.I cfman Xfound it. X.I Cfman Xcan locate man pages Xthat are improperly referenced rather than merely missing. It Xcan be run on an entire man tree, or on individual files as Xan aid to developers writing new man pages. X.BF "Sample \fIcfman\fP run" Xat.1: cron(8) really in cron(1) Xbinmail.1: xsend(1) missing Xdbadd.1: dbm(3) really in dbm(3x) Xksh.1: exec(2) missing Xksh.1: signal(2) missing Xksh.1: ulimit(2) missing Xksh.1: rand(3) really in rand(3c) Xksh.1: profile(5) missing Xld.1: fc(1) really in fc(1f) Xsccstorcs.1: thinks it's in ci(1) Xuuencode.1c: atob(n) missing Xyppasswd.1: mkpasswd(5) missing Xfstream.3: thinks it's in fstream(3c++) Xftpd.8c: syslog(8) missing Xnfmail.8: delivermail(8) missing Xversatec.8: vpr(1) missing X.EF X.PP XThe amount of output produced by X.I cfman Xis startling. XA portion of the output of a sample run Xis seen in Figure 6. XSome of its complaints are relatively harmless, such as X.I dbm Xbeing in section X.B 3x Xrather than section X\fB3\fR, because the X.I man Xprogram can find entries with the subsection left off. XHaving inconsistent X.LB .TH Xheaders is also harmless, although the printed Xman pages will have headers that do not reflect their Xfilenames on the disk. XHowever, entries that refer to pages that are truly absent, like X.M exec 2 Xor X.M delivermail 8 , Xmerit closer attention. X.NH 2 XMultiple Architecture Support X.PP XAs mentioned in the discussion of the need for a \s-1MANPATH\s0, Xa site may for various reasons wish to maintain several Xcomplete sets of man pages on the same machine. Of course, Xa user could know to specify the full pathname of the Xalternate tree on the command line Xor set up their environment appropriately, but this is Xinconvenient. Instead, it is preferable Xto specify the machine type on the command line and let Xthe system worry about pathnames. X.ne 5 XConsider these examples: X.br X.CW X.nf X.na X.in +.5i Xman vax csh Xapropos sun rpc Xwhatis tahoe man X.in -.5i X.CE X.ad X.fi X.PP XTo implement this, Xwhen presented with more than one argument, X.I man X(in any of its three guises) Xchecks to see whether the first non-switch argument Xis a directory beneath X.I /usr/man . XIf so, it automatically adjusts its \s-1MANPATH\s0 to that subdirectory. X.PP XNot all vendors use precisely the same set of X.M man 7 Xmacros for formatting their man pages. Furthermore, it's Xhelpful to see in the header of the man page which manual Xit came from. The X.I man Xprogram therefore looks for a local X.I tmac.an Xfile in the root of the current man tree for alternate macro Xdefinitions. If this file exists, it will be used rather than Xthe system defaults for passing to X.I nroff Xor X.I troff Xwhen reformatting. X.NH XPerformance Analysis X.PP XThe X.I man Xprogram is one that is often used on the system, Xso users are sensitive to any significant degradation Xin response time. Because it is written in perl (an Xinterpreted language) this was cause for concern. XOn a \s-1CONVEX C2\s0, the C version runs faster when only Xone element is present in the \s-1MANPATH\s0. XHowever, when the \s-1MANPATH\s0 contains four Xelements, the C version bogs down considerably because of Xthe large number of X.M access 2 Xcalls it must make. X.PP XThe start-up time on the parsing Xof the script, now just over 1300 lines long, is around X0.6 seconds. This time can be reduced by dumping the Xparse tree that perl generates to disk and executing that instead. XThe expense of this action is disk space, as the current implementation Xrequires that the whole perl interpreter be included in the Xnew executable, not just the parse tree. This method Xyields performance superior to that of the C version, Xirrespective of the number of components in the user's \s-1MANPATH\s0, Xexcept occasionally on the initial run. This is because the Xprogram needs to be loaded Xinto memory the first time. If perl itself is installed ``sticky'' Xso it is memory resident, start-up time improves considerably. XIn any case, the Xtotal variance (on a \s-1CONVEX\s0) is Xless than two seconds in the worst case (and often Xunder one second), so it was deemed acceptable, particularly Xconsidering the additional functionality the perl version offers. X.PP XNothing in the algorithms employed in the X.I man Xprogram require that it be written in perl; Xit was just easier this way. It could be rewritten in C Xusing X.M dbm 3x Xroutines, although the development time would probably Xbe much longer. X.PP XThe X.I makewhatis Xprogram was originally a conglomeration of man calls to various individual Xutilities such as X\fIsed\fP, X\fIexpand\fP, X\fIsort\fP, and others. The perl rewrite runs in less than half the time Xof the original, and does a much better job. There are two Xreasons for the speed increase. The first is the cost of the numerous X.M exec 2 Xcalls made via the shell script used by the old version of X.I makewhatis . XThe second is that Xperl is optimized for text processing, which is most of what X.I makewhatis Xis doing. X.PP XTotal development time was only a few weeks, Xwhich was much shorter than originally anticipated. The short Xdevelopment cycle was chiefly attributable to Xthe ease of text processing in perl, the many built-in Xroutines for doing things that in C would have required Xextensive library development, and, last but not at all least, Xthe omission of the compilation stage in the normal edit-compile-test Xcycle of development when working with non-interpreted languages. X.NH XConclusions X.PP XThe system described above has been in operation for the last Xsix months on a large local network consisting of three dozen X\s-1CONVEX\s0 machines, a token \s-1VAX\s0, quite a few \s-1HP\s0 workstations Xand servers, and innumerable Sun workstations, all running different Xflavors of \s-1UNIX\s0. Despite this heterogeneity, Xthe same code runs on all systems without alterations. XFew problems have been seen, and those that did arise were quickly Xfixed in the scripts, which could be immediately redistributed Xto the network. The principal project goals of improved functionality, Xextensibility, and execution time were adequately met, and the Xexperience of rewriting a set of standard \s-1UNIX\s0 utilities Xin perl was an educational one. XMan pages stand a much better chance of being internally consistent Xwith each other. XResponse from the user and development community has Xbeen favorable. They have Xbeen relieved by the many bug fixes and pleasantly surprised Xby the new functionality. The suite of man programs will replace Xthe old man system in the next release of \s-1CONVEX\s0 utilities. X.\" Should be .BB here but that seems to mutilate my last BF figure X.sp 3 X.QP X.I X.SM XTom Christiansen left the University of Wisconsin with an \s-1MS-CS\s0 Xin 1987 Xwhere he had been a system administrator for 6 years to join X\s-1CONVEX\s0 XComputer Corporation in Richardson, Texas. XHe is a software development engineer Xin the Internal Tools Group there, designing software tools Xto streamline software development and systems administration Xand to improve overall system security. X.BE SHAR_EOF if test 34978 -ne "`wc -c < 'man.ms'`" then echo shar: "error transmitting 'man.ms'" '(should have been 34978 characters)' fi chmod 664 'man.ms' fi echo shar: "extracting 'COPYING'" '(151 characters)' if test -f 'COPYING' then echo shar: "will not over-write existing file 'COPYING'" else sed 's/^ X//' << \SHAR_EOF > 'COPYING' X# You are free to use, modify, and redistribute these scripts X# as you wish for non-commercial purposes provided that this X# notice remains intact. SHAR_EOF if test 151 -ne "`wc -c < 'COPYING'`" then echo shar: "error transmitting 'COPYING'" '(should have been 151 characters)' fi chmod 664 'COPYING' fi echo shar: "extracting 'man'" '(39119 characters)' if test -f 'man' then echo shar: "will not over-write existing file 'man'" else sed 's/^ X//' << \SHAR_EOF > 'man' X#!/usr/local/bin/perl X# X# man - perl rewrite of man system X# tom christiansen X# X# Copyright 1990 Convex Computer Corporation. X# All rights reserved. X# X# -------------------------------------------------------------------------- X# begin configuration section X# X# this should be adequate for CONVEX systems. if you copy this script X# to non-CONVEX systems, or have a particularly outre local setup, you may X# wish to alter some of the defaults. X# -------------------------------------------------------------------------- X X$PAGER = $ENV{'PAGER'} || 'more'; X X# assume "less" pagers want -sf flags, all others must accept -s. X# note: some less's prefer -r to -f. you might also add -i if supported. X# X$is_less = $PAGER =~ /^\S*less(\s+-\S.*)?$/; X$PAGER .= $is_less ? ' -si' : ' -s'; # add -f if using "ul" X X# man roots to look in; you would really rather use a separate tree than X# manl and mann! see %SECTIONS and $MANALT if you do. X$MANPATH = &config_path; X X# default section precedence X$MANSECT = $ENV{'MANSECT'} || 'ln16823457po'; X X# colons optional unless you have multi-char section names X# note that HP systems want this: X# $MANSECT = $ENV{'MANSECT'} || '1:1m:6:8:2:3:4:5:7'; X X# alternate architecture man pages in