Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!rice!uw-beaver!fluke!inc From: inc@tc.fluke.COM (Gary Benson) Newsgroups: comp.text.sgml Subject: Re: looking for more information Message-ID: <1990Dec11.214726.8463@tc.fluke.COM> Date: 11 Dec 90 21:47:26 GMT References: <200@tivoli.UUCP> <1990Nov28.105230.10365@tc.fluke.COM> <215@tivoli.UUCP> Organization: John Fluke Mfg. Co., Inc., Everett, WA Lines: 131 In article <215@tivoli.UUCP> lark@tivoli.UUCP (Lar Kaufman) writes: >In case I have misled anyone, I hasten to admit that I am very much a >student of SGML, not a master. I have _never_actually_used_ SGML in a >product, so my knowledge is only theoretical. I am only now in a position >to begin working with SGML concepts and proto-SGML software. I agree with >Gary's proposal, and I hope someone with a practical knowledge will accept >the task of maintaining a FAQ file. > >Gary also mentions tools written in perl. I would love to see people >volunteering code and techniques for implementing SGML solutions. I know >that others have written programs for converting structural information >to/from SGML using various languages (such as Icon). Where are they? >Has anyone considered setting up an FTP site for SGML tools? > >A final comment: we should remember to distinguish between SGML, the >standard, and various software products that implement it. It can be >confusing to mix these. For example, when I say that you can imbed >chapters in an SGML document, I do not imply any knowledge of how a >specific SGML product does it (or doesn't do it). Woops, I didn't mean to set you up for guru-hood, Lar! It's just that your posting was well-written and informative without being esoteric to the point of meaninglessness. I hope this newsgroup can be a place for a wide-spectrum discussion of SGML, but so far, it has seemed weighted toward theory, and I found your posting to be a refreshing breath of reality. As to your idea about people posting code and techniques, I can say this -- we have several man-years of programming in our quasi-SGML autocoding programs, and I'm sure I'd be in big trouble if I disseminated those programs. However, our techniques are rather interesting (to us, at least), and I was surprised to see no response to my query if others are using our techniques. Long ago, back when we typeset all of Fluke's technical manuals, a decision was made in the Publications Department to attempt to keep the writing function as separate as possible from the production function. We defined production as encompassing page design, preparation of files for typesetting, typesetting itself, layout, and of course printing, binding, and so on. There have been two very interesting results from that decision: 1. While the industry as a whole has moved to "desk-top publishing", we find ourselves without many peers to discuss methods. We still have our staff typing in raw text, having rejected the "Mac on every desk" approach. 2. We are in an excellent position to take advantage of new software tools because we have a lot of experience with implied markup techniques. In our approach, the writer's file has an absolute minimum of explicit instructions or codes. We have long used the string ---n at the end of lines to indicate heading levels. This is basically the only "coding" our writers do in files. Everything else is recognized by context or through regular expression pattern matching, something that perl is extremely adept at. We use a perl program to scan the file and determine what objects are present. Figure titles are identified by the following string, appearing on a line by itself: Figure n-n. arbitrary text title When our coding program comes across that string, there is only one possible generic code to send to the output file:
. We are toying with the idea of having the title end with a "higher level generic code" like the heading level indicators. This would serve as a cue from writer to gencoding program indicating the desired size of the illustration. For example, "Figure 3-3. Arbitrary Text Title/1" might indicate a full-page illustration, while changing the number to 2, 3, or 4 would indicate half, third and quarter pages respectively. Lists are indented objects beginning with a number or letter, followed by a dot. When the program is confronted with a list environment, it compares the current indent to the former one and the result determines when to send the tag for proper nesting. For bullet lists, we use the letter o with no following dot. As each line is processed, a subroutine scans it for any "special characters" and sends the required string to the gencode file. We like +/- to appear as a plus sign above a minus. Regular expressions look for degrees symbols and Greek letters like mu and omega among others. For example, the string 9oF means 9 degrees F, while 13 uF means 13 microFarads. A major concern has been that reviewers should not be asked to try to make their way through a text loaded with coding. We've found that we get higher quality review remarks when the review copy looks similar to the expected final page. Which is why we have pre-printout filters that convert lines ending ---n to boldface, and if we do incorporate the "Figure Title/n" idea we will probably not print the code even in review copies, instead converting the number to line or form feeds. Our perl program currently recognizes and generates generic codes for: * Section headings * Notes, Cautions, and Warnings * Textual headings up to 4th order (we tell writers if they need to go any higher than 4th order headings, they are probably writing funny). * Alpha, numeric, and two types of bullet lists at 4 indent levels * Figure and Table Titles ...and of course, everything else is just running text :-) Many of our manuals need special treatement for a variety of things -- special fonts, in-text keycap art, special formats, so we by no means have technical publication figured out down to a non-event, but we are getting there! Generic coding and implied markup are powerful approaches to the traditional problems in publishing (especially publishing of structured documents as opposed to books, magazines, and so on). As I asked before, I'd be very interested in hearing from others who are using similar methods. Or other perl users! We had our first program written for us about 2 1/2 years ago, and it is still cranking along, even through two dozen patch levels. Gary Benson Supervisor, Publication Services John Fluke Mfg. Co. Inc. -- Gary Benson -=[ S M I L E R ]=- -_-_-_-inc@fluke.com_-_-_-_-_-_-_-_-_-_- Go jump in a goddam volcano, you fucking cave newt! -greg Nowak