Path: utzoo!attcan!uunet!husc6!mailrus!ames!pasteur!ucbvax!decwrl!hplabs!hpda!hp-sde!hpfcdc!hpldola!winter From: winter@hpldola.HP.COM (Kirt Winter) Newsgroups: comp.software-eng Subject: Flexible Prettyprinters, an example... Message-ID: <1420008@hpldola.HP.COM> Date: 24 May 88 16:33:59 GMT Organization: HP Elec. Design Div. -ColoSpgs Lines: 122 In an earlier discussion about style rules for C shops, I posted a response in which I talked about a flexible, intelligent prettyprinter which I designed in pursuit of my MS. Judging from the responses I've gotten, there is sufficient interest to warrent a full posting to the net. At the current time, the fate of my work on further versions is in limbo. Some entities have expressed interest in the prettyprinter, but nothing has really piqued my interest enough to get me excited about parting with it. I may yet give the source code (or just the executable) to individuals for personal use. The challenge was to design a prettyprinter that could be easily adapted to a wide variety of programming styles. To demonstrate the feasability of the concept, I created the prototype for Turbo Pascal 3.0 source code. First, I'll talk about what prettyprinters actually do. A prettyprinter is intended to take syntactically legal (usually) source code and, in some manner, transform the "style" in which it is written. This could range from highlighting keywords to rearranging even the order of procedures and functions. A happy medium is usually chosen by the designer of the pretty- printer, usually altering the spacing of the program. The usually stated goal of a prettyprinter is to "improve readability" in some manner. Traditional prettyprinter designers have attempted to chose a style that they believe will accomplish that goal. Unfortunately, the response to a traditional prettyprinter is... "I don't like where it puts the relative to ." Or something along that line. In my prototype, I focused on the relative placement of language tokens. That doesn't mean that other style facets (capitalization of keywords, identifier underscore vs. capitals, etc.) couldn't be handled, but rather that my proto- type doesn't handle them in an effort to concentrate on "indentation and placement" issues. A prettyprinter, whether flexible or not, recognizes "white-spaces" and then replaces the original white-spaces in the source code with ones favored by the prettypriner's designer (traditionally). The trick to adding flexibility is to allow a user to specify the white-spaces that will be used to replace the ones in the source. So, one could envision a prettyprinter in which a seperate file of white- spaces would be saved. A user could edit that file, and therefore change the way the prettyprinter worked. This is certainly possible, but in looking at the wide range of styles, and all the places that white-space could be recognized, I felt that while it would still be useful to do it that way, it would also be quite cumbersome to the user. As I stated earlier, a prettyprinter must recognize white-spaces in order to replace them (even if recognizing means skipping). Why not use the same method, but instead of replacing the white-spaces, analyze and save them? This is the way IPP (my prototype Intelligent PrettyPrinter) is configured. Specifying a new style to IPP is accomplished by typing... ipp L condition THEN compound | IF condition THEN statement ; One simply choses the areas in the language that one wants to "learn", and inserts tokens accordingly. IPP currently uses 48 of these tokens, and these are sufficient to capture a very wide range of syntactic style. IPP does not format anything below the statement level currently, although this could be added fairly easily. The only problem with this (from a learning standpoint, this is another problem with using a "user-defined" style sample) is that this is the area of syntactic style that programmers are least likely to be consistent in (at least based on my work). Flexible prettyprinters are avalable for some languages. One exists for LISP, but uses "deformat" statements for each user-defined feature, not anything approaching the "learn by example" method. One exists for the Logitech Modula II compiler (and is included with the package), which takes one sample file, modified by the user, and "learns" from that. Ones for C are probably on the horizon. So, sorry for the long-winded discussion, but again, there seemed to be enough interest for a posting. I'd appreciate hearing any comments, sugg- estions, offers :-), etc. Kirt ------------------------------------------------------------------------------- Kirt Alan Winter winter@hpldola.hp.com Hewlett Packard - EDD (719) 590-5974 Colorado Springs, Colorado ------------------------------------------------------------------------------- I had these ideas and opinions before I came to HP. HP has to find its own. -------------------------------------------------------------------------------