Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!abvax!iccgcc!schmidtg From: schmidtg@iccgcc.decnet.ab.com Newsgroups: comp.lang.forth Subject: STRIPPERS Message-ID: <3154.27b16b1a@iccgcc.decnet.ab.com> Date: 7 Feb 91 19:58:33 GMT Lines: 75 Sometime ago in a posting regarding metacompilation, a passing reference was made to a "stripper" program. A stripper removes all words not used within a given application. Has anybody developed/used such a program? If so, I would be interested in learning more about it. Here are some of my own thoughts/questions about how a stripper might work. Method 1. The stripper works in conjunction with the metacompiler. The metacompiler is modified to keep a reference count for each defined word in the symbol table. After metacompilation, if the reference count is zero, the word is defined, but not referenced and may therefore be eliminated in the subsequent metacompilation. Now comes the fun part. The metacompilation is rerun and these words are ignored. How is this done? One way might be to have the symbol table from the previous run provide input to the next run. When the metacompiler encounters a defining word, it checks the "old" symbol table to see if it has been referenced. If not, it redirects output from the target image to the bit bucket until the next defining word is encountered. Has anybody tried a scheme like this? Method 2. The stripper compresses a running forth system. The forth system is loaded along with the application. The stripper program is then loaded, it's 1st word it "TASK". A list of application words is given to the stripper (e.g. STRIP FOO STRIP BAR STRIP BLETCH). The stripper recursively examines the call tree ("used tree?") of each word and adds each word encountered to it's symbol table. Starting from "TASK" and working towards lower memory, look at each word and determine if the word is in the symbol table. If it is, do nothing and go on to the next word. If not, and there are no other referenced words above this one, advance a pointer which points to the last word in the system. This effectively "deletes" this word. Now for some real fun. If the word is not referenced, but other referenced words are above it, then shift all the words above this one down in memory by the size of this word. Doing this of course invalidates all other words which use this word. If your system is token threaded (I'm certain it's not) you can just change a single entry in the token list. Otherwise the stripper now parses the entire program (or rather what's left of it after a partial strip) and fixes up all references to this words. Obviously, this part of the task is more difficult if you opt for subroutine threading and inline code. Also, when the words are "shifted", the LFA of the previous shifted word must change too. This process is continued until the root word is reached. Any vocabulary pointers must now be changed and the new system is then saved out to disk. A disadvantage of this method is that it requires enough memory to hold the unstripped application plus the stripper program. What do people think of these methods? Are there better ways do this aside from what I have suggested. There may be some holes too! Let me know what you think. -- ============================================================================= Greg Schmidt -> schmidtg@iccgcc.decnet.ab.com ============================================================================= "People with nothing to hide have nothing to fear from O.B.I.T" -- Peter Lomax ----------------------------------------------------------------------------- Disclaimer: No warranty is expressed or implied. Void where prohibited. =============================================================================