Path: utzoo!attcan!uunet!wuarchive!usc!julius.cs.uiuc.edu!rpi!uwm.edu!ogicse!plains!kmagel@plains.NoDak.edu From: kmagel@plains.NoDak.edu (ken magel) Newsgroups: comp.software-eng Subject: reverse engineering report Message-ID: <8204@plains.NoDak.edu> Date: 11 Feb 91 15:11:02 GMT Organization: North Dakota State University, Fargo Lines: 649 Here are the informative replies I have received thus far to my query concerning the practices of reverse engineering. Several people mentioned that they would report or ask colleagues to report on actual reverse engineering efforts, but I have not yet received anything on those efforts. If I do, I will post another summary. From byrne@ksuvax1.cis.ksu.edu Mon Feb 4 12:47:41 1991 Received: from harris.cis.ksu.edu by plains.NoDak.edu; Mon, 4 Feb 91 12:47:06 -0600 Return-Path: Received: from ksuvax1.cis.ksu.edu by harris.cis.ksu.edu (5.58/SWH-2.03); id AA15726; Mon, 4 Feb 91 12:46:12 CST Received: by ksuvax1.cis.ksu.edu (5.59++/CIS1.1) id AA01405; Mon, 4 Feb 91 12:46:05 CST Date: Mon, 4 Feb 91 12:46:05 CST From: byrne@ksuvax1.cis.ksu.edu (Eric J. Byrne) Message-Id: <9102041846.AA01405@ksuvax1.cis.ksu.edu> To: kmagel@plains.NoDak.edu Subject: reverse engineering refs Status: R I saw your request on the net for reverse engineering references. Reverse engineering seems to mean different things to different people, and there are a variety of reasons for doing it. Here are some references that I have collected. They range around the topic of reverse engineering, but some stray a bit. The notes on each are mine and may or may not be completely accurate. These refs are not in any kind of order, other than random ( if you consider that an order). Naturally, I hope to see a summary of what you receive. - Eric ############# TITLE: Reverse Engineering and Design Recovery: A Taxonomy AUTHOR: Chikofsky, Elliot J. and Cross, James H. SOURCE: IEEE Software VOL: 7 NO: 1 DATE: January 1990 PAGES: 13 - 17 This article defines and relates six terms: reverse engineering, forward engineering, redocumentation, design recovery, restructuring, and reengineering. TITLE: A Knowledge-Based Approach to the Analysis of Code and Program Design Language (PDL) AUTHOR: Das, Bikas K. SOURCE: Conference on Software Maintenance DATE: October 16-19, 1989 PAGES: 290 - 296 This paper presents a knowledge-based technique for understanding programs ( PDL and corresponding code ) in terms of their plans. Also understanding code supports certain QA activities. TITLE: Program Recognition AUTHOR: Ourston, Dirk SOURCE: IEEE Expert VOL: 4 NO: 4 DATE: Winter 1989 PAGES: 36 - 49 Reviews in some detail three systems in program recognition research - the Program Recognizer, Talus, and Proust. Gives strengths and limitations of each system. TITLE: Recognizing Design Decisions in Programs AUTHOR: Rugaber, Spencer, Ornburn, Stephen B., and LeBlanc, Richard J. SOURCE: IEEE Software VOL: 7 NO: 1 DATE: January 1990 PAGES: 46 - 54 This articles discussions design decisions and their effect on source code. Discusses how to characterize decisions and how to find them. TITLE: Using Function Abstraction to Understand Program Behavior AUTHOR: Hausler, Philip A., Pleszkoch, Mark G., Linger, Richard C., and Hevner, Alan R. SOURCE: IEEE Software VOL: 7 NO: 1 DATE: January 1990 PAGES: 55 - 63 Discusses proposed characteristics and techniques of an automated system for function abstraction. Goal of function abstraction is to extract busines rules ( requirements ) from code and express them in nonprocedural terms for inspection and analysis. TITLE: Knowledge-Based Program Analysis AUTHOR: Harandi, Mehdi T., and Ning, Jim Q. SOURCE: IEEE Software VOL: 7 NO: 1 DATE: January 1990 PAGES: 74 - 81 Describes PAT, a support tool for program maintenance. It uses an object-oriented framework of programming concepts and a heuristic-based concept-recognition mechanism to understand programs. TITLE: Recognizing a Program's Design: A Graph-Parsing Approach AUTHOR: Rich, Charles, and Wills, Linda M. SOURCE: IEEE Software VOL: 7 NO: 1 DATE: January 1990 PAGES: 82 - 89 Describes the Recognizer, a program that automatically finds all occurances of a given set of cliches in a program and builds a description of that program. Also discusses difficulties with recognizing cliches. TITLE: A Reverse Engineering Methodology to Reconstruct Hierarchical Data Flow Diagrams For Software Maintenance AUTHOR: Benedusi, P., Cimitile, A., and De Carlini, U. SOURCE: Conference on Software Maintenance DATE: October 16-19, 1989 PAGES: 180 - 189 Describes a methodology used to produce from code a hierarchy of Data Flow Diagrams (DFDs) at different levels of abstraction. TITLE: Design Recovery for Maintenance and Reuse AUTHOR: Biggerstaff, Ted J. SOURCE: IEEE Computer VOL: 22 NO: 7 DATE: July 1989 PAGES: 36 - 49 This article discusses design recovery, proposes an architecture to implement the concept, illustrates how the architecture operates, describes the progress toward implementing it. Architecture is based on a domain model that can detect instances of known procedural entities. TITLE: Understanding and Documenting Programs AUTHOR: Basili, Victor R., and Mills, Harlan D. SOURCE: IEEE Transactions on Software Engineering VOL: SE-8 NO: 3 DATE: May 1982 PAGES: 270 - 283 This paper reports on an experiment in trying to understand an unfamiliar program. The program was re-structured and a specification and correctness proof were developed for it. The techniques used included function specification, the discovery of loop invariants, case analysis, and the use of a bounded indeterminate auxiliary variable. TITLE: The Retrospective Introduction of Abstraction into Software AUTHOR: Colbrook, A., and Smythe, C. SOURCE: Conference on Software Maintenance DATE: October 16-19, 1989 PAGES: 166 - 173 A technique is proposed which facilitates the retrospective introduction of abstract data types into existing systems and the corresponding software tool to aid this process is presented. The purpose of this paper is to show that it is possible to take the original source code data structures and to remap them onto a set of more rigidly defined and understood data structures. TITLE: Software Renewal: A Case Study AUTHOR: Sneed, Harry M. SOURCE: IEEE Software VOL: 1 NO: 3 DATE: July 1984 PAGES: 56 - 63 Describes a design recovery effort to re-document an existing system and develop an environment for future maintenance of the system. Gives a good description of the recovery steps and their interrelationships and the products produced. Techniques and problems are skimmed over. TITLE: Using Modern Design Practices To Upgrade Aging Software Systems AUTHOR: Britcher, Robert N., and Craig, James J. SOURCE: IEEE Software VOL: 3 NO: 3 DATE: May 1986 PAGES: 16 - 24 Gives experiences by IBM in upgrading FAA's NAS en route software. Existing software was abstracted to a mathematical PDL notation. This level was redesigned and reimplemented. Only 100,000 LOC out of the 1.5 million LOC system were redesigned and 52,000 new LOC was created. Having to redesign to function with existing code increased the difficulty of the task. TITLE: Maintenance and Porting of Software by Design Recovery AUTHOR: Arango, Guillermo, Baxter, Ira, Freeman, Peter, and Pidgeon, Christopher SOURCE: Conference on Software Maintenance DATE: 1985 PAGES: 42 - 49 Proposes a method for the re-implementation of programs by recovering the design of such programs, and using the recovered design, to re-implement the program in new environments (porting), with different functionality (maintenance), or with different performance (enhancement). The method is intergrated with the Draco development paradigm. AUTHOR: Ulrich, William TITLE: Re-engineering vs. Reverse Engineering SOURCE: Software Magazine VOL: 8 NO: 11 DATE: September 1988 PAGES: 8,10 Gives definitions of re-engineering and reverse engineering. Briefly explains the benefits of each. Claims that re-engineering is the first step in reverse engineering and that reverse engineering tools don't exist yet, but will be available within two years. TITLE: TMM: Software Maintenance by Transformation AUTHOR: Arango, Guillermo, Baxter, Ira, Freeman, Peter, and Pidgeon, Christopher SOURCE: IEEE Software VOL: 3 NO: 3 DATE: May 1986 PAGES: 27 - 39 Proposes two methods for maintenance. TMM, transformation maintenance model, which works with the Draco paradigm. Reverses design decisions to reach a least common abstraction for the current implementation and the desired implementation. MBA, maintenance by abstraction, handles the situtation where no specification or design information exist. This paper is a rewrite of the authors Conference on Software Maintenance-85 paper. TITLE: Inverse Transformation of Software From Code To Specification AUTHOR: Sneed, Harry M., and Jandrasics, Gabor SOURCE: Conference on Software Maintenance DATE: October 24-27, 1988 PAGES: 102 - 109 Describes how existing Cobol software can be retranslated into a logical software design stored in the form of a relational database for both data and program design elements and how these relations can be retranslated into an entity/relationship model of the system with entity, structure, and relationship descriptions. Relations are described and information sources are given, but transformation techniques are not mentioned. TITLE: A CASE for Reverse Engineering AUTHOR: Bachman, Charlie SOURCE: Datamation VOL: 34 NO: 13 DATE: July 1, 1988 PAGES: 49 - 56 Non-technical article that discusses the potential for CASE tools that incorporate support for reverse engineering, re-engineering, and expert systems to help with backward and forward development. The article deals mostly with reverse engineering, what it is, its benefits, and why CASE tools need to support it. Business oriented article. TITLE: Simple Tools To Automate Documentation AUTHOR: Kuhn, D. Richard, and Hollis, Carol G. SOURCE: Conference on Software Maintenance DATE: November 11-13, 1985 PAGES: 203 - 210 This paper describes how program information can be extracted from source code using simple programs. The technique relys on the use of a programming standard when writing the software. Information retrieved includes calling interfaces, variable usage, calling-called relationships, etc. Oriented towards PL/1, but is general enough for other languages. TITLE: SRE: A Knowledge-based Environment for Large-Scale Software Re-engineering Activities AUTHOR: Kozaczynski, Wojtek, and Ning, Jim Q. SOURCE: 11th International Conference on Software Engineering DATE: 1989 PAGES: 113 - 122 This paper describes the underlying principles of a knowledge-based Software Re-engineering Environment (SRE). Issues related to the re-engineering of large-scale software systems are addressed. The focus seems to be more on reverse engineering support rather than supporting modifications. TITLE: PROMPTER: A Knowledge Based Support Tool for Code Understanding AUTHOR: Fukunaga, Koichi SOURCE: 8th International Conference on Software Engineering DATE: August 28-30, 1985 PAGES: 358 - 363 Reports on a prototype tool called PROMPER for code understanding. Given assembler source code for a program, it produces a higher level description of the program using programming knowledge, hardware knowledge, and program conventions. The basis for the tool and its structure are given. TITLE: The Evolution of Programs: Program Abstraction and Instantiation AUTHOR: Dershowitz, Nachum SOURCE: 5th International Conference on Software Engineering DATE: March 9-12, 1981 PAGES: 79 - 88 Gives two detailed examples demonstrating a methodology for deriving an abstract program schema that captures a shared technique underlying a set of concrete programs. A schema can be instantiated to create a new concrete program. The method is based on formal logic and specifications. The concrete programs must have input/output and body assertions given for this method to work. TITLE: A Knowledge-Based System for Software Maintenance AUTHOR: Calliss, F. W., Khalil, M., Munro, M., and Ward, M. SOURCE: Conference on Software Maintenance DATE: October 24-27, 1988 PAGES: 319 - 324 Describes a project to develop a knowledge-based tool called the Maintainer's Assistant. The tool is directed at supporting a maintainer develop an understanding of unknown code. The tool uses a formal language to model source code, transformations are used to "realize" the purpose of the code, and programming plans are used to spot known algorithms. TITLE: A Documentation Method Based on Cross-Referencing AUTHOR: Foster, John R., and Munro, Malcolm SOURCE: Conference on Software Maintenance DATE: September 21-24, 1987 PAGES: 181 - 185 This paper is concerned with the use and maintenance of documentation. It describes a toolset ( and methodology) called DOCMAN that uses a cross-referencer to collect the names of all program items. A maintainer can then add a description about an item as an understanding of that item is gained. The tool can also be used to view recorded item information. TITLE: MAP : a Tool for Understanding Software AUTHOR: Warren, Sally SOURCE: Sixth International Conference on Software Engineering DATE: 1982 PAGES: 28 - 37 This paper describes MAP, a tool that helps maintenance programmers understand their programs. The paper lists the MAP command set and shows example of its use. Its targeted support areas are also well explained. MAP helps show procedural structure, follow control-flow and data-flow, understanding data aliasing, search for patterns, and compare two different versions of the same program. It supports COBOL. TITLE: Automatic Documentation Methodologies For Software Maintenance AUTHOR: Landis, Larry D., Hyland, Patricia M., Gilbert, Alton L., and Fine, Andrew J. SOURCE: Prepared by Technical Solutions, Inc. for U.S. Army Research Office DTIC Number : AD-A204 752 DATE: January 25, 1989 Based on the idea that software maintenance is made easier by accurate documentation, this reports deals with providing automatic techniques of generation documentation from source code. This is a brief report indicating the goals of the research and the results reached. No details TITLE: Redocumenting Software Systems using Hypertext Technology AUTHOR: Fletton, Nigel T., and Munro, Malcolm SOURCE: Conference on Software Maintenance DATE: October 24-27, 1988 PAGES: 54 - 59 This is a brief paper that discusses the possible advantanges of using Hypertext to document a software system. Most of the paper discusses current software documentation troubles and how they relate to maintenance. Then the idea of using hypertext is advocated. The authors have just begun an experimental study using hypertext to collect program information gained by maintainers while documenting a program. TITLE: Maintenance and Reverse Engineering: Low-Level Design Documents Production and Improvement AUTHOR: Antonini, P., Benedusi, P., Cantone, G., and Cimitile, A. SOURCE: Conference on Software Maintenance DATE: September 21-24, 1987 PAGES: 91 - 100 This paper presents a technique for generating Jackson's logic diagrams and Warnier/Orr logical process structures from Cobol source code. The techniques are general enough to be applied to most programming languages. The paper also presents a methodology based on using reverse engineering in the maintenance phase. The methodology is based on the generation and comparison of new design documents with earier document versions. TITLE: Documentation in a Software Maintenance Environment AUTHOR: Landis, Larry D., Hyland, Patricia M., Gilbert, Alton L., and Fine, Andrew J. SOURCE: Conference on Software Maintenance DATE: October 24-27, 1988 PAGES: 66 - 73 This paper describes a project to help maintenance programmers develop an understanding of a program by generating documentation from source code. Recovered documentation includes extended Nassi-Shneiderman charts, a data dictionary, pictorial representations of data structures, and a pretty-printer. The system supports Fortran, C, and Ada, which are translated into an internal language called the Documentation Language (DL). A good review of documentation methodologies is given in an appendix. TITLE: PAT: A Knowledge-based Program Analysis Tool AUTHOR: Harandi, Mehdi T., and Ning, Jim Q. SOURCE: Conference on Software Maintenance DATE: October 24-27, 1988 PAGES: 312 - 318 This article describes a knowledge based tool to support program understanding and debugging. PAT uses program plans to recognize common algorithms and typical implementation mistakes. The paper gives an overview of the tool architecture and explains the program plan notation and use. An example tool sessions is included. TITLE: Software Maintenance as an Engineering Discipline AUTHOR: Linger, Richard C. SOURCE: Conference on Software Maintenance DATE: October 24-27, 1988 PAGES: 292 - 297 This paper argues that software maintenance must use more formal techniques in order to become a managible activity. This paper discusses the use of the Linger-Mills theory of program primes as a formal construct. Its use for program control restructuring is given. TITLE: Reverse Software Engineering AUTHOR: Prywes, N., Ge, X., Lee, I., and Song, M. SOURCE: Tech Report MS-CIS-88-99 Department of Computer and Information Science, University of Pennsylvania DATE: December 1989 From Invader%cup.portal.com@nova.unix.portal.com Thu Feb 7 00:34:54 1991 Received: from portal.COM by plains.NoDak.edu; Thu, 7 Feb 91 00:34:39 -0600 Received: by nova.unix.portal.com (3.1.18.113) id m0j459s-0000pvC; Wed, 6 Feb 91 22:31 PST Received: by portal.unix.portal.com (%I%) id AA27836; Wed, 6 Feb 91 22:31:05 PST Received: by hobo.corp.portal.com (4.0/4.0.3 1.6) id AA20576; Wed, 6 Feb 91 22:31:03 PST To: kmagel@plains.nodak.edu From: Invader@cup.portal.com Subject: reverse engineering Lines: 27 Date: Wed, 6 Feb 91 22:31:02 PST Message-Id: <9102062231.1.18964@cup.portal.com> X-Origin: The Portal System (TM) Status: RO OK. I'll bite. Reverse engineering has been my hobby for years. I started off with a paper tape of an interpreter system for the PDP-11 and have gone from there. Since I'm not dead serious about it, I don't have a great deal of advice. In general, though, it requires a tool of some sort. I have built tools progressively attack the problem and that save hints about the outcome. The more interactive, the better. The idea is to show something useful and when you understand that, make it more symbolic. An ideal system would have a lot of built-in knowledge of how programs work and would do a lot more analysis, breaking into basic blocks and using flow analysis techniques like a compiler would use. Finally, compiled code is the easiest to deal with because it is normally very regular (unless the compiler did some code motion or cross jumping.) Regularity is the key, especially in dealing with assembly language. What is really cool about it is that even at the bit level you can tell that different people wrote different parts of a program. The personality comes through. I don't know why you want to know any of this stuff, but I'd be happy to talk more about it if you have specific ideas, etc. mkd ps I'd be glad to receive any other info you get, too. From bnfb@cs.washington.edu Mon Feb 4 15:03:17 1991 Received: from june.cs.washington.edu by plains.NoDak.edu; Mon, 4 Feb 91 15:03:15 -0600 Received: by june.cs.washington.edu (5.64/7.0jh) id AA21642; Mon, 4 Feb 91 13:03:19 -0800 Date: Mon, 4 Feb 91 13:03:19 -0800 From: bnfb@cs.washington.edu (Bjorn Freeman-Benson) Return-Path: Message-Id: <9102042103.AA21642@june.cs.washington.edu> To: kmagel@plains.NoDak.edu Subject: Re: reverse engineering Newsgroups: comp.software-eng In-Reply-To: <7940@plains.NoDak.edu> Organization: University of Washington, Computer Science, Seattle Cc: Status: R >Does anyone have any real world experiences they are willing to >share? Thanks. What the heck, here's what I do: I work on a contract basis writing linkers for Zortech C++. My linker must be compatible with the Microsoft linker, but there are no specs on what or how the MS linker works. Thus I run test cases through the MS linker and check the output. Then I run a slightly different test case, and check it again. For example, I want to know how segment attributes are combined: is it a logical OR? a last-come, last-served? a defaults are always overridden? a logical AND? etc. I write dozens of little test cases based on what I believe the solution to be and check my hypothesis. Sometimes I am write, sometimes I am surprised, and (rarely if I am a good engineer) do I miss the truth. The trick for my work is to apply all the off-by-one, corner-cases, etc. experience that I have to choosing the test cases. Regards, Bjorn N. Freeman-Benson From jcardow@blackbird.afit.af.mil Mon Feb 4 15:14:15 1991 Received: from [129.92.1.2] by plains.NoDak.edu; Mon, 4 Feb 91 15:14:05 -0600 Received: by blackbird.afit.af.mil (5.64+/a0.25) id AA10390; Mon, 4 Feb 91 16:14:05 -0500 Date: Mon, 4 Feb 91 16:14:05 -0500 From: James E. Cardow Message-Id: <9102042114.AA10390@blackbird.afit.af.mil> To: kmagel@plains.NoDak.edu Subject: Re: reverse engineering Newsgroups: comp.software-eng References: <7940@plains.NoDak.edu> Status: R In comp.software-eng you write: > About ten days ago, I requested information concerning how people do >reverse engineering of computer software. To date, there have been six >requests to post what I found out, but very little in the way of informative >responses. The January, 1990 issue of IEEE Software is devoted in part to >reverse engineering. There are some good articles there and some good >references. Does anyone have any real world experiences they are willing to >share? Thanks. Ken, I am trying to prepare a course in reverse engineering/re-engineering for working professionals so while I can't give you practical examples I can give you some pointers. I did participate in one effort several years ago without a good understanding of what was going on, which is why I interested in developing the course. Some references of interest: IEEE Computer Society Tutorial on Software Restructuring by Robert Arnold. (I attended a tutorial by Mr Arnold last month and he said it is in revision). IEEE Computer Society Proceedings from the Conference on Software Maintenance for 1990. Several tracks addressed reverse engineering etc. One paper by Takis Katsoulakos was especially interesting as an overview of efforts in Europe. Conference currently in planning by one of the Navy outfits around D.C. is on current reverse engineering efforts. Articles by Biggerstaff, Chikofsky, and Rugaber in IEEE Software. Work by Eric Byrne of Kansas State U. for the Air Force as summer study. Don't know if this will help, but I too would be interested in finding out your results. I'd appreciate any information you would care to share. Jim Cardow Air Force Institute of Technology Wright Patterson AFB, OH From kim@unagi.cis.upenn.edu Tue Feb 5 13:02:27 1991 Received: from LINC.CIS.UPENN.EDU by plains.NoDak.edu; Tue, 5 Feb 91 13:02:23 -0600 Received: from UNAGI.CIS.UPENN.EDU by linc.cis.upenn.edu id AA04374; Tue, 5 Feb 91 14:02:32 -0500 Return-Path: Received: by unagi.cis.upenn.edu id AA27771; Tue, 5 Feb 91 14:02:31 EST Date: Tue, 5 Feb 91 14:02:31 EST From: kim@unagi.cis.upenn.edu (JEE-IN KIM) Posted-Date: Tue, 5 Feb 91 14:02:31 EST Message-Id: <9102051902.AA27771@unagi.cis.upenn.edu> To: kmagel@plains.NoDak.edu Subject: Re: reverse engineering Newsgroups: comp.software-eng In-Reply-To: <7940@plains.NoDak.edu> Organization: University of Pennsylvania Cc: Status: R Noah Prywes (nsp@central.cis.upenn.edu) and his colleagues have been developing an equational language called MODEL which was used as an intermmediate language for a reverse engineering from CMS-2 codes to Ada and C. He and his consulting company have lots of REAL WORLD experiences, I should say. I think you had better ask him for a list of references including his techniacl reports. Best, Jee-In -- --------------------------- Jee-In Kim kim@unagi.cis.upenn.edu --------------------------- Brought to you by Super Global Mega Corp .com