Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!think.com!mintaka!spdcc!esegue!compilers-sender
From: adamsf@turing.cs.rpi.edu (Frank Adams)
Newsgroups: comp.compilers
Subject: Re: Help on disassembler/decompilers
Keywords: code, assembler, debug
Message-ID: <_5A%GS%@rpi.edu>
Date: 10 Sep 90 22:20:33 GMT
References: <HOW.90Sep5173755@sundrops.ucdavis.edu> <12976@june.cs.washington.edu>
Sender: compilers-sender@esegue.segue.boston.ma.us
Reply-To: adamsf@turing.cs.rpi.edu (Frank Adams)
Organization: RPI CS Dept.
Lines: 27
Approved: compilers@esegue.segue.boston.ma.us

In article <12976@june.cs.washington.edu> pardo@cs.washington.edu (David Keppel) writes:
>My guess is that decompiling in to a language that is e.g.,
>saccarine-sweetened assembler (C) is `easy', while decompiling e.g.,
>in to APL is hard.

If we assume that the program is to be decompiled into the language in
which it was written, it is in general easier to decompile the less the
compiler optimizes the generated code.

A second problem is type inference.  APL, with a fixed set of data types,
is easier in this respect than C.  For example, when the code loads a pointer
into a register and indexes off of it, what kind of struct is the pointer
pointing to?

If the object is only to get some kind of higher-level language representation
of an arbitrary executable, C will indeed be easier.  But this kind of
decompilation is not very useful -- what read

	foo.bar = 0;

in the original is likely to come out as

	*(int *)(((char *)&foo + 8)) = 0;

-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.