Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!think.com!mintaka!spdcc!esegue!compilers-sender From: phorgan@cup.portal.com Newsgroups: comp.compilers Subject: Disassembly Keywords: assembler, debug Message-ID: <9009091032.1.139@cup.portal.com> Date: 9 Sep 90 17:32:55 GMT Sender: compilers-sender@esegue.segue.boston.ma.us Reply-To: phorgan@cup.portal.com Organization: Compilers Central Lines: 25 Approved: compilers@esegue.segue.boston.ma.us The problem with disassembling arbitrary object code is that data bears a disturbing resemblance to code at times:) Even when running through code disassembling starting at known code, it's not always possible to determine when code stops and data begins. Then it's not possible to tell when object code starts up again. This is easy to see using most dissassemblers; when you hit the data, the unknown op-code indicator appears (typically ???), then random sequences of ??? and op-codes, then when the code starts, often the disassembler has just guessed wrong and includes the first byte or two of the 'real' op-code in a previous 'false' one. It might take a while to 're-synchronize' and start showing 'real' op-codes. The only time this isn't a problem would be with fixed single length op-codes with an alignment requirement. It is possible to reduce the problem with an algorithm that looks ahead starting byte-by-byte and sees which one generates a most successful string of instructions. From a 'good starting byte', you could disassemble in reverse to find a previous starting location. Even this fails in many cases of self modifying code or in cases where strange things are done like overlapped code. (A trick often done on code to be put in machines with limited ROM space). If you're familiar with coding practices for the processor though, heuristic methods can be applied with some success. Patrick Horgan phorgan@cup.portal.com -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.