Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!asuvax!ncar!ico!haddock!ima!esegue!compilers-sender From: albaugh@dms.UUCP (Mike Albaugh) Newsgroups: comp.compilers Subject: Re: Disassembly Keywords: disassemble Message-ID: <1153@dms.UUCP> Date: 17 Sep 90 17:02:34 GMT References: Sender: compilers-sender@esegue.segue.boston.ma.us Reply-To: albaugh@dms.UUCP (Mike Albaugh) Organization: Atari Games Inc., Milpitas, CA Lines: 49 Approved: compilers@esegue.segue.boston.ma.us >From article , by Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips): > > Suggestion: > > Use a program that starts with the first executable instruction, marking > and decoding as it goes every instruction except for conditional branches. > Upon encountering a conditional branch, follow _both_ branches. If > implemented recursively, the stop conditions are, 1) a branch to an already > marked instruction and b) an end-of-program. Even _I_ have re-invented this particular wheel, although I use a more-or-less-human-readable "state file" to convey "marks", so I can guide it "between runs" as to new avenues to explore (e.g. mark as "a new entry" all the targets of what _I_ can see is a jump-table). > Even this can fail if there is a conditional branch to garbage, which > never happens in practice due to the underlying algorithm. Never say never. If one is disassembling 6502 code, this happens a LOT. For those unfamiliar with the 6502, it has _no_ unconditional branch, hence programmers often use a conditional branch where the condition is "known to be true." The simple cases are along the lines LDA #1 BNE foo ; ALWAYS TAKEN but there seems to be some sort of macho pride taken in establishing the condition via a _long_ chain of computation. Even in the source, this can be confusing. When reading raw machine code the problem of detecting entry into the chain of computation (without even labels to go by) that invalidates the assumptions can get, uh, very interesting. Oh, yeah, the short reach of these branches means that they often get used as "stepping stones", so even the one shown above _might_ fall through. Now it's my turn to say "but that wouldn't happen in in practice" :-) And while we're at it, do typical disassemblers just handle the documented instructions, or do they "simulate" the more commonly used "illegal instructions" that hackers seem to love? I would ordinarily assume that since this thread started with "de-compiling", we are _not_ interested in code that "a compiler would never generate", but... Mike | Mike Albaugh (albaugh@dms.UUCP || {...decwrl!pyramid!}weitek!dms!albaugh) | Atari Games Corp (Arcade Games, no relation to the makers of the ST) | 675 Sycamore Dr. Milpitas, CA 95035 voice: (408)434-1709 -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.