Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!cme!cam!ARTEMIS From: miller@FS1.cam.nist.gov (Bruce R. Miller) Newsgroups: comp.lang.lisp Subject: Re: Help needed - Read Macros Message-ID: <2885815216@ARTEMIS.cam.nist.gov> Date: 13 Jun 91 15:20:16 GMT References: <1991Jun13.064841.20364@Daisy.EE.UND.AC.ZA> Sender: news@cam.nist.gov Followup-To: comp.lang.lisp Organization: NIST - Computing and Applied Mathematics Laboratory Lines: 45 In article <1991Jun13.064841.20364@Daisy.EE.UND.AC.ZA>, Bobby Abraham writes: > I am needing some help with the following problem in an simple > assembler I am writing. > > I wish to parse expressions such as > mov #10 20 > add @13 #6 > l: jmp @5 > > Ideally I would like a lexical analyser to return the following > mov (immediate 10) 20 > add (indirect 13) (immediate 6) > (label l) jmp (indirect 5) > The first thing is to define your own readtable using copy-readtable or such -- you could start off by copying the lisp readtable. Then you'll need to define your own readers to replace the ones that the lisp readtable does `wrong', such as #\#. In your case, #\# and #\@ could be defined similar to the #\` reader macro, something like: (list 'immediate (read stream t nil t)) The colon is slightly tricky. First you need to change it in some way so it nolonger tries to do package prefixes; At least (set-syntax-from-char #\: #\A *assembler-readtable*) to make alphabetic. The catch is that : in your case is a postfix operator. For postfix and infix operators you need to be able to fetch the `previous' parsed object (not to mention dealing with binding powers, etc). Rather than introduce that complexity into the CL standard, the designers decided to leave it out with the proposal that you should use the readtable machinery to `tokenize' the input, and then use a lexical analyzer to do the remaining steps. Nevertheless, if this is the worst case you could still do the whole parse using the readtable. You need to write a function to replace the `symbol' reader; ie what gets used for every alphanumeric char which, if it discovers a #\: at the end returns (list 'label (intern string-so-far ...)) rather than simply (intern string-so-far...) Hope this sketch helps some. Have fun. bruce miller@cam.nist.gov