Xref: utzoo comp.lang.icon:591 comp.lang.c:35197 alt.sources:3001 Path: utzoo!utgpu!cs.utexas.edu!uunet!munnari.oz.au!goanna!minyos!monu6!monu1!bruce!alanf From: alanf@bruce.cs.monash.OZ.AU (Alan Grant Finlay) Newsgroups: comp.lang.icon,comp.lang.c,alt.sources Subject: ANSI C to K&R converter written in Icon (source code provided) Keywords: C, Icon, Ansi Message-ID: <3579@bruce.cs.monash.OZ.AU> Date: 13 Jan 91 07:03:38 GMT Organization: Monash Uni. Computer Science, Australia Lines: 230 After wasting my time trying to fix up a converter to work for my C sources I decided to write my own. Icon seems to be the ideal language for this job (provided you have a compiler/interpreter). I originally thought I would do the job properly (i.e. using a C grammar) but after some reflection I was soon put off (C's grammar is truly awful). The result is yet another a converter that works for the author's programs. However one advantage of this converter is that the algorithm is quite easy to follow and could be easily adapted by Icon programmers to handle a greater subset of C. The program has the following limitations: 1) function prototypes are recognised by the sequence ");" with no intervening spaces, newlines or comments. 2) function prototypes may not contain comments within the parameter list. 3) function definitions may have comments, spaces and newlines within the parameter list however the output will win no awards for legibility. 4) the presence of function parameters in a function definition will mess up the conversion of that parameter list (usually just removes the parameters). There is a workaround as demonstrated by the following example: #ifdef ANSI void addts(void (*ts)()) { #else void addts(ts) void (*ts)(); { #endif #ifdef ANSI } #else } #endif This workaround requires that the non ANSI compiler ignores text which is excluded by "#ifdef"s. The algorithm is to divide the source text into a stream of substrings labelled as either "considered" or "ignored". A pipeline is set up to process the stream as follows: control lines -> {all lines beginning with # are ignored} comments -> {comments are ignored} brackets -> {everything within curly brackets is ignored} declarations -> {ignore all except function declarations (top level)} compress -> {joins together consecutive segments of same label} prototypes -> {prototype declarations have the parameters removed} compress -> {joins together consecutive segments of same label} parameter lists -> {function definition parameters are converted to K&R} compress -> {joins together consecutive segments of same label} The workaround in (4) above can now be seen to depend upon the brackets step. The source follows next: -------------------------//---------------------------------------- # Program to convert C programs with ansi style function prototypes to the # equivalent K&R form. Only top level declarations are converted. # Written 7/1/91 by Alan Finlay, Computer Science, Monash University. # record ignored(body) # A program is processed as a sequence of record considered(body) # ignored and considered parts. global idset # identifier characters global nidset # skip these to find next identifier global spcset # white space characters procedure main() idset:= &lcase ++ &ucase ++ '0123456789_' nidset:= ~idset spcset:= ' \n\t\r' every text:= compress(parms) do writes(text.body) end procedure parms() # rearrange ansi style parameters to suit K&R syntax. par:= "false" # not doing parameters now parlist:= [] # parameter list is empty currpar:= "" # Current parameter is bare every x:= compress(protos) do if type(x)=="ignored" then suspend x else x.body ? while not pos(0) do if par=="false" then if text:= tab(find("()"))||move(2) then suspend considered(text) else { if text:= tab(find("("))||move(1) then par:= "true" else text:= tab(0) suspend considered(text) } else { if text:= tab(upto(',)')) then { currpar||:=text; # check for (void) void:= "false" currpar ? if (tab(many(spcset))|0) & ="void" & (tab(many(spcset))|0) & pos(0) then void:= "true" if void=="true" & *parlist=0 & &subject[&pos]==")" then { currpar:= "" par:= "false" } else { # end of a parameter, extract the identifier currpar ? { if any(nidset) then tab(i:= many(nidset)) while tab(many(idset)) & (k:= i) & tab(i:= many(nidset)) } # update parlist and output the identifier ### /i:= 1; /k:= 1 # for strange parameters only if i=*currpar+1 then i:= k j:= (currpar ? many(spcset)) | 1 put(parlist," "||currpar[j:0]||";\n") currpar||:= move(1) suspend considered(currpar[i:0]) currpar:="" if &subject[&pos-1]==")" then { # can release the saved parameters suspend considered("\n") suspend considered(!parlist) parlist:= [] par:= "false" } } } else { text:= tab(0) # the parameter continues currpar||:= text } } end procedure protos() # remove parameter types from prototypes. # only recognises prototypes which end with ");" as prototypes. # only works for prototypes which are not interrupted by comments etc. # must be compressed afterwards for parms to work. every x:= compress(decs) do if type(x)=="ignored" then suspend x else x.body ? while not pos(0) do if text:= tab(upto('('))||move(1) then { if not tab(find(");")) then text||:= tab(0) suspend considered(text) } else { text:= tab(0) suspend considered(text) } end procedure compress(seq) # joins together adjacent text. textc:= ""; texti:= "" every x:= seq() do if type(x)=="ignored" then { # save ignored and expel considered texti||:= x.body if *textc~=0 then suspend considered(textc) textc:= "" } else { # save considered and expel ignored textc||:= x.body if *texti~=0 then suspend ignored(texti) texti:= "" } if textc~=="" then return considered(textc) # only one of these if texti~=="" then return ignored(texti) # can apply. end procedure decs() # remove top level data declarations. dec:= "false" # not in a declaration now every x:= brackets() do if type(x)=="ignored" then suspend x else x.body ? while not pos(0) do if dec=="false" then { if text:= tab(find("typedef" | "auto" | "static" | "extern" | "register" )) then dec:= "true" else text:= tab(0) suspend considered(text) } else { if text:= tab(find(";"))||move(1) then dec:= "false" else text:= tab(0) suspend ignored(text) } end procedure brackets() # remove any text between { and } after comments removed. bal:= 0 # start with balanced brackets every x:= comments() do if type(x)=="ignored" then suspend x else x.body ? while not pos(0) do if text:= tab(upto('{}')) then if &subject[&pos]=="{" then { bal+:= 1; text||:= move(1) if bal=1 then suspend considered(text) else suspend ignored(text) } else { bal-:= 1; move(1) suspend ignored(text) if bal=0 then suspend considered("}") else suspend ignored("}") } else { text:= tab(0) if bal=0 then suspend considered(text) else suspend ignored(text) } end procedure comments() # Read std input and remove comments. # For now compiler control lines are removed here also. com:= "false" # not in a comment now while line:= read()||"\n" do if line[1]=="#" then suspend ignored(line) else line ? while not pos(0) do if com=="false" then { if text:= tab(find("/*")) then com:= "true" else text:= tab(0) suspend considered(text) } else { if text:= tab(find("*/"))||move(2) then com:= "false" else text:= tab(0) suspend ignored(text) } end