Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site hadron.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxn!ihnp4!qantel!lll-crg!seismo!rlgvax!hadron!jsdy From: jsdy@hadron.UUCP (Joseph S. D. Yao) Newsgroups: net.lang.c Subject: Re: Comments on your program Message-ID: <103@hadron.UUCP> Date: Fri, 29-Nov-85 19:10:17 EST Article-I.D.: hadron.103 Posted: Fri Nov 29 19:10:17 1985 Date-Received: Sun, 1-Dec-85 03:32:45 EST References: <135@brl-tgr.ARPA> Reply-To: jsdy@hadron.UUCP (Joseph S. D. Yao) Organization: Hadron, Inc., Fairfax, VA Lines: 49 In article <135@brl-tgr.ARPA> cottrell@nbs-vms.arpa (COTTRELL, JAMES) writes: > ... `Now why didn't you think before posting?' >> ... This program was written >> to help decode a bitnet routing table that I had been netcopy'd >> to me and didn't get translated into ascii. So after running dd >> over it, the line markers had disappeared into never never land. >> But from looking real closely at the file I could see that each >> line was supposed to start with ROUTE..... thus this program: > >It doesn't work. Suppose the sequence `ROUROUTE' occurs. The second >`R' will not be recognized as the start of the sequence! > >I thought of ways to use existing tools to do the job. How about this: >1) run thru `tr' to change all `R's to newlines. This gives you all >possible places where a line might start. Now run an `ex' script that >chex (wheat, corn, rice) each line begins with OUTE. If it doesn't, >then put back the R. Then for each line that begins with an R, join >it with the previous line. Finally, put back an R on each line. Yes, Herron's algorithm won't work without some way of backing up. No, Cottrell's algorithm won't work either. It assumes that ALL NL's have been removed, which is a possible but not necessary interpretation of the originally stated problem. In C, one way to do things is: while ((c = my_getchar()) != EOF { if (c != 'R') { putchar(c); last_put = c; continue; } gather 4 more test for ROUTE if so, print NL + 5 chars; last_put = 'E'; else ungetchar 4 (which is why my_getchar()) } if (last_put != NL) /* almost certainly so */ putchar(NL); This assumes that Herron is correct in his assumption that the word "ROUTE" was one-to-one with line starts. Note also that Herron implies a conversion from E***** to ASCII. If the original tape/file was blocked with fixed-length records, then there is a dd arg to size lines (cbs=, I believe). If var- length, he may have to read all lines in the original for the record sizes and substitute for them the E@#$%^ NL character before dd'ing. -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}