Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site hadron.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxn!ihnp4!qantel!lll-crg!seismo!rlgvax!hadron!jsdy
From: jsdy@hadron.UUCP (Joseph S. D. Yao)
Newsgroups: net.lang.c
Subject: Re: Comments on your program
Message-ID: <103@hadron.UUCP>
Date: Fri, 29-Nov-85 19:10:17 EST
Article-I.D.: hadron.103
Posted: Fri Nov 29 19:10:17 1985
Date-Received: Sun, 1-Dec-85 03:32:45 EST
References: <135@brl-tgr.ARPA>
Reply-To: jsdy@hadron.UUCP (Joseph S. D. Yao)
Organization: Hadron, Inc., Fairfax, VA
Lines: 49

In article <135@brl-tgr.ARPA> cottrell@nbs-vms.arpa (COTTRELL, JAMES) writes:
>                             ...  `Now why didn't you think before posting?'
>>                                    ...  This program was written
>> to help decode a bitnet routing table that I had been netcopy'd
>> to me and didn't get translated into ascii.  So after running dd
>> over it, the line markers had disappeared into never never land.
>> But from looking real closely at the file I could see that each
>> line was supposed to start with ROUTE.....  thus this program:
>
>It doesn't work. Suppose the sequence `ROUROUTE' occurs. The second
>`R' will not be recognized as the start of the sequence!
>
>I thought of ways to use existing tools to do the job. How about this:
>1) run thru `tr' to change all `R's to newlines. This gives you all
>possible places where a line might start. Now run an `ex' script that
>chex (wheat, corn, rice) each line begins with OUTE. If it doesn't,
>then put back the R. Then for each line that begins with an R, join
>it with the previous line. Finally, put back an R on each line.

Yes, Herron's algorithm won't work without some way of backing up.
No, Cottrell's algorithm won't work either.  It assumes that ALL
NL's have been removed, which is a possible but not necessary
interpretation of the originally stated problem.  In C, one way
to do things is:
	while ((c = my_getchar()) != EOF {
		if (c != 'R') {
			putchar(c);
			last_put = c;
			continue;
		}
		gather 4 more
		test for ROUTE
		if so, print NL + 5 chars; last_put = 'E';
		else ungetchar 4 (which is why my_getchar())
	}
	if (last_put != NL)	/* almost certainly so */
		putchar(NL);
This assumes that Herron is correct in his assumption that the
word "ROUTE" was one-to-one with line starts.

Note also that Herron implies a conversion from E***** to ASCII.
If the original tape/file was blocked with fixed-length records,
then there is a dd arg to size lines (cbs=, I believe).  If var-
length, he may have to read all lines in the original for the
record sizes and substitute for them the E@#$%^ NL character
before dd'ing.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}