Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!mcsun!ukc!pyrltd!abekrd!garyb From: garyb@abekrd.UUCP (Gary Bartlett) Newsgroups: comp.unix.shell Subject: Re: Problem using multiple 'head' commands in shell script Keywords: head shell buffering Message-ID: <1678@abekrd.UUCP> Date: 31 Jan 91 12:37:39 GMT References: <1671@abekrd.UUCP> <6925@exodus.Eng.Sun.COM> Organization: Abekas Video Systems Ltd, Reading, England Lines: 81 In krs@uts.amdahl.com (Kris Stephens [Hail Eris!]) writes: >In article <6925@exodus.Eng.Sun.COM> mcgrew@ichthous.Eng.Sun.COM (Darin McGrew) writes: >>In article <1671@abekrd.UUCP> garyb@abekrd.UUCP (Gary Bartlett) writes: >>->... >>->It looks like 'head' initially reads in a whole buffer of data from file >>->(stdin), prints out the requisite number of lines and then dumps the rest >>->of the buffer. The next 'head' then reads the NEXT buffer.... >> >If, however, the echo "Line ?01 follows" in the original example >was a place holder for "I want to do other stuff here, then pick up >processing with the next set of lines", neither the awk nor the sed >calls will allow it, as both simply insert the line-counting messages >into the stream of data from file. This is indeed what I intended - see my last piece of news on the subject. >Dog slow though it be, the following will do it: > #!/bin/sh > ( > i=1 > while [ $i -lt 201 ] > do > read line; echo "$line" > i=`expr $i + 1` > done > : process some more stuff here > cat - > ) < file This is effectively what I started out using - a 'while' loop, an 'expr' counter, and a couple of 'read's. Hideously slow! >You may be forced into multiple reads of the file to get something >resembling good performance: >The saving graces here are that, even though the file is opened three >times, (1) only the first 200 lines are read thrice and the second >200 twice, and (2) one avoids the nearly nightmarish performance of >the while loops in the example preceeding this one. It doesn't hurt, >too, that sed is pretty quick. The thing is, the file I'm merging from may be very long (ie very many sed passes). >Now, let's take it one step further and generalize it into a function... I DO like the function idea though. I did actually write my own 'head' (C) program which turned off all buffering of the stdin before doing any reading. This did the trick and worked in the shell script. It was faster but not greatly so - I guess it had to read every character individually. I did try using line-buffering but this did not work. It still lost data (although not as much as when using the full-buffering of head). I'm not overly happy with that solution though - I but it's not at all portable. *** FLASH OF INSPIRATION *** I have an idea: - Process the original file by putting the line number at the beginning of each line, - Process the file to be merged so that the merge points are at the beginning of each of these lines, - Cat the two processed files together and pass through 'sort', - Remove line numbers from beginning of resulting file, QED This doesn't matter how big either file is. Thoughts? Thanks again for some very useful input, Gary -- --------------------------------------------------------------------------- Gary C. Bartlett NET: garyb@abekrd.co.uk Abekas Video Systems Ltd. UUCP: ...!uunet!mcsun!ukc!pyrltd!abekrd!garyb 12 Portman Rd, Reading, PHONE: +44 734 585421 Berkshire. RG3 1EA. FAX: +44 734 567904 United Kingdom. TELEX: 847579