Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!apple!amdahl!krs
From: krs@uts.amdahl.com (Kris Stephens [Hail Eris!])
Newsgroups: comp.unix.shell
Subject: Re: Problem using multiple 'head' commands in shell script
Keywords: head shell buffering
Message-ID: <37nW01QQ16Rt00@amdahl.uts.amdahl.com>
Date: 1 Feb 91 16:01:51 GMT
References: <1671@abekrd.UUCP> <6925@exodus.Eng.Sun.COM> <fcPl016n13RO00@amdahl.uts.amdahl.com> <1678@abekrd.UUCP>
Reply-To: krs@amdahl.uts.amdahl.com (Kris Stephens [Hail Eris!])
Organization: Amdahl Corporation, Sunnyvale CA
Lines: 74

In article <1678@abekrd.UUCP> garyb@abekrd.UUCP (Gary Bartlett) writes:
>*** FLASH OF INSPIRATION ***
>
>I have an idea:
>- Process the original file by putting the line number at the beginning of
>  each line,
>- Process the file to be merged so that the merge points are at the beginning
>  of each of these lines,
>- Cat the two processed files together and pass through 'sort',
>- Remove line numbers from beginning of resulting file, QED
>
>This doesn't matter how big either file is.
>
>Thoughts?

Hmmm....  If you've got the blocks available on your disk to
handle a second copy of the file, read it once and break it
into (n) pieces of your required length, each.  This would
use pretty much the same number of blocks of storage (plus
at most a block-per-chunk for fractional last blocks) and
some number of inodes (number of lines / lines per chunk).

--- multi-piece processing --
:
file=${1:-file}
lines=${2:-200}
base=Base

# awk will return the number of subsets it created
subsets=`awk '
NR == 1 {
	i++
	outfile = base "" i
	print > outfile
	next
	}
NR % lines == 1 {
	close outfile
	i++
	outfile = base "" i
	print > outfile
	next
	}
	
	{ print >> outfile }
END { close outfile; print i + 0 }' base=$base lines=$lines $file`

i=1
while [ $i -le $subsets ]
do
	subset="$base$i"
	cat $subset
	rm $subset	# Clean up now
	echo "# End of block $i"
	i=`expr $i + 1`
done > report	# or pipe it to something else (lp?)
--- multi-piece processing ---

The perl-literate will undoubtedly say "If you're going to do this,
use perl instead", but so it goes...   :-)

Anyhow, this reads the data twice (once to break it into pieces
of $lines; once to cat them into the report file).

>Thanks again for some very useful input,
>Gary

You're very welcome -- helps me stretch out my thinking, too.
...Kris
-- 
Kristopher Stephens, | (408-746-6047) | krs@uts.amdahl.com | KC6DFS
Amdahl Corporation   |                |                    |
     [The opinions expressed above are mine, solely, and do not    ]
     [necessarily reflect the opinions or policies of Amdahl Corp. ]