Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!att!cbnewsh!ka@cbnewsh.ATT.COM
From: ka@cbnewsh.ATT.COM (Kenneth Almquist)
Newsgroups: comp.unix.wizards
Subject: Re: Algorithm needed: reading/writing a large file
Message-ID: <2024@cbnewsh.ATT.COM>
Date: 7 Jul 89 19:07:12 GMT
References: <205@larry.sal.wisc.edu>
Sender: jgy@cbnewsh.ATT.COM
Reply-To: ka@hulk.att.com
Lines: 17

jwp@larry.sal.wisc.edu (Jeffrey W Percival) writes:
> I am writing a C program (Ultrix, VAXstation 2000) to re-arrange a
> large disk file.  The file contains a large number of fixed length
> records.  [The program sorts the keys in memory, and then does
> random accesses to the file to write the records in order.]
>
> I cannot have all of [the file] in memory, but can have more than the one
> record I am currently using in this shuffle.  How can I speed this up?

I'm pretty sure that you can't speed up a shuffle operation much by
keeping only part of the data in memory.  Anyone with a theoretical
bent want to try to prove this?

I suggest you try the traditional approach of sorting chunks of the
file in memory and then merging the results.  This will perform more
I/O than your scheme, but it will be sequential I/O.
				Kenneth Almquist