Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!apple!sun-barr!decwrl!ucbvax!tut.cis.ohio-state.edu!cica!ctrsol!ginosko!uunet!kddlab!icot32!nttlab!utogw!kuno From: kuno@gssm.otsuka.tsukuba.JUNET (Yasushi Kuno) Newsgroups: comp.unix.wizards Subject: Re: Algorithm needed: reading/writing a large file Message-ID: <416@utogw.gssm.otsuka.tsukuba.JUNET> Date: 12 Jul 89 08:38:07 GMT References: <205@larry.sal.wisc.edu> <8137@bsu-cs.bsu.edu> <207@larry.sal.wisc.edu> Sender: news@utogw.gssm.otsuka.tsukuba.JUNET Lines: 25 jwp@larry.sal.wisc.edu (Jeffrey W Percival) writes: >My question was about optimizing the process of rearranging a disk >file according to a *given* mapping. >One helpful person suggested reading sequentially and writing randomly, >rather than vice-versa, and I tried that but it didn't help. I guess >the benefit gained from using the input stream buffering was cancelled >out by the effective loss of the output stream buffering. >The ever-attentive and always-reliable Mr Torek suggested looking >into disk cache algorithms in OS journals, and verily that I will do, >with thanks to those who responded. Oh, that seems quite right, but how about this? Open N files for writing, read sequentially, and write each record to one of them so that N output files form ordered bucket... i.e. every record in output file 1 is smaller than output file 2, and so on. (You can do this because you already have mapping in the memory.) Then original problem reduce to sorting N smaller random files. Apply this recursively until each file fit into your memory. N and pass number should depend on your system. Isn't this simple? I'm curious if it works fine or not, so let me know if you have tried this. Yasushi Kuno