Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!unix.cis.pitt.edu!dsinc!syd
From: syd@DSI.COM (Syd Weinstein)
Newsgroups: comp.mail.elm
Subject: Re: Faster reading of mailboxes with indexing
Keywords: elm mailbox fast index feature
Message-ID: <1990Feb28.230830.9818@DSI.COM>
Date: 28 Feb 90 23:08:30 GMT
References: <1887@uniol.UUCP>
Reply-To: syd@DSI.COM
Distribution: comp
Organization: Datacomp Systems, Inc.  Huntingdon Valley, PA
Lines: 25

henseler@uniol.UUCP (Herwig Henseler) writes:

>When Un*x-mailers read a mailbox, they have to scan the whole file for
>"From_"-lines to detect the top of the messages. So does elm. I can hardly
>imagine a more uneffective way of achieving this aim! The mailbox format
>is old enough to overcome an improvement...

>Idea: Why not index this positions in a second file, so that only this
>      file (with seek-positions for every "From_"-line together with the
>      "From:"-entry, the "Subject:"-line and the total amount of lines)
>      has to be scanned to build the internal tables for elm. This will be
>      _much_ faster !
This was discusses a while back in the development group.  Two proposals
were considered, one imbed the index in the file itself as a fake pseudo
message, the second was to use a seperate file, with the sub ideas
of one file per user or one file per mail file.

However, this whole point becomes less important as we head toward the
Content-Length: header which allows for seeking over the body anyway
and we do need to read the headers anyway.
-- 
=====================================================================
Sydney S. Weinstein, CDP, CCP                   Elm Coordinator
Datacomp Systems, Inc.				Voice: (215) 947-9900
syd@DSI.COM or {bpa,vu-vlsi}!dsinc!syd	        FAX:   (215) 938-0235