Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!unix.cis.pitt.edu!dsinc!syd From: syd@DSI.COM (Syd Weinstein) Newsgroups: comp.mail.elm Subject: Re: Faster reading of mailboxes with indexing Keywords: elm mailbox fast index feature Message-ID: <1990Feb28.230830.9818@DSI.COM> Date: 28 Feb 90 23:08:30 GMT References: <1887@uniol.UUCP> Reply-To: syd@DSI.COM Distribution: comp Organization: Datacomp Systems, Inc. Huntingdon Valley, PA Lines: 25 henseler@uniol.UUCP (Herwig Henseler) writes: >When Un*x-mailers read a mailbox, they have to scan the whole file for >"From_"-lines to detect the top of the messages. So does elm. I can hardly >imagine a more uneffective way of achieving this aim! The mailbox format >is old enough to overcome an improvement... >Idea: Why not index this positions in a second file, so that only this > file (with seek-positions for every "From_"-line together with the > "From:"-entry, the "Subject:"-line and the total amount of lines) > has to be scanned to build the internal tables for elm. This will be > _much_ faster ! This was discusses a while back in the development group. Two proposals were considered, one imbed the index in the file itself as a fake pseudo message, the second was to use a seperate file, with the sub ideas of one file per user or one file per mail file. However, this whole point becomes less important as we head toward the Content-Length: header which allows for seeking over the body anyway and we do need to read the headers anyway. -- ===================================================================== Sydney S. Weinstein, CDP, CCP Elm Coordinator Datacomp Systems, Inc. Voice: (215) 947-9900 syd@DSI.COM or {bpa,vu-vlsi}!dsinc!syd FAX: (215) 938-0235