Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!panda!husc6!harvard!seismo!umcp-cs!chris From: chris@umcp-cs.UUCP (Chris Torek) Newsgroups: net.micro.amiga,net.unix-wizards Subject: Re: seeks and dbm Message-ID: <1584@umcp-cs.UUCP> Date: Mon, 19-May-86 16:35:25 EDT Article-I.D.: umcp-cs.1584 Posted: Mon May 19 16:35:25 1986 Date-Received: Fri, 23-May-86 05:39:21 EDT References: <12593@ucla-cs.ARPA> <645@baylor.UUCP> <297@maynard.UUCP> Reply-To: chris@maryland.UUCP (Chris Torek) Distribution: net Organization: University of Maryland, Dept. of Computer Sci. Lines: 27 Xref: watmath net.micro.amiga:3209 net.unix-wizards:18123 In article <297@maynard.UUCP> campbell@maynard.UUCP (Larry Campbell) writes: >... the manual describes the [lseek] offset as a long (NOT a "simple >integer" nor a "magic cookie"). Second, if the offsets AREN'T integers, >then almost any database library or package won't work (like dbm(3)), >because they often hash keys into offsets in the index file. It is true that the current dbm code assumes that lseek offsets are simple `long's; but it would be trivial to get dbm to run on a system where this were not the case, as long as there were a simple mapping from an integer index to a `block' of some fixed size. The dbm scheme (`extensible hashing') is to convert a string (`datum') into a 32 bit hash index, then to start with no bits of the hash, and increase the number of bits used until it has reached a `leaf' value for that hash index. There is a bit map of size 2^{max-n-bits-ever-used}-1 that has a bit set for each value that is no longer a leaf. When you reach a leaf, the datum is either in the corresponding block, or not present in the database. The only really tricky thing is storing a datum, because if a leaf block fills up, it must be turned into a non-leaf block. If the hash values are `good', approximately half of the objects in the block will move to a new leaf (the other half will stay because they have a zero in the next bit of their hash). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu