Path: utzoo!utgpu!watserv1!watmath!att!att!linac!pacific.mps.ohio-state.edu!zaphod.mps.ohio-state.edu!wuarchive!udel!haven!mimsy!chris From: chris@mimsy.umd.edu (Chris Torek) Newsgroups: comp.unix.programmer Subject: Re: ndbm won't cut it, what now? Message-ID: <27794@mimsy.umd.edu> Date: 18 Nov 90 17:53:53 GMT References: <1990Nov11.111118.21856@cs.umn.edu> <2300@sixhub.UUCP> <1990Nov16.145542.22326@iecc.cambridge.ma.us> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 192 >In article rang@cs.wisc.edu >(Anton Rang) writes: >>... I thought the original poster meant 'multiple keys' in the >>sense of one record having several access paths to it. (This was the impression I got as well.) In article <1990Nov16.145542.22326@iecc.cambridge.ma.us> johnl@iecc.cambridge.ma.us (John R. Levine) writes: >You can fake it by making one key the primary key and storing the actual data >in its record. The secondary keys point to the primary key. To keep sizes >down you might want to make the primary key something small and boring like a >three or four byte serial number. > >Logically each key type is in a separate file, but you can combine them into >one big dbm file using the prefix byte trick. It's gross, but no grosser >than what real data bases do every day. You also need to store the `next serial number', so you have to reserve a key for this as well. This is indeed rather gross. A few years ago I thought about this problem for a while, and came up with a different idea based on the same hashing technique. To explain it I need to describe the idea behind dbm. The way dbm works is to take each `key' string and turn it into a 32-bit number, something like a CRC-32 or a checksum. This number is not necessarily unique (two different strings will produce the same 32-bit value) but collisions should be rare, if the string-to-number algorithm is chosen carefully. In any case, the same string always produces the same number. Now to store a `datum', we turn its `key' into our 32 bit number. This number is taken as the block number in a file of blocks (typically 4096 bytes each). To avoid starting out with 2^32 blocks, we ignore `some' of the bits in this number. Initially we ignore *all* the bits, storing all key/datum pairs in block 0; as the database grows, we ignore fewer bits. We can leave a binary decision tree behind as a record of when we stopped ignoring each bit. This `tree' is simply a string of bits, 0 where we are still ignoring, and 1 where we are not. Given a key hash `h', we start by saying `ignore all the bits', then we ignore fewer bits as long as bit (h&mask)+mask is set: /* * NB: the following is `off the top of my head'. * It may contain errors. */ hash = calchash(key); for (hmask = 0; bit(map, (hash & hmask) + hmask) != 0; hmask = (hmask << 1) | 1) /* void */; Now if the key is anywhere in the database, it is in the block whose number is (hash & hmask). The datum for that key is stored immediately after the key, so all we have to do is read that block and march through every even-numbered entry, comparing against the key. If there is a match, return the corresponding odd-numbered entry. Under this implementation, storing key=value pairs such as mimsy=128.8.128.8 mimsy.umd.edu mimsy.umd.edu=128.8.128.8 mimsy.umd.edu mimsy.cs.umd.edu=128.8.128.8 mimsy.umd.edu (a name to number-and-official-name mapping as might be found in a host number database) requires storing `128.8.128.8 mimsy.umd.edu' three times---51 bytes, at 17 bytes each, counting the host number as 4 bytes. Using John Levine's technique we might instead say Nmimsy=17 Nmimsy.umd.edu=17 Nmimsy.cs.umd.edu=17 #17=128.8.128.8 mimsy.umd.edu so that we can store `128.8.128.8 mimsy.umd.edu' only once, saving 34 bytes (and costing 1+4 per key plus 5 for the intermediate key; we wind up with a net savings of 14 bytes---minus 2 more hidden inside the dbm implementation, used when storing that intermediate key). But this means that to look up a name we have to say: *buf = 'N'; strcpy(buf + 1, name); key.dptr = buf; key.dsize = strlen(buf); datum = fetch(key); if (datum.dptr != NULL) { *buf = '#'; bcopy(datum.dptr, buf + 1, sizeof(int)); key.dsize = 1 + sizeof(int); datum = fetch(key); if (datum.dptr == NULL) panic("lost target"); printf("officially, %s = %.*s\n", name, datum.dsize, datum.dptr); } else printf("%s does not exist\n", name); There *is* another way. Instead of storing keys and data in pairs, we can store each individually. In other words, we can dissociate them---put each in its own block. All we have to do is somehow tie one to the other (otherwise the database is no good!). The 32-bit hash value is almost good enough, but not quite: two different strings can produce the same hash. But what if we augment it? Instead of viewing the database as a series of `key/datum' pairs, suppose we view it as a collection of `items'. Each item has three attributes: A. A number that is unique to this item within its block. (Blocks are typically 4096 bytes or so, so there cannot possibly be more than 4096 or so items; a 16-bit number suffices.) I called this an `index' for want of a better name; this particular index is the item's `in index'. (`Key' might be a better name in some other context :-) .) This exists so that the item can be a datum. B. A `link' record consisting of a 32-bit value and another index. This is only used if the item is considered a `key'. The index here is an `out index'; it is zero if the item is not acting as a key. The 32-bit value is an `out hash'. C. A `link count' telling how many times this item is a key or datum (it can only be a key once). Now to store an item, whether as key or datum, we compute its hash and find its block as above. Then, if the item is a key (has a nonzero `out index') and we are replacing its current content (link target), we `unlink' its current target, store the new datum, and `link' this key to the item produced here. If the item is not a key and is being made one, it simply acquires an `out index'. If the item is a datum, we just increment its link count. To find a key's datum, we `follow the link': the out hash is the hash value for the key's datum. Just as with a key, this (plus the decision bit tree) tells us exactly which block will contain the key's datum (which in this case is certain to exist). Since the item index values are unique to each block, once we have the correct block all we need to do is find the item with the same `in index' number that the key has in its `out index'. Thus, instead of Nmimsy=#17 Nmimsy.umd.edu=#17 Nmimsy.cs.umd.edu=#17 #17=128.8.128.8 mimsy.umd.edu we might have a database that looks like this: : : : More interestingly, if we create a new database and stick hello=hello into it, we get: : This key points to itself! There is no problem with circular references, because a key does not point to another `key', but rather to an `item'. This item has two references (itself as a key, and itself as a datum), not infinitely many. Storing hello=world causes the count to go to 1 (when the as-datum link vanishes); removing `hello' as a key then makes the count go to 0. The main problem with this scheme is that, as with John Levine's, it takes two database accesses to locate an item (though this time the work is hidden from the programmer). There is also a problem with the index values. As each block fills and splits, a new block is created with some index values `used up', and it can become difficult to decide on the next `available in index'. I got around this by recording minimum and maximum `in use' index numbers in each block header, so that typically simply taking the next number suffices. When the min and max values reach 1 and 65535 (actually USHRT_MAX), however, it is necessary to scan the block for a free number. Also, this technique imposes a fair amount of overhead (12 bytes per item for a size offset, link count, two index values, and the out hash, versus 2 bytes per item in dbm/ndbm for the size offset). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris