Path: utzoo!mnetor!tmsoft!torsqnt!lethe!yunexus!ists!helios.physics.utoronto.ca!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!maverick.ksu.ksu.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!olson
From: olson@sax.cs.uiuc.edu (Bob Olson)
Newsgroups: comp.lang.perl
Subject: Re: Large, sparse dbm databases
Message-ID: <OLSON.91Feb8081825@sax.cs.uiuc.edu>
Date: 8 Feb 91 13:18:25 GMT
References: <1991Feb7.044015.10585@ux1.cso.uiuc.edu> <pl9Ga+oh@cs.psu.edu>
Sender: news@ux1.cso.uiuc.edu (News)
Organization: University of Illinois, Urbana-Champaign
Lines: 11
In-Reply-To: flee@cs.psu.edu's message of 7 Feb 91 23:58:29 GMT

Yep, that seems to be it. Credit also to Randal who emailed a similar
answer. 

My solution?  Tokenize all strings inserted into the database
using another dbm database for the string->integer conversion and an
array for the integer->string conversion. It works quite well, and the
resulting databases are a LOT smaller, with seemingly small
performance hit. The database ends up consisting of entries like
	1^\3^\7^\190 --> 4,2,10,5

--bob