Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uwm.edu!mailrus!accuvax.nwu.edu!nucsrl!telecom-request From: wrf@mab.ecse.rpi.edu (Wm Randolph Franklin) Newsgroups: comp.dcom.telecom Subject: Re: A Puzzle Message-ID: <4232@accuvax.nwu.edu> Date: 22 Feb 90 23:51:05 GMT Sender: news@accuvax.nwu.edu Organization: Rensselaer Polytechnic Institute, Troy NY Lines: 38 Approved: Telecom@eecs.nwu.edu X-Submissions-To: telecom@eecs.nwu.edu X-Administrivia-To: telecom-request@eecs.nwu.edu X-Telecom-Digest: Volume 10, Issue 120, message 3 of 12 There is a way to store a sparse list of long numbers or words in less space than it would take to list them, if you allow some false positives. Hence it would be better to store the valid numbers, else people would be falsely accused of using bad cards. Let N = number of numbers/words to store. Set M = 20M (20 is a user parameter) Allocate B[1:M] a long bit string, initially 0. So our total storage req is 20 bits/number, INDEPENDENT of the numbers' lengths, which may be much less than needed to store the numbers. Define 10 hash functions from numbers to bit locations: Hi(n) -> l where i says which hash function, n is the number to be hashed, and l is the output bit number. Now, to store number n, set the 10 bits B[H1(n)], ..., B[H10(n)]. Repeat for all the card numbers to be stored. This will set somewhat less than half of all the bits in B (because of some bits being set several times). To check whether a number is valid, compute the 10 hash functions and check the 10 bits. If any bit is 0, that is definitely an invalid number. On the other hand if the number is invalid there is less than a 1/1024 = 0.1% chance that this scheme gives a false positive. Note that the error rate is a function of 10 and 20. This is also an excellent method to store a spelling dictionary. Does anyone know if this is actuallu used for calling or credit cards? Wm. Randolph Franklin Internet: wrf@ecse.rpi.edu (or @cs.rpi.edu) Bitnet: Wrfrankl@Rpitsmts Telephone: (518) 276-6077; Telex: 6716050 RPI TROU; Fax: (518) 276-6261 Paper: ECSE Dept., 6026 JEC, Rensselaer Polytechnic Inst, Troy NY, 12180