Path: utzoo!mnetor!uunet!husc6!tut.cis.ohio-state.edu!mdf From: mdf@tut.cis.ohio-state.edu (Mark D. Freeman) Newsgroups: comp.databases Subject: Duplicate elimination Message-ID: <10496@tut.cis.ohio-state.edu> Date: 13 Apr 88 19:28:28 GMT Organization: StrongPoint Systems, Inc.; Columbus, OH. (guest of Ohio State U.) Lines: 28 I am looking for some algorithms to do duplicate detection on addresses. We have several databases which all have as a subset: First name Last Name Address1 Address2 City State Zip We would like some way of determining if a new record represents a duplicate of the address, taking into account variations in the addressing (i.e. 201 Test Street and 201 Test St., 201-B Foo Ave and 201 Foo Ave. Apt. B, etc.). An algorithm to standardize addresses would be great too. The post office uses one for their free 9-digit-zip encoding service, but I don't know how it works. Thanks! -- Mark D. Freeman (614) 262-1418 mdf@tut.cis.ohio-state.edu 2440 Medary Avenue ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!mdf Columbus, OH 43202-3014 Guest account at The Ohio State University