Newsgroups: comp.sys.amiga.programmer Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!snorkelwacker.mit.edu!ira.uka.de!news From: S_ASCHMIDT@iravcl.ira.uka.de (|S| Angela Schmidt) Subject: Re: The Boyer-Moore string searching algorithm In-Reply-To: bojsen@moria.uucp's message of 11 Jun 91 00: 24:51 GMT Message-ID: <1991Jun12.143556.4127@ira.uka.de> Sender: news@ira.uka.de (USENET News System) Organization: University of Karlsruhe, FRG (Informatik Rechnerabteilung) References: <492@regina.uregina.ca> <19453c80.ARN360b@moria.uucp> <1991Jun10.140028.9664@ira.uka.de> <1947e763.ARN36eb@moria.uucp> Date: Wed, 12 Jun 1991 15:16:19 GMT X-News-Reader: VMS NEWS 1.08 Lines: 66 In <1947e763.ARN36eb@moria.uucp> bojsen@moria.uucp writes: > In article <1991Jun10.140028.9664@ira.uka.de>, |S| Angela Schmidt writes: > > > In <19453c80.ARN360b@moria.uucp> bojsen@moria.uucp writes: > > > > > The fastest-known single-keyword string searching algorithm both in theory > > > and practice is the Boyer-Moore string matching algorithm described in > > > Boyer, R.S. and J.S. Moore, ``A Fast String Searching Algorithm'', Comm. > > > ACM 20(10) (1977) pp. 62-72. > > > > > Not totally correct. Boyer-Moore works in O(n*m), another algorithm, KMP only > > needs O(n+m), where n is the length of the text you are searching in and m is > > the length of the text you are searching for. So Boyer&Moore is not the > > fastest in theory, only in practice: > > > I'm sorry, but I believe Boyer-Moore is O(m+n) like KMP. But Boyer-Moore has > a much better average behavior than KMP. For sufficiently large alphabets > Boyer-Moore only needs to examin about n/m characters of the input string > on the average. > Sorry, but I'm *very* shure it has O(n*m) since - in the worst case - it always tries to match the whole string and only recognizes at the last character, that there is a mismatch. Doing this n times causes O(m*n). Of course, normally it only needs a little bit more than n/m - that's why in practice it's so quick. aaaaaaa String you search in called A baaa String you search for called B ^ means that the character matches ! means that the character does not match Strings Characters that will be checked Number of checks --------------------------------------------------------------- aaaaaaa baaa !^^^ A[3] A[2] A[1] - mismatch at A[0] m checks in 1st round aaaaaaa baaa A[4] A[3] A[2] - mismatch at A[1] m checks in 2nd round !^^^ ... (n-m+1 times) This will be done n-m+1 times, so you need O(m*(n-m+1)) and since m< [...] > -- > .------------------------------------------------------------------------------. > | Greetings from Per Bojsen. | > +------------------------------+-----------------------------------------------+ > | EMail: cbmehq!lenler!bojsen | "Names do have power, after all, that of | > | Or: bojsen@dc.dth.dk | conjuring images of places we have not seen" | > `------------------------------+-----------------------------------------------' ---------------------------------------------------------+--------------------- Angela Schmidt Internet: S_ASchmidt@iravcl.ira.uka.de | // (Nessy in IRC) BITNET: UK8B@DKAUNI2.BITNET | Amiga // Phone: +49 731 712316 & 721 6904-263 | \\ // only 1000 MEZ: (10am-9pm) (12pm-10pm) | \X/ the real one! ---------------------------------------------------------+---------------------