Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!cbmvax!cbmehq!cbmdeo!lenler!moria!bojsen From: bojsen@moria.uucp (Per Bojsen) Newsgroups: comp.sys.amiga.programmer Subject: Re: The Boyer-Moore string searching algorithm Message-ID: <1947e763.ARN36eb@moria.uucp> Date: 11 Jun 91 00:24:51 GMT References: <492@regina.uregina.ca> <19453c80.ARN360b@moria.uucp> <1991Jun10.140028.9664@ira.uka.de> Reply-To: cbmehq!moria!bojsen Followup-To: comp.sys.amiga.programmer Organization: IDUN-Soft Aps. Lines: 48 In article <1991Jun10.140028.9664@ira.uka.de>, |S| Angela Schmidt writes: > In <19453c80.ARN360b@moria.uucp> bojsen@moria.uucp writes: > > > The fastest-known single-keyword string searching algorithm both in theory > > and practice is the Boyer-Moore string matching algorithm described in > > Boyer, R.S. and J.S. Moore, ``A Fast String Searching Algorithm'', Comm. > > ACM 20(10) (1977) pp. 62-72. > > > Not totally correct. Boyer-Moore works in O(n*m), another algorithm, KMP only > needs O(n+m), where n is the length of the text you are searching in and m is > the length of the text you are searching for. So Boyer&Moore is not the > fastest in theory, only in practice: > I'm sorry, but I believe Boyer-Moore is O(m+n) like KMP. But Boyer-Moore has a much better average behavior than KMP. For sufficiently large alphabets Boyer-Moore only needs to examin about n/m characters of the input string on the average. > I've already implemented both algorithms. I've also implemented the most > trivialest one which needs about O(n*m) and looks like this: > for (i=0; i j=0; > while (j ... > j++; > } > } > This must be what is known as the brute-force algorithm. > I stopped the time in all three cases. I searched ASCII-Strings on a harddisk, > using the device. So there were a lot of differnent characters. KMP needed > about twice as much(!!!) time as the normal algorithm which needs about > O(n*m) - like Boyer&Moore. But when I used Boyer&Moore, it run much faster - > depending of the length of the searching string. So Boyer&Moore is really the > fastest - in practice. > Yes, and in theory, since it's O(m+n) and *not* O(m*n). You may want to check the references in my original posting. -- .------------------------------------------------------------------------------. | Greetings from Per Bojsen. | +------------------------------+-----------------------------------------------+ | EMail: cbmehq!lenler!bojsen | "Names do have power, after all, that of | | Or: bojsen@dc.dth.dk | conjuring images of places we have not seen" | `------------------------------+-----------------------------------------------'