Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uunet!clyde.concordia.ca!mcgill-vision!quiche!opus!clement From: clement@opus.cs.mcgill.ca (Clement Pellerin) Newsgroups: comp.text Subject: Re: wanted "French digital dictionary" Message-ID: <1841@opus.cs.mcgill.ca> Date: 11 Jan 90 21:04:12 GMT References: <1838@opus.cs.mcgill.ca> <110@pellan.UUCP> <12755@cgl.ucsf.EDU> <90Jan11.141206est.2694@neat.cs.toronto.edu> Reply-To: clement@opus.UUCP (Clement Pellerin) Organization: SOCS, McGill University, Montreal, Canada Lines: 41 In article <90Jan11.141206est.2694@neat.cs.toronto.edu> J.-F. Lamy writes: > Spelling checking is more difficult in languages where number and gender > agreement is an issue. A simple-minded approach like that of spell or ispell > would give you an immense number of false errors. In the case of French you > would at least need all conjugated variants of French verbs, and a way to > deal with accents properly, and I claim that would still not be enough. > remember well there are over 180 forms of verb conjugation in French -- forget > about those 3 groups and a few irregular ones you learned about in High School There are complete books on conjugations, even us can't get it right that's why we need a spelling checker:-) > The reason I bring this up is that someone tried to do spelling verification > using that data, and found out that it is a much harder problem than one might > think it is. So let me go on record as extremely skeptical that anything > useful would come out of a simple minded approach, and that what works for > English (spell/ispell) will not carry over to other languages (like French) > where word morphology is subject to weird and wonderful transmutations when > changing gender, number or tense. Granted, spell works because of the simple rules of English. A good French spelling checker would have to do a great deal of syntactical analysis before even coming close to what spell can achieve. I was well aware of the difficulties, and that's the reason I am only asking for a simple minded solution. Doing it right is simply not possible. Nevertheless, I consider that simple minded help is better than nothing. I would settle for anything that would lookup every word in the dictionary to see if it is present or not. You seem to imply that even this does not work. Obviously, number, gender and tense will go unnoticed. It will at least catch spelling mistakes in the roots of the words. Can you expand on your fellow's experiments? I don't see how he would conclude that this simple minded tool is not worth it. Let me reinstate that we are also looking for a machine readable dictionary with definitions of the words. There is a problem of fast indexing but that should be easy to do. Webster on the NeXT does it very well indeed. -- news