Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!spool.mu.edu!rex!rouge!vke From: vke@cacs.usl.edu (Venkatesh K. E.) Newsgroups: comp.ai.neural-nets Subject: Re: Clustering Message-ID: <31771@rouge.usl.edu> Date: 26 May 91 04:47:23 GMT Article-I.D.: rouge.31771 References: <1991May20.203008.27681@noose.ecn.purdue.edu> <19703@sdcc6.ucsd.edu> Sender: anon@rouge.usl.edu Organization: The Center for Advanced Computer Studies, USL Lines: 87 In article <19703@sdcc6.ucsd.edu> demers@beowulf.ucsd.edu (David Demers) writes: >In article reynolds@park.bu.edu (John Reynolds) writes: > >>>>>>> On 21 May 91 14:10:44 GMT, greenba@gambia.crd.ge.com (ben a green) said: > >>ben> Clustering is a way to sort things into groups that share similarities. >>ben> If you already know the classes to which the things belong, what's the >>ben> point of trying to cluster them? > >->>In addition to code compression, which is a consequence of both >->>supervised and unsupervised clustering, some supervised clustering >->>algorithms can allow generalization. Some systems can tessellate the >->>input space into homogeneous regions containing only patterns of a >->>single class. If such regions can be identified, then an informed >->>guess can be made about the class membership of future, unlabeled >->>patterns, and they can be treated accordingly. > >->>Moreover, it is often inappropriate to group patterns according to >->>ostensible similarity because the values of important variables may >->>not be known. Two objects may appear similar but they may differ in >->>unknown but important variables. > >->>Some supervised clustering algorithms can locally warp the similarity >->>metric so that functionally similar patterns are grouped together and >->>vice versa. Dimensions which are useful in separating functionally >->>different classes of objects are enhanced, and irrelevant dimensions >->>are compressed. > >OK, sounds interesting. I too naively thought clustering was >an unsupervise method. But in Duda & Hart, Jain, nor Hartigan >I can not locate anything about supervised clustering. Anyone >have any references? > There are not many papers that talk about clustering of patterns in supervised mode (as far as my knowledge goes). But i am sure that there are some published papers in Information Retrieval. The concepts used in IR are ver similar to those in pattern Recognition The work is 'User Oriented Clusterning' of documents and uses learning from examples. For each cluster (or in this case category) some positive and negative examples (or in this case documents that match the concept and those that do not) are provided. within the positive samples, there could be samples that are more close to the concept than few others. so we can rank the documents in a linear scale say 1-5 or 1-7, where 1 stands for most relevant and 5/7 stands for least relevant. In normal clustering of documents (Salton's work), the document vectors (pattern vectors) are grouped together based on some standard similarity, for example cosine similarity. This may not be very useful in User Oriented systems where user preferences are more important than the similarity between documents. for example two documnets (journal articles) are deemed relevant to the concept Computer_Science. it is not necessary the the cosine similarity between these two documents should be high. the above mentioned work is a part of PhD dissertation of Gwang S. Jung, titled "Connectionist Domain Knowledge Acquisition and its Evaluation in Information Retrieval", The Center for Advanced Computer Studies, University of Sothwestern Louisiana, May 1991. Other references could be found in Annual ACM SIGIR Conf. proceedings or IPM. Other ref. in learning from examples are 1. there are about 5 chapters in the book "Machine Learning: an AI Approach", Vol2, - Michalski, Carbonell, Mitchel; Morgan Kaufmann. 1986. 2. vol 1 may also be useful, i havent gone through vol3 the 3 volumes are considered to be "The Bible" 3. Machine Learning, Paradigms and Methods; Artificial Intelligence Vol 40. also published as a book by MIT Press - edited by J. Carbonell 4. There is a new book out by MIT Press. its title is something like "Pattern Recognition and Neural Networks" - Carpenter and Grossberg. I am not suer about the title. this book may contain some info. on surevised pattern clustering and recognition using ART (Adaptive Resonance Theory) Any comments about the above observation will be useful. Also, if there are any papers on supervised clustering other than from learing from examples, I would be eager to take a note of it. ----Venkatesh K. E. ------------------------------------------------------------------------------- Venkatesh K. Elayavalli phone : (318) 231 5809 (Off) P. O. Box 42544 email : vke@cacs.usl.edu Lafayette, LA 70504.