Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!rutgers!gatech!bloom-beacon!XEROX.COM!Bagley.PA From: Bagley.PA@XEROX.COM Newsgroups: comp.ai.digest Subject: Large corpora of English text Message-ID: <19880731213344.3.NICK@HOWARD-JOHNSONS.LCS.MIT.EDU> Date: 31 Jul 88 21:33:00 GMT Sender: daemon@bloom-beacon.MIT.EDU Organization: The Internet Lines: 20 Approved: ailist@ai.ai.mit.edu Date: Thu, 28 Jul 88 15:30 EDT From: Bagley.PA@Xerox.COM Subject: Large corpora of English text To: nl-kr@cs.rochester.edu, ailist@ai.ai.mit.edu Line-fold: no I am looking for public domain or commercially available corpora of either written English or transcriptions of spoken English, preferably significantly longer than a million characters. If it is tagged with part-of-speech that would be great, but it isn't necessary. Thanks for all assistance. Steve Bagley System Sciences Laboratory Xerox PARC 3333 Coyote Hill Road Palo Alto CA 94301 Bagley.pa@xerox.com 415-494-4331 -------