Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site harvard.UUCP Path: utzoo!watmath!clyde!cbosgd!gatech!seismo!harvard!macrakis From: macrakis@harvard.UUCP (Stavros Macrakis) Newsgroups: net.internat,net.nlang Subject: Texts in other languages Message-ID: <517@harvard.UUCP> Date: Wed, 20-Nov-85 15:30:02 EST Article-I.D.: harvard.517 Posted: Wed Nov 20 15:30:02 1985 Date-Received: Sat, 23-Nov-85 01:27:29 EST Organization: Aiken Comp. Lab., Harvard Lines: 18 Keywords: text, natural language, corpus Xref: watmath net.internat:77 net.nlang:3779 For an experiment in text compression, I would find it useful to have a collection of texts in a variety of languages. Ideally, I would like a half-dozen distinct texts, each 2000-15000 words long, in each language. The texts should be in a consistent (documented) transcription, preferably without formatting commands. The texts need not be selected to be `representative' of the language. For instance, technical papers are fine. The languages in which I am interested are French, Italian, German, (Modern) Greek, Arabic, and Turkish. If you have texts in other languages, please let me know. If you could send me mail describing the texts you might be able to provide, we can find some way of transferring them later. Thanks -s Macrakis@Harvard.{Harvard.EDU,ARPA,uucp,csnet} @Harvunxh.bitnet