Xref: utzoo comp.ai:7956 comp.ai.edu:121 comp.ai.philosophy:288 comp.ai.neural-nets:2419 comp.edu:3769
Path: utzoo!attcan!uunet!mcsun!unido!sbsvax!fb14vax!ak
From: ak@fb14vax.cs.uni-sb.de (Alfred Kobsa)
Newsgroups: comp.ai,comp.ai.edu,comp.ai.philosophy,comp.ai.neural-nets,comp.edu
Subject: MAILSERVER FOR AI LITERATURE
Keywords: bibliographic database, mail access, refer and LaTeX format
Message-ID: <7452@sbsvax.cs.uni-sb.de>
Date: 5 Nov 90 15:21:05 GMT
Sender: news@sbsvax.cs.uni-sb.de
Followup-To: poster
Lines: 189


                 THE LIDO MAILSERVER FOR AI LITERATURE             Version 2.0

A mail server has been developed at the Computer Science Department of the 
University of Saarbruecken which accesses a large database of bibliographic
data of articles pertaining to the field of Artificial Intelligence. At the 
moment, this database contains more than 13.000 articles, which can be 
retrieved via electronic mail. The result will be returned either in LaTeX 
(Bibtex) format or in a Refer-like format.

This mail server is a "by-product" of the bibliographic information system
LIDO which is currently under development at the University of Saarbruecken.
The following people are involved in this project:

Coordination:  Alfred Kobsa
Hacker:        Monika Klar
               Alfred Kobsa
               Peter Schwarz
Wizards:       Gerd Herzog
               Clemens Huwig
Mail-Freak:    Roman Jansen-Winkeln
Data Input:    Christa Weinen
               Gisela Veit

The LIDO MAILSERVER is partly based on the UNIX refer system. Queries to the 
bibliographic database are restricted to the names of the author(s), the title,
and the year of publication. Users may select between full word search (fast,
since index-based; hence prioritized processing) and substring search with
optional regular expressions. Global search with key words is *not* possible.
Users who already have a certain overview of a field will thus probably profit
more from the LIDO MAILSERVER than novices familiarizing themselves with a new
area.

In order to keep the network and computer workload tolerable and to control
erroneous queries, certain security limits have been introduced:
  1. Not more than 150 articles may be retrieved per query, and not more than
     500 per message.
  2. Queries with the option `nosubstring' are handled with priority.

Since LIDO is still under development, it cannot be distributed yet. However,
the bibliographic data (3 MB at the moment) may be obtained on a license basis
for a fee of U.S.$ 75.00-300.00 via ftp or on tape. Please understand that
it is not possible for us to lend out or to copy articles which you retrieve
in the bibliographic databases. If you find an error, please send a note to 
bib-1@cs.uni-sb.de.

Messages to the LIDO MAILSERVER should be sent to

                  lido@cs.uni-sb.de

and should have the following format:

a) Subject field:

   - First the key word `lidosearch'.
   - Then the desired format of the bibliographic data in the return message:
     `latex' (= Bibtex format) or `nolatex' (= refer-like format). The default
     is `nolatex'.
   - Then the form of retrieval:
     a) `nosubstring': Your search patterns (see below) must be full words. 
        Your message will be handled with priority.
     b) `substring' (default): Your search patterns may be substrings. Regular
        expressions in the egrep notation (see Appendix) may be used as well.
        Plural forms and spelling variants can thereby be accounted for.
   - Then the language that should be used for comments and error messages in 
     the return message: `english' or `deutsch' (default).

b) Body of the Message:
   
   Each line of the body of the message contains one or more search patterns 
   which may refer to the names of the authors, to words in the title, or to 
   the year of publication. If a line contains more than one search pattern, 
   only those articles are retrieved which match *all* patterns. German umlauts
   and the `scharfes s' should be transliterated as follows: A", O", U", a", 
   o", u", s"


Example 1:
---------
mail lido@cs.uni-sb.de
Subject: lidosearch latex nosubstring english

wahlster
generation
kobsa models 1989

This message contains three different queries. In the first case, all articles
are retrieved which contain the word `wahlster' as an author's name or as a 
word in the title. In the second case, the same applies to `generation'. In the
third case, all articles are retrieved which contain both `kobsa' and `model' 
and 1985 (but not `models', since `nosubstring' was selected). The message will
be handled with priority since `nosubstring' was chosen. The references in the
return message will be in LaTeX (Bibtex) format, and error messages and 
comments will be in English.


Example 2:
---------
mail lido@cs.uni-sb.de
Subject: lidosearch latex substring english

kobs natu"rlichspr 

This message contains a single query only. All articles will be retrieved which
contain both the substring `kobs' (like in `Kobsa' or `Jakobson') and the
substring `natu"rlichspr'. The return message will come in LaTeX format,
and error messages and comments will be in English.


Example 3:
---------
mail lido@cs.uni-sb.de
Subject: lidosearch substring english

morpholog(y|ie)
modell?ing
modell*ing
model+ing
ja[ck]obson
\<kobs

This message contains 6 queries which will yield articles containing the 
following strings in the titles or authors' names (the output will come in 
a refer-like format, and the comments will be in English):

Query 1:   `morphology' or `morphologie' (German spelling)
Query 2:   `modeling' or `modelling'
Query 3+4: `modeling', `modelling', `modellling', etc.
Query 5:   `jacobson' or `jakobson'
Query 6:   `kobs' at the beginning of a word (thus articles of Kobsa but
                                      not of Jakobson are found).
            
Summary:
   
   mail lido@cs.uni-sb.de
   Subject: lidosearch [help][info]                        Sends this message
                       {[latex][nolatex]}                  Default: nolatex
                       {[substring][nosubstring]}          Default: substring
                       {[english][deutsch]}                Default: deutsch
Body of message:
         Query pattern(s) of first query
         Query pattern(s) of second query
             :
             :

Bugs: Very long words are truncated by the refer program which underlies the
      'nosubstring' mode of LIDO. Theoretically it could therefore happen that
      additional undesired articles are retrieved by the LIDO MAILSERVER in 
      this mode when long patterns are employed.

Good luck with your bibliographic search with LIDO!

-------------------------------------------------------------------------------


REGULAR EXPRESSIONS
    
     (egrep)   (explanation)

       _c      a single (non-meta) character matches itself.
        .      matches any single character except newline.
        ?      postfix operator; preceeding item is optional.
        *      postfix operator; preceeding item 0 or more times.
        +      postfix operator; preceeding item 1 or more times.
        |      infix operator; matches either argument.
       \<      matches the empty string at the beginning of a word.
       \>      matches the empty string at the end of a word.
  [_c_h_a_r_s] match any character in the given class; if the first character 
               after [ is ^, match any character not in the given class; 
               a range of characters may be specified by _f_i_r_s_t-_l_a_s_t; 
               for example, \W (below) is equivalent to the class [^A-Za-z0-9]
      ( )      parentheses are used to override operator precedence.
  \_d_i_g_i_t  \_n matches a repeat of the text matched earlier in the regexp
               by the subexpression inside the nth opening parenthesis.
       \       any special character may be preceded by abackslash to match it
               literally.

(the following are for compatibility with GNU Emacs)
      \b       matches the empty string at the edge of a word.
      \B       matches the empty string if not at the edge of a word.
      \w       matches word-constituent characters (letters & digits).
      \W       matches characters that are not word-constituent.

     Operator precedence is (highest to lowest) ?, *, and +, con-
     catenation, and finally |.  All other constructs are syntac-
     tically identical  to  normal  characters.   For  the  truly
     interested,  the  file  dfa.c describes (and implements) the
     exact grammar understood by the parser.