Path: utzoo!utgpu!attcan!uunet!ncrlnk!ncrcae!hubcap!gatech!ukma!tut.cis.ohio-state.edu!uccba!uceng!dmocsny
From: dmocsny@uceng.UC.EDU (daniel mocsny)
Newsgroups: comp.sys.next
Subject: Re: Hundreds of books on an optical disk
Summary: connections.
Message-ID: <407@uceng.UC.EDU>
Date: 10 Nov 88 06:53:42 GMT
References: <0XMtqn087E-0A14EYk@andrew.cmu.edu> <344@uceng.UC.EDU> <1804@garth.UUCP>
Organization: Univ. of Cincinnati, College of Engg.
Lines: 43

In article <1804@garth.UUCP>, fenwick@garth.UUCP (Stephen Fenwick) writes:
> The only problem with this is keeping everything on file in a manner that
> allows users to find what they need.  This is non-trivial, as the information
> content of a work may not be limited by the author's conception of the its
> content.  Watch the PBS series "Connections" to see what I mean.  

This is exactly why we need to store information in a form that
retains the maximum flexibility, because the author cannot predict all
the uses it might find. Suppose we just store all the books and
articles as fully-indexed files, and follow the present card catalog
system. Is this going to make information less accessible than it now
is in printed form?

How much human effort goes into re-typing printed information? Look
at almost any scholarly paper out there.  Up to half of it is
literature survey. Most of the survey is there because the author
can't count on readers having ready access to all the previous papers.
Sometimes the survey adds value, by putting previous work in
perspective, but a lot of it simply gives researchers useless
work to do.

> Machines
> are currently very good a fast data retrieval, but decidedly bad at making
> inferences about the data that they store.

True enough, but I'll be happy to make the inferences about what I
need. First I've got to get at the information. A machine that did
no more than automatically retrieve all the citations in a given paper
would be an enormous help. (You know how frustrating being stumped by
a missing citation is -- the author skips some important steps because
they're in paper X, your library doesn't have the journal, so off you
go, wasting valuable time and money trying to track it down.) I could
also make real progress with a few boolean expressions and short
phrases, provided that I could search abstracts and/or text of papers
and books.

Perhaps someday we will have machines that ``look over your shoulder''
and spot analogies between problem X that's stumping you and problem
Z that appeared in some obscure East-block journal. If we could do
that today, our 50-year technological diffusion patterns would speed
up to weeks and days.

Dan Mocsny