Xref: utzoo comp.protocols.tcp-ip:16594 comp.archives.admin:53 Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!mel.dit.csiro.au!manta.mel.dit.CSIRO.AU!ajw From: ajw@manta.mel.dit.CSIRO.AU (Andrew Waugh) Newsgroups: comp.protocols.tcp-ip,comp.archives.admin Subject: Re: building an interstate (data) highway with no roadmaps Message-ID: <1991Jun19.001411.5396@mel.dit.csiro.au> Date: 19 Jun 91 00:14:11 GMT References: <9106171612.AA01441@mazatzal.merit.edu> Sender: usenet@mel.dit.csiro.au (usenet mail contact) Organization: CSIRO DIT (Melb.) Lines: 93 In article emv@msen.com (Ed Vielmetti) writes: > X.500 Directory services assume a neat, structured, hierarchical name > space and a clear line of authority running from the root all the way > to the leaves. While this is certainly true, it is important to understand why this is so. X.500 is intended to support a distributed directory service. It is assumed that there will be thousands, if not millions, of repositories of data (DSAs). These will co-operate to provide the illusion of a single large directory. The problem with this model is how you return a negative answer in a timely fashion. Say you ask your local DSA for a piece of information. If the local DSA holds the information you want, it will return it. But what if it doesn't hold the information? Well, the DSA could ask another DSA, but what if this second DSA also doesn't hold the information? How many DSAs do you contact before you return the answer "No, that piece of information does not exist"? All of them? X.500 solves this problem by structuring the stored data hierarchically and using this heirarchy as the basis for distributing the data amongst DSAs. Using a straightforward navigation algorithm, a query for information can always progress towards the DSA which should hold the information. If the information does not exist that DSA can authoritatively answer "No such information exists." You don't have to visit all - or even a large proportion - of the DSAs in the world. It is important to realise that this is a generic problem with highly distributed databases. The X.500 designers chose to solve it by structuring the data. This means that X.500 is suitable for storing data which can be represented hierarchically and is less suitable for storing data which cannot. Exactly what data will be suitable for storing in X.500 is currently an open question - there is simply not sufficient experience. The proposed archive database which started this thread will have exactly the same problem. The solution chosen will, if different to that X.500 uses, will have problems as well. There is no such thing as a perfect networking solution! >X.500 services are hard to run -- the technology is big, bulky, >osified. So the people who are most interested in running it are the >"computer center" folks. If you look for the innovative, interesting, >and desirable applications that you'd want to find on the net, you'll >see that many of them are being done out in the field in departmental >computing environments or increasingly in small focused private >commercial or non-commercial efforts. There's not a terribly good >reason for these two groups to communicate, and so most X.500 projects >have much more structure than substance. > >X.500 services are directory oriented. The data in them is relatively >small, of known value, and highly structured. Information about >archive sources is just about completely counter to these basic >principles. The amount of information about any particular service >which you'd like to have on hand can be quite considerable; perhaps at >minimum access instructions, but more likely some text describing the >service, who its intended audience is, sample output, etc. In >addition it would be valuable to keep information on user reactions to >the system close to the official provided directory notice; these >reviews (a la the michelin guide) are often more valuable than the >official propaganda put out by the designer. To search this mass of >information, you'll want something much more expressive than the >relatively pitiful X.500 directory access tools -- full text >searching, at the very minimum, with a way to sensibly deal both with >structured data and with more fuzzy matches on "similar" items. > >X.500 is a holy grail, there's a lot of money which seems to be being >thrown at it these days in the hope to make it useful. Good luck, I >wish you well. But please, don't try to cram all the world's data >into it, because it doesn't all fit. It's a shame that equivalent >amounts of effort aren't being spent on developing other protocols >more suited to the task. I'm thinking in particular of the Z39.50 >implementation in WAIS [*] which holds a lot of potential for >providing a reasonable structure for searching and sifting through >databases which have rich textual information. Perhaps it's just as >well that federal subsidy hasn't intruded here and clouded people's >judgments on the applicability of a particular technology to a >certain task. As for the rest of the posting, all I can say is that it must be great to know so much about the costs and benefits of using X.500. From my perspective, it is obvious that X.500 will not solve all the world's problems (nothing ever does :-) but it is way too early to be so dogmatic. When we have had 1) The necessary expericence of implementing X.500, running X.500 databases and storing different types of data in such a database; and 2) experience in alternative highly distributed databases. (X.500 might prove to be extremely poor for storing certain types of data - but the alternatives might be even worse.) then we can be dogmatic. andrew waugh