Newsgroups: comp.archives.admin Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!uunet!munnari.oz.au!manuel!cmf851 From: cmf851@anu.oz.au (Albert Langer) Subject: Re: building an interstate (data) highway with no roadmaps Message-ID: <1991Jun24.202941.21411@newshost.anu.edu.au> Sender: news@newshost.anu.edu.au Organization: Computer Services Centre, Australian National University, Canberra, Australia. References: <2013@uqcspe.cs.uq.oz.au> <89gs5jr@Unify.Com> <11900.Jun2322.59.2491@kramden.acf.nyu.edu> Date: Mon, 24 Jun 91 20:29:41 GMT In article <11900.Jun2322.59.2491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >I think the Mathematics Subject Classification model would apply quite >well to archived files (and netnews!). Sounds like a useful model to start from - especially: 1. Use of more than one level. 2. Codes defined by a central authority. 3. Assignment of primary and any number of secondary codes. I doubt that there will be much success with self-assignment by authors of software packages since unlike mathematicians they are not used to relying on literature searches for prior art anyway. However there is no way to find out how viable that fourth feature of the maths system is until we have the codes assigned by a central authority. If it also turns out to be viable fine, otherwise I propose the "cooperative cataloging" model used by libraries - i.e. the first major archive site that stocks the package does the classifying and others copy - that distributes the work among people who understand the classification scheme, even though not as widely as by distributing it to authors as well. (Once it has caught on, and people actually USE the catalog classifications, one could THEN hope for some self-cataloging by authors.) By "major archive site" I really mean "cataloging site" - i.e. one that is willing to do far more than the typical ftp site in actually maintaining organized cataloging information. This need not actually be a site that has disk space available on the internet, though considering that disk space is now only $2 per MB I don't see why not. Another set of possible catalogers are the moderators and indexers of the *sources* groups. (There was some discussion re a classification scheme in comp.sources.d recently). >Of course, the MSC (which is available for anonymous ftp on >e-math.ams.com as mathrev/asciiclass.new) wouldn't apply directly to >software; we'd have to draft a whole new set of categories. But the >model will work. As well as new categories I think we would have to add quite a lot of features to the model e.g. 1. Version numbers. For whole and component parts. 2. *sources* message-id/subject headings/archive names 3. file sizes for source and object code software, docs, test and other data, abstracts (README, HISTORY etc) and various combinations, with "standard" filenames. 4. refinement of 3 to include postscript/dvi and "source" forms of documentation, compressed and uncompressed versions with various packaging methods etc. 5. Patches and what they apply to and result in. 6. Languages used (perhaps merely one of many classifications, but could add file sizes and numbers for each). 7. Pre-requisite software. (Not a classification but a reference to other cataloged packages with specific version numbers). 8. Pre-requisite hardware. 9. Release status. (alpha, beta, gamma etc) 10. Copyright information. (Whether "freely available" etc) 11. Systems tested on. 12. Systems it is believed to work on. 13. Systems it is believed not to work on. Only the most important information need be provided initially, but it should be possible to add other stuff including even review comments or pointers to discussion in newsgroups. This could be provided for at the same time as setting up system for cooperative cataloging since coop cataloging implies being able to take an existing or non-existant catalog record and add to it and have that then available for others to use or add to. Adding "review comments" would be particularly useful. It still strikes me that libraries are the institutions that should be doing this. One thing though, if they aren't prepared to take it on yet, perhaps they could make available the software used at no charge? There are some very powerful systems in use for cooperative cataloging and MARC records that cover everything from audio tapes to maps are just as complex as anything we will need for software packages. How about just submitting a couple of packages as "publisher" to the LC and ask for the "Cataloging In Publication" data to be returned overnight as is done for book manuscripts. Should produce some discussion :-). U.S. copyright law clearly defines computer programs as "literary works" and I can't see anybody claiming that something like "c news" or X windows is "merely ephemeral" so I guess they would HAVE to catalog it. The Library of Congress IS on the internet (loc.gov) - but if they won't accept submissions by email or ftp somebody could just startup a "publisher" to issue a series of tapes and diskettes for physical delivery to them with each volume a separate monograph (not part of a single serial) containing one software package. I'm quite serious about this, proper cataloging DOES cost about $200 per item and it IS THEIR JOB. We should just be helping with specialist advice. P.S. For anyone wanting to follow up - I just don't have time - a contact at the LC is: Sally H. McCallum, Chief Network Development and MARC Standards Office Library of Congress smcc@seq1.loc.gov (202) 707-6273 -- Opinions disclaimed (Authoritative answer from opinion server) Header reply address wrong. Use cmf851@csc2.anu.edu.au