Xref: utzoo comp.sources.d:3170 news.groups:6746 Path: utzoo!utgpu!attcan!uunet!ispi!jbayer From: jbayer@ispi.UUCP (Jonathan Bayer) Newsgroups: comp.sources.d,news.groups Subject: Re: comp.datasets Call For Discussion (was: Re: SAO Part01/49) Message-ID: <394@ispi.UUCP> Date: 5 Jan 89 15:43:53 GMT References: <1249@fig.bbn.com> <1280@vsi1.UUCP> <151@cjsa.WA.COM> <15054@genrad.UUCP> <360@pte.UUCP> Reply-To: jbayer@ispi.UUCP (Jonathan Bayer) Organization: Intelligent Software Products, Inc. Lines: 55 In article <360@pte.UUCP> car@pte.UUCP (Chris Rende) writes: >In article <15054@genrad.UUCP>, jpn@genrad.com (John P. Nelson) writes: >> In article <76@sopwith.UUCP> snoopy@sopwith.UUCP (Snoopy) writes: == =It appears that the number and size of "datasets" posted to the net is == =growing. Time to consider a group such as comp.datasets to hold them. == == Point of order: What EXACTLY is a "dataset"? There was a discussion == recently about starting a group for GIF format graphic pictures: would == this group cover these as well as star catalogs? I mean, these are == clearly data, and they are large (well, at least in collection). = =I think that a group for data postings is a good idea. I'd like to see more =data be posted: Astronomical data, medical, graphics, geographical maps, =census, etc... = =What is a dataset? I can just see a month's worth of philosophical flaming =in the wings... :-) Especially if you're talking about LISP stuff... (((:-))) =It would probably be best to describe the desired contents of the group =rather than try to define data in a manner which agrees with all netters. = =To start the furnace, here are some thoughts that I have about a comp.data =newsgroup: = =- Used to post machine readable information that is NOT program source code. = [I say "NOT program source code" because we have source code groups and = because LISP code looks like LISP data.] =- Format can be ASCII or UUENCODED binary. =- Said data should NOT be binary executables. There are groups for this as well. =- Although I'm not quite sure of the posting mechanism, I think that some = description of the structure of the data should be included with the data. = This could be source code segments (like .h files), a column by column = description, or just plain text. [comp.data.d?] =- Said data should be machine independent. I.e., a UUENCODED binary of a = MSDOS directory is not generally usefull. = This sounds like a good idea. This way it would be possible for small sites with little or no interest in most of the datasets would not get them. However, I do have a suggestion. It makes sense to archive these datasets at several large usenet sites similar to the way that comp.sources.unix (and others) is currently archived. There should be a notice published in a different newsgroup from (suggested) comp.sources.data which would let everyone know about the published datasets. This would let those sites that need/want it to get it from a major site. This suggestion does have a problem. If there are some small sites which are several nodes downstream of an archive site, and each one requests a major dataset, then the mail load on the intermediate sites will go up tremendously. However, this might be preferable to always carrying the datasets. -- Jonathan Bayer "The time has come," the Walrus said... Intelligent Software Products, Inc. 19 Virginia Ave. ...uunet!ispi!jbayer Rockville Centre, NY 11570 (516) 766-2867 jbayer@ispi