Path: utzoo!attcan!uunet!ssbell!kent From: kent@ssbell.UUCP (Kent Landfield) Newsgroups: comp.sources.d Subject: Archiving Sources (was alt.sources archiving) Message-ID: <448@ssbell.UUCP> Date: 19 Mar 89 21:33:12 GMT Reply-To: kent@ssbell.UUCP (Kent Landfield) Organization: Sterling Software, FSG-IMD, Bellevue, NE. Lines: 168 J. Eric Townsend in <405@flatline.UUCP> writes: # >In article <590@alice.marlow.uucp> fox@marlow.UUCP (Paul Fox) writes: # >>In article <2706@rtech.rtech.com> daveb@rtech.com (Dave Brower) writes: # Please move this discussion to comp.sources.d. Alt.sources if for source # postings only. Posting messages here plays havoc with automatic archivers. It should not play havoc with the archivers, just force the archive administrator at the site to review the archived members of alt.sources, trashing non-source members. Chip Rosenthal in <727@vector.UUCP> writes: # [discussion moved from alt.sources -- what a novel concept] # Automatic archiving of alt.sources? Ha! What a joke. # I've long ago decided it was much easier to manually archive the "wheat" # rather than automatically archive everything and throw out the "chaff". I disagree! I do not have the time on a daily basis to manually save the "wheat". I am envious of your position if you truely have time to waste.:-) I find it much easier to automatically archive the entire newsgroup. Then, when I can find the time, I remove members that are non-sources. 5 minutes a week max... Chip Rosenthal in <727@vector.UUCP> writes: # The only way archiving is going to work for alt.sources is if posters # start using secondary headers (a la comp.sources newsgroups) when real # sources are posted there. Personally, I'd like to see something like: # Submitted-by: My Name # Posting-number: Volume 89 # Archive-name: pgm_name # This leaves off the "Issue" part of the "Posting-number" field. Roll the # volume number annually. 12 chars or less on archive name, please, to # allow for ".Z". Comments? Well this is not the *only* way it will work but it is a method that fits the r$alz approach. Chip's intent is to move more toward the standard method of labeling archive members so that existing (and about to be released) archivers can deal with alt.sources. Why stop at alt.sources ? Why not have a source posting program so that all sources posted to any non-moderated newsgroup on the net have the same format of auxiliary headers. In this manner a process could be set on the news directory daily that would archive *all* sources sent to *any* group. (Rich, are you listening ? Is is time for a posting of post.c ?) Jef Poskanzer in <10985@well.UUCP> writes: # Why do you need Submitted-by: when you already have From:? Why do you # need a number, especially if it's only going to change once per year? Chip Rosenthal in <763@vector.UUCP> writes: # To minimize the chance of breaking existing archiving programs. This is altogether too true. Everyone has their own method of archiving sources. There has been no real tool for the average site to use. Each site had to come up with their own tools to get the job done. To some I have run into, archiving is almost as touchy an issue as RHF.(:-) Jef Poskanzer in <10985@well.UUCP> writes: # Why do you need a separate Archive-name:, when some simple conventions # for what goes in the Subject: line would work even better? For referencing sources, it is much easier to specify an Archive-name then to rely on the standardization of new "simple conventions". I can't see how the Subject: line would work "even better". The Subject line should be used for informing us as to what the contents of the archive member are in English, not some cryptic "convention". Jef Poskanzer in <10985@well.UUCP> writes: # Why do you want to store the postings in a filename specified by the poster, # with all the security issues that brings up? I really fail to see a security issues problem as long as archivers do not use absolute paths. Sources archiving should be done from a seperate uid/gid as is stands now. NOT as root.. Posting software should not allow for absolute paths and the archivers would enforce this by only placing the files within an archive directory. Jef Poskanzer in <10985@well.UUCP> writes: # What's wrong with just saving the posting in a numbered file, grepping # out the From: and Subject: lines to save in an index, and compressing # the file? Chip Rosenthal in <763@vector.UUCP> writes: # Boy...that's a giant step backwards. What's wrong with doing all of # that, but using a meaningful name instead of a random number? Nothing is wrong with Jef's approach. It is the Message-ID method of archiving. It is the method used by many software archivers for groups that do not have auxiliary headers. Optimally, all source posted to any newsgroup would have headers specifying the appropriate information so that archivers could use a "meaningful name". Currently though, that is not the case. Presently, there are three methods of archiving: o Archive-name - The moderators of *most* sources groups assign an official Archive-name to each article that gets submitted to the net. The Archive-name line in each file has a "new-login" or "elm/part06" type of format. For multi-part postings, a subdirectory is created (as indicated in the elm example) to hold the separate "parts". This format is used by many large archive sites because it is easier for retrieval via mail request software such as netlib. The filenames also give hints as to what the software is. o Volume-Issue - Software sent via *most* moderated sources groups have an assigned Volume and Issue number. This allows the moderators to track and reference the individual items that have been posted to the group. Each individual article is given an "Issue" number. The Issues are grouped together into a "Volume". There are roughly 100 articles in each Volume but this is an arbitrary split totally up to each moderator. This format is extremely useful when the software archives are cataloged. It makes searching of the files quicker and verification of complete volumes easier. This archive format is recommended for any site that will be doing massive searches of the individual volumes since it keeps the quadratic nature of directory searches from making your life miserable. o Message-ID - The news software stores the articles locally by naming the news article by a number generated on every site. The Message-ID number ordering is unique to each site. If a Message-ID archive method is used, (or required by the newsgroup), the news article file is copied to the archive directory. The name of the archived article will match the Message-ID number of the article contained within. There is nothing wrong with this method as long as a index is generated to assist in archive member identification. This method is in use for alt.sources and comp.sources.mac at ssbell since both groups do not have auxiliary headers. Chip Rosenthal in <763@vector.UUCP> writes: # I'm sorry, I missed the counterproposal in your message. Were you trying # to say that one shouldn't try to archive alt.sources? Or were you just # trying to trash my suggestion? Nor have you explained why this is such # a crummy idea for alt.sources even though it seems to work for the other # sources newsgroups. No. Jef was not trying to say it was a crummy idea. Discussion between people on an idea _usually_ produces a better solution. I did not see the replies as flames, but as a "why bother in a group where anything goes, when archiving can be accomplished already". I disagree with Jef here. I would like to see a move towards auxiliary headers on all sources posted to any newsgroup, whether they are moderated or not. I do not have the time to read every single article in every single news group to examine if it contains sources or not. I HATE it when I need something only to find out that it went through a newsgroup I never read two weeks before I was forced to write it out of necessity. Alt.sources was established as a place where people could get sources posted without going through a moderator. The group was established for source postings, *not* discussions. Cool heads can accomplish something for the net, not just alt.sources. Source/patches distribution needs to be improved in all NON-moderated sources groups. How about modifications to Pnews/postnews so that: Does this posting contain compilable sources ? [n]: y Please enter requested Archive-name: pgm_name produces an auxiliary header of the type, Submitted-by: My Name Archive-name: pgm_name Posting-number: Volume 89 that can then be read by automated archivers. This could save us all a lot of time. I know my family would appreciate seeing me more... 1/2 :-) -Kent+ -- Kent Landfield UUCP: kent@ssbell Sterling Software FSG/IMD INTERNET: kent%ssbell.uucp@uunet.uu.net 1404 Ft. Crook Rd. South Phone: (402) 291-8300 Bellevue, NE. 68005-2969 FAX: (402) 291-4362