Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!tut.cis.ohio-state.edu!brutus.cs.uiuc.edu!usc!henry.jpl.nasa.gov!elroy.jpl.nasa.gov!peregrine!ccicpg!conexch!sandy From: sandy@conexch.UUCP (Sandford Zelkovitz) Newsgroups: alt.sources Subject: uncmp ported to Xenix Keywords: uncmp103 ported_2_xenix Message-ID: <37270@conexch.UUCP> Date: 20 Sep 89 01:54:02 GMT Organization: The Consultants' Exchange, Orange County, CA Lines: 1906 The following file is part 1 of 2. The program uncmp103 is an MSDOS high speed un-arcer written by Derron Simon. ( I did the Xenix port). --------------------- cut here for 1 of 2 of the Xenix Port ------------------ #!/bin/sh # to extract, remove the header and type "sh filename" if `test ! -s ./dlzw1213.h` then echo "writing ./dlzw1213.h" cat > ./dlzw1213.h << '\Rogue\Monster\' /******************************************************************* * UNCMP - DLZW1213, Version 1.03, created 6-28-89 * * Dynamic Lempel-Ziv-Welch 12/13 bit uncompression module. * * Uncompresses files stored with Crunching (LZW 9-12 bits with RLE * coding) and Squashing (LZW 9-13). The basic compression algorithm * is FC-SWAP (See Storer, James A., _Data Compression Method and * Theory_, 1989, Computer Science Press). * * The great majority of this code came from SQUASH.C by Leslie * Satensten, which was based on the Unix COMPRESS program which is * in the Public Domain. * * This code has been released into the Public Domain. *******************************************************************/ int compress(FILE *,FILE *); int decompress(FILE *, FILE *); /* the next two codes should not be changed lightly, as they must not */ /* lie within the contiguous general code space. */ #define FIRST 257 /* first free entry */ #define CLEAR 256 /* table clear output code */ /* The tab_suffix table needs 2**BITS characters. We */ /* get this from the beginning of htab. The output stack uses the rest */ /* of htab, and contains characters. There is plenty of room for any */ /* possible stack (stack used to be 8000 characters). */ #define MAXCODE(n_bits) (( 1<<(n_bits)) - 1) #define tab_prefixof(i) codetab[i] #define tab_suffixof(i) ((unsigned char *)(htab))[i] #define de_stack ((unsigned char *)&tab_suffixof(1< ./errors.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - ERRORS, Version 1.03, created 6-28-89 * * Common error messages are here to save space. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include "archead.h" #include "global.h" #include "uncmp.h" void read_error(void) { printf("Error reading file\n"); exit(1); } void write_error(void) { printf("Error writing file\n"); exit(1); } void mem_error(void) { printf("Error allocating memory\n"); exit(1); } \Rogue\Monster\ else echo "will not over write ./errors.c" fi if `test ! -s ./fileio.c` then echo "writing ./fileio.c" cat > ./fileio.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - FILEIO, Version 1.03, created 6-28-89 * * Input/Output routines for UNCMP, including RLE. * * Actually does all the I/O involving archives and all RLE * decoding. * * The great majority of this code came from SQUASH.C by Leslie * Satensten, which is in the Public Domain. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include "archead.h" #include "uncmp.h" #include "global.h" /* stuff for non-repeat packing */ #define DLE 0x90 /* repeat sequence marker */ #define UINT_MAX 0xffffffff /* non-repeat packing states */ #define NOHIST 0 /* don't consider previous input */ #define INREP 1 /* sending a repeated value */ void putc_pak(char c, FILE *t) /* output an unpacked byte */ { add1crc(c); /* update the CRC check value */ fputc(c, t); } /* putc_rle outputs bytes to a file unchanged, except that runs */ /* more than two characters are compressed to the format: */ /* DLE */ /* When DLE is encountered, the next byte is read and putc_rle */ /* repeats the previous byte times. A of 0 */ /* indicates a true DLE character. */ void putc_rle(unsigned char c, FILE *out) /* put NCR coded bytes */ { switch (state) { /* action depends on our state */ case NOHIST: /* no previous history */ if (c == DLE) /* if starting a series */ state = INREP; /* then remember it next time */ else putc_pak(lastc=c, out); /* else nothing unusual */ return; case INREP: /* in a repeat */ if (c) /* if count is nonzero */ while (--c) /* then repeatedly ... */ putc_pak(lastc, out); /* ... output the byte */ else putc_pak(DLE, out); /* else output DLE as data */ state = NOHIST; /* back to no history */ return; default: printf("Fatal error: Bad RLE packing state (%d)", state); return; } } /* getc_pak is implemented this way so an int is used more often */ /* than a long. This should be faster. */ int getc_pak(FILE *in) /* get a byte from an archive */ { static unsigned getc_wrk = 0; if (!getc_wrk) { /* less overhead to manage ints */ /* then indicate end of file */ if (!sizeleft) return EOF; /* if no data left */ else { getc_wrk = UINT_MAX; /* the maximum int (0xffff on PC) */ if (getc_wrk > sizeleft) getc_wrk = sizeleft; sizeleft -= getc_wrk; } } getc_wrk--; /* deduct from input counter */ return (fgetc(in)); /* and return next byte */ } \Rogue\Monster\ else echo "will not over write ./fileio.c" fi if `test ! -s ./filelist.c` then echo "writing ./filelist.c" cat > ./filelist.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - FILELIST, Version 1.03, created 6-28-89 * * Create a linked list of file and check for matches * * Creates a singly-linked list of filenames after qualifying them * and then takes filenames and searches for matches in the list. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #ifdef __TURBOC__ # include #else /* MSC */ # include #endif #include #include "uncmp.h" #include "archead.h" #include "global.h" #define TRUE 1 #define FALSE 0 struct list { char filespec[13]; struct list *next; }; static struct list *header = NULL; /* NULL for first initialization */ static struct list *temp; /* makes any filename into a valid filename for use within an archive */ char *setup_name(char *filename) { unsigned char namelen; char *strptr; namelen = strlen(filename); /* convert all back slashes to forward slashes */ while((strptr = strchr(filename,'\\')) != NULL) *strptr = '/'; /* remove trailing period */ if (filename[namelen-1] == '.') filename[--namelen] = '\0'; /* remove everything before first back slash */ if ((strptr = strrchr(filename,'/')) != NULL) filename = strptr+1; /* remove everything before colon */ if ((strptr = strrchr(filename,':')) != NULL) filename = strptr+1; return(strupr(filename)); } /* adds filename to a singly-linked list */ void setup_list(char *filename) { /* first time initialization */ if (header == NULL) { if ((header = (struct list *)calloc(1,sizeof(struct list))) == NULL) { mem_error(); } temp = header; } /* initialize new node in list */ if ((temp->next = (struct list *)calloc(1,sizeof(struct list))) == NULL) { mem_error(); } /* add filename to list */ temp = temp->next; strcpy(temp->filespec,setup_name(filename)); temp->next = NULL; } /* check to see if filename is in list of files and return TRUE if it is */ /* or FALSE if it isn't */ int check_list(char *filename) { /* convert filename to one used within archives */ filename = setup_name(filename); /* if never initialized than all files should be uncompressed */ if (header == NULL) return(1); /* search for filename to extract */ temp = header; while (temp != NULL) { if (compare_files(filename,temp->filespec)) return(1); temp = temp->next; } return(0); } int compare_files(char *filename, char *filespec) { while (((*filename != NULL) && (*filename != '.')) || ((*filespec != NULL) && (*filespec != '.'))) { if (*filename != *filespec && *filespec != '?') { /* if doesnt't comp */ if (*filespec != '*') return (0); /* and not '*' */ else { /* move to extension, filename matches */ while ((*filename != NULL) && (*filename != '.')) filename++; while ((*filespec != NULL) && (*filespec != '.')) filespec++; break; } } else { /* they match */ filename++; filespec++; } } if ((*filename != NULL) && (*filename == '.')) filename++; /* inc past . */ if ((*filespec != NULL) && (*filespec == '.')) filespec++; while ((*filename != NULL) || (*filespec != NULL)) { if ((*filename != *filespec) && (*filespec != '?')) { if (*filespec != '*') return (0); else return (1); } else { filename++; filespec++; } } return (1); } int setup_path(char *filename) { int namelen; namelen = strlen(filename); if (filename[namelen-1] == '/') { strcpy(path,filename); return(1); } else { path[0] = NULL; return(0); } } char *strupr(xx) char *xx; /* For Xenix, we REALLY want to convert to lower! */ { char *ptr, *pptr; ptr = malloc(strlen(xx) + 1); if ( ptr == NULL ) return (NULL); strcpy(ptr, xx); pptr = ptr; while ( *pptr != NULL ) { *pptr = tolower(*pptr); pptr++; } return(ptr); } \Rogue\Monster\ else echo "will not over write ./filelist.c" fi if `test ! -s ./gethead.c` then echo "writing ./gethead.c" cat > ./gethead.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - GETHEAD, Version 1.03, created 6-28-89 * * Archive header reader * * Reads the header of archives (including incompatible type 1). * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include "archead.h" #include "global.h" #include "uncmp.h" int getarcheader(in) FILE *in; { /* read in archead minus the length field */ if ((fread((char *)&archead, sizeof(char), sizeof(struct archive_header) - sizeof(long), in)) < 2) { printf("Archive has invalid header\n"); exit(1); } /* if archead.arcmark does not have that distinctive arc identifier 0x1a */ /* then it is not an archive */ if (archead.arcmark != 0x1a) { printf("Archive has invalid header\n"); exit(1); } /* if atype is 0 then EOF */ if (archead.atype == 0) return(0); /* if not obsolete header type then the next long is the length field */ if (archead.atype != 1) { if ((fread((char *)&archead.length, sizeof(long), 1, in)) < 1) { printf("Archive has invalid header\n"); exit(1); } } /* if obsolete then set length field equal to size field */ else archead.length = archead.size; return(1); } \Rogue\Monster\ else echo "will not over write ./gethead.c" fi if `test ! -s ./global.c` then echo "writing ./global.c" cat > ./global.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - GLOBAL, Version 1.03, created 6-28-89 * * Contains all global variables used by the modules. * * Every variable accessed by more than one module is here. * * This code has been released into the Public Domain. *******************************************************************/ #include "archead.h" unsigned char state; /* state of ncr packing */ unsigned int crc; /* crc of current file */ long sizeleft; /* for fileio routines */ int lastc; /* last character ouput by putc_rle() */ int errors=0; /* number of errors */ char path[63]; /* path name to output to */ char headertype; /* headertype of archive */ struct archive_header archead; /* header for current archive */ /* command line switches (flags) */ char warning=1; char overwrite=0; char testinteg=0; char listarchive=0; \Rogue\Monster\ else echo "will not over write ./global.c" fi if `test ! -s ./global.h` then echo "writing ./global.h" cat > ./global.h << '\Rogue\Monster\' /******************************************************************* * UNCMP - GLOBAL.H, Version 1.03, created 6-28-89 * * Externs for all global variables used by various modules * * This code has been released into the Public Domain. *******************************************************************/ extern unsigned char state; extern unsigned int crc; extern long sizeleft; extern int lastc; extern int errors; extern char path[63]; extern int crctab[]; extern struct archive_header archead; extern char warning; extern char overwrite; extern char testinteg; extern char listarchive; extern char headertype; char *strupr(); \Rogue\Monster\ else echo "will not over write ./global.h" fi if `test ! -s ./huffman.c` then echo "writing ./huffman.c" cat > ./huffman.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - HUFFMAN, Version 1.03, created 6-28-89 * * Classic Huffman with RLE uncompression module. * * The great majority of this code came from SQUPRT33 by Theo Pozzy * which borrowed code from SQ and USQ, and USQ by Richard Greenlaw. * Both of the above mentioned programs are in the Public Domain. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include "archead.h" #include "global.h" #include "uncmp.h" #define SQEOF 256 /* Squeeze EOF */ #define NUMVALS 257 /* 256 data values plus SQEOF */ struct sq_tree /* decoding tree */ { int children[2]; /* left, right */ } dnode[NUMVALS]; /* use large buffer */ void sq_decomp(FILE *in, FILE *out) /* initialize Huffman unsqueezing */ { register int i; /* generic loop index */ register int bitpos; /* last bit position read */ int curbyte; /* last byte value read */ int numnodes; /* get number of nodes in tree, this uses two character input calls */ /* instead of one integer input call for speed */ numnodes = getc_pak(in) | (getc_pak(in)<<8); if ((numnodes < 0) || (numnodes >= NUMVALS)) { printf("File has invlaid decode tree\n"); exit(1); } /* initialize for possible empty tree (SQEOF only) */ dnode[0].children[0] = -(SQEOF + 1); dnode[0].children[1] = -(SQEOF + 1); for(i=0; i= 0; ) { /* traverse tree */ if(++bitpos > 7) { if((curbyte=getc_pak(in)) == EOF) return; bitpos = 0; /* move a level deeper in tree */ i = dnode[i].children[1 & curbyte]; } else i = dnode[i].children[1 & (curbyte >>= 1)]; } /* decode fake node index to original data value */ i = -(i + 1); /* decode special endfile token to normal EOF */ i = (i == SQEOF) ? EOF : i; if (i != EOF) putc_rle(i,out); } }\Rogue\Monster\ else echo "will not over write ./huffman.c" fi if `test ! -s ./link.lst` then echo "writing ./link.lst" cat > ./link.lst << '\Rogue\Monster\' UNCMP+ STUBS+ DISPATCH+ CRC+ FILEIO+ GLOBAL+ DLZW1213+ TESTARC+ GETHEAD+ LISTARC+ FILELIST+ HUFFMAN+ STORE+ ERRORS+ PACK+ SLZW12 UNCMP.EXE UNCMP /NOE /NOI /E /M;\Rogue\Monster\ else echo "will not over write ./link.lst" fi if `test ! -s ./listarc.c` then echo "writing ./listarc.c" cat > ./listarc.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - LISTARC, Version 1.03, created 6-28-89 * * Archive directory lister. * * Lists contents of archives in standard verbose style. * * The great majority of this code came from AV v2.01 by Derron * Simon, of which the source code has not been released, however it * this code is in the Public Domain. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include "archead.h" #include "global.h" #include "uncmp.h" void list_arc(FILE *in) { unsigned long int total_size = 0L; unsigned long total_length = 0L; unsigned int total_files = 0; char method[9]; char crushing = 0; int hour, min, sec; int month, day, year; printf(" Filename Length Method SF Size Date Time CRC\n" " -------- ------ ------ -- ---- ---- ---- ---\n"); do { switch ((int) archead.atype) { case 1: strcpy(method, " stored "); case 2: strcpy(method, " Stored "); break; case 3: strcpy(method, " Packed "); break; case 4: strcpy(method, "Squeezed"); break; case 5: case 6: case 7: strcpy(method, "crunched"); break; case 8: strcpy(method, "Crunched"); break; case 9: strcpy(method, "Squashed"); break; case 10: strcpy(method, "Crushed*"); crushing++; break; default: strcpy(method, "Unknown!"); } year = (archead.date >> 9) & 0x7f; /* dissect the date */ month = (archead.date >> 5) & 0x0f; day = archead.date & 0x1f; hour = (archead.time >> 11) & 0x1f; /* dissect the time */ min = (archead.time >> 5) & 0x3f; sec = (archead.time & 0x1f) * 2; if (check_list(archead.name)) { printf(" %-12s %7lu %8s %02d%% %7lu %2d-%02d-%2d %02d:%02d:%02d %04X\n", archead.name, archead.length, method, calcsf(archead.length, archead.size), archead.size, month, day, year + 80, hour, min, sec, archead.crc); total_length += archead.length; total_size += archead.size; total_files++; } fseek(in,archead.size,1); } while (getarcheader(in) != 0); printf(" -------- ------ -- ----\n"); printf(" Total: %-3d %7lu %02d%% %7lu\n", total_files, total_length, calcsf(total_length, total_size), total_size); if (crushing > 0) printf("\n* Crushing not supported in this version of UNCMP\n"); fclose(in); } int calcsf(long length_now, long org_size) { if (length_now == 0) return 0; /* avoid divide-by-zero error */ /* divide in fixed point, to avoid FP */ return ((int) (100 - ((org_size * 100) / (length_now)))); } \Rogue\Monster\ else echo "will not over write ./listarc.c" fi if `test ! -s ./pack.c` then echo "writing ./pack.c" cat > ./pack.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - PACK, Version 1.03, created 6-28-89 * * Uncompresses files saved using the pack method (3). Pack uses * RLE compression. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include "archead.h" #include "global.h" #include "uncmp.h" #define MAXOUTSIZE 16384 void rle_decomp(FILE *in, FILE *out) { register int c; char *buffer=NULL; if ((buffer = (char *)malloc(sizeof(char)*MAXOUTSIZE)) == NULL) { /* uncompress char by char if no room for buffer */ while ((c=getc_pak(in)) != EOF) putc_rle(c,out); } while (sizeleft >= MAXOUTSIZE) { if (fread(buffer,sizeof(char),MAXOUTSIZE,in) != MAXOUTSIZE) read_error(); for(c=0;c != MAXOUTSIZE;c++) putc_rle(buffer[c],out); sizeleft -= MAXOUTSIZE; } if (fread(buffer,sizeof(char),sizeleft,in) != sizeleft) read_error(); for(c=0;c != sizeleft;c++) putc_rle(buffer[c],out); sizeleft = 0; free(buffer); } \Rogue\Monster\ else echo "will not over write ./pack.c" fi if `test ! -s ./programr.man` then echo "writing ./programr.man" cat > ./programr.man << '\Rogue\Monster\' UNCMP v1.00 PROGRAMMER MANUAL ----------------------------- About this file --------------- This document is intended to be helpfull for programmers who are interested in the workings of UNCMP. People who are not interested in hacking UNCMP should skip this file and read the USER.MAN file included in this archive. Development system used ----------------------- The author wrote UNCMP on a GRiDCASE 1530 80386 based portable running MS-DOS 3.30 with 1 megabyte of memory. MSC v5.1 was used for development, testing, and the final version, however Turbo C v2.00 was tested with UNCMP occasionally and with the final version. I used the EC text editor from C Source. Development history ------------------- UNCMP was originally created to extract all crunched (arctype 8) and squashed (arctype 9) files from an archive. It later added the ability to extract every archive type except crushing (arctype 10) and the ability to extract only named files. General background on arc files ------------------------------- The arc format of archives has become the industry standard since System Enhancement Associates created the original ARC archive utility. Since then it has added and removed many different compression systems in an attempt to create the smallest archives. Only type 2,3 and 8 are used by the current version of ARC and only type 2,3,8 and 9 are used by PKARC. The other types are obsolete, but still supported by UNCMP. # Description --- --------------------------------------------------- 1 no compression, but archive header is shorter than current version. 2 no compression, with new header. 3 RLE compression. 4 Squeezing, or classic Huffman. 5 Lempel-Ziv static 12 bit without RLE 6 Lempel-Ziv static 12 bit with RLE 7 Lempel-Ziv static 12 bit with RLE and new hash method. 8 Dynamic Lempel-Ziv-Welch 9-12 bit with RLE. 9 Dynamic Lempel-Ziv-Welch 9-13 bit without RLE. Archive header -------------- The format for the archive header can be found in ARCHEAD.H. The general format of an archive is FILE . is a special two byte pair 0x1a and 0x00. Wish list of enhancements ------------------------- What follows is a wish list of enhancements that I'd be more than happy to see implemented if you've got the time. o Support for garbled files (encrypted) o Output to console or printer o Rewrite getcode() in module DLZW1213.C in assembly o Faster RLE decoding (with large buffer handling) o Better buffer handling (relying on setvbuf() is slow) o Support for crushing (arctype 10) Since UNCMP is relatively slow, any speed improvements are welcome. Problems in portability ----------------------- I'm not sure of where problems will be in porting, however I believe that the only real system dependant stuff is in UNCMP.C and deals with overwrite checking and filestamp setting. Commenting ---------- I believe that most of the code is well-commented and should be easy to read. Testing ------- If you have a version up and running for your compiler, please send me the sources and information at one of the BBS's mentioned in the USER.MAN file. \Rogue\Monster\ else echo "will not over write ./programr.man" fi if `test ! -s ./readme.103` then echo "writing ./readme.103" cat > ./readme.103 << '\Rogue\Monster\' UNCMP v1.03 is the first release of UNCMP to the general public. Enjoy, and please read the manual, it's not very long. Derron Simon \Rogue\Monster\ else echo "will not over write ./readme.103" fi if `test ! -s ./slzw12.c` then echo "writing ./slzw12.c" cat > ./slzw12.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - SLZW12, Version 1.03, created 6-28-89 * * Static Lempel-Ziv-Welch 12 bit uncompression module. * * Uncompresses files created using the obsolete compression methods * 5 (12 bit compression without RLE), 6 (12 bit compression with * RLE), and 7 (12 bit compression with RLE and new hash method). * The basic method is FC-FREEZE (See Storer, James A., _Data * Compression Method and Theory_, 1989, Computer Science Press). * * The great majority of this code came from LZWUNC.C by Kent * Williams which is in the Public Domain. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include #ifdef __TURBOC__ # include #else /* MSC */ # include # include #endif #include "archead.h" #include "global.h" #include "uncmp.h" #define FALSE (0) #define TRUE !FALSE #define TABSIZE 4096 #define STACKSIZE TABSIZE #define NO_PRED 0xFFFF #define EMPTY 0xF000 #define NOT_FND 0xFFFF #define UEOF ((unsigned)EOF) #define UPDATE TRUE void init_tab(void); int getcode12(FILE *); void upd_tab(unsigned int, unsigned int); unsigned int hash(unsigned int, unsigned char, int); static char stack[STACKSIZE]; /* stack for pushing and popping */ /* characters */ static int sp = 0; /* current stack pointer */ static unsigned int inbuf = EMPTY; static struct entry { char used; unsigned int next; /* hi bit is 'used' flag */ unsigned int predecessor; /* 12 bit code */ unsigned char follower; } string_tab[TABSIZE]; int slzw_decomp(FILE * in, FILE * out, int arctype) { register unsigned int c, tempc; unsigned int code, oldcode, incode, finchar, lastchar; char unknown = FALSE; int code_count = TABSIZE - 256; struct entry *ep; headertype = arctype; init_tab(); /* set up atomic code definitions */ code = oldcode = getcode12(in); c = string_tab[code].follower; /* first code always known */ if (headertype == 5) putc_pak(c, out); else putc_rle(c, out); finchar = c; while (UEOF != (code = incode = getcode12(in))) { ep = &string_tab[code]; /* initialize pointer */ if (!ep->used) { /* if code isn't known */ lastchar = finchar; code = oldcode; unknown = TRUE; ep = &string_tab[code]; /* re-initialize pointer */ } while (NO_PRED != ep->predecessor) { /* decode string backwards into stack */ stack[sp++] = (char)ep->follower; if (sp >= STACKSIZE) { printf("\nStack overflow, aborting\n"); exit(1); } code = ep->predecessor; ep = &string_tab[code]; } finchar = ep->follower; /* above loop terminates, one way or another, with */ /* string_tab[code].follower = first char in string */ if (headertype == 5) putc_pak(finchar, out); else putc_rle(finchar, out); /* pop anything stacked during code parsing */ while (EMPTY != (tempc = (sp > 0) ? (int)stack[--sp] : EMPTY)) { if (headertype == 5) putc_pak(tempc, out); else putc_rle(tempc, out); } if (unknown) { /* if code isn't known the follower char of last */ if (headertype == 5) putc_pak(finchar = lastchar, out); else putc_rle(finchar = lastchar, out); unknown = FALSE; } if (code_count) { upd_tab(oldcode, finchar); --code_count; } oldcode = incode; } return (0); /* close all files and quit */ } unsigned hash(unsigned int pred, unsigned char foll, int update) { register unsigned int local, tempnext; static long temp; register struct entry *ep; if (headertype == 7) /* I'm not sure if this works, since I've never seen an archive with */ /* header type 7. If you encounter one, please try it and tell me */ local = ((pred + foll) * 15073) & 0xFFF; else { /* this uses the 'mid-square' algorithm. I.E. for a hash val of n bits */ /* hash = middle binary digits of (key * key). Upon collision, hash */ /* searches down linked list of keys that hashed to that key already. */ /* It will NOT notice if the table is full. This must be handled */ /* elsewhere. */ temp = (pred + foll) | 0x0800; temp *= temp; local = (temp >> 6) & 0x0FFF; /* middle 12 bits of result */ } if (!string_tab[local].used) return local; else { /* if collision has occured */ /* a function called eolist used to be here. tempnext is used */ /* because a temporary variable was needed and tempnext in not */ /* used till later on. */ while (0 != (tempnext = string_tab[local].next)) local = tempnext; /* search for free entry from local + 101 */ tempnext = (local + 101) & 0x0FFF; ep = &string_tab[tempnext]; /* initialize pointer */ while (ep->used) { ++tempnext; if (tempnext == TABSIZE) { tempnext = 0; /* handle wrap to beginning of table */ ep = string_tab;/* address of first element of table */ } else ++ep; /* point to next element in table */ } /* put new tempnext into last element in collision list */ if (update) /* if update requested */ string_tab[local].next = tempnext; return tempnext; } } void init_tab(void) { register unsigned int i; memset((char *) string_tab, 0, sizeof(string_tab)); for (i = 0; i <= 255; i++) { upd_tab(NO_PRED, i); } } void upd_tab(unsigned int pred, unsigned int foll) { struct entry *ep; /* pointer to current entry */ /* calculate offset just once */ ep = &string_tab[hash(pred, foll, UPDATE)]; ep->used = TRUE; ep->next = 0; ep->predecessor = pred; ep->follower = foll; } /* getcode fills an input buffer of bits and returns the next 12 bits */ /* from that buffer with each call */ int getcode12(FILE *in) { register int localbuf, returnval; if (EMPTY == inbuf) { /* On code boundary */ if (EOF == (localbuf = getc_pak(in))) { /* H L1 byte - on code boundary */ return EOF; } localbuf &= 0xFF; if (EOF == (inbuf = getc_pak(in))) { /* L0 Hnext */ return EOF; /* The file should always end on code boundary */ } inbuf &= 0xFF; returnval = ((localbuf << 4) & 0xFF0) + ((inbuf >> 4) & 0x00F); inbuf &= 0x000F; } else { /* buffer contains nibble H */ if (EOF == (localbuf = getc_pak(in))) return EOF; localbuf &= 0xFF; returnval = localbuf + ((inbuf << 8) & 0xF00); inbuf = EMPTY; } return returnval; } \Rogue\Monster\ else echo "will not over write ./slzw12.c" fi if `test ! -s ./store.c` then echo "writing ./store.c" cat > ./store.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - UNSTORE, Version 1.03, created 6-28-89 * * Uncompress archives stored with method 1 and 2 (storing). * * This file will copy files using a large buffer. The buffer is * MAXOUTSIZE large. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #ifdef __TURBOC__ #include #else /* MSC */ #include #endif #include "archead.h" #include "global.h" #include "uncmp.h" #define MAXOUTSIZE 16384 void store_decomp(FILE *in, FILE *out) { int c; char *buffer=NULL; /* first time initialization */ if ((buffer = (char *)malloc(sizeof(char)*MAXOUTSIZE)) == NULL) { /* do char by char if no room for buffer */ while ((c=getc_pak(in)) != EOF) putc_pak(c,out); return; } while (sizeleft >= MAXOUTSIZE) { if (fread(buffer,sizeof(char),MAXOUTSIZE,in) != MAXOUTSIZE) read_error(); addcrc((char *)buffer,MAXOUTSIZE); fwrite(buffer,sizeof(char),MAXOUTSIZE,out); sizeleft -= MAXOUTSIZE; } if (fread(buffer,sizeof(char),sizeleft,in) != sizeleft) read_error(); addcrc((char *)buffer,sizeleft); fwrite(buffer,sizeof(char),sizeleft,out); /* free the buffer before exiting */ free(buffer); sizeleft = 0; } \Rogue\Monster\ else echo "will not over write ./store.c" fi if `test ! -s ./stubs.c` then echo "writing ./stubs.c" cat > ./stubs.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - STUBS, Version 1.03, created 6-28-89 * * Contains stubs to keep LINK from including never used routines. * * The function _setenvp() is called by MSC startup code to setup * the environment pointer. Since the environment is never used by * UNCMP I have substituted the Microsoft routine will an empty * function. With MSC 5.1 it saves 192 bytes. * * The function _nullcheck() is called by MSC shutdown code to check * for writes to the null segment. Since it is needed only for * debugging and the code is pretty well debugged, I have replaced * it with a null routine to save space. It saves 112 bytes with * MSC 5.1. I think a bug in MSC causes problems with setting the * DOS errorcode with _nullcheck(), so this one returns an error- * level of 0 no matter what the main code sets on exit. * * This code has been released into the Public Domain (what a joke!). *******************************************************************/ void _setenvp(void) { } int _nullcheck(void) { return 0; }\Rogue\Monster\ else echo "will not over write ./stubs.c" fi if `test ! -s ./testarc.c` then echo "writing ./testarc.c" cat > ./testarc.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - TESTARC, Version 1.03, created 6-28-89 * * Test for corrupt archives by taking CRC of files included. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include #include "archead.h" #include "global.h" #include "uncmp.h" int testarc(FILE *in) { FILE *null; crc = 0; sizeleft = archead.size; state = 0; /* NOHIST */ if ((null = fopen("NUL","wb"))==NULL) { printf("Error opening NUL device\n"); exit(1); } switch(archead.atype) { case 1: case 2: if (archead.atype == 1) printf("Unstoring, "); else printf("UnStoring, "); store_decomp(in,null); break; case 3: /* can't bypass output with RLE */ printf("UnPacking, "); rle_decomp(in,null); break; case 4: printf("UnSqueezing, "); sq_decomp(in,null); break; case 5: case 6: case 7: printf("Uncrunching, "); slzw_decomp(in,null,archead.atype); break; case 8: case 9: if (archead.atype==8) printf("UnCrunching, "); else printf("UnSquashing, "); dlzw_decomp(in,null,archead.atype,archead.size); break; case 10: if (warning) printf("\nCrushing not supported in this version of UNCMP, skipping\n"); fseek(in,archead.size,1); return(1); default: if (warning) printf("Unknown compression type, skipping\n"); fseek(in,archead.size,1); } if (crc != archead.crc) { printf("Failed\n"); errors++; } else printf("Ok\n"); return(0); }\Rogue\Monster\ else echo "will not over write ./testarc.c" fi if `test ! -s ./uncmp.c` then echo "writing ./uncmp.c" cat > ./uncmp.c << '\Rogue\Monster\' /******************************************************************* * UNCMP - UNCMP, Version 1.00, created 5-13-89 * * Main part of Uncompress program. * * Reads the command line and determines which routines to run. * * This code has been released into the Public Domain. *******************************************************************/ #include #include #include #include #include #ifdef __TURBOC__ # include #endif #include "archead.h" #include "global.h" #include "uncmp.h" #define _A_NORMAL 0 struct find_t { char reserved[21]; char attrib; unsigned wr_time; unsigned wr_date; long size; char name[13]; }; void main(int argc, char **argv) { FILE *in; long foffset; char arcname[80]; int curarg = 1; /* current argument in arg list */ printf("UNCMP v1.03 - Archive UnCompressor, written 1989 by Derron Simon\n"); printf("Compiled on " __DATE__ " with " COMPILER "\n\n"); if (argc < 2) { help(); exit(1); } while ((argv[curarg][0] == '/') || (argv[curarg][0] == '-')) { switch (tolower(argv[curarg][1])) { case 'e': case 'x': /* extract */ break; case 'w': /* warnings off */ warning = 0; break; case 'o': /* overwrite on */ overwrite = 1; break; case 't': /* test integrity */ testinteg = 1; break; case 'l': case 'v': /* verbose listing of archive */ listarchive = 1; break; case 'h': case '?': /* help with UNCMP */ help(); exit(1); default: printf("Unknown option, run uncmp without arguments for help\n"); exit(1); } curarg++; } strcpy(arcname,argv[curarg++]); if (!strchr(arcname,'.')) strcat(arcname,".arc"); /* if path is the next argument then curarg++, else let it be */ if (setup_path(strupr(argv[curarg]))) curarg++; /* put the arguments in the filename list */ while (curarg != argc) { setup_list(argv[curarg]); curarg++; } if ((in = fopen(arcname,"r")) == NULL) { printf("Can't open archive %s\n",arcname); exit(1); } if (!getarcheader(in)) { read_error(); } /* what follows is a kludge to get around what could be a bug in setvbuf() */ /* after setvbuf() the file pointer is incremented by one, so you have */ /* to save your place before setvbuf() and restore it afterwords */ fgetpos(in,&foffset); /* get position */ if (setvbuf(in,NULL,_IOFBF,INBUFSIZE)) { mem_error(); } fflush(in); fsetpos(in,&foffset); /* restore position */ do { /* MAIN LOOP */ if (testinteg) { if (check_list(archead.name)) /* if match then test file */ test_file(in,archead.name); else fseek(in,archead.size,1); /* else skip it */ } else if (listarchive) { list_arc(in); /* list_arc() only gets */ break; /* called once */ } else { if (check_list(archead.name)) /* if match then extract file */ extract_file(in,archead.name); else fseek(in,archead.size,1); /* else skip it */ } } while (getarcheader(in)); fclose(in); if (testinteg) printf("%d error(s) detected\n",errors); printf("\nUNCMP complete\n"); exit(errors); } /* this function extracts a file and tests to see if duplicate exists */ void extract_file(FILE *in,char *filename) { #ifdef __TURBOC__ struct ffblk buffer; #else /* MSC 5.1 */ struct find_t buffer; #endif char outfile[80]; int reply; FILE *out; int failure = 0; /* create filename with specified path */ strcpy(outfile,path); strcat(outfile,filename); if (!overwrite) { #ifdef __TURBOC__ if (!findfirst(outfile,&buffer,FA_ARCH)) { #else /* MSC 5.1 */ if (!_dos_findfirst(outfile,_A_NORMAL,&buffer)) { #endif printf("File %s exists, overwrite? ",archead.name); fflush(stdin); reply=getc(stdin); if ((reply != 'y') && (reply != 'Y')) { printf("Skipping file %s\n",archead.name); fseek(in,archead.size,1); return; } } } if ((out = fopen(strupr(outfile),"w")) == NULL) { printf("Cannot create file %s\n",outfile); exit(1); } if (setvbuf(out,NULL,_IOFBF,OUTBUFSIZE)) { mem_error(); } rewind(out); printf("Extracting file: %-15s ",strupr(filename)); failure = uncmp(in,out); fflush(out); /* set date and time, but skip if not MSC since Turbo C has no */ /* equivalent function */ /* #ifndef __TURBOC__ _dos_setftime(fileno(out),archead.date,archead.time); #endif */ fclose(out); #ifdef M_XENIX xen_setftime(strupr(outfile), archead.date, archead.time); #endif /* if errors during uncompression, than delete attempt at uncompression */ if (failure) { unlink(outfile); } } void help(void) { printf("UNCMP uncompresses archives created with ARC-compatible archivers\n\n"); printf("Usage: uncmp -[xtv] -[wo] archive[.ext] [path/] [filename.ext] [...]\n"); printf(" - \"[xtv]\" specifies operation\n"); printf(" - \"-x\" extract files (default)\n"); printf(" - \"-t\" test archive integrity\n"); printf(" - \"-v\" list contents of archive\n"); printf(" - \"[wo]\" specifies option\n"); printf(" - \"-w\" suppresses warnings\n"); printf(" - \"-o\" overwrite existing files\n"); printf(" - \"archive.ext\" is the archive to extract from with optional extension,\n"); printf(" arc is assumed if none is specified.\n"); printf(" - path/\" specifies the path to extract files to.\n"); printf(" - \"filename.ext\" is the name of a file to extract, any number can be\n"); printf(" specified.\n"); } /* the same as extract except it doesn't check for already existing files */ /* and doesn't setup buffers */ void test_file(FILE *in, char *filename) { printf("Testing file: %-15s ",strupr(filename)); testarc(in); } fgetpos(in, addr) FILE *in; long *addr; { *addr = ftell(in); } fsetpos(in, addr) FILE *in; long *addr; { (void)fseek(in, *addr, 0); } _dos_findfirst(ofile, ints, bufs ) char *ofile; int ints; char *bufs; { int result; FILE *Result; Result = fopen(ofile, "r"); if( Result == NULL ) result = 1; else result = 0; return(result); } xen_setftime(f_name, f_date, f_time) /* set a file's date/time stamp */ char *f_name; /* file to set stamp on */ unsigned short f_date, f_time; /* desired date, time */ { time_t times[2]; unsigned long year, month, day, hour, minute, second; static unsigned long monthdays[12] = { 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 }; year = (f_date >> 9) + 1980; month = (f_date >> 5 & 0xF) - 1; day = f_date & 0x1F; hour = f_time >> 11; minute = f_time >> 5 & 0x3F; second = f_time << 1 & 0x3F; times[0] = time(NULL); times[1] = ((((year - 1970l) * 365l + ((year - 1969l) >> 2) + monthdays[month] + ((month > 1l && !(year & 3l)) ? 1l : 0l) + day - 1) * 24l + hour) * 60l + minute) * 60l + second; utime(f_name, times); } \Rogue\Monster\ else echo "will not over write ./uncmp.c" fi if `test ! -s ./uncmp.h` then echo "writing ./uncmp.h" cat > ./uncmp.h << '\Rogue\Monster\' /******************************************************************* * UNCMP - UNCMP.H, Version 1.03, created 6-28-89 * * Function prototypes for all functions used, as well as all the * defines that are needed for all of the modules. * * This code has been released into the Public Domain. *******************************************************************/ /* defines */ #define OUTBUFSIZE 16384 /* 16k output buffer */ #define INBUFSIZE 16384 /* 16k input buffer */ #ifdef __TURBOC__ /* if TURBO C */ # define COMPILER "Turbo C 2.0 and TLINK 2.0" #else /* Assume MSC 5.1 */ # define COMPILER "MSC 5.1 and LINK 3.65" #endif #define PASCAL pascal /* can be undefined if compiler doesn't support */ /* pascal calling. Improves speed slightly */ /* the following is an inline version of addcrc for 1 character, implemented */ /* as a macro, it really speeds things up. */ #define add1crc(x) crc = ((crc >> 8) & 0x00FF) ^ crctab[(crc ^ x) & 0x00FF]; /* function prototypes */ void main(int argc, char **argv); void addcrc(char *cc, int i); int dlzw_decomp(FILE *in,FILE *out,int arctype,long size); void putc_pak(char c,FILE *out); void putc_rle(unsigned char c,FILE *out); int getc_pak(FILE *in); int getarcheader(FILE *in); void sq_decomp(FILE *in, FILE *out); void list_arc(FILE *in); int calcsf(long length_now,long org_size); void extract_file(FILE *in,char *filename); void help(void); void test_file(FILE *in,char *filename); int slzw_decomp(FILE *in,FILE *out,int arctype); int testarc(FILE *in); int uncmp(FILE *in,FILE *out); char *setup_name(char *filename); void setup_list(char *filename); int check_list(char *filename); int compare_files(char *filename, char *filespec); int setup_path(char *filename); void store_decomp(FILE *in, FILE *out); void read_error(void); void write_error(void); void mem_error(void); void rle_decomp(FILE *in, FILE *out); \Rogue\Monster\ else echo "will not over write ./uncmp.h" fi if `test ! -s ./uncmp.prj` then echo "writing ./uncmp.prj" cat > ./uncmp.prj << '\Rogue\Monster\' UNCMP.C STUBS.C DLZW1213.C GETHEAD.C FILEIO.C FILELIST.C HUFFMAN.C SLZW12.C DISPATCH.C STORE.C ERRORS.C GLOBAL.C PACK.C CRC.C TESTARC.C LISTARC.C \Rogue\Monster\ else echo "will not over write ./uncmp.prj" fi if `test ! -s ./uncmp.tc` then echo "writing ./uncmp.tc" cat > ./uncmp.tc << '\Rogue\Monster\' Turbo C Configuration File else echo "will not over write ./uncmp.tc" fi if `test ! -s ./user.man` then echo "writing ./user.man" cat > ./user.man << '\Rogue\Monster\' UNCMP v1.00 USER MANUAL ----------------------- What is UNCMP? -------------- UNCMP is a Public Domain un-archiver with full C source code for MS-DOS. UNCMP was written by Derron Simon, with most of the code coming from Public Domain sources and the rest from by Derron Simon. UNCMP comes complete with source code for MSC 5.1 and Turbo C 2.00 and is very customizable for C programmers who wish to customize it to a particular application, however you do not need to be a C programmer or technically inclined to use it. The interface is very straight forward and simple, and UNCMP has one major thing going for it, IT'S FREE! Who should use UNCMP? --------------------- UNCMP was written first to be an end-user application and then to be a platform for C programmers to use in creating a Public Domain un-archiver. People who need to extract files from ARC and PKA archives daily should consider UNCMP if they do not wish to pay for an un-archiver or would rather use a program that is user-supported. Who should _not_ use UNCMP? --------------------------- UNCMP is lacking in areas that may be needed by some people. UNCMP is relatively slow, most commercial and shareware un-arcers are between 2 and 3 times faster than UNCMP. People who need and archiver must also use another archiver since UNCMP cannot create archives, it only extracts files from archives. UNCMP feature list ------------------ o Ability to extract single or multiple files from archives. o Compatibility with SEA's ARC format and PKWARE's PKA format. o Support for all compression types in common use today, including Squashing, Crunching, old style Crunching, Squeezing, packing, and storing, but not crushing (a proprietary format used by NoGate's PAK). o Full C source code! o Faster than all other PD un-archivers. o Ability to list contents of archives. o Ability to test the integrity of archives. o Command line flags to toggle automatic overwrite and warning level. o Ability to extract to a path. o Ability to use wildcards in list of files to extract. o Small (under 25k with MSC 5.1). o Portable. Summary of commands ------------------- To use UNCMP is very simple. The format is as follows: UNCMP [-xtv] [-wo] archive[.ext] [d:\path\] [filename.ext] [...] Each of the above arguments that begins with a hyphen ('-') specifies an operation or modifier. The first is an operation. To extract a file the option '-x' is given right after UNCMP on the command line. To test the integrity of a file's contents the option 't' is used. To examine the contents of the archive, the option '-v' is used. The second argument is the modifier. Only two modifiers are available and they are '-w' and '-o' which specify to turn off warnings and overwrite files that already exist. If no options are given, UNCMP defaults to extract with warnings and ask on overwrite. The archive must be an archive created with an ARC compatible program, such as ARC itself, PKARC, or any of a number of ARC compatible programs. The default extension is ARC and is added if none is given. The fourth argument is the path to extract the files in the archive to. It can be any valid path and must end with a slash ('\' or '/') for UNCMP to recognize it as a path. The last argument is a file specifier which specifies which files to extract. Wildcards may be used and any number of files may be specified. Distribution Notice ------------------- I hold no copyright to UNCMP and release it to the Public Domain. I hope that UNCMP will not be vandalized and wish that I will not be taken advantage of by this very liberal policy. Most of the source code came from PD sources, so this was written by those people as much as by myself. If you modify UNCMP I will be very happy to receive your modifications and release them in the next update. Please send them to me at one of the addresses below. Thanks. Every official release of UNCMP will be packaged in an ARC compatible file with the following filename style: UCMP100S.ARC - for the whole package, including C source code, documentation, and executable. UCMP100E.ARC - for the executable and user documentation. Please do not use the above naming conventions for any non-official releases, and please keep the files together in the archives they came in. Do not change the file groupings. About the author ---------------- I am a 17 year old C programmer who right now is looking for a good college and hoping to major in computer science. I have been programming in C for about 3 years and became very interested in data compression. I wrote UNCMP because of that interest. I have written another utility which lists the contents of archives of many different formats called AV. AV is available on many systems across the country. I will answer your questions if you send them to one of the addresses below: Derron Simon GEnie: D.SIMON 14 Valley Road Glen Mills, PA 19342 Or leave a message on one of the BBS's below, for Derron Simon. Distribution Points ------------------- Every official version of UNCMP will be distributed as widely as possible by me, however the following BBS's and information services will always have the latest version. PAL Software BBS - (914)762-8055 (2400/1200) Rydal Board BBS - (215)884-6122 (2400/1200) GEnie IBM Roundtable Software Library I will probably add more after the initial release of UNCMP, but they are the only ones I have right now. If you must, I will send you a version if you send me a diskette and a return mailer with postage. I can only handle 720k and 1.44mb 3 1/2 diskettes and 360k 5 1/4 diskettes. Please try to find another method, because I don't want to be swamped with work. Trademarks ---------- ARC is a trademark of System Enhancement Associates. Credits ------- The following people have contributed to this program, both knowingly and unknowingly. Leslie Satenstein - wrote the SQUASH program from which all of the Crunching and Squashing code came from, as well as the CRC and RLE code. Kent Williams - wrote the old method crunching routines. D.A. Huffman - the founder of the squeezing or huffman method of data compression. James A. Storer - author of the book _Data Compression Methods and Theory_, from which I learned about data compression. Richard Greenlaw - wrote the huffman decoding routines used. I'm sure this list will grow as additions are made to UNCMP. \Rogue\Monster\ else echo "will not over write ./user.man" fi echo "Finished archive 1 of 2" exit -- uucp: ...!uunet!zardoz!alphacm!sandy ....!att!hermix!alphacm!sandy ...!trwrb!ucla-an!alphacm!sandy ....!lcc!alphacm!sandy phone: data --- 714-821-9671 voice --- 714-821-9670 Sanford Zelkovitz