Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!uunet!sco!rogerk From: rogerk@sco.COM (Roger Knopf 5502) Newsgroups: comp.unix.xenix.sco Subject: Re: 1Gbyte file on a 130Mb drive (fsdb) Keywords: big file fsdb Message-ID: <17956@scorn.sco.COM> Date: Tue, 25 Jun 91 19:52:46 GMT References: <124@comix.UUCP> Sender: news@sco.COM Reply-To: rogerk@silicon.UUCP (Roger Knopf 5502) Distribution: na Organization: The Santa Cruz Operation, Inc. Lines: 105 In article <124@comix.UUCP> jeffl@comix.Santa-Cruz.CA.US (Jeff Liebermann) writes: >How does one deal with a bogus 1Gigabyte file? >I have a Xenix 2.3.3 system that has ls magically >declare a 45Mb accounting file as 1Gbyte huge. > >ls declares it to be 1Gb big. >du agrees. >df -v gives the correct filesystem size. >fsck "Possible wrong file size I=140" (no other errors). > >To add to the problem, I'm having difficulty doing >a backup before attacking. It may not be anything "wrong" with the inode. There may just be a gap where nothing was written. For example, if you do an lseek of 10,240 and write 1K the utilities above will show the file size to be 11K when in reality you have used only one block. dbm did this at one time (don't know about now). The real problem lies in that no utility recognizes these intervening empty blocks as not actually there. After all, if you read back the above file, you will get 10K of nulls and then your data. So if you use cp, tar, cpio, etc. they "fill in the blanks" and make your file actually be the logical size by writing nulls in all the empty blocks. There are two ways of dealing with this problem: 1) Most databases have a utility which dumps records and loads them. Use this to back up the data. To restore the data you make empty data files and use the load utility. This is the preferred method if available. It also has the side benefit of (usually) making the new files run faster. 2) This is ugly but effective. It only works assuming the "sparse" file scenario. Use the attached program to compress it. To read it back, read in each struct, lseek to the offset in lseekval and then write the block (actually n bytes) to the output file. -------------------------- sparsecp.c --------------------------- #include #include #include #define BSIZE 1024 struct orec { long lseekval; int cnt; unsigned char buf[BSIZE]; } rec; main(argc, argv) int argc; char *argv[]; { int infd, outfd, n; long offset; /* error checking is left as an excersize for the reader */ infd=open(argv[1], O_RDONLY); outfd=creat(argv[2], 0700); offset=0; while ((n=read(infd, rec.buf, BSIZE)) > 0) { if (notempty(rec.buf)) { rec.lseekval=offset; rec.cnt=n; write(outfd, &rec, sizeof (struct orec)); nullout(rec.buf); if (n != BSIZE) /* partial block at EOF */ break; } offset+=BSIZE; } close(infd); close(outfd); } notempty(s) unsigned char s[]; { register i; for (i=0; i