Path: utzoo!attcan!uunet!wuarchive!emory!rsiatl!jgd From: jgd@rsiatl.UUCP (John G. DeArmond) Newsgroups: comp.unix.sysv386 Subject: Re: Converting DOS text files Keywords: SCO ODT Message-ID: <4339@rsiatl.UUCP> Date: 16 Oct 90 15:56:10 GMT References: <1477@pai.UUCP> Organization: Radiation Systems, Inc. (a thinktank, motorcycle, car and gun works facility) Lines: 81 erc@pai.UUCP (Eric Johnson) writes: >This is for those of you who have SCO's OpenDesktop with a DOS >under UNIX, or any other DOS under UNIX that has this problem. >The problem is this: when you use a DOS-based copy command to copy a text >file onto your system (from a PC floppy, say), that DOS text file >is full of CR/LFs (instead of the UNIX line feed) and has a trailing >Ctrl-Z. On SCO, there is a program to take care of this, called >dtox. Unfortunately, dtox is a filter. That is, you call it >with something like: [program with BIG copyright deleted.] Please don't take this wrong but your approach, while probably necessary in a DOS tool-less environment, is terrible for Unix. Here's how you do it without any programming. Get to know Mr. Shell. He is your friend. Here's how: for i in `ls *.txt` do # takes care of read-only temp file name collisions rm -f /tmp/$i >/dev/null 2>&1 tr -d '\032''\015' <$i >/tmp/$i if [ -z $? -a -f /tmp/$i] then mv -f /tmp/$i $i else rm -f /tmp/$i >/dev/null 2>&1 # just in case echo "tr returned an error on file $i" exit fi done If you want to put this in a shell script, simply substitute this for the first line: for i in `ls $*` What this script does is first execute the command in back-ticks ("ls *.txt") and then steps through the list of files via the shell variable "i". Each file is run through tr (translate) invoked in its "dump" mode (-d). Tr is told to dump ^M (octal 015) and ^Z (octal 032). The return code from tr is stored in the shell intrinsic "$?". If tr is successful, this value will be 0. The "if" statement checks to see if tr ran ok AND if the temporary file was created ok and if so moves the temporary file back on top of the original. There are even simpler ways to do this, but this is what popped out of my head when reading your post. There are several unaddressed error conditions in this script, such as when a temp file name collision occurs and the temp file is not owned by you, but these problems are left as an exercise to the reader :-) You could, of course, use dtox in place of tr but this solution is unix vendor-independent. You could also use sed, awk, Perl (if installed) and who knows what else. In other words, get with the Unix tools show, man :-) Minor programming note. I don't usually critique coding practices on the net but in this case I gotta. Your approach is terribly inefficient, requiring twice as much system resource as necessary. Namely, you first process the input file a character at a time (which is OK for a quick hack) and then you copy the temp file back onto the input file a character at a time (NO NO). The easist way to move the temp file back onto the original is to use a system() call with mv. Example: sprintf(tmpstr,"mv %s %s"", tmpname, filename); system(tmpstr); For a bit of error checking, you could fork() and exec() mv and look at the return code from wait(). Or, assuming the files are both on the same file system, you could simply rm() the old file, link() the old name to the temp file and rm() the temp file. That is the most efficient way of doing it. While one could (successfully) argue that a system() or fork() system call would be more expensive than processing small files a byte at a time, for typical files, this would not be the case. And for machines that process I/O system calls slowly (NCR towers come to mind), even small files would seriously degrade performance, especially if you are doing a lot of them. John John De Armond, WD4OQC | "The truly ignorant in our society are those people Radiation Systems, Inc. | who would throw away the parts of the Constitution Atlanta, Ga | they find inconvienent." -me Defend the 2nd {emory,uunet}!rsiatl!jgd| with the same fervor as you do the 1st.