Xref: utzoo comp.unix.programmer:427 alt.religion.computers:2001
Path: utzoo!attcan!uunet!lll-winken!cert!netnews.upenn.edu!msuinfo!rang
From: rang@cs.wisc.edu (Anton Rang)
Newsgroups: comp.unix.programmer,alt.religion.computers
Subject: Re: Why use U* over VMS
Message-ID: <RANG.90Nov5230646@nexus.cs.wisc.edu>
Date: 6 Nov 90 04:06:46 GMT
References: <1990Oct25.160937.28144@edm.uucp> <1089@dg.dg.com>
	<1990Oct31.141954.22736@druid.uucp> <3749@idunno.Princeton.EDU>
	<PENA.90Nov1164830@fuug.fi>
Sender: news@msuinfo.cl.msu.edu
Followup-To: alt.religion.computers
Organization: UW-Madison CS department
Lines: 46
In-Reply-To: pena@fuug.fi's message of 1 Nov 90 15:48:30 GMT

In article <PENA.90Nov1164830@fuug.fi> pena@fuug.fi (Olli-Matti Penttinen) writes:
>  The worst thing is that (at least under VMS 4.x) you really had to
>know the exact file type of a *TEXT FILE* to be able to do anything
>useful with it.

  This is an argument which comes up fairly often.  It's only partly
true.  The C I/O model (a file is a stream of bytes) works well for an
awful lot of applications.  The problem is that if you have a more
complex file structure, mapping it into the stream-of-bytes framework
in a way that makes sense is difficult.

  The way that most languages under VMS do their I/O is by using the
high-level RMS calls.  You can $OPEN a file, and then read through it,
one record at a time, using $GET.  This will let you sequentially scan
through file with a record structure--whether it's a variable-length
record file, a stream file, a 'relative' (numbered-record) file, or an
ISAM file.  You can use $PUT and/or $UPDATE to change files, one
record at a time.

  It's difficult to map this directly into a read/write/fseek model,
except with stream files (which are basically identical to UNIX text
files).  For instance, each record in a (sequential) variable-length
record file has a two-byte length field, and may include a byte of
padding.  If you want to treet fseek() as taking an offset which is a
character count, in this file type, you don't know where to position
the 'file pointer' without scanning from the start of the file.
(That's why it's a "sequential file"--it's designed for sequential
access, and shouldn't be used [under VMS] for random access.)

  An ISAM file type is even worse...if you have records with keys of
'abc', 'def', and 'ghi', they may have that logical order, but be
stored physically in a different order, different parts of the record
may be in different areas of the disk, and the record itself may be
stored in compressed form.  How could fseek() deal rationally with
this?

  If you want an easy port of a program using the UNIX I/O facilities,
it will have to deal with stream files, since that's what UNIX
provides.  (Unfortunately, the VMS C RTL still had some rather odd
quirks last time I looked, but the latest release was supposed to
improve it--I haven't used it since then.)

  Hopefully this isn't really controversial, but I've directed
followups to alt.religion.computers just in case....

	Anton
   
+---------------------------+------------------+-------------+
| Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison |
+---------------------------+------------------+-------------+