Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!mit-eddie!uw-beaver!tektronix!teklds!copper!stevesu From: stevesu@copper.UUCP Newsgroups: comp.os.vms Subject: Re: C RTL [again?] Message-ID: <1105@copper.TEK.COM> Date: Wed, 10-Jun-87 00:13:46 EDT Article-I.D.: copper.1105 Posted: Wed Jun 10 00:13:46 1987 Date-Received: Sat, 13-Jun-87 03:44:18 EDT References: <12308535119.47.AWALKER@RED.RUTGERS.EDU> <870607122607.00h@CitHex.Caltech.Edu> Organization: Tektronix Inc., Beaverton, Or. Lines: 86 Summary: Unix is not record oriented Inevitably, one quickly finds that many C RTL questions cross over from language issues to operating system issues. Several people have correctly pointed out that various parts of VMS have inherent 65,535 or 32,767 byte record limitations. Unfortunately, these arguments have nothing to do with C functions named read() or write(). Most vendors, DEC included, provide a C run-time library which "just happens" to look a lot like Unix. Presumably this is to make porting programs from Unix to (in this case) VMS easy. Therefore, the C RTL should do a reasonable amount of work to hide filesystem or operating system peculiarities, especially those not present in Unix. Unix has _a_b_s_o_l_u_t_e_l_y _n_o _r_e_c_o_r_d _s_t_r_u_c_t_u_r_e. People who are used to traditional record-based operating systems find this concept about as alarming as Westerners do when encountering aborigines running around without clothes, but Unix programmers love this freedom, and having to deal with RMS is one of the biggest shocks when moving from Unix to VMS. VMS I/O is nominally device-independent; Unix I/O is much more so. The read() and write() calls, even though they are low-level I/O routines, should not unnecessarily reflect underlying device characteristics, particularly on disk files. It could be argued that a program that is trying to do reads of more than 32,767 would be more portably written in terms of fopen and fread, so that the stdio package could provide another level of chunking/buffering. It could also be argued that such a program is inherently unportable, because huge numbers like these would not work as the third argument to read() on a 16-bit machine. Dragging record length considerations in, however, begins to compromise the semantics of read(). I must concede that, in the end, it is impossible to ignore record-length considerations when doing C I/O on VMS. Even if huge reads were supported, there are several other alignment problems you have to worry about. (lseeks to record boundaries, reads and writes of multiples of the record size, lines not changing size when updating variable-length record files in place, etc.) There are a whole bunch of these limitations, and they could probably be better documented. In particular, Table A-1 should not state that read() and write() have "equivalent functionality." (Several of the "not equivalent" entries in that table document restrictions much less significant than those of read and write.) Steve Summit stevesu@copper.tek.com P.S. Some of you are saying "but to remove the restrictions, the read() and write() emulations would have to do extra buffering, and that would be inefficient." Yes, buffering underneath read() and write() is necessary if you want to remove as many RMS- related restrictions as possible, but no, it doesn't have to be inefficient. It is possible to write these routines so that, when callers perform the "approved," aligned operations, no buffering (and hence no significant overhead) is required. When unaligned calls are made, the buffering is no less efficient than the buffering that would inevitably be introduced in the calling program to work around a restricted call. P.P.S. I'm knowledgeable about C RTL issues, and could talk about tradeoffs longer than you probably feel like listening and I feel like typing, because I wrote one here at Tektronix, partly because of licensing restrictions with the version 1 DEC C RTL, and partly because we didn't feel like rewriting large numbers of applications to work around the DEC C RTL limitations. Our library is proprietary, of course, so I can't offer you a copy. Disclaimers: Since I don't use DEC's C RTL, for the above- mentioned reason, I can't be sure about its limitations and restrictions. Many of them were removed between versions 1 and 2, including perhaps some of the ones I mentioned in this article. It is not strictly true that Unix has _n_o record structure; when dealing with "raw" devices like terminals and tape drives, the line/record structure becomes apparent, as indeed it must for programs to work correctly when using these devices nonabstractly. Disk files, however, are completely unstructured, unless you count the st_blksize field in the stat structure in 4.2bsd and Ultrix. (If Berkeley hates VMS as much as they would have us believe, why do they keep putting so many VMSisms in 4bsd?) Oh, and I hope this doesn't trigger another "Unix vs. VMS" debate. I use 'em both; I'm not even gonna mention which one I prefer :-).