Path: utzoo!attcan!uunet!jarthur!ucivax!orion.oac.uci.edu!usc!cs.utexas.edu!fletcher From: WALLI%7178.gm@hac2arpa.hac.com Newsgroups: comp.std.unix Subject: The Context for LIS of POSIX Message-ID: <13460@cs.utexas.edu> Date: 11 Oct 90 01:28:09 GMT Sender: fletcher@cs.utexas.edu Lines: 521 Approved: fletcher@cs.utexas.edu (Guest Moderator, Fletcher Mattox) X-Submissions: std-unix@uunet.uu.net Submitted-by: WALLI%7178.gm@hac2arpa.hac.com The Context for Programming Language Independence for POSIX Stephen R. Walli - walli@osmcl1.gm.hac.com EDS of Canada, Ltd. 1 INTRODUCTION Programming Language Independent Specification (LIS) of POSIX has become a hot topic within the IEEE P1003 Working Groups and ISO WG15 (POSIX). Depending on one's point of view, it will either make the POSIX family of standards more robust and usable or make them completely unusable while seriously delaying the process. What I hope to accomplish here is to present all of the relevant concerns and information in one place, so as to provoke ideas and discussion which will prove fruitful in the upcoming P1003 Seattle meeting and the subsequent WG15 meeting on Orcas Island. The standard disclaimers apply. All views expressed are the author's and do not necessarily represent the opinions of the IEEE P1003 Working Group members, ISO WG15 or the author's employers. I would like to thank Paul Rabin for reading this through, catching some of my oversights and helping me clarify some of my statements. POSIX is a registered trademark of the IEEE. UNIX is a registered trademark of AT&T. 2 WHAT EXACTLY IS THE POINT OF LIS I will not provide any of the historical reasons/arguments/discussions between ISO and the IEEE since all I know is hearsay, and I would not want to raise anyone's ire if I've misunderstood any of the facts. It also serves no real purpose in accomplishing the task at hand to re-iterate these discussions. It is sufficient to say that the direction to accomplish the LIS work is coming from ISO and TCOS-SS has agreed to the work. The directive is to write the POSIX interfaces in a programming language independent way, such that the functionality and behaviour are completely described, but no language semantics are introduced. This then frees up language bindings groups to map to the interface specification in a way most natural for their particular language. Describing the service in a language independent manner also serves to provide a more rigorous definition of the service [1]. Many will argue that POSIX is a C language standard interface to UNIX system services, and all of the noise about any other programming language binding or any other operating system is immaterial. I've both seen this and heard it voiced in the occasional dark corner at P1003 Working Group meetings. While POSIX' roots are most certainly C interfaces to UNIX system Page 2 services, the market has driven POSIX beyond those roots. There are many other language groups which have a vested interest in writing portable applications which want access to operating system services in a portable way [2]. The US government commitment to Ada, and the amount of government funded work in FORTRAN has created the need for two POSIX Working Groups to produce their respective bindings to 1003.1 services. As the commercial market interest in Open Systems grows, I have no doubt we will eventually see a COBOL binding. I would be very surprised if there isn't someone at IBM already working in this direction. The fact remains that the LIS of POSIX is required work for the international acceptance of POSIX, and is here to stay. 3 WHAT ARE PAUL AND STEPHE DOING? Paul Rabin and I are working on the methods document [1] for producing a language independent specification of POSIX. Paul found time to do all the real work of compiling and editing the document, while I acted as critical reviewer and chief nit-picker. The work is based on documents received from ISO/IEC JTC1/SC22/WG11 which is defining a set of methods for creating abstract, programming language independent procedure specifications [3] [4]. The method's objectives are: o to meet the ISO requirement of producing a LIS of POSIX, while adhering to their guidelines on developing these specifications. o to facilitate the development of language bindings from base LIS. o to facilitate the development of base LIS which are sufficiently robust so as to ensure a common recognizable functionality in the bindings. Specific non-goals include: o Interoperability between modules written in different languages bound within the same executable image, or interoperability between applications written in different languages using common services, including data interchange. o Incorporating formal description techniques. o Ensuring the portability of language or binding implementations. o Providing a machine translatable language-independent binding description language. Page 3 We discovered that interoperability between applications written in different programming languages cannot be ensured within the current scope. The general formula presented by WG11 for producing language-independent procedure specifications is to model the interface using abstract data types, then each binding defines its mapping of real data types to these abstract types. Interoperability fails when certain abstract opaque types, process id or file descriptor for example, are mapped by different languages to different types. What may be effectively mapped to a pointer in one language cannot be supported by another language which does not understand pointers. The second language must map the opaque type differently, to the detriment of interoperability. In retrospect, this is not unreasonable. POSIX' goal is to ensure the portability of an application using operating systems services across multiple implementations at the source level. It makes no effort to ensure interoperability between programming languages, nor should that be within the scope of the standard. The method defines a model which contains data types, procedures on those types, and constants. The objects that the system services act upon are modelled by their abstract types. The procedures (services) become the operations on the data types. Operating system service interfaces are presented as an abstract procedure, with input parameters, output parameters and the notion of a completion status. Note that completion status does not refer to a returned value, but could just as easily refer to a raised exception, a signal, a return value of a function, an output parameter of a procedure, or any other entity you can imagine. The methods document goes on to suggest guidelines for both base standard and language binding developers. The methods document has been updated since Danvars and will be presented again in Seattle. It will be put to a mock ballot sometime after Seattle. Donn Terry is managing the ballot list. One interesting example of a similar informal method that I've seen recently is the circulation of ORKID Draft 2.1 [5] in a LIS form with a C binding. ORKID isn't as complex as POSIX, but the draft serves as an interesting and complete example. A C binding accompanies the draft as an appendix, formatted tersely as a C header file. I would be very interested to see a FORTRAN or Ada binding to the draft, if one exists. The draft has the same problem with language interoperability that we discovered with our method, in that there is considerable room for choosing language specific data types to match the opaque types. They go as far as to allow implementations within a language to specify their own data types. I haven't spent enough time with the draft to be able to comment on whether this hurts networked applications, or whether the procedure interface deals with this behind the application developer's back. It is still a valuable example. Page 4 Paul is managing a mailing list for LIS related issues and discussion. Messages for distribution to the whole list should be sent to posix-lis@osf.org or uunet!osf.org!posix-lis. Requests for updates to the list should be sent to posix-lis-request@osf.org. 4 PEOPLE ISSUES There are a number of people issues surrounding the LIS, which should be understood, because the LIS sometimes becomes an emotionally debated topic. An effort has been made to state them unbiasedly and to completely avoid any of the finger pointing arguments which sometimes occur. 4.1 People Issue #1 Many people have devoted a considerable effort into building the current 1003.1 standard and the draft documents which are balloting or near ballot. There is ownership and sweat built into all of the documents. A perception exists that the ISO mandate to produce LIS of the services destroys these documents. It does not. There is a desire to change the documents to produce the best possible standard, yet backwards conformance to the current work has always been a goal. These documents will change in future revisions of POSIX. The knowledge gained and information content is valid. We are discussing, however, an effort that is far beyond simple reformatting of the current documents. ANY significant change at this point will inevitably meet with resistance no matter how it's presented. This whole issue is very analogous to 1003.3 requirements for providing test assertions at this point. At the last P1003 meeting in Danvars, I had an opportunity to spend time in the .1, .5 and .9 rooms, (as well as my home in .4) discussing LIS issues. I think I'm beginning to understand how Roger Martin (P1003.3 chair) feels any time he shows up in another working group to explain test assertion requirements. 4.2 People Issue #2 Working Groups that thought they had ballotable documents are being asked to fulfil additional requirements. These requirements entail considerable extra effort. Base standards groups are being asked to provide base LIS of their services, and the C language binding to the LIS. Bindings groups are being asked to provide reformatted bindings to a base LIS which doesn't yet exist. At the same time test assertion requirements are being presented. Both of these areas are perceived as being tedious and "boring". One Working Group actually went on record saying they felt they would lose membership over these issues. Page 5 For this work to be worthwhile it must be done completely and accurately. This will require exacting effort. Nothing like the "exciting" work of arguing the functionality of a family of services. This comes at the perceived end of work as a draft document prepares to go to ballot. 5 THICK DOCUMENTS OR THIN - A USABILITY ISSUE One of the debates currently being argued in P1003 is whether the individual language bindings are thick documents or thin. In the thick document scenario, there is a base document which describes the abstract service interfaces, and each binding document is a thick standalone document which will repeat the functional descriptions, adjusted for the particular language. This camp's followers are programmers with real experience in receiving a standard on their desk and having to use it as a programming tool. The base document becomes a tool only for language binding writers. The thin document scenario has a base document describing abstract service interfaces, but each thin binding will only include language specific information. All appropriate functional descriptions are pointed to in the base document by reference. This camp is the "Standards aren't for People" crowd. Standards are only meant for conformance testing for procurement. If a programmer actually had to use the binding standard, they would also require the base standard and would then work with a finger stuck in each book. There are actually two separate issues hidden in the thick/thin binding debate. The first issue is whether a binding is allowed to repeat material contained in the base LIS. The second issue concerns whether a binding provides a direct one-to-one mapping to the base, or whether it can be creative and more directly map to the semantics of the language being bound. For the record, the P1003.5 (Ada) Working Group decided early in their history to create a standalone document appropriate to the Ada language. The P1003.9 (FORTRAN) Working Group chose to create a binding which points to the "base" document, mapping its services one-to-one as closely as possible. We are working under the assumption that ISO ascribes to the thin binding camp. Semantically, standards do not overlap. Standards are allowed to refer to other standards. There are genuine and realistic concerns with synchronizing standards documents if many documents contain overlapping material. For the record, I'll voice the following suggestion. The STANDARDS themselves will be individually drafted and balloted documents as in the thin binding camp. The LIS standard comes first. The binding standard comes second. However, instead of merely pointing to a another document, or including its own interpretation of the contents of the other standard, the text of the LIS is directly embedded. The Page 6 embedded LIS text is clearly delineated so as to be clearly separate from the binding text, and only the binding text is ballotable in the draft binding document. This would hopefully solve the usability issue put forward by the one document camp. Think of it as a documentation analogy to software development. Instead of subroutine calls pointing elsewhere, there are already expanded "macro" calls to "speed" the understanding. (Publication synchronization concerns become source control concerns similar to different applications referencing the same "macro" library.) It would simplify the synchronization issue. Ultimately I believe the publication should be usable by mere mortals. 6 THE CASE FOR RIGOROUS FORMAL METHODS AND THE CASE AGAINST Another hot debate is the level of rigor required by the LIS. Our understanding is that natural language descriptions of the services are sufficient for the current LIS of POSIX. It is explicitly stated in the draft methods document that we are not pursuing a formal method of specification for the standard. There seems to be a lack of experience and standardization of formal methods within the standards community. Little work has been done to formally specify standards. (Now that I've publicly made this statement, I'm sure I'll be accosted by everyone next week who has seen anything even remotely looking like a formal standard.) I base this statement on three P1003 meetings worth of LIS BOFs where everyone is quick to suggest their favourite formal method, but there is never anyone who has taken the trouble to bring an example of it used to specify a standard. Please do. The author welcomes all related information. A recent issue of IEEE Software had a very reasonable article on the use of formal methods in the standards process [6]. This work came out of a working group formed by the British Computer Society (BCS) to address the lack of informed opinion on the matter. They outline briefly the reasons for using formal methods, a few examples of experiments with using formal methods on standards, and finish with a set of guidelines. These guidelines are by far the strongest part of the article. They also refer briefly to an ISO three-phase plan [7] to bring formal methods into the standards process. 1. Phase 1 has the use of formal methods restricted due to lack of experience and expertise. Work is done in parallel on a formal specification of the standard, which hopefully contributes to the robustness and clarity of the standard, and the results are published as a SEPARATE report. Page 7 2. Phase 2 has seen the knowledge and experience base expanded in the use of formal methods in standards creation, and the work proceeds in parallel and is sufficient to be published as an informative annex. (Note: This is not a balloted NORMATIVE one, but an unballoted informative one.) 3. Phase 3 sees the standards developing body well versed in formal methods and the formal description is the standard with a natural language commentary. One thing that the article warns against is the retrofitting of formal methods to an existing standard, because it can often demonstrate the lack of conceptual integrity of the original work. Additionally they recommend choosing an appropriate formal method to express the standard, depending on such factors as the proposed standard's content, mathematical sufficiency and accessibility to the standards forming group. The primary formal methods that have been suggested are Z, and VDM. Z actually has a number of interesting examples to consider. Recent work has been done to produce a formal specification of P1003.1 using Z, and it was reported upon by Martin Kirk [8]. The report concluded that while the work was useful at finding "weasel worded" areas of the standard, it requires a large effort to continue this work. Several other problems exist as well. Some of these problems had to do with the complexity of POSIX, and its deliberate areas of ambiguity. Other problems encountered had to do with the Z notation and the choice of model. Indeed this raises an area of concern with how far formal methods should be applied to POSIX. POSIX has deliberate areas of ambiguity, "weasel words", and unspecified nature. This is required so as to allow a maximum number of implementations, to not restrict implementations in unnecessary ways or force implementations. POSIX is a standard for portable operating system service interfaces. It is not the specification of an operating system [9]. There are also times when weasel words are the only way to arrive at consensus. Another interesting example of Z in this area is a recent article by J. Michael Spivey on using Z to specify a real-time kernel [10]. This is the specification of a minimal kernel and not an interface to it. Spivey discusses several deficiencies in his specification of the kernel, and addresses all of them at the risk of making his specification more complex and less understandable. He does conclude that using a formal specification is a valuable and beneficial tool for answering questions about the kernel, but he then "suggests that the idea of using a formal specification as a complete contract between implementer and user is not very helpful." [11] Indeed it has been sensibly pointed out that the use of formal methods is beneficial to aiding understanding about the object being specified, but that they need not be a complete specification [12] Page 8 [13] [14]. This certainly fits in with the ISO three-phase approach to introducing formalism into standards. They never require a complete formal specification without natural language commentary. The Vienna Development Method (VDM) is also frequently suggested as a candidate formal method. VDM has a similar flavour to Z but does not have a facility similar to Z's schema calculus to allow simple specifications to be built into complex ones. This summarises all the current discussion I've discovered to date concerning actual formal methods to specify a standard POSIX interface. 7 A BRIEF NOTE ON TESTING AND CONFORMANCE ISSUES There are several testing issues about LIS of POSIX no matter how formal the specification method. The following questions have been raised. How does one "test" a language independent specification? At first glance, one doesn't. Test assertions are merely done at the binding level to allow implementations to demonstrate conformance. This certainly needs to be done. But can formal or natural language assertions be made about the LIS, which can be tested manually by argument and inspection, and which can then act as a basis set of assertions used when building language binding assertions? >From a different point of view, is there a set of assertions that can be made about the LIS which can help determine how good a binding reflects the base? How do bindings conform to the base? If a binding becomes a one-to-one mapping of the base LIS, then they conform directly. If they do not completely map the binding or map it differently, how is conformance measured? All of these questions need some thought, and I hope this generates some creative feedback for next week. 8 SUMMARY I hope I have presented completely and with as little bias as possible the issues surrounding the language independent specification of POSIX. Hopefully at the BOF gatherings at P1003 and the WG15 Ad Hoc, many of these issues can be solved to everyone's satisfaction, with a care towards the tremendous effort which has gone on to date at building POSIX. I look forward to seeing everyone there. Page 9 9 REFERENCES [1] Paul Rabin and Stephen Walli, "Draft TCOS-SS Programming Language Independent Specification Methods", Draft 1, 15 July, 1990. [2] Dominic Dunlop, comp.std.unix Volume 20, Number 110, USENET, 5 July, 1990. [3] "Proposed DTR 10182 on:Information Processing Systems - Guidelines for Language Bindings", ISO/IEC JTC1/SC22 N754, International Standards Organization, Geneva. [4] "Common Language-Independent Datatypes: Working Draft #3", ISO/IEC JTC1/SC22/WG11 N162, International Standards Organization, Geneva. [5] ORKID Working Group, "ORKID - Open Real-time Kernel Interface Definition, Draft 2.1", August 1990. [6] David Blyth, et al, "The Case for Formal Methods in Standards", IEEE Software, Volume 7, Number 5, September, 1990. [7] "JTC1 Statement of Policy on Formal Description Techniques", ISO/IEC JTC1 N145, and ISO/IEC JTC1/SC18 N1333, International Standards Organization, Geneva, 1987. This reference was pointed to in [6] and I have not yet been able to obtain a copy. [8] Martin Kirk, "Z Specification of P1003.1", ISO/IEC JTC1/SC22/WG15 N115, International Standards Organization, Geneva, September, 1990. [9] Donn Terry, "Suggested Response to JTC1/SC22/WG15 N115", Document SC22/WG15 US TAG N146. [10] J. Michael Spivey, "Specifying a Real-Time Kernel", IEEE Software, Volume 7, Number 5, September, 1990. [11] Ibid. p.27 [12] J. Michael Spivey, "The Z Notation : a reference manual", Prentice Hall International, 1989 [13] Anthony Hall, "Seven Myths of Formal Methods", IEEE Software, Volume 7, Number 5, September, 1990. [14] Jeannette M. Wing, "A Specifier's Introduction to Formal Methods", Computer, Volume 23, Number 9, September 1990. Volume-Number: Volume 21, Number 197