Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!rutgers!apple!voder!pyramid!nsc!taux01!taux02!chaim From: chaim@taux02.UUCP (Chaim Bendelac) Newsgroups: comp.arch Subject: Re: Software Distribution Summary: We are optimistic, aren't we? Message-ID: <127@taux02.UUCP> Date: 2 Sep 88 07:03:00 GMT References: <891@taux01.UUCP> Organization: National Semiconductor (IC) Ltd, Israel Lines: 105 In article <891@taux01.UUCP>, chaim@taux01.UUCP (I) wondered: > ...if there is no room for another standard layer, specially > designed for software DISTRIBUTION. Imagine an Intermediate Program > Representation ... language independent, architecture independent... I asked: > 3. What are the main obstacles? Economical? Political? Technical? > 4. What are the other advantages or disadvantages? Below is a [strongly edited] summary of the discussion so far. It seems to me that the problem-solvers outvoice the problem-raisers. I somehow have the feeling that we have not covered many of the problems. Let me try to be more specific: Let the goal be a IR for distribution purposes, that covers "traditional" architectures (you know what I mean) WITHOUT giving one a clear advantage over others (a particular object-format is out), that covers "popular" languages (C, Cobol, Modula-2, Pascal, Fortran, perhaps Ada and Lisp), that lives under a REAL unix-standard (AT&T's, or OSF's or who-ever), and that does not attempt to solve "system-code" - ONLY "application programs" (rule-of-thumb: if you cannot write the program in any language but C the program probably disqualifies). Squeezing the last 1% of performance out of system is NOT a goal, but the IR should be optimizable, both before and after distribution. "Tokenizing the source" is fine, if that allows me to write ONE single code generator for all these language translators out there, with a 100%-proof semantic definition of the IR. I want a R E A L separation. I want a "IR code-generator validation suite" to test my code generator, so that applications can be assured their stuff runs on my machine/architecture. What are the constraints? -- Chaim Bendelac (National Semiconductor Corporation) ---------------------------------------------------------------------------- Summary of current status of discussion: > From: henry@utzoo.uucp (Henry Spencer) it's tricky to build such a > representation in which the front end doesn't need to know *anything* about > the machine. Things like data-type sizes = From: henry@utzoo.uucp (Henry Spencer) Is the layout of structs in memory = decided before or after the IR is generated? What about "sizeof"? "varargs"? = you will end up with a tokenized version of the source. > From: bpendlet@esunix.UUCP (Bob Pendleton) you can get away with specifying > the data radix and the minimum number of digits required. "short int x;" > can be translated into "x: static allocated signed binary min 16". The > translator would translate declarations into constraints on the valid > representations of the declared items. > The layout of structs must be done by the machine specific code > generator. "sizeof" becomes a symbolic expression evaluated by the > code generator. In one system I wrote, all data size computations were done > in the linker. Worked out very well. The machine independent code for a > varargs call could look something like this: vararg_block > code for arg 1 > code for arg 2 > : > It's important to remember that this intermediate language must be > usable by ALL programming languages, not just C. > The problems are just not that big. = From: cik@l.cc.purdue.edu (Herman Rubin) I could probably list over 1000 = hardware-type operations which I would find useful. which algorithm to use = would be dependent on the timing of these operations. one cannot optimize = a program without knowing the explicit properties of the target machine. = We must face the fact that there cannot be efficient portable software. > From: mrspock@hubcap.UUCP (Steve Benz) I think the real bugaboo here will > be system calls and the like. Granted that they are theoretically identical, > but in reality, they're not so. = From: aglew@urbsdc.Urbana.Gould.COM In a recent UNIX World Omri Serlin = (I think) mentioned that OSF is considering something with a name like = "Architecture Independent Exchange Format" as a challenge to the plethora = of ABIs in the AT&T/SUN world. > From: henry@utzoo.uucp (Henry Spencer) in the real world, the part of the > compiler that does not make hardware-dependent decisions is the easy and > small part. What this would amount to, almost, is a sort of encrypted source. > And about whether the programmer was competent enough to make the code > really portable. Don't forget that condition. > This new flexibility also opens the door to a whole new range of bugs, > since the code can now be run on machines which the author never even > compiled it on. = From: dick@ccb.ucsf.edu (Dick Karpinski) the Gnu C's Register Transfer = Language (gcc's RTL) does look like a tokenized version of the source. = I would forsee a sort of validation suite to test both the gcc backend (with = the machine description) and system calls on the target system. > From: chase@Ozona.orc.olivetti.com (David Chase) RTL isn't a UNCOL, no. I'm > afraid these make it rather non-universal. = From: rpw3@amdcad.AMD.COM (Rob Warnock) Some companies have made major = advances in the art of emulation of one CPU on another, particularly when = the emulated CPU is the IBM PC. If one designed a virtual "machine" that was = specifically easy to emulate this *might* be a suitable form for "portable" = object programs (as contrasted with some "universal intermediate form"). > From: mash@mips.COM (John Mashey) The R3000 is pretty easy to convert; the > hardest machines to convert FROM are those with condition codes. ----------------------------------------------------------------------------- -- chaim@nsc