Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!auspex!guy
From: guy@auspex.auspex.com (Guy Harris)
Newsgroups: comp.unix.internals
Subject: Re: Fundamental defect of the concept of shared libraries
Keywords: ISC i386 shared libraries
Message-ID: <8167@auspex.auspex.com>
Date: 3 Jun 91 17:33:20 GMT
References: <265@titccy.cc.titech.ac.jp> <8144@auspex.auspex.com> <276@titccy.cc.titech.ac.jp>
Organization: Auspex Systems, Santa Clara
Lines: 75

>:Yes, of course. Bnews is the real example showing significance of call
>:overhead.
>
>I cited B news as the real example showing significance of call overhead.

Umm, if call overhead is significant, inlining is a win, right?

The trick here is that in B news, the string comparison operation is
partially inlined by using a macro; that form of inlining works just
fine with shared libraries.

>>There are two separate issues here, which you're mixing together:
>>
>>1) the issue of code that will run regardless of what its virtual
>>   address is, and that doesn't have to be modified to run at a
>>   different address;
>>
>>2) the issue of mapping the same physical page into different virtual
>>   addresses within different processes.
>
>I am not mixing them.

Yes, you're continuing to mix them.  See below.

>>I
>>sincerely *hope* nobody was claiming that the fact that you couldn't was
>>at *all* a major obstacle to implementing position-independent shareable
>>code objects!
>
>What you don't and I didn't understand is position-independent code is
>not necessary for shared libraries. Roughly-position-independent code
>is enough.

See, you're still mixing them!

The first issue is, as stated, the one of making code that runs
regardless of what address it's located at.  On most if not all of the
major architectures on which UNIX runs, that can be done, and that code
is *fully* position-independent - you could move it by some minimal
amount (the actual amount depends on the alignment requirements for
various instructions).

In practice, on a system with address mapping, in order to share them
they have to be put on page or segment boundaries; if they're put on
page boundaries, they can only be relocated by an integral number of
pages - but that has nothing to do with the way the code was made
position-independent.

The second issue is the one of making the code be cacheable if you map
it in at different addresses on a machine with a virtually-indexed cache
(whether virtually or physically tagged; both can deal with aliases,
although virtually-tagged caches have to work a little harder at it), or
making it shareable without having to shuffle the page map on a context
switch on a machine with inverted page tables.  That issue means that
the alignment requirements on the code are stricter, e.g. aligning all
the virtual addresses so that the cache tags for a given location are
the same in all address spaces.

If you *don't* do that, the code will still *work* just fine, because
the code is fully position-independent, not "roughly
position-independent"; it'll just run slower because you'll have to mark
it non-cacheable.

There may well be architectures on which the code can't be made
fully-position-independent, i.e. such that it can't be made to run *at
all* unless the position of the code is only adjusted by e.g. a segment
size; however, that's not true of the 68K, the 88K, SPARC, MIPS, the
386andup, the VAX, or the IBM 3[679]0 - I didn't bother buying the WE32K
or i860 S5R4 ABI books, so I didn't see whether they do
fully-position-independent code or not.  Shared libraries could probably
be done on such an architecture, assuming the alignment requirements
aren't *too* strict.  However, given that the high-volume architectures
don't have that problem, and given that I don't work on any low-volume
architectures that have that problem, I didn't spend any energy worrying
about it.