Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!bbn!oberon!bloom-beacon!mit-eddie!rutgers!bellcore!faline!sabre!gamma!ulysses!sfmag!sfsup!shap From: shap@sfsup.UUCP (J.S.Shapiro) Newsgroups: comp.arch,comp.os.misc Subject: Re: Shared libraries (Was: Re: Big Programs Hurt Performance) Message-ID: <2114@sfsup.UUCP> Date: Sat, 26-Sep-87 16:23:46 EDT Article-I.D.: sfsup.2114 Posted: Sat Sep 26 16:23:46 1987 Date-Received: Wed, 30-Sep-87 07:26:01 EDT References: <6886@eddie.MIT.EDU) <2501@xanth.UUCP> <2067@sfsup.UUCP> <443@devvax.JPL.NASA.GOV> Organization: AT&T-IS, Summit N.J. USA Lines: 103 Summary: Answer to why one wants unshared libraries Xref: mnetor comp.arch:2417 comp.os.misc:251 In article <443@devvax.JPL.NASA.GOV>, des@jplpro.JPL.NASA.GOV (David Smyth) writes: > In article <28957@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes: > > How is COPYING the old shared libraries into executables which need > them ANY savings in disk usage? It seems it will be a DEAD LOSS: > core (bigger executable images); virtual memory (it gets used up even > if paged out); AND disk space (the executable file gets bigger for EVERY > program which needs the unshared library). > I think you missed the idea. A shared library is usually not a single monolithic object, and the incompatibility with an old version is usually temporary. Since the libraries only provide for *unresolved* references, it suffices as a temporary fix to haul only the problematic object out of the old library for inclusion in your code, and continue to use the new shared library. That is, you don't have to use *all* of the old library. Two things mitigate this. First, changes in libraries are almost always bug fixes or compatible with the documentation. If you depend on a bug, you really *do* deserve what you get, particularly given that most companies that produce compilation systems provide workaround lists. If you have not been following the docs, you also deserve what you get. The other possible case is a major upheaval in the compilation system, as will tend to happen with the forthcoming batch of ANSI C compilers in the market. ANSI has changed C a lot. In these cases you need to do substantial rework anyway, and linking with the old objects is a way to get a working interim product to your customers while you provide a real solution. Yes, in the short term it is a lose from the standpoint of space if you have to do this for a lot of routines, however, disk is cheaper than nonproductivity, and on a temporary basis most customers won't object. > Why EVER have unsharable libraries??? There are many architectures out there which don't support shared libraries (particularly position independent ones) gracefully. Having a shared library means reseving a good sized chunk of your address space for each shared library you anticipate, and it becomes a fairly difficult administrative problem to parcel out chunks of the address space to your VARs. On many architectures, position independent code means a performance hit of 20% (or more), and only recently have advances in hardware technology made this acceptable. It's a tradeoff. Many architectures can't do shared libraries at all, and any compilation system that wants to deal with these architectures *as well as* the newer architectures faces a difficult problem. > Why EVER have libraries specifically linked to an executable??? See above, then I'll deal with the specific claims below: > a) If it is an application which makes repeated calls > to a library, the FIRST invocation may be slower, but > all following invocations can be VERY CLOSE to the same > speed [Message/Object Programming, Brad J. Cox, see > table 1]. Well, this isn't really a win. There are basically two techniques for making this hack work. These are: (1) completely relocate the executable when you load it into core to execute it (2) come up with a backpatching scheme such that the first time you call a function from any given place, some intermediate glue examines the CALL statement and backpatches the *real* function pointer into place. Option (1) is clearly debatably good - that can be a lot of relocation, and if your binary is big the relocation takes a long time. Whether or not this is a good choice depends on how many times you need to fire up the binary, how big it is, and how much disk space it saves you to use the shared libraries. It makes doing paging efficiently hard (see below). Option (2) is very difficult to do on many architectures, requires careful code generation, and prevents taking advantage of span-dependent instructions for calls. This has it's own impact, and it is potentially sizeable. > b) Speed Critical Applications probably want to be vectorized, > and I would think reducing the competition for core via > shared libraries would be a BIG win if swapping is reduced > even a little bit (I don't know much about vectorized > algorithms, I only work on these archaic Suns, Vaxen, and > such Von Nueman rubbish :^) ). Consider that both methods require modifying text pages, and this means that you have to reserve space for these pages in your paging area. This prevents you from paging in from the text portion of the original program file. Shared libraries tend to be small sets of core facilities. Chances are many more pages will reference them than there are in the shared library, and this hurts you in swap area. Note that to make this work you need *writable* shared text, which opens a whole other can of worms. There is a technique which can be used to avoid all this which is to have an indirection table and a directory in each library, or a well-known-globals list, as someone suggested, but this implies a remarkable performance hit. In short, it ain't all as easy as it sounds, which is why most compilation systems still don't support it at all. And that is why you want non-shared libraries. Jon Shapiro AT&T Information Systems