Path: utzoo!attcan!uunet!samsung!usc!orion.oac.uci.edu!ucivax!gateway
From: bnrgate!bnr.ca!pww@uunet.uu.NET (Peter Whittaker)
Newsgroups: comp.protocols.iso.x400
Subject: Re: DATA Compression and X400 standards
Message-ID: <1990Oct30.162942.11200@bnrgate.bnr.ca>
Date: 30 Oct 90 19:27:02 GMT
References: <Qb=4CfO00VADA1N41e@andrew.cmu.edu>
Organization: Bell-Northern Research, Ltd., Ottawa, Ontario, CANADA
Lines: 92
Approved: usenet@ICS.UCI.EDU
x-attn: jns
ReSent-To: mhsnews@ICS.UCI.EDU

In article <Qb=4CfO00VADA1N41e@andrew.cmu.edu> ms6b+@andrew.cmu.edu (Marvin Sirbu) writes:
>Shannon's theory of
>information says that the more you know about the message set, the more
>effectively you can compress it.  Thus, if I send a multi-media message,

(a bit deleted)

>While it may appear simpler to use a single compression scheme at a layer
>below the application, such an approach may sacrifice substantial
>potential efficiency gains in transmission.
>
>

Can't help but agree that compression should be higher in the stack,
and for a variety of reasons (number 3 is the most imp, IMHO).

1) As Marvin states, you get better compression when you know what you
   are compressing (compressing data of unknown type/origin seems kinda
   silly (esp. as it could already be in its most space efficient form,
   and please correct me if I'm wrong, but compressing it could lead to
   pathological behavior where the 'compressed' data is bulkier than
   the original)).

2) Compress higher in the stack, and the lower layers have less data to move,
   i.e. less memory to manipulate, less room for transmission/reception/
   allocation/deallocation errors.

3) Compression is an example of manipulation of user data:  from the OSI
   purists perspective (I'm a purist on odd-numbered days - Happy Hallowe'en)
   the last (lowest numbered) layer to touch user data is the presentation
   layer (layer 6).  Once it gets further down, the OSI stack assumes it's
   safe to ship.  It can't assume that it's in best form to ship, but it's
   bound to heed the 'prerogative' of layer 6:  that's where ASN.1 is made,
   and where the BER are applied.

   Not to mention that when data is compressed, it has to be uncompressed
   (trivial, right?).  But how does the other side of the connection know
   data is compressed?  It seems to me that compression vs non-compression
   would be part of the context negotiations at session establishment:
   the iniatiator and responder would have to agree on what set of compression
   routines to use, if any, and how to indicate to one another that compression
   had been applied.

   My understanding of layers 4 and below (I work on the upper 3-4 layers,
   depending on how you define the application stack) is that peer-to-peer
   communication do not provide any services for such negotiation.
   (Please corect if wrong....).

   There are some more practical consioderations too (NOTE:  OSI purists
   may go into conniptions fits :@} ).

   The presentation layer (layer 6) is responsible for translation between
   network independent and host specific data representations.  It is also
   the last layer that 'knows' what data types it's handling (all that layer
   5 and below see are bits).  So, the presentation layer is the last layer
   that can make a determination as to the most effcient compression
   routine to be applied to a certain body type (or generic data type).

   Furthermore, when compressing the data, are you compressing to save
   local disk space and memory, or to save network resources?

   In the former case, X.400 (at layer 7) could call a presentation service
   element and ask it to perform some compression on a body type before
   transmission (i.e. in the case of a store and forward node:  receive the
   data, identify the data type, compress it, then store it till it's
   time to forward it.  All this depends on the store-time, of course
   (is it worth processing 10 pages of g3fax if you're only going to store
    it for ten minutes?)).

   In the latter case, the network may benefit from having compression
   applied to the machine dependent data representation (i.e. compress, then
   encode as ASN.1) or it might benefit from compression after encoding.

   The only way to know which to do is to have compression routines having
   to the presentation layer (for use before or after ASN.1 encoding), and
   to experiment, and collect some metrics.  In time, we'll (hopefully)
   have a body of experimental evidence of what-works-best-when-in-most-cases
   (or maybe somebody can work it all out in theory:  theories are easier to
   program to than experimental data).


--
Peter Whittaker         [~~~~~~~~~~~~~~~~~~~~~~~~~~]    Open Systems Integration
pww@bnr.ca              [                          ]    Bell Northern Research
Ph: +1 613 765 2064     [                          ]    P.O. Box 3511, Station C
FAX:+1 613 763 3283     [__________________________]    Ottawa, Ontario, K1Y 4H7