Path: utzoo!attcan!uunet!mcsun!inria!mirsa!jerry.inria.fr!huitema
From: huitema@jerry.inria.fr (Christian Huitema)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: Efficiency (or lack thereof) of ASN.1.
Message-ID: <8020@mirsa.inria.fr>
Date: 11 Jun 90 12:06:59 GMT
References: <9006092155.AA05105@psi.com> <11866@ditmela.oz>
Sender: news@mirsa.inria.fr
Reply-To: huitema@jerry.inria.fr (Christian Huitema)
Organization: INRIA Sophia Antipolis
Lines: 52

A few comment on the efficiency of ASN.1. We did quite a lot of work on
that at INRIA, including the development of alternative encoding rules.
The relative cost of ASN.1 derives from the recursive
``type-length-value'' structure of the BER, plus the cost of decoding
length fields, integers, and reals. The relative cost of ASN.1, compared
to a simpler encoding, depends from the type of elements. To cite, in order:

* the handling of REALs is ludicrous. It is almost equivalent to
converting them to string using something like <sprintf/sscanf>, only
more complex.

* the handling of the Integers, as explained by MTR, uses variable
length big endian complement to 2 notation. The coding can be optimized
somehow, e.g. by predicting the range of the element or by loosening
your compliance to the BER (as advocated by some experts from DEC); the
decoding routines on the other hand must be general. Their cost depends
from the quality of the implementation; I measured a ration of 10 to 20
between BER decoding of integers and a simple ``move''; an endian
reversion would have costed 5 to 6 moves.

* the handling of the strings is comparable to what can be found in
other syntaxes: the coding of the tag and length field may take 20
instructions instead of 1 or 2, but it is outweighted by the copying of
the string itself, which is syntax independant. The decoding can however
be made much longer if your partner insists on passing its strings as
``sequences of segments''.

* the handling of structures requires an analysis of the tags and length
fields. A sequence without optional components has a very low cost; a
set is very costly; a choice cost marginally more than a ``case of''
operation; an array (set of, sequence of) costs significantly longer to
decode than a simple ``count + pointer'' representation, as one can only
assert the number of elements (useful for malloc, bound checking, etc)
by decoding each element and testing the ``length'' field.

In short: the efficiency of ASN.1/BER depends of the complexity of the
structure and of the type of the element. If a simple structure contains
mostly strings, ASN.1 is very efficient. If a semi complex structure
contains a mix of strings and integers (or booleans), one can expect 1
to 4 instructions per byte -- depending of the mix. If the message is a
matrix of floating point numbers exchanged between a Cray and a
graphical workstation, then ASN.1 is about as good as converting the
matrix to a text file... In short, if the application deals mostly with
numbers, one should try to negociate something else than the BER.

For information handling protocols like MHS, X.500, DFR or CMIP (and
SGMP) the cost of the encoding is probably in the 1-4 inst/byte range.
This is  significant, although probably much less than the cost of
retrieving the information itself. And one should point out that the
coding of the DNS is about as costly...

Christian Huitema