Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!uwm.edu!spool.mu.edu!agate!dog.ee.lbl.gov!elf.ee.lbl.gov!torek
From: torek@elf.ee.lbl.gov (Chris Torek)
Newsgroups: comp.unix.internals
Subject: Re: Byte Order on workstations
Message-ID: <14784@dog.ee.lbl.gov>
Date: 28 Jun 91 12:35:18 GMT
References: <15145@ector.cs.purdue.edu>
Reply-To: torek@elf.ee.lbl.gov (Chris Torek)
Organization: Lawrence Berkeley Laboratory, Berkeley
Lines: 68
X-Local-Date: Fri, 28 Jun 91 05:35:18 PDT

In article <15145@ector.cs.purdue.edu> wangjw@brandon.cs.purdue.edu () writes:
>... If we communicate between workstations of different byte order, .e.g., 
>Dec station -- Sun, we must first transform data into network order before 
>sending them and change them back to host order after receiving them.

You probably misunderstand the `byte order problem':

>  My question is that in a network environment, how is this problem solved?
>For example, when this mail reaches your machine, how does your machine 
>know that this mail is from a Sun instead of a Dec Station?

It does not.  Mail is not a problem because it is interpreted consistently.

The `byte order problem' is not that one or another kind of machine gets
things `backwards'.  The bytes you send from a VAX to a Sun, or vice
versa, are the same when received as when sent.  The reason that a
`number' like 0x1234 seems to change to 0x3412 is not because it *did*
change, but rather because you changed the way you interpret the bytes.

,erehwyreve redro emas eht ni era enil siht no setyb ehT
but you have to read them right to left to make sense of them.

`Little endian' machines (VAX, DECstation, etc) interpret multibyte
numbers this way:

	byte 0: 0x12
	byte 1: 0x34

The number is the low byte plus the next byte times 256 plus the next
times 65536 plus....  In this case the number is 0x3412.  Big endian
machines, on the other hand, start `at the top'.  If the multibyte
number is four bytes long, the number is the low byte time 16777216
plus the next byte times 65536 plus the next byte times 256 plus the
last byte.  Here the number is two bytes, so it is 0x1234.  The
difference is that while we read English text left-to-right, the
machine reads the bytes in the machine's order (whatever that is) and
then presents them to us left-to-right in ASCII.

Once again: the underlying bit sequence HAS NOT CHANGED.  The reason
it `looks' different is that different machines `look' in different
orders.  As long as the machines agree to `look at' the bytes in the
same order, the problem vanishes.

(Internet) mail is defined as a text sequence, and most computers use
the same order for text.  Thus there is no problem.  The semantics are
the same because the symbols (ASCII characters) are interpreted
identically.%  This is the basis of communication: when two entities
agree on a set of symbols and interpretation rules, those two entities
can communicate.  The `network byte order problem' is a result of a
small variation in the interpretation rules.
-----
% Note that this breaks down in some instances, e.g., the bytes {|}
  may be interpreted differently on a display in Oslo than on another
  in Akron, Ohio.
-----

Incidentally, this is sort of a microcosmic version of the problems
ASN.1 and other standards are trying to solve.  In order to communicate
some data set, we need to get the machines to agree on both the symbols
and their meanings.  The ISO approach seems to be to define translation
layer after translation layer, and hope that, somewhere along the way,
all the pieces get defined, rather than to start with the fact that the
pieces must get defined, to define them, and only then to arrange those
definitions in some pleasing order for recording in English (or other
human language).
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov