Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!uwm.edu!spool.mu.edu!agate!dog.ee.lbl.gov!elf.ee.lbl.gov!torek From: torek@elf.ee.lbl.gov (Chris Torek) Newsgroups: comp.unix.internals Subject: Re: Byte Order on workstations Message-ID: <14784@dog.ee.lbl.gov> Date: 28 Jun 91 12:35:18 GMT References: <15145@ector.cs.purdue.edu> Reply-To: torek@elf.ee.lbl.gov (Chris Torek) Organization: Lawrence Berkeley Laboratory, Berkeley Lines: 68 X-Local-Date: Fri, 28 Jun 91 05:35:18 PDT In article <15145@ector.cs.purdue.edu> wangjw@brandon.cs.purdue.edu () writes: >... If we communicate between workstations of different byte order, .e.g., >Dec station -- Sun, we must first transform data into network order before >sending them and change them back to host order after receiving them. You probably misunderstand the `byte order problem': > My question is that in a network environment, how is this problem solved? >For example, when this mail reaches your machine, how does your machine >know that this mail is from a Sun instead of a Dec Station? It does not. Mail is not a problem because it is interpreted consistently. The `byte order problem' is not that one or another kind of machine gets things `backwards'. The bytes you send from a VAX to a Sun, or vice versa, are the same when received as when sent. The reason that a `number' like 0x1234 seems to change to 0x3412 is not because it *did* change, but rather because you changed the way you interpret the bytes. ,erehwyreve redro emas eht ni era enil siht no setyb ehT but you have to read them right to left to make sense of them. `Little endian' machines (VAX, DECstation, etc) interpret multibyte numbers this way: byte 0: 0x12 byte 1: 0x34 The number is the low byte plus the next byte times 256 plus the next times 65536 plus.... In this case the number is 0x3412. Big endian machines, on the other hand, start `at the top'. If the multibyte number is four bytes long, the number is the low byte time 16777216 plus the next byte times 65536 plus the next byte times 256 plus the last byte. Here the number is two bytes, so it is 0x1234. The difference is that while we read English text left-to-right, the machine reads the bytes in the machine's order (whatever that is) and then presents them to us left-to-right in ASCII. Once again: the underlying bit sequence HAS NOT CHANGED. The reason it `looks' different is that different machines `look' in different orders. As long as the machines agree to `look at' the bytes in the same order, the problem vanishes. (Internet) mail is defined as a text sequence, and most computers use the same order for text. Thus there is no problem. The semantics are the same because the symbols (ASCII characters) are interpreted identically.% This is the basis of communication: when two entities agree on a set of symbols and interpretation rules, those two entities can communicate. The `network byte order problem' is a result of a small variation in the interpretation rules. ----- % Note that this breaks down in some instances, e.g., the bytes {|} may be interpreted differently on a display in Oslo than on another in Akron, Ohio. ----- Incidentally, this is sort of a microcosmic version of the problems ASN.1 and other standards are trying to solve. In order to communicate some data set, we need to get the machines to agree on both the symbols and their meanings. The ISO approach seems to be to define translation layer after translation layer, and hope that, somewhere along the way, all the pieces get defined, rather than to start with the fact that the pieces must get defined, to define them, and only then to arrange those definitions in some pleasing order for recording in English (or other human language). -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov