Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!mcvax!kth!sunic!enea!sommar From: sommar@enea.se (Erland Sommarskog) Newsgroups: comp.lang.eiffel Subject: Character and string literals Message-ID: <102@enea.se> Date: 8 Jul 89 21:08:26 GMT Organization: Enea Data AB, Sweden Lines: 47 When writing my article on Eiffel and national characters I wanted to check what characters Eiffel allows in characters and string literals. The result was somewhat surprising and puzzling. I wrote a class that had 256 features, a0 to a255. The were declared as a0 : character is '\000'; a1 : character is '\001'; but where "\001" was the character itself. First attempt revealed that newline, apostrophe and backslash meant a syntax error on which the compiler gave up, but that was expected. Of the remaining characters the following were regarded as "Invalid character constant": 1-31, 127, 135, 138, 146, 155, 162, 166, 170, 173, 181, 184, 192, 201, 208, 212, 216, 219, 227, 230, 238, 247-255. Those below 128 are obvious. They are non-printing characters in the ASCII set, and it's understandable that Eiffel forbids them to be written explicitly. But above 128? Is ISE using some eight-bit set with gaps in it at the points in the list above? It's probably not a standard set in that case. If Eiffel were to support ISO 8859 - which I think it should - it would forbid 1-31, 127-159 and permit everything else, and it wanted to be restrictive in this area. I'm not sure it should. Next thing I tried was replacing "character" with "string" and the apostrophes with quotes and compiled again. (I also had to remove the quote character from the list of course.) This gave the following output: Pass 1 on class pelle Pass 2 on class pelle Interface has not changed. Pass 4 on class pelle C-compiling pelle "pelle.c", line 3304: unexpected EOF "pelle.c", line 3304: newline in string or char constant "pelle.c", line 3305: syntax error at or near string "));? *** ec: C-compilation canceled That is, what Eiffel didn't allow in character literals, it did allow in strings! Doesn't seem like a consistent behaviour to me. Now, what about the error the C compiler detected? The cause is the very last string, which contains character 255. (Which corre- sponds to lowercase dotted "y" in 8859/1.) Apparently the C compiler takes this end of file. (My knowledge of C and Unix is little, but isn't -1 often a code for end of file? And -1 and 255 is the same thing for a byte.) -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se Bowlers on strike!