Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!husc6!m2c!wpi!jhallen From: jhallen@wpi.wpi.edu (Joseph H Allen) Newsgroups: comp.lang.misc Subject: Re: Anyone want to design a language? Message-ID: <8475@wpi.wpi.edu> Date: 17 Feb 90 17:32:08 GMT References: <22569:05:10:24@stealth.acf.nyu.edu> Reply-To: jhallen@wpi.wpi.edu (Joseph H Allen) Organization: Worcester Polytechnic Institute, Worcester ,MA Lines: 344 In article <22569:05:10:24@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >I'm bored, it's a cloudy day, and I can't stand Ada. >So what do you want in a compiled, imperative, perhaps object-oriented >language? Take C as a starting point for good ideas and feel free to >use parts of any other language. Remember: This isn't Ada. If it gets >too complicated, trash it. Simple is beautiful. Modular design is >beautiful. And above all, remember that this is going to be a language >people can actually like. Ok, I'll bite. Here's a compiled language I'd like to see: (1) No semicolons. (2) Except for end of line comments. /* These comments are evil */ (3) Block structure indicated by indentation level: while a!=b int q ; Multi-line body q=z*5 r+=foo(q) q=6 while a!=b r=foo(z*5) ; Single line body q=6 if a==b c=d if a==b c=d r=500 else q=r s=t etc. So that you can blocks in single lines, [ and ] can also be used to indicate block structure in the conventional way. (4) Overloadable AND definable operators (5) All characters allowed in symbols. For example, a typical definition might be: int :^&%&^*?: = 8 This is so that operators can be defined. There shouldn't be seperate character sets for operators and identifiers. I.E., instead of detecting the end of identifiers with the presence of operator or whitespace characters, the longest possible string which can be a symbol is deteced: if these are symbols: abc def abcdef then when the input sees: abc ; abc is recognized abcdef ; abcdef is recognized defabc ; def is recognized and then abc is recognized This requires that special seperators be used to delimit symbols in declarations (or wherever they first appear). Perhaps to save typing there might be a default identifier character set which doesn't require these delimiters. Symbol recognition should occure before constant recognition. I.E., this way you can define: int :4: = 5 ; Make 4 equal to 5 (6) Nifty C declarations which allow one type to be shared among multiple declarations each of which might have an initializer. Bad: it : integer; this : integer; Good: int it = 7, this = 5, that = 0, theother = 10 (7) However, the convoluted C declaration system needs to be replaced: instead of: int **foo[] an array of pointers to int pointers do this: [] * * int foo (8) Eliminate arrays. They arn't needed. Use pointers and macros instead. (9) For constants: $hex decimal %binary 'c' ; Character 'abc' ; String (sorry, no octal. You could do it with 0777 but that's gross) These are equivelent strings: 'a' \ 'b' \ 'c' \ 13 \ 10 \ 0 'abc' \ 13 \ 10 \ 0 I.E., no escape sequences needed. Strings are just integer constants concatenated together. And constant expression can be used in these constants soo: const int CR = 13 const int LF = 10 const int EOS = 0 'abc' \ CR \ LF \ EOF I would prefer ',' for the concatenation character but it's needed elsewhere. (10) Standard operators. Grouped together in equal precidence: ( ) Precidence [ ] Block and precidence ` Get symbol from previous scope level (C++'s '::') @ Get object at address (C's '*') # Return address of object (C's '&') . Member selector. No need for '->'. Why does C do use -> anyway? ~ Bit-wise not - Negate sizeof Size of argument on right base Distance between member indicated on right and base address of structure >> << Shift right and shift left * Multiply / Divide // Modulous & Bit-wise and + Add - Subtract | Bit-wise or ^ Bit-wise exclusive or = += -= |= ^= *= /= //= &= >>= <<= Assignments &&= ||= : +: -: |: ^: *: /: //: &: >>: <<: Assignments which work the &&: ||: the other way: a += b means add b to a and return the result a +: b means add b to a but return the original value of a == >= <= != > < Comparison ! Logical not && Logical and || Logical or (11) Blocks return the last value generated: a = [ int q q=r r=t t=q ] ; a gets r (12) Statements return their last value: a = if b==c 500 else 1000 ; if b equals c a gets 500 ; otherwise it gets 1000 (This way, there is no need for the '?:' operator) (13) Like C++, declarations can be made anywhere. (14) Statements if expr expr else expr do expr until expr while expr expr return (C's 'return expr' is 'expr return' in this language) break continue goto expr (gotos take code addresses) (15) Structure and code generation rules: int a int b these are always right next to each other and a is at a lower address. (GNU C actually puts b at a lower address) The rules for this are the same as in structures Structure members are placed in the order they appear in the defenition- they are never sorted. Bytes are first packed and then padded on machines with alignment problems. I.E., typedef IT int a char b char c char d int e (oh did I mention that there is no 'struct' symbol? Use typedef and blocks instead) b c and d are all in one integer. that integer has 1 extra byte of padding in it. (16) Basic types should be: int expr ; a signed of at least expr bits uint expr ; an unsigned integer of at least expr bits A set of macros might be used for the machine standard types. (17) More types shit: const ; for addressable constants inline ; for small non-addressable constants ; (and inline functions) register ; non-addressable variable ; fully addressable variable (blank) macro ; same as inline but with no type checking op LEFT RIGHT RETURN ; an operator or function ; LEFT indicates left-side arguments ; RIGHT indicates right-side arguments ; RETURN is the return type op void RIGHT RETURN ; This is a traditional function (18) There should be a symbol for the automatic conversion stuff. This way you can control how conversions can work: op void NSTRING s int CONVERT = atoi(s.text) This overloads the converion function CONVERT to allow automatic conversion from NSTRINGs (string with a number in it, say) to integers. (19) prec SYMBOL expr sets the precidence of operator SYMBOL to expr (a number). (20) In this function, the right argument is a pointer to a string (s is an address of (#) a character (int 8)) and returns a 32 bit integer. When it's called you actually give it an address. op void # int 8 s int 32 atoi = ... This defines the '+=' operator. a is a reference to an int. When you call it you put a variable on the left as usual: x += y but the function will actually receive the address of the variable: op @ int 32 a int 32 b int :+=: = @a = @a + b (and this is the '+:' operator) op @ int 32 a int 32 b int ::=: = int 32 tmp tmp=@a ; Remember original value of left side @a = @a + b ; Add tmp ; Return original value There should also be a modified so that the '@' is automatically assumed in the function (I.E., like pointers in pascal): op ref @ int 32 a int 32 b int ::=: = int 32 tmp tmp=a ; don't need @a since 'ref' is there a = a + b tmp (21) More about structures - Classes == structures - There should be a word 'inherit' which copies the contents of the indicated structure defenition into the new one. I.E.: typedef me int a int b typedef you inherit me int c is the same as typedef you int a int b int c - Inherits with clashing members are not allowed. Use instances instead. - Function arguments are really structures. If a function returns a structure, that structure is placed on the stack, not a in a global variable. - Member functions are indicated in function declarations. There should be another type qualifyer which indicates a function gets a pointer to the structure and all members of that structure look like local variables to the function. - To get the instance.message form, function pointers should be used in the structure. - There should be a way to indicate default structure values for when structures are created. Possibly this could be done in a constructor/destructor system. (22) Named arguments. You should be able to call a function in two ways: func(10,20,30) ; position arguments func(`a=20, `c=30, `b=20) ; argumnents are specifically named There's much, much more to do and there are problems with what I have. But this is the way my ideal language should sort of look like. The general goal is to make it both one step above assembly language and completely extendable. -- "Come on Duke, lets do those crimes" - Debbie "Yeah... Yeah, lets go get sushi... and not pay" - Duke