Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!bloom-beacon!eru!hagbard!sunic!mcsun!ukc!icdoc!qmw-cs!eliot
From: eliot@cs.qmw.ac.uk (Eliot Miranda)
Newsgroups: comp.object
Subject: Re: value semantic versus reference semantic
Keywords: Smalltalk, distribution, values, object identity, immutability
Message-ID: <3683@sequent.cs.qmw.ac.uk>
Date: 15 May 91 11:28:34 GMT
References: <68780001@hpcupt1.cup.hp.com> <1991May14.093053.3017@jyu.fi>
Followup-To: comp.object
Organization: Computer Science Dept, QMW, University of London, UK.
Lines: 131

>In article <68780001@hpcupt1.cup.hp.com> thomasw@hpcupt1.cup.hp.com (Thomas Wang) writes:
>>It has occurred to me that value semantic is equivalent to reference
>>semantic if the value of 'b' can never change.  So one can use reference
>>semantic on immutable classes, and expect them to be well behaved.

This is not true of typical OOPLs without some effort.  Systems which give
access to an object's identity (C++ & Smalltalk) or allow enumeration of a
class's instances (Smalltalk) can allow the programmer to show that two or
more equi-valued immutable objects exist.


In article <1991May14.093053.3017@jyu.fi> sakkinen@jytko.jyu.fi (Markku Sakkinen) writes:
>To avoid all ambiguity, it might be better to speak of immutable _objects_.
>I must admit another good point in C++ (newer versions): immutability
>('const') is not an attribute of a class but rather of a variable or object.
>There seems to be the bad point in Smalltalk that one really cannot define
>sensible immutable classes, because there is no distinction between
>the initialisation and the modification of an object.  (Smalltalk experts:
>is there a way around this problem?)

Indeed. Note that in standard Smalltalk-80 & Smalltalk/V systems the symbols
are a set of unique strings.  These symbols are tested for equality by looking
at their pointer.  Message lookup matches the pointer of the message selector
symbol with an occurence of the pointer in some dictionary.  One simple way
to implement 'private' messages is to create a private set of objects (they
don't even have to be strings, they could be e.g. SmallIntegers).  Within some
compilation group (e.g. a class & its metaclass) certain message selectors can
be taken from this private set.  This means that only instances of the class
or the class itself can send & get understood messages in this set.

To implement truly immutable objects in Smalltalk arrange that
	a) the initialization message selector is private to the object's class
	b) override all mutating primitives with shouldNotImplement versions

This gives truly immutable objects provided that you trust the object's classes
to correctly initialize them & not diddle about later.

To implement truly immutable values (e.g. which can be arbitrarily duplicated
without being able to distinguish between equi-valued instances)
	a) override the == primitive with a method that compares the values of
	   the receiver & anObject.  This could be primitively implemented for
	   simple values (e.g. strings)
	b) stumble across the instance enumeration facilities:
		someInstance nextInstance allInstances allInstancesDo:
	   and realise you have a problem :-)
Part a) is a problem in some Smalltalk virtual machines because they don't
bother to look up the #== message.  On these platforms you typically have to
change the set of special selectors so it doesn't include #== and recompile the
entire system.  This ensures that the special selector send of #== which isn't
looked up is nolonger generated by the compiler.

Too many people assume the Smalltalk compilation system as a given (harder not
to assume this in Smalltalk/V).  Being open, Smalltalk-80 is much more flexible.

>
>In reference-based languages, for an assignment to be equivalent to
>value semantics, it is usually not sufficient for the assigned object to be
>immutable itself;  it must be "recursively immutable" in the worst case.
>But note that with reference semantics there is even no general definition
>of the "value" of an object:  usually it is something between "shallow value"
>and "deep value".
More than that. In systems like Smalltalk with many reflective facilities
you must also think hard about e.g. allInstancesDo:.  Its no good not being
able to distinguish between two immutable objects with the same value if you
can demonstrate that in fact more than one exists. e.g. in Smalltalk,
if running the following

	| count |
	count := 0.
	ImmutableStringValue allInstancesDo: [:s|
		(s size = 3
		and: [(s at: 1) = $Y
		and: [(s at: 2) = $E
		and: [(s at: 3) = $S]]]) ifTrue: [
			count := count + 1]].
	count

evaluates to something more than 1 then your implementation of immutable values
is flawed.  I haven't thought of a good solution to this.
		
>
>>However, it seems the running speed of program using immutable classes
>>can be slower than an equivalent program using mutable classes, since so many
>>garbage objects can be generated by the immutable version.
>>The question is 'are there optimization algorithms that can improve
>>the performance of immutable classes?'
Certainly. The use of hash tables to hold all instances (a la Smalltalk-80
symbol table) where the hash tables are 'weak arrays' so that instances only
referenced from the table will get collected.  When you want to create an
instance of some immutable value class given its intended value first look in
the hash table.  I think the same technique could be used to solve the
allInstancesDo: problem either by arranging that duplicates are never created
or by ignoring instances not in the table during enumeration.


>There should be people in this group who know some answers to the question,
>but so far I haven't seen any.  However, I have some doubts about
>the relevance of the problem:
>1. If you are modelling mutable entities with your software, why should
>   you use immutable objects in the first place?
>2. If you really need immutable objects, there will not be generated
>   many garbage objects.
>
>I.e., use both mutable and immutable objects according to where
>each alternative is more appropriate.  Personally, I do not believe
>in "applicative object-oriented languages" (if that's what you had
>at the back of your mind), except perhaps for very special applications
>(pun not originally intended, but it came for free).
>
>Markku Sakkinen
We're doing work on a distributed Smalltalk and for us immutable values can be
freely copied between machines, which is much cheaper than creating proxies.
Consider e.g. a database that returns a string in response to a query from
another machine.  If the string is mutable & referenced by other objects then
a proxy must be created. (If the string is not referenced by other objects it
can be migrated).  The proxy will reference the string via some global id.
When the client wants to fetch the characters in the string a remote message
send will be needed for each character (using some sort of concurrency control
e.g. read & write batons in distributed filing systems, can optimize this).
If the string is an immutable value then it can simply be copied.

In our distributed Smalltalk we intend to arrange that substancial parts of
classes (especially methods) will be 'recursively immutable' by Markku's
definition.  Changes to classes will be probably be done via a version system.
This should make the management of a large shared code-base much simpler.
-- 
Eliot Miranda			email:	eliot@dcs.qmw.ac.uk
Dept of Computer Science	ARPA:	eliot%dcs.qmw.ac.uk@nsf.ac.uk
Queen Mary Westfield College	UUCP:	eliot@qmw-dcs.uucp
Mile End Road			Fax:	081 980 6533 (+44 81 980 6533)
LONDON E1 4NS			Tel:	071 975 5229 (+44 71 975 5229)