Xref: utzoo comp.lang.smalltalk:3117 comp.object:3814 Newsgroups: comp.lang.smalltalk,comp.object Path: utzoo!utgpu!cunews!bnrgate!bqnes74!news From: CWatts@BNR.CA (Carl Watts) Subject: Smalltalk scaleability & IdentityDictionaries Message-ID: <1991Jun26.193441.28581@bqnes74.bnr.ca> Sender: news@bqnes74.bnr.ca Organization: Bell Northern Research Date: Wed, 26 Jun 91 19:34:41 GMT Gray Huggins of Texas Instruments recently sent me mail asking me questions and expressing concerns about Smalltalk scaleability. The example given was the limitation of the number of elements in an IdentityDictionary. My response (which follows), I think, would be of interest to other members of comp.lang.smalltalk and comp.object: Gray, Smalltalk scales better than other other language I know of. The reason is simple, you can model everything from the Space Shuttle to an Integer the same way. As an object (that perhaps uses other objects) that provides an excapsulated interface to some behavior. Now a Space Shuttle object may have thousands of other objects that it uses to provide its behavior, and an Integer may use no other objects to provide its behavior, but the ways in which you treat the two objects are pretty much the same. They both are Objects. And they both have a message interface allowing interaction with it. Smalltalk was designed from the start to not have any problems scaling. What other language do you now of that can as easily handle 2 + 3 as it can evaluate: 958685745856745747465784756948574847584737384877584 + 9293458823458723745734875847574874875748874374373646 Smalltalk tells you the answer to the first is 5 and the answer to the second is 10252144569315469493200660604523449723333611759251230 Smalltalk was the first general purpose language I know of that allowed infinite precision Integer arithmetic. Its thanks to the objects Integer, LargePositiveInteger, and LargeNegativeInteger. The IdentityDictionary problem is extremely easy to solve. Just get out of the mindset that the Classes that come with Smalltalk are somehow holier-than-holy and are by definition perfect and complete. They aren't. Every Class in Smalltalk is implemented to fullfil a certain purpose. And certain implementation compromises and design choices are always made. When IdentityDictionary was implemented some of the compromises and design choices that were made were: 1) Speed over Space. Hashing was used for fast speed at the cost of additional space. 2) A single object with indexable instance variables rather than some 'tree' based representation that costs more objects and where the benefits only become apparent when a large number of items are in the Dictionary. The class was not meant to necessarily be the perfect implementation of IdentityDictionary for all possible uses. The answer to your problem should be obvious now. Just change IdentityDictionary or make a new kind of IdentityDictionary. Simple. Now, of course the question comes up, what new design compromises and design choices do you want to make for this new class? a) Do you want to change the implementation of IdentityDictionary itself to use, say a balanced tree type representation? b) Do you want IdentityDictionarys to turn into LargeIdentityDictionarys when they get a certain number of elements in them (much like a SmallInteger turns into a LargePositiveInteger when it gets big enough). Then the LargeIdentityDictionary can use a different representation for storage that is more appropriate when you have large numbers of elements. Regardless of the first these choices, what new representation do you want? Multiple Arrays? Trees of IdentityDictionarys Self balancing splay trees of node elements? With each decision you are making new design decisions for your new kind of IdentityDictionary. Your new class will serve some particular need. IdentityDictionary was constructed to serve the most common need for a class like this. A fast, relative efficient IdentityDictionary. It didn't promise to be all things to all people. We chose to write a new kind of IdentityDictionary called LargeIdentityDictionary. IdentityDictionarys turned themselves into LargeIdentityDictionarys when they've got move than 16000 elements in them. And the converse also happens, if a LargeIdentityDictionary gets less than 8000 elements in it, it turns itself back into a normal IdentityDictionary. A LargeIdentityDictionary uses a different method of storing elements which has no limit on the number of elements. What you pay for this is slightly slower access times. Access time goes from O(constant) to O(log n). Still vary fast but slower than the hashing scheme employed by normal IdentityDictionarys. Hope this sheds some light.