Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!rpi!sarah!cs.albany.edu!stst
From: stst@cs.albany.edu (Stefan Strack)
Newsgroups: comp.lang.prolog
Subject: Array implementation in (PDC) Prolog
Keywords: array, PDC Prolog
Message-ID: <858@watson.albany.edu>
Date: 22 Jun 91 01:31:36 GMT
Reply-To: stst@watson.albany.edu.UUCP (Stefan Strack)
Organization: SUNY Albany, Comp. Sci. Dept.
Lines: 69

This post is a bit lengthy, and concerns mostly PDC Prolog, so hit 'n' if
you're not interested .. also, I am not a research person, so no flames from
you Prolog-Gurus please ;-)

Prolog is great for recursive data-structures, such as lists, trees, etc..
Sometimes, though, I find myself needing a simple one-dimensional array.
What I mostly resort to is to implement an array as a collection of indexed
facts, such as:

        array(0,Contents1).
        array(1,Contents2).
        ...

Array-updates are implemented as assert/retract operation. The nice thing 
about this implementation is that accesses by index and by contents are 
possible. The bad news is that access speed is inversely related to the 
"size" of the array. So, what I am looking for is a better implementation of 
an array in Prolog. This implementation should:

        - be reasonably memory efficient
        - allow fast look-up by index (fast look-up by contents would
          also be good, but isn't critical for my application)
        - provide array-size independent access times (or at least with a
          reasonable upper bound)

By now, it should be clear that I'm not talking about pure Prolog.
Specifically, I am looking for something that works with PDC Prolog (that's
the only Prolog I have access to). The best solution that I came up with so
far involves using "external databases" in PDC Prolog. In short, the array
contents would be stored in an external database, in no particular order. A
B+ tree would be used to maintain the indicices in ascending order. Nodes in
the B+ tree consist of a string-key (sorted) and a pointer into the external
database, such as:

            B+ tree                External database
        Key     Pointer
        "0000"  A ---------------> Contents1
        "0001"  B ---------------> Contents2
        "0002"  C ---------------> Contents3
         ...                       ...

In this implementation, an empty array starts out as an empty B+ tree and an
empty external database. As the array gets filled with elements, the external
database is simply appended to, and the new index is inserted into the B+ 
tree, converting the numerical index into its string equivalent to form the 
key. Similarly, look-up of array[N] involves constructing the key-string "N",
and fetching the pointer to the external database element from the B+ tree.

Although I haven't tried this yet, it looks to me that this would be faster
than the array-as-collection-of-indexed-facts method in the beginning,
at least for the kind of large arrays I am interested in. It however requires
additional memory for the B+ tree.

Can you think of a more efficient solution? Maybe I am overlooking something
elegantly trivial here? Again, the keyword is "(largely) array-size
independent access-time". Although I am mostly interested in PDC Prolog, I
would also like to hear how you would do it in _your_ favorite Prolog.  And
yes, I expect a few to say "if you want to use arrays, go program in C or
Pascal", but that's not what I am interested in. Also "interface your Prolog
program to a C routine that maintains the array" is also not acceptable,
because my array should hold compound objects.

In case you are asking what kind of program I am writing: it's an
implementation of A.K.Dewdney's "Core War" game, and I am looking for a good
representation of the Core memory array.

 Please post or reply by email, I'll post a summary if there's interest (??).

Thanks, Stefan (stst@cs.albany.edu)