Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uunet!mcsun!ukc!dcl-cs!aber-cs!thor!pcg From: pcg@thor.cs.aber.ac.uk (Piercarlo Grandi) Newsgroups: comp.editors Subject: A new idea (?) Message-ID: Date: 3 Jan 90 18:01:37 GMT References: <1558@aber-cs.UUCP> <129799@sun.Eng.Sun.COM> Sender: pcg@aber-cs.UUCP Organization: Coleg Prifysgol Cymru Lines: 120 There has been some discussion on my analysis of GNU Emacs memory management, with various opinions. There has been (in comp.editors) a call for editors ideas. The two will be merged in this posting of mine. There is (and this is much wieder subject) a fundamental divide between thos that think that programming is building a tool that works (often Americans) and those that think that programming is communicating descriptions of entities and operations on them, both to humans and computers (often Europeans). The first type of programmers are 'hackers', and they aim for the 80% solution, that is the solution that works for them 80% of the time, and damn the rest; the major example of this attitude is Bill Joy; the others aim for simple, documentable, complete solutions based on well understood principles; major example is Dijkstra. (Note: both Joy and Dijkstra are extreme examples). I tend to think that I do belong to the Dijkstra group, not the Joy crowd. I have been thinking about designing my own proper editor, not as a quick (or slow) hack, but as it should be done. I think I have got a clue (IMNHO), and here it is. I think it would make a fine Ph.D. thesis subject. It is an old idea of mine, but I haven't yet seen it published (the closest thing is the concept of Smalltalk browser). How can we characterize an editor? It is clearly (to my jaundiced eye) a set manipulator (secondarily, a set browser). In particular, I think we can restrict our discussion to manipulations of sequences (Hoare's article in Structured Programming), even if at least the TSS editor was key based. More precisely, and editor should be equivalent to a generic sequence facility. What kind of operations can be done on sequences? well, add element, remove element, apply, concatenation, order comaprison if applicable, ... An editor is something that reads a sequence from a file into a buffer, manipulates it, and then writes it back. It can take advantage of the fact that while in the buffer manipulation need not be strictly sequential, for example, if the entire sequence can be held in it. Notice that under this definition rn, mail -f, and a lot of other programs are editors. Indeed an editor should be generic in the sense that it operates on sequences of entities, where entities may have any type; news articles, mail messages, passwd lines, etc... An editor should have a general purpose sequence management engine, and a number of modules with a standardized interface that define the semantics for the specific record type. Each record type module should have primitives that indicate record boundaries, allow for (potentially recursive!) intra record editing, display in a pretty form the record contents, allow content based selection of records. An editor should have an internal language that allows the user to write programs using these operations through an high level interface. It is clear that a generic editor can be what a broswer is *and* more. I can easily conceive of an editor that in one window allows me to edit a mailbox and in another a C source where each file level entity is considered a record, and in another a list of newgroups, where each newgroup is considered a record, and another where one of these newgroups is displayed, and each article is considered a record; even simple text may have different modules, e.g. for files made up of lines composed of ASCII characters or two byte chinese ones. I can easily conceive of an editor whose 'lines' can take multiple lines of screen, and have many fields. In some sense much of the user friendliness of GNU Emacs comes from such a view of the world, only that in GNU Emacs all types of 'records' are forcibly mapped into ASCII lines, in an ad hoc way. This is both inefficient and limiting. As to implementation details, I would implement this editor by mapping a file into memory (or just leaving it where it is), and building a list of pointers to each record boundary in it; performing updates by recording them in a log (either in memory or in a file), as this avoids the need to copy the original, and allows easy undo, threaded with the pointer list. Writing out simply means then following the pointer list and writing out a record a time (with obvious optimizations if records happen to be contiguous in storage). If many updates are done, the user should be given the option of trimming the log (forgoing some undo), or to merge the log with the original and restart. Before any of these two are necessary, the log of updated records should be compacted, of course, every now and then, both as to useless entires and as to clustering for better locality. As per the Emacs cookbook I would have an editing engine distinct from the redisplay engine, communicating via a window image buffer. As per the old, good, Arizona idea of frontending ed(1) for full screen editing, one could conceivably have them in distinct processes, thus making possible to have several front ends, e.g. one oriented to full screen editing, one for teletypes, one for X, etc... The editing engine and the display engine can be *entirely* independent of the record semantics provided in each specialized module. As an optimization, to avoid the necessary indirection overhead in the most common case, I can conceive the module for records of ASCII characters terminated by newline being compiled in and special cased, to reduce path length. As I see it, the right way to design such an editor would be to carefully analyze the 'sequence of records' type constructor, define a suitable abstract interface for it, making sure it makes semantic good sense, and then define what kind of operations on the single records are needed to be available to support the higher level interface. But alas, this is not hacking... Any takers ? -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk