Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uwm.edu!uakari.primate.wisc.edu!nic.MR.NET!uc!uh!fin From: fin@uh.msc.umn.edu (Craig Finseth) Newsgroups: comp.editors Subject: Re: A new idea (?) Message-ID: <1026@uc.msc.umn.edu> Date: 3 Jan 90 23:18:30 GMT References: <1558@aber-cs.UUCP> <129799@sun.Eng.Sun.COM> Sender: news@uc.msc.umn.edu Reply-To: fin@uh.UUCP (Craig Finseth) Organization: Minnesota Supercomputer Center, Minneapolis, MN Lines: 114 There is (and this is much wieder subject) a fundamental divide between thos that think that programming is building a tool that works (often Americans) and those that think that programming is communicating descriptions of entities and operations on them, both to humans and computers (often Europeans). The first type of programmers are 'hackers', and they aim for the 80% solution, that is the solution that works for them 80% of the time, and damn the rest; the major example of this attitude is Bill Joy; the others aim for simple, documentable, complete solutions based on well understood principles; major example is Dijkstra. (Note: both Joy and Dijkstra are extreme examples). I am very uncomfortable with this division. I prefer a three-way division: - Those that aim for the 80% solution (hackers: "see, it works for me"). - Those that aim for good textbook cases (academics: "see this elegant algorithm? so what if it scales O(n^n)"). - Those that design, produce, and document complete solutions. Such solutions perform well, have good user interfaces, are easy to maintain, and fit well into the system as a whole. (Many well-thought out general statements omitted.) In some sense much of the user friendliness of GNU Emacs comes from such a view of the world, only that in GNU Emacs all types of 'records' are forcibly mapped into ASCII lines, in an ad hoc way. This is both inefficient and limiting. Ah, but it is very powerful and consistent. This very issue comes up time and again with structure editors. Yes, it is very pure and consistent to build an editor in terms of objects. However, IT IS IMPORTANT THAT THE EDITOR'S VIEW OF THE OBJECTS IS WELL-MATCHED TO THE USER'S VIEW. (Note carefully which view is variable and which is constant.) I view the Emacs interface as very consistent. All of my learning regarding editing this message carries directly over to editing C code, to editing nroff source, and to patching binary files. Many people have developed many different structure editors. I have yet to see one that has a user interface with the same degree of consistency, carryover, and power as the Emacs interface. I'm not (yet) saying that it can't be done: I'm saying that it has not been done and you may wish to think twice about the scope of the problem that you're tackling. As to implementation details, I would implement this editor by mapping a file into memory (or just leaving it where it is), and building a list of pointers to each record boundary in it; performing updates by recording them in a log (either in memory or in a file), as this avoids the need to copy the original, and allows easy undo, threaded with the pointer list. Writing out simply means then following the pointer list and writing out a record a time (with obvious optimizations if records happen to be contiguous in storage). If many updates are done, the user should be given the option of trimming the log (forgoing some undo), or to merge the log with the original and restart. Before any of these two are necessary, the log of updated records should be compacted, of course, every now and then, both as to useless entires and as to clustering for better locality. This is a good design in theory. Does it scale well in practice? In particular, as redisplay -- not editing -- has been shown by your own data to be the sticking point, how well does this representation speed up redisplay? (The best argument that I am aware of for linked line representation is the performance boost that it gives to redisplay.) As per the Emacs cookbook I would have an editing engine distinct from the redisplay engine, communicating via a window image buffer..... The "cookbook" also pointed out that the "pure" model would in general not perform well enough. Well-designed "behind the scenes" hooks between the edting and redisplay engines are in general necessary. The editing engine and the display engine can be *entirely* independent of the record semantics provided in each specialized module. As an optimization, to avoid the necessary indirection overhead in the most common case, I can conceive the module for records of ASCII characters terminated by newline being compiled in and special cased, to reduce path length. You're already bringing in those pesky details which clutter up elegant algorithms. As I see it, the right way to design such an editor would be to carefully analyze the 'sequence of records' type constructor, define a suitable abstract interface for it, making sure it makes semantic good sense, and then define what kind of operations on the single records are needed to be available to support the higher level interface. But alas, this is not hacking... Neither will it lead to a pure, elegant program. Good software engineering (the field that many of us claim to practice) must simultaneously encompass high level design and nitty-gritty detail. As with all engineering, its essence is compromise. Your model is a good solid one. It must be shaped by all external constraints (user interface, machine performance limits, representations of the objects to be manipulated, etc.) into an actual program product. In the process, the model will be refined and changed, most often by compromises that are mandated by external constraints and that you would rather not have to make. Craig A. Finseth fin@msc.umn.edu [CAF13] Minnesota Supercomputer Center, Inc. +1 612 624 3375