Path: utzoo!telly!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!WSL.DEC.COM!bothner From: bothner@WSL.DEC.COM Newsgroups: gnu.gdb.bug Subject: Gdb reading of symbols Message-ID: <8904172200.AA23780@gilroy.pa.dec.com> Date: 17 Apr 89 21:59:59 GMT Sender: daemon@tut.cis.ohio-state.edu Distribution: gnu Organization: GNUs Not Usenet Lines: 64 This is a discussion on improving gdb, rather than a bug report. To reduce the start-up time of gdb, there seems to be some hack to do "lazy" reading of symbols. I'm not sure this is a win, because whenever I do a "bt" I get very annoying pauses (when debugging a large program, such as an X server). These unpredicatable pauses detract very strongly from the user interface. There is an alternative: Using gcc's "-gg" flags, which writes out symbols in gdb internal format. The main problem (besides the extra non-standard magic needed in gas, ld, and gdb) is that gdb symbols are rather large. The solution is not to give up on gdbsyms, but to make the gdb symbol tables more compact. The information in a gdb symbol segment is of three main kinds: - Symbol names. It is hard to make these more compact, and standard dbx symbols are no more compact anyway. It may be possible to win some by avoiding duplicate strings; I don't know how worthwhile or hard this would be. Some more space might be saved by only padding to long boundaries when needed. - Line number information. Gdb symbols are already more compact than dbx symbols. However, the space needed could be halved by using shorts instead of longs. The machine code addresses would have to be relative instead of absolute. Changing this part would most affect the assembler, so I would leave it alone for now. - Tables of pointers, integers, and flags. This is where a lot of space can be saved. Most of these fields are 32 bits, where 16 bits is plenty for almost all cases. Suggestions: - All pointers to data structures within the symbol segment (such as pointers to strings, or types) are replaced by a 16-bit relative displacement. (An exception might be made for certain global structures and tables, such as the type vector). One of the 16 bits is reserved as an overflow flag, which is used to indicate an extra level indirection for the very few cases where the displacement won't fit in 16 bits: If the overflow flag is on, the displacement indicates a 32-bit (absolute or relative ?) pointer to the real data. - Using relative displacements means that gdb internally uses relative pointers. Dereferencing a pointer becomes somewhwt slower, but I cannot imagine a significant effect. Gdb no longer would relocate a segment on startup. This would save some startup time. In addition less data would need to be read from disk, and less memory would be used. Because the symbol segment is read-only, a system with mappable files need only map the segment into virtual memory, without having to actually read it. - Similarly, all (or most) integer, enum, and flag fields can be made 16 bits (or less). For integers that can occasionally (though infrequently) need more than 16 bits, I again suggest reserving one bit as an overflow bit, to indicate a pointer to a 32-bit value. Another potential win is to reduce the duplication of symbols because many files include the same include files. That is best done within the context of a module-oriented system, which requires a fairly major change in philosphy from the C way of doing separate compilation. Unfortunately, I cannot volunteer to do the work, so I guess I don't deserver much say into the matter... --Per Bothner Western Software Lab, Digital Equipment, 100 Hamilton Ave, Palo Alto CA 94301 bothner@wsl.dec.com ...!decwrl!bothner