Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!elroy.jpl.nasa.gov!ames!apple!snorkelwacker.mit.edu!shelby!neon!opbibtex From: opbibtex@Neon.Stanford.EDU (Oren Patashnik) Newsgroups: comp.text.tex Subject: Re: sorting in bibtex Keywords: bibtex, sorting Message-ID: <1991Feb9.203132.6226@Neon.Stanford.EDU> Date: 9 Feb 91 20:31:32 GMT References: <790@utrcu1.UUCP> <1991Feb7.175558.9848@Neon.Stanford.EDU> Organization: Computer Science Department, Stanford University Lines: 140 I got lots of requests to post a spiel I mentioned in a previous article, so here it is. --------------------------------------- First I'll flame a bit about bibliography styles in general. I'll argue that author-date styles, whose citations in the text look like [Jones 76] or (Jones, 1976) or [Jon76], are based on outdated technology and are bad. Then I'll give some .bst-file information. BEGIN FLAME I understand that there's often little choice in choosing a bibliography style. Journal X says you must use style Y and that's that. If you have a choice, however, I strongly recommend that you choose something like BibTeX's `plain' (numbers-in-brackets) standard style. Such a style, van Leunen argues convincingly in her "Handbook for Scholars", encourages better writing than the alternatives---more concrete, more vivid. By contrast, author-date styles encourage flabby writing. For instance many such styles almost require the passive voice---"It has been shown [Knuth 76] that ..."; the `plain' style avoids the passive voice---"Knuth [13] shows that ..." But the passive-voice problem isn't the worst of it. The author-date style forces you to include in your sentence author and date information, even when some or all of that information is more a distraction than a help to the reader. Furthermore, author-date makes it awkward to include other information that might be helpful. The `plain' style, on the other hand, allows you to include exactly the information in the sentence that belongs---sometimes the author, sometimes the year, sometimes neither, but sometimes other stuff--- while minimally interrupting the flow of the sentence. For example if the year information is crucial you simply write something like: "Knuth's seminal 1976 paper on mumble [13] shows ..." Another example, in `plain': The field blossomed in 1976, starting with Knuth's tantalizing theory [13]. Others tore the theory to shreds [7], [8], [12], [41], [43], but that theory sparked . . . This is reasonably clean and crisp. Here's how it might come out in `author-date': The field blossomed one year, starting with a prominent computer scientist's tantalizing theory [Knuth 76]; others tore the theory to shreds [Aho and Ullman 76], [Aho et al. 76], [Hopcroft and Ullman 76], [Ullman and Yannakakis 76], [Yannakakis 76], but that theory sparked . . . The passage, in conforming to the rigidity of the author-date style, has become a metastasized mess. "But The Chicago Manual of Style prefers an author-date style," some people point out. True, but for anachronistic reasons: The chief disadvantage of [a style like `plain'] is that additions or deletions cannot be made after the manuscript is typed without changing numbers in both text references and list. (page 401, thirteenth edition) With computer-typesetting systems like LaTeX, however, this disadvantage obviously evaporates. "But I hate it when someone writes a sentence like `The mumble theorem was proved in [13]', forcing me to flip to the reference list to see who did the proving," others complain. I hate that too; but in my view the fault lies with the *writer* of that sentence, not with the number-in-brackets style itself. Nothing in that flexible style prevents a writer from saying `Knuth [13] proved the mumble theorem,' or from giving additional information that's useful to the reader (or from omitting even the author's name in the rare circumstances that call for it). To overstate the argument a bit: Just as we don't blame a typeface for the poor writing of those who use that typeface, we shouldn't blame the numbers-in-brackets style for the sloppiness of those who use that style. And it's not just the text that suffers from an author-date style; the reference list has logical deficiencies too, as anyone who's written a thorough program for such a style can attest. *All* author-date styles have these deficiencies, but which deficiencies arise depends on the specific author-date style. For example a style that uses labels like [ABC86] (for a 1986 paper by Fred Aza, Joe Bloe, and Bill Collier) must sort first by label, then by author (otherwise---if it sorted first by author---a reader might have to search three pages of `A' listings in a large bibliography to find the reference, because he won't necessarily know that the `A' stands for `Aza' and he must therefore look through all the `A' listings before finding [ABC86] at the end). But this kind of sorting (label first, then author) gives an unnatural order: The [ABC86] paper, for example, would come near the beginning of the `A' listings, rather than in its natural spot near the end. [Note: Since I originally wrote this spiel, I've changed my mind---I now think that the problems with label-first sorting are worse than the problems with author-first sorting; hence the next version of the `alpha' standard style will probably use author-first sorting; but having to decide which type of sorting is worse merely underscores the author-date deficiencies.] Alternatively, a style that uses labels like [Aza et al. 86] will produce entries that tend not to be as far from their natural spot as as with the [ABC86] style (although the alternative style can still produce an unnatural order), but the longer, more cumbersome labels are a nuisance. Worse yet are styles that don't use labels in the reference list at all. They have the advantage of being in natural order, but they might separate the entries for [Smith et al. 83a] and [Smith et al. 83b] by a full page in a large reference list. Not only that, but these nonlabel styles might require the reader to search two pages of `Smith' listings to find the [Smith et al. 84] entry. Any one of these problems is fixable, but only at the expense of introducing new, worse problems. What a mess. There are other problems with certain author-date styles, but that's enough for now. (If forced to choose among these author-date styles, I'd choose, probably kicking and screaming, the one that produces labels like [ABC86] and [Knu76], which is BibTeX's `alpha' style.) The `plain' style has none of these reference-list deficiencies--- it produces the natural order, it has short labels, and it has the simplest and quickest scheme for finding a reference in the list--- all while providing the most flexible in-text citation scheme. END OF FLAME. If you don't buy my arguments, or if you're saddled with an unenlightened editor, there are author-date options. BibTeX's standard style `alpha' uses labels like [ABC86] for multiple authors and [Knu76] for single authors. There is also an `apalike' style that has no labels in the reference list and that produces citations in the text like (Aho, 1983) or (Aho and Hopcroft, 1983) or (Aho et al., 1983). This style resides in the Clarkson style collection. (In addition to the file apalike.bst, you'll need apalike.sty (so that you can give `apalike' as an optional argument to the \documentstyle command) if you're using BibTeX with LaTeX, or apalike.tex if you're using BibTeX with TeX.) Both these styles are for BibTeX version 0.99 or later. If you need a variation on these styles, it's best to (1) start with `alpha' if your style uses labels in the reference list, or with `apalike' if your style doesn't have labels, and (2) then modify (but change the name when you're finished modifying). The Clarkson style collection may have other author-date styles as well. --Oren Patashnik (opbibtex@neon.stanford.edu)