11 Related Work

As noted in the introduction, most existing documentation tools fall into one of three categories: LaTeX-like tools, JavaDoc-like tools, and WEB-like tools.

The LaTeX category includes general word-processing tools like Microsoft Word, but LaTeX offers the crucial advantage of programmability, where macros enable automatic formatting of API details. Systems like Skribe (Gallesio and Serrano 2005) improve LaTeX by offering a sane programming language. Even in a programmable documentation language, however, a lack of connection to source code means that information is duplicated in documentation and source, and binding and evaluation rules inherent to the source language are not automatically reflected in documentation and in examples related to those bindings.

The JavaDoc category includes perldoc for Perl, RDoc for Ruby, Haddock (Marlow 2002) for Haskell, OCamlDoc (Leroy 2007), Doxygen (van Heesch 2007) for various languages (including Java, C++, C#, and Fortran), and many others. Such tools improve on the LaTeX category, in that they provide a closer connection to the programs that they document. In particular, they are specifically designed for library API documentation, where they shine in automatic extraction of API details from the source code. These tools are not suitable for other kinds of stand-alone documents, such as overview documents, tutorials, and papers (like this one), where prose and document structuring are more central than API details.

Literate programming tools such as WEB (Knuth 1984) and noweb (Ramsey 1994) are designed for documenting the implementation of a library as much as the API that a library exports. In a sense, these tools are an extreme version of the JavaDoc category, where the information communicated to a reader is drawn from both the prose and the executable source. In doing so, unfortunately, the tools typically revert to a textual slice-and-dice of the program and prose sources, instead of a programmable layer that spans the two halves.

Simonis and Weiss (2003) provide a more complete overview of existing systems and add ProgDoc, which is similar to noweb in the way that it uses a pipeline of tools. Scribble builds on many ideas from these predecessors, but fits them into an extensible framework backed by an expressive programming language.

Skribe (categorized above in the LaTeX group) is by far the system most closely related to Scribble. Like Scribble, Skribe builds on Scheme to construct representations of documents using Scheme functions and macros, and it uses an extension of Scheme syntax to make it more suitable for working with literal text. (Skribe uses square brackets to quote strings, and within square brackets, a comma followed by an open parenthesis escapes back into Scheme.) Skribe’s format-independent document structure and its use of passes to render a document influenced the design of Scribble. Skribe, however, lacks an integration with lexical binding and the module system that is the heart of Scribble. For example, a scheme form that typesets and links and identifier in a lexically sensitive way is not possible to implement in Skribe without building a PLT Scheme-style module and macro layer on top of Skribe.

Scribble builds on a long line of work in Lisp-style language extensibility, including traditional Lisp macros, lexically scoped macros in Scheme (Dybvig et al. 1993), and readtable-based syntactic extension as in Common Lisp. Phase-sensitive binding through for-label is specific to PLT Scheme, as is the disciplined approach to reader extension embodied by #lang.

The SLaTeX (Sitaram 2007) system provides automatic formatting of Scheme code within a LaTeX document. To identify syntactic forms and constants, SLaTeX relies on defkeyword and defconstant declarations. In this mode, the author of a work in progress must constantly add another “standard” binding to SLaTeX’s list; SLaTeX’s built-in table of syntactic forms is small compared to the number of syntactic forms available in PLT Scheme. More generally, the problem is the usual one for “standards”: there are many to choose from. Scribble solves this problem with for-label imports and by directly using the namespace-management functionality of PLT Scheme modules.

Many systems follow the Lisp tradition of docstrings, in which documentation is associated to run-time values and used for online help. Python supports docstrings, and its doctest module even extracts and executes examples as tests, analogous to Scribble’s examples form. Scribble supports a docstring-like connection between run-time bindings and documentation, but using lexical-binding information instead of the value associated with a binding. For example, (help cons) in PLT Scheme’s read-eval-print loop opens documentation for cons based on its binding as imported from scheme/base, and not based on the procedure obtained by evaluating cons.

Smalltalk programming environments (Kay 1993) have always encouraged programmers to use the source (with its comments) as documentation, and environments like Eclipse and Visual Studio now make code navigation similarly convenient for other languages. Such tools do not supplant the need for external documentation, however, such as guides and tutorials.

In terms of surface syntax, many documentation systems build on either S-expression notation (or its cousin XML) as a way to encode both document structure and program structure. Such representations are especially appropriate for an intermediate representation of documentation, as in DocBook (Walsh and Muellner 2008). S-expression encodings of documentation are especially common in Lisp projects, where data and code are mingled easily.