4 @s and []s and {}s, Oh My!
Users of a text-markup language experience first and foremost the language’s concrete syntax. The same is true of any language, but in the case of text, authors with different backgrounds have arrived at a remarkably consistent view of the appropriate syntax: it should use blank lines to indicate paragraph breaks, double-quote characters should not be special, and so on. At the same time, a programmable mark-up language needs a natural escape to the programming layer and back.
From the perspective of a programming language, conventional notations for string literals are terrible for writing text. The quoting rules tend to be complex, and they usually omit an escape for arbitrarily nested expressions. “Here strings” and string interpolation can alleviate some of the quoting and escape problems, but they are insufficient for writing large amounts of text with frequent nested escapes to the programming language. More importantly, building text in terms of string escapes and operations like string-append distracts from the business of writing prose, which is about text and markup rather than strings and function calls.
Indeed, many documentation systems, like JavaDoc, avoid the limitations of string literals in the language by defining a completely new syntax that is embedded within comments. Of course, this approach sacrifices any connection between the text and the programming language.
For Scribble, our solution is the @-notation, which is a text-friendly alternative to traditional S-expression syntax. More precisely, the @-notation is another way to write down arbitrary S-expressions, but it is tuned for writing blocks of free-form text. The @-expression notation is a strict extension of PLT Scheme’s S-expression syntax; the @ character has no special meaning in Scheme strings, in comments, or in the middle of Scheme identifiers. Furthermore, since it builds on the existing S-expression parser, it inherits all of the existing source-location support for error messages.
4.1 @-expressions as S-expressions
An @-expression maps to an S-expression as follows:
An @‹op›{...} sequence applies ‹op› to text-mode arguments. For example,
@emph{Yes!}
is equivalent to the S-expression
(emph "Yes!")
As another example, since @ keeps its meaning inside text-mode arguments,
is equivalent to the S-expression
An @‹op›[...] sequence applies ‹op› to S-expression arguments. For example,
is equivalent to the S-expression
An @‹op›[...]{...} sequence combines S-expression arguments and text-mode arguments. For example,
@title[#:style ’toc]{Contracts}
is equivalent to the S-expression
(title #:style ’toc "Contracts")
where #:style uses PLT Scheme’s S-expression notation for a keyword.
An @‹op› sequence without an immediately following { or [ is equivalent to just ‹op› in Scheme mode. For example,
@username
is equivalent to the S-expression
username
so that
@emph{committed by @username}
is equivalent to
(emph "committed by " username)
An ‹op› can be omitted in any of the above forms. For example,
@{Country @emph{and} Western}
is equivalent to the S-expression
("Country " (emph "and") " Western")
which is useful in some quoted or macro contexts.
Another way to describe the @-expression syntax is simply @‹op›[...]{...} where each of the three parts is optional. When ‹op› is included but both kinds of arguments are missing, then ‹op› can produce a value to use directly instead of a function to call. The ‹op› in an @-expression is not constrained to be an identifier; it can be any S-expression. For example, an argumentless @(require scribble/manual) is equivalent to the S-expression (require scribble/manual).
The spectrum of @-expression forms enables a document author to use whichever variant is most convenient. For a given operation, however, one particular variant is typically used. In general, @‹op›{...} or @‹op›[...] is used to imply a typesetting operation, whereas @‹op› more directly implies an escape to Scheme. Hence, the form @emph{Yes!} is preferred to the equivalent @(emph "Yes!"), while @(require scribble/manual) is preferred to the equivalent @require[scribble/manual].
A combination of S-expression and text-mode arguments is often useful to “customize” an operation that consumes text. The @title[#:style ’toc]{Contracts} example illustrates this combination, where the optional 'toc style customizes the typeset result of the title function. In other cases, an operation that specifically leverages S-expression notation may also have a text component. For example,
Creates an unfilled ellipse. |
} |
is equivalent to
"Creates an unfilled ellipse.") |
but as the description of the procedure becomes more involved, using text mode for the description becomes much more convenient.
An @ works both an escape from text mode and as a form constructor in S-expression contexts. As a result, @-forms keep their meaning whether they are used in a Scheme expression or in a Scribble text part. This equivalence significantly reduces the need for explicit quoting and unquoting operations, and it helps avoid bugs due to incorrect quoting levels. For example, instead of @itemize[(item "a") (item "b")], an itemization is normally written @itemize[@item{a} @item{b}], since items for an itemization are better written in text mode than as conventional strings; in this case, @item{a} can be used directly without first switching back to text mode.
Overall, @-expressions are crucial to Scribble’s flexibility in the same way that S-expressions are crucial to Scheme’s flexibility – and, in the same way, the benefit is difficult to quantify. Furthermore, just as S-expressions can be used for more than writing Scheme programs, the @ notation can be used for purposes other than documentation, and the @-notation parser is available for use in PLT Scheme separate from the rest of the Scribble infrastructure. We use it as an alternative to HTML for building the plt-scheme.org web pages, more generally in a template system supported by the PLT Scheme web server, and also as a text preprocessor language similar in spirit to m4 for generating plain-text files.
4.2 Documentation-Specific Decoding
The @ notation supports local text transformations and mark-up, but it does not directly address some other problems specific to organizing a document’s source:
Section content should be grouped implicitly via section, subsection, etc. declarations, instead of explicitly nesting section constructions.
Paragraph breaks should be determined by empty lines in the source text, instead of explicitly constructing paragraph values.
A handful of ASCII character sequences should be converted automatically to more sophisticated typesetting elements, such as converting `` and '' to curly quotes or --- to an em-dash.
These transformations are specific to typesetting, and they are not appropriate for other contexts where the @ notation is useful. Therefore, the @ parser in Scribble faithfully preserves the original text in Scheme strings, and a separate decode layer in Scribble provides additional transformations.
Functions like bold and emph apply decode-content to their arguments to perform ASCII transformations, and item calls decode-flow to transform ASCII sequences and form paragraphs between empty lines. In contrast, tt and verbatim do not call the decode layer, and they instead typeset text exactly as it is given.
For example, the source document
@(require scribble/manual) |
|
@title{Tubers} |
|
@section{Problem} |
|
You say “potato.” |
|
I say “potato.” |
|
@section{Solution} |
|
Call the whole thing off. |
invokes the decode layer, producing a module that is roughly equivalent to the following:
(require scribble/struct) |
(provide doc) |
|
(define doc |
(list |
(list |
(list "You say \u201Cpotato.\u201D")) |
(list "I say \u201Cpotato.\u201D")))) |
(list |
(list "Call the whole thing off."))))))) |