PPrint: A Universal Pretty-Printer
by Dave Herman (dherman at ccs dot neu dot edu)
PPrint is a library for pretty-printing: generating textual representations formatted to fit as well as possible on a fixed-width device such as a text editor or printer. While PLT Scheme provides an excellent pretty-printer for Scheme, this package provides a more general library for pretty-printing any text.
1 Getting Started with PPrint
To use PPrint, first require it from PLaneT:
(require (planet dherman/pprint:4))
Here’s a simple example of pretty-printing a fragment of code.
Examples: | |||||||
| |||||||
|
Things to notice about this example:
The pretty-print function takes as its argument the result of composing many different PPrint library functions such as v-append and text.
The v-append function appends multiple lines of text.
The nest function increases the indentation level for subsequent lines.
2 Abstract Documents
Formatting text in PPrint involves creating an "abstract document" or doc, which encapsulates formatting information for the pretty printer. The library functions of PPrint build and combine docs, which can then be rendered for pretty printing (see Rendering Documents).
(doc? x) → boolean? |
x : any |
Determines whether a value is a member of the doc datatype.
When using the markup constructor, the doc datatype may be thought of as a parameterized type doc a for arbitrary markup of type a. See the documentation for markup for details.
3 Library Documentation
3.1 Rendering Documents
(pretty-print d [out width]) → any |
d : doc? |
out : output-port? = (current-output-port) |
width : natural-number/c = (current-page-width) |
Pretty prints the doc d to the output out with a maximum page width of width.
(pretty-format d [width]) → string? |
d : doc? |
width : natural-number/c = (current-page-width) |
Pretty prints the doc d to a string with a maximum page width of width.
(pretty-markup d combine [width]) → (or/c string? a) |
d : doc? |
combine : ((or/c string? a) (or/c string? a) -> (or/c string? a)) |
width : natural-number/c = (current-page-width) |
Pretty prints the doc d to an instance of type a, which is determined by the type of the markup nodes in d, with a maximum page width of width.
The process of generating the markup relies on the ability to concatenate strings or markup, and this concatenation is dependent on the type a. So the combine argument is required in order to concatenate fragments of marked-up text.
(current-page-width) → natural-number/c |
(current-page-width w) → void? |
w : natural-number/c |
A parameter specifying the default maximum page width, in columns, for pretty printing.
3.2 Basic Documents
empty : doc? |
The empty document, which contains the empty string.
(char c) → doc? |
c : char? |
Constructs a document containing the single character c.
(text s) → doc? |
s : string? |
Constructs a document containing the fixed string s.
(nest n d) → doc? |
n : natural-number/c |
d : doc? |
Constructs a document like d but with the current indentation level increased by n.
NOTE: The nest combinator does not affect the current line’s indentation. Indentation is only inserted after a line or a break.
Examples: | |||
> (pretty-print (nest 4 (text "not indented"))) | |||
not indented | |||
| |||
|
(label s d) → doc? |
s : string? |
d : doc? |
Constructs a document like d but with the current indentation suffixed by the string s.
(markup f d) → (doc a) |
f : ((or/c string? a) -> (or/c string? a)) |
d : (doc a) |
Creates a document node with a markup transformer, which is applied by pretty-markup to produce a pretty-printed document with markup information. The markup is assumed not to affect the width of the string. This allows you, for example, to produce X-expressions from pretty-printed source.
Examples: | ||||||
| ||||||
| ||||||
> (pretty-markup (markup (λ (x) `(em ,x)) (text "hi!")) combine) | ||||||
(em "hi!") |
(group d) → doc? |
d : doc? |
Creates a document like d but with all line breaks removed, if it fits on a single line.
line : doc? |
A document containing a line break, which is replaced with a single space when placed in the context of a group.
break : doc? |
A document containing a line break, which is replaced with the empty string when placed in the context of a group.
soft-line : doc? |
Equivalent to (group line).
soft-break : doc? |
Equivalent to (group break).
3.3 Compound Documents
(h-append d ) → doc? |
d : doc? |
Concatenates documents d ....
(hs-append d ) → doc? |
d : doc? |
Concatenates documents d ... with successive pairs of documents separated by space.
(v-append d ) → doc? |
d : doc? |
Concatenates documents d ... with successive pairs of documents separated by line.
(vs-append d ) → doc? |
d : doc? |
Concatenates documents d ... with successive pairs of documents separated by soft-line.
(vb-append d ) → doc? |
d : doc? |
Concatenates documents d ... with successive pairs of documents separated by break.
(vsb-append d ) → doc? |
d : doc? |
Concatenates documents d ... with successive pairs of documents separated by soft-break.
3.4 List Utilities
(h-concat ds) → doc? |
ds : (listof doc?) |
Concatenates documents ds.
(hs-concat ds) → doc? |
ds : (listof doc?) |
Concatenates documents ds with successive pairs of documents separated by space.
(v-concat ds) → doc? |
ds : (listof doc?) |
Concatenates documents ds with successive pairs of documents separated by line.
(vs-concat ds) → doc? |
ds : (listof doc?) |
Concatenates documents ds with successive pairs of documents separated by soft-line.
(v-concat/s ds) → doc? |
ds : (listof doc?) |
Concatenates documents ds with successive pairs of documents separated by spaces if they all fit on one line; otherwise concatenates them vertically. Equivalent to (group (v-concat ds)).
(vb-concat ds) → doc? |
ds : (listof doc?) |
Concatenates documents ds with successive pairs of documents separated by break.
(vsb-concat ds) → doc? |
ds : (listof doc?) |
Concatenates documents ds with successive pairs of documents separated by soft-break.
(vb-concat/s ds) → doc? |
ds : (listof doc?) |
Concatenates documents ds if they all fit on one line; otherwise concatenates them vertically. Equivalent to (group (vb-concat ds)).
(apply-infix d ds) → (listof doc?) |
d : doc? |
ds : (listof doc?) |
Concatenates documents ds with successive pairs of documents separated by d.
3.5 Fillers
(fill n d) → doc? |
n : natural-number/c |
d : doc? |
Creates a document like d but with enough spaces to pad its width to n, or no spaces if the width is already greater than or equal to n.
Examples: | ||||||||||||
| ||||||||||||
|
(fill/break n d) → doc? |
n : natural-number/c |
d : doc? |
Creates a document like d but with enough spaces to pad its width to n, or if the width is already n or greater, increases the nesting level by n and appends a line.
Examples: | ||||||||||||
| ||||||||||||
|
3.6 Context-Sensitive Alignment
The alignment operators were introduced in Daan Leijen’s PPrint library for Haskell. These are useful in practice but more expensive than other operations. They determine their layout relative to the current column.
(align d) → doc? |
d : doc? |
Creates a document like d but with the nesting level set to the current column.
(hang n d) → doc? |
n : natural-number/c |
d : doc? |
Creates a document like d but with the nesting level set to the current column plus n. Equivalent to (align (nest n d)).
(indent n d) → doc? |
n : natural-number/c |
d : doc? |
Creates a document like d but indented by n spaces from the current column.
3.7 Useful Constants
comma : doc? |
(char #\,)
semi : doc? |
(char #\;)
colon : doc? |
(char #\:)
lparen : doc? |
(char #\()
rparen : doc? |
(char #\))
lbracket : doc? |
(char #\[)
rbracket : doc? |
(char #\])
lbrace : doc? |
(char #\{)
rbrace : doc? |
(char #\})
langle : doc? |
(char #\<)
rangle : doc? |
(char #\>)
space : doc? |
(char #\space)
ellipsis : doc? |
(text "...")
squote : doc? |
(char #\')
dquote : doc? |
(char #\")
dot : doc? |
(char #\.)
backslash : doc? |
(char #\\)
equals : doc? |
(char #\=)
4 Haskell Compatibility Library
(require (planet dherman/pprint:4/haskell))
For those who are more familiar with the names in the Haskell library, this library is provided as a compatibility mode. (This might be useful for porting existing Haskell code, for example.)
empty : doc? |
Same as empty.
char : doc? |
Same as char.
text : doc? |
Same as text.
nest : (natural-number/c doc? -> doc?) |
Same as nest.
group : (doc? -> doc?) |
Same as group.
line : doc? |
Same as line.
linebreak : doc? |
Same as break.
softline : doc? |
Same as soft-line.
softbreak : doc? |
Same as soft-break.
<> : (doc? ... -> doc?) |
Same as h-append.
<+> : (doc? ... -> doc?) |
Same as hs-append.
<$> : (doc? ... -> doc?) |
Same as v-append.
</> : (doc? ... -> doc?) |
Same as vs-append.
<$$> : (doc? ... -> doc?) |
Same as vb-append.
<//> : (doc? ... -> doc?) |
Same as vsb-append.
hcat : ((listof doc?) -> doc?) |
Same as h-concat.
hsep : ((listof doc?) -> doc?) |
Same as hs-concat.
vsep : ((listof doc?) -> doc?) |
Same as v-concat.
fill-sep : ((listof doc?) -> doc?) |
Same as vs-concat.
sep : ((listof doc?) -> doc?) |
Same as v-concat/s.
vcat : ((listof doc?) -> doc?) |
Same as vb-concat.
fill-cat : ((listof doc?) -> doc?) |
Same as vsb-concat.
cat : ((listof doc?) -> doc?) |
Same as vb-concat/s.
punctuate : (doc? (listof doc?) -> doc?) |
Same as apply-infix.
fill : (natural-number/c doc? -> doc?) |
Same as fill.
fill-break : (natural-number/c doc? -> doc?) |
Same as fill/break.
align : (doc? -> doc?) |
Same as align.
hang : (natural-number/c doc? -> doc?) |
Same as hang.
indent : (natural-number/c doc? -> doc?) |
Same as indent.
comma : doc? |
Same as comma.
semi : doc? |
Same as semi.
colon : doc? |
Same as colon.
lparen : doc? |
Same as lparen.
rparen : doc? |
Same as rparen.
lbrace : doc? |
Same as lbrace.
rbrace : doc? |
Same as rbrace.
lbracket : doc? |
Same as lbracket.
rbracket : doc? |
Same as rbracket.
langle : doc? |
Same as langle.
rangle : doc? |
Same as rangle.
space : doc? |
Same as space.
ellipsis : doc? |
Same as ellipsis.
squote : doc? |
Same as squote.
dquote : doc? |
Same as dquote.
dot : doc? |
Same as dot.
backslash : doc? |
Same as backslash.
equals : doc? |
Same as equals.
5 Design Notes
5.1 History
Functional pretty printers have a surprisingly long and illustrious tradition in the literature. The ancestry of this library goes something like this:
1995 - John Hughes publishes a paper [Hug95] on creating an algebra of "pretty documents" for the implementation of a pretty-printing library.
1997 - Simon Peyton Jones implements this as a Haskell library [Pey97].
1998 - Philip Wadler publishes a paper [Wad98] improving on Hughes’ algebra and design.
2001 - Daan Leijen implements this as a Haskell library [Lei01].
2001 - Ralph Becket ports Leijen’s library to Mercury, a strict functional/logic language [Bec02].
This library is a translation of the Haskell PPrint library, but with help from Becket’s Mercury implementation for maintaining efficiency in a strict language.
5.2 Mercury Port
Becket’s port makes the following modifications to the Haskell library:
He eliminates the UNION constructor, since the only place union is really required is for the group operation. In a strict language, this prevents unnecessary construction of duplicate data.
He delays the calculation of best and flatten on the two arms of the union.
He adds a LABEL constructor, which allows programmers to specify arbitrary text for indentation, rather than just spaces.
Becket further modifies the Haskell algorithm by eliminating the SimpleDoc datatype and directly producing output from within the layout function, rather than first generating the intermediate SimpleDoc. However, this changes the behavior of the original algorithm. The layout function in the Haskell library examines not just the current sub-document but its entire context (i.e., the rest of the document) in order to determine whether it fits on the current line. The Mercury port, however, only uses the current sub-document to make this decision.
The following example demonstrates the difference in behavior:
Examples: | |||
| |||
|
With a column width less than 14 (i.e., (string-length "pretty printer")), the Haskell library would determine that the flattened document does not fit, and decide to break lines. The Mercury library, however, only looks at the soft break and chooses not to break because (text " ") has length 1 and therefore fits, and it subsequently overruns the length of the line.
5.3 Scheme Port
I’ve chosen a design somewhere in between the two. The code mostly follows the Haskell version, but I’ve replaced the UNION constructor with a GROUP construct as in Becket’s implementation. This way there is no unnecessary duplication of data. Furthermore, the flattened version is only computed by need when the layout function reaches a GROUP node, and of course the recursion on the non-flattened version is only computed if the flattened version fails to fit.
I’ve also added Becket’s LABEL constructor.
5.4 Modification History
2006/9/26 - Added MARKUP constructor with markup and pretty-markup operations.
2006/9/27 - The previous implementation didn’t correctly prune the search space. Philip Wadler [Wad98] demonstrated examples of nested occurrences of GROUP:
(define (test-performance n)
(parameterize ([current-page-width 5])
(pretty-format
(let build-example ([n n])
(if (= n 1)
(group (v-append (text "hello")
(text (number->string n))))
(group (v-append (build-example (sub1 n))
(text (number->string n)))))))
(void)))
This example can arbitrarily nest a bunch of GROUP nodes where the very first one encountered in the layout algorithm should discover that flattening will fail (i.e., because "hello" is larger than the page width of 5 characters). In the past, the layout algorithm would completely compute the layout of the flattened version before calling fits? to discover that it would fail.
In a lazy language, this is optimized for free: the complete recursive call to layout isn’t computed until it’s needed, and if fits? determines it isn’t needed, it gets short-circuited.
In an eager language, you need to perform the short-circuiting explicitly. I’ve added an implementation of backtracking with exceptions in the layout algorithm. You can test the above example and see that it performs quite well now.
2006/9/29 - Added the combine argument to pretty-markup.
2008/9/2 - Finally fixed the implementation of the layout algorithm. It was not trying the flattened version first, so it was never producing flattened output. Also, backtracking should happen when we reach a TEXT node that’s wider than the remainder of the column, whereas the code was backtracking after overrunning the column.
Bibliography
| Ralph Becket, “pprint.m.” 2002. http://www.cs.mu.oz.au/research/mercury/information/doc-latest/mercury_library/pprint.html | |
| John Hughes, “The Design of a Pretty-Printing Library.” 1995. http://www.cs.chalmers.se/~rjmh/Papers/pretty.html | |
| Daan Leijen, “PPrint, a Prettier Printer.” 2001. http://research.microsoft.com/users/daan/pprint.html | |
| Simon Peyton Jones, “A Pretty-Printer Library in Haskell.” 1997. http://research.microsoft.com/~simonpj/downloads/pretty-printer/pretty.html | |
| Philip Wadler, “A Prettier Printer.” 1998. http://homepages.inf.ed.ac.uk/wadler/topics/language-design.html#prettier |