The design of (yet another!) pretty-printing library

The design of (yet another!) pretty-printing library
Dave Herman
Northeastern University
June 2005


Functional pretty printers have a long and surprisingly illustrious tradition in
the literature. The ancestry of this library goes something like this:

    1995 - John Hughes publishes a paper [2] on creating an algebra of "pretty
           documents" for the implementation of a pretty-printing library.
    1997 - Simon Peyton Jones implements this as a Haskell library [4].
    1998 - Philip Wadler publishes a paper [5] improving on Hughes' algebra and
    2001 - Daan Leijen implements this as a Haskell library [3].
    2002 - Ralph Becket ports Leijen's library to Mercury, a strict
           functional/logic language [1].

This library is a translation of the Haskell PPrint library, but with
help from Becket's Mercury implementation for maintaining efficiency
in a strict language.


Becket's port makes the following modifications to the Haskell

- He eliminates the `UNION' constructor, since the only place union is
  really required is for the `group' operation. In a strict language,
  this prevents unnecessary construction of duplicate data.

- He delays the calculation of `best' and `flatten' on the two arms of
  the union.

- He adds a `LABEL' constructor, which allows programmers to specify
  arbitrary text for indentation, rather than just spaces.

Becket further modifies the Haskell algorithm by eliminating the
`SimpleDoc' datatype and directly producing output from within the
layout procedure, rather than first generating the intermediate
SimpleDoc. However, this changes the behavior of the original
algorithm. The layout procedure in the Haskell library examines not
just the current sub-document but its entire context (i.e., the rest
of the document) in order to determine whether it fits on the current
line. The Mercury port, however, only uses the current sub-document to
make this decision.

The following example demonstrates the difference in behavior:

    text "pretty" </> text "printer"

With a column width less than 19 (i.e., length "pretty printer"), the
Haskell library would determine that the flattened document does not
fit, and decide to break lines. The Mercury library, however, only
looks at the soft break and chooses not to break because text " " has
length 1 and therefore fits; it subsequently overruns the length of
the line.


I've chosen a design somewhere in between the two. The code mostly
follows the Haskell version, but I've replaced the `UNION' constructor
with a `GROUP' construct as in Becket's implementation. This way there
is no unnecessary duplication of data. Furthermore, the flattened
version is only computed by need when the layout procedure reaches a
`GROUP' node, and of course the recursion on the non-flattened version
is only computed if the flattened version fails to fit.

I've also added Becket's `LABEL' constructor, which seems like a great


[1] Ralph Becket, pprint.m (2002)

[2] John Hughes, "The Design of a Pretty-Printing Library" (1995)

[3] Daan Leijen, "PPrint, a Prettier Printer" (2001)

[4] Simon Peyton Jones, "A Pretty-Printer Library in Haskell" (1997)

[5] Philip Wadler, "A Prettier Printer" (1998)