#lang scribble/doc
@(require scribble/manual
          scribble/eval
          scheme/runtime-path
          (planet cce/scheme:4:1/planet)
          (for-label scheme/base
                     "../coma/rpn-macro.ss"
                     "../macro.ss"
                     ))
@(define-runtime-path demo "../pic18/demo.ss")

@(define box-eval
   (let ((eval (make-base-eval)))
     (eval `(require (file ,(path->string (simplify-path demo)))))
     eval))

@(define-syntax ex
   (syntax-rules ()
     ((_ () . args)
      (interaction #:eval box-eval . args))
     ((_ . args)
      (defs+int #:eval box-eval . args))))


@title{Staapl}

@section{Introduction}

Staapl is a Scheme to Forth metaprogramming system.  To illustrate the
general idea we're going to use a concrete application: the Forth
compiler for the 8-bit Microchip PIC18 microcontroller architecture.
This section uses REPL interaction with some example code written on
top of the compiler to demonstrate the code generation process.

@;defmodule/this-package[pic18/demo]
@defmodule[(planet zwizwa/staapl/pic18/demo)]
@subsection{Forth}

The @scheme[code>] form provided by the demonstration module
interprets the forms in its body as PIC18 Forth code, compiles them,
and prints out the resulting intermediate code with one instruction
per line.

@ex[()
    (code> 123)]

The instruction @scheme[qw] tells the target machine to load the
number @scheme[123] on the run-time parameter stack.  It is short for
``quote word.''  Providing a sequence of numbers in the @scheme[code>]
body will generate concatenated machine code, which is executed by the
target machine from top to bottom.

@ex[()
    (code> 1 2 3)]


Target code is represented in this intermediate form during the first
code generation pass to facilitate code transformations.  It consists
of a mix of pseudo instructions and real PIC18 instructions.  The code
generator will eventually clean up all occurances of @scheme[qw]
before attempting translation to binary machine code.  The
@scheme[pic18>] form performs this extra step and shows real machine code
output.

@ex[()
    (pic18> 123)]

The first instruction stores the contents of the working register in
the 2nd position on the parameter stack, and the second instruction
replaces the contents of the working register with @scheme[123].
Again, concatenating compiler input produces concatenated output:

@ex[()
    (pic18> 1 2 3)]


The intermediate instruction set which contains the @scheme[qw]
instruction is useful for implementing @emph{partial evaluation}
rules.  When compiling a particular Forth word, the compiler can
inspect the code already compiled to determine if it can combine its
effect with the effect to be compiled.

@ex[()(pic18> +)]
@ex[()(pic18> 1 +)]
@ex[()(pic18> 1 2 +)]

This illustrates 3 different modes of computation.  The first program
computes the addition at run-time, taking both input values from the
runtime stack and putting back the result.  The second program adds
the literal value @scheme[1] to the top of the stack using a different
machine instruction.  The third program doesn't perform any run-time
computation at all and simply loads the result of the addition that
was computed at comple-time because both inputs to the addition where
available.

Note that in this last program the result of the compile-time
computation is not shown.  Instead it shows a program @scheme[(1 2 +)]
that gives the result upon evaluation.  The compiler doesn't need to
know the exact value at this point.  It only needs to know that the
value can be determined later when necessary.  This is essential for
integration with the assembler, since these expressions might contain
symbolic representations of code addresses that only the assembler
knows.

Using the intermediate form with the @scheme[qw] pseudo-instructions
to compile the Forth program @scheme[1 2 +] shows the key idea: the
target @emph{code list} can be interpreted as a parameter stack, with
the top of the stack at the bottom of the code list.

@ex[()(code> 1 2)]
@ex[()(code> 1 2 +)]

The @emph{code stack} can be used as the argument passing mechanism
for a @emph{language of macros} that is active at compile time.
Machine instructions then become datatypes of this language.  The word
@scheme[+] names a function that operates at compile time.
It inspects the code stack and if it finds one or two @scheme[qw] objects it
can use them as input to the addition operation and compile a simpler
run-time instruction.


@subsection{Scheme}

While traditional Forth has its own metaprogramming facilities,
combining it with a Scheme-based meta system gives the added advantage
of tying into a powerful @emph{lexical scoping} mechanism.  The
ability to assign local names to objects comes in handy when dealing
with complex data structures, which is sometimes difficult to do in a
combinator language like Forth or Joy.  At the same time, the absence
of lexical binding forms in Forth code make it very suitable to be
handled as @emph{data}, avoiding complications associated to name
capture.

This sections introduces the forms @scheme[macro:],
@scheme[compositions], @scheme[patterns] and @scheme[tv:] which
comprise the metaprogramming interface between the Forth and Scheme
languages in Staapl.

@subsubsection{Code and Composition}

A representation of a Forth program can be composed using the
@scheme[macro:] form.

@ex[((define code1 (macro: 123))
     (define code2 (macro: 1 +)))]

The objects created by @scheme[macro:] are called @emph{concatenative
macros}.  These are functions that operate on compilation state
objects.  


@ex[() code1]

The target code associated to the objects @scheme[code1] and
@scheme[code2] can be printed using the function
@scheme[state-print-code] which extracts a code stack from a state object
and prints it, and the function @scheme[state:stack] which produces a
state object with an empty code stack to serve as an initial state.


@ex[((define (print-code code-rep)
       (state-print-code (code-rep (state:stack)))))
    (print-code code1)
    (print-code code2)]

It is seldom necessary to apply concatenative macros to state objects
manually since composition using the @scheme[macro:] form will usually
suffice.  In practice such application is performed by the framework
when generating target code.

The forms residing in the code body of a @scheme[macro:] form have a
one-to-one correspondence to concatenative macros.  @emph{Literals}
found in the body of a @scheme[macro:] are mapped to compilation state
transformers that append @scheme[qw] instructions to the current code
list.  @emph{Identifiers} are mapped to function values using the
@scheme[macro] form, which fishes them out of the @scheme[(macro)]
dictionary.

@ex[() (macro +)]

Note that the objects produced by @scheme[(macro +)] and @scheme[(macro:
+)] are different, although they have the same behaviour.  The former
is a variable reference while the latter creates a new abstraction.
This is similar to the distinction between the Scheme expressions
@scheme[+] and @scheme[(lambda (a b) (+ a b))].

Composing macros that are not in the @scheme[(macro)] dictionary is
possible using the @scheme[unquote] operation.  The @scheme[macro:]
form behaves similar to @scheme[quasiquote].  Unquoted objects come
from the Scheme lexical environment and are interpreted as macros.

@ex[((define code3 (macro: ,code1 ,code2)))
    (print-code code3)]

This illustrates the advantage of keeping target code in
@emph{delayed} form.  The effect of both pieces of code has been
combined into a single target operation.

It is possible to unquote Scheme values as literals by wrapping a
@scheme[quote] form around the @scheme[unquote] form.

@ex[((define value 42)
     (define code4 (macro: ',value)))
    (print-code code4)]

To define new identifiers in a particular dictionary, we can use the
@scheme[compositions] form.

@ex[((compositions (macro) macro:
      (inc  1 +)
      (dec  1 -)))
    (print-code (macro: 100 inc))
    (print-code (macro: 100 dec))]

The first two sub-forms in a @scheme[compositions] form indicate the
target dictionary and body compiler respectively.  The rest of the
body consists of a list of lists, where the first element of each list
is an identifier to which a macro will be associated in the
dictionary, and the rest of the list is a code body that's passed to
the body compiler form.


@subsubsection{Primitives}

The previous section describes how to compose existing code to create
new code by concatenation, and how to evaluate code into a form that
can be passed to the assembler.  This section will describe how to
define primitive macros operating on stacks of target machine
instructions.

Creating an instance of a machine code instruction is done using the
@scheme[op:] form.  It is exactly these objects that are produced when
concatenative macros are evaluated.

@ex[((define ins1 (op: addwf 42 0 0))
     (define ins2 (op: addlw 123))
     (define ins3 (op: qw 123)))]

Real opcodes can be passed to the assembler to produce binary output.
Pseudo instructions cannot.

@ex[()
    (op-apply ins1 0)
    (op-apply ins2 0)
    (op-apply ins3 0)]

However, pseudo instructions can be used to hold intermediate data
during the compilation phase.  The following will illustrate the use
of a form to define new primitive operations.  We'll create a macro
@scheme[add] that will behave like the macro @scheme[+] encountered
before.

Creating new primtive macros is done with the @scheme[patterns] form.
Its first subform specifies the dictionary to which definitions are
associated.  The rest of the forms contains lists of pattern and
template pairs.  The following example defines the @scheme[add] macro
as not taking any input from the code stack, but producing an
@scheme[addwf] instruction as output.

@ex[((patterns (macro)
       ((add) ([addwf POSTDEC0 0 0]))))]

The templates in a @scheme[pattern] form are lists of forms that are
passed on to the @scheme[op:] form, resulting in lists of instruction
objects.  Verifying if it works gives:

@ex[()
    (print-code (macro: add))]

This is an example of a non-optimizing macro which only performs code
generation. To add different behaviour for different input patterns,
extra clauses can be added.

@ex[((patterns (macro)
       (([qw a] add) ([addlw a]))               
       ((add)        ([addwf POSTDEC0 0 0]))))
    (print-code (macro: 123 add))
    (print-code (macro: add))]

When a @scheme[qw] instruction appears in the input, it is
deconstructed and its operand is used as the operand of a
@scheme[addlw] operation.  The @scheme[patterns] form is built on the
PLT Scheme @scheme[match] form, deconstructing a @emph{stack} of
instructions according to input patterns, and constructing lists of
instructions to be added to the compilation state to replace the
matching top of stack.

Upto now our @scheme[add] doesn't really perform compile-time
computation other than selecting a different instruction based on the
presence of literal data.  Using the @scheme[tv:] form we can add
proper computation when there are two literals available.  For now,
think of @scheme[tv:] as an RPN calculator behaving as in:

@ex[()
    (target-value->number (tv: 1 2 +))
    (let ([a 100]
          [b 200])
      (target-value->number (tv: a b +)))
    ]

The @scheme[tv:] form will install wrappers to enable computations
that can only be performed after the assembler has assigned numerical
addresses to code labels.  The use of this compile-time calculator
then leads to an implementation of the @scheme[+] macro whith 3
evaluation modes:

@ex[((patterns (macro)
       (([qw a] [qw b] add) ([qw (tv: a b +)]))
       (([qw a] add)        ([addlw a]))               
       ((add)               ([addwf POSTDEC0 0 0]))))
    (print-code (macro: 1 2 add))
    (print-code (macro: 123 add))
    (print-code (macro: add))]


@subsection{More}

Staapl contains a generic concatenative language parser in
@scheme[staapl/rpn] which is used to implement the languages
@scheme[macro:], @scheme[tv:], @scheme[scat:] and @scheme[target:].
This language can be extended with @emph{prefix parsers} to implement
a Forth-style prefix syntax for defining words.

At the core of the Forth code generator is an incremental compiler
which constructs a control flow graph as its first output pass.
Subsequent optimization passes operate on this structure.

Assembler opcodes can be specified using the @scheme[instruction-set]
special form defined in @scheme[staapl/asm].  This is used to define
an assembler for the PIC18 instruction set.

The distribution contains some example Forth code and library routines
that implement host to target tethering.

There is a significant body of code to perform run-time target access
through the @scheme[target:] language, and incremental code upload.
Staapl can emulate a standard Forth console.

Finally, Staapl provides the @scheme[staaplc] command line application
which can be used to compile stand-alone Forth programs to binary, for
upload with a microcontroller programmer tool.


@section{Reference}

@defmodule[(planet zwizwa/staapl/macro)]

@defform[(macro: word ...)]

@defform[(op: word ...)]

@defform[(scat: word ...)]

@defform[(tv: word ...)]

@defform[(patterns dictionary (pattern template) ...)]

@defform[(compositions dictionary composer (id word ...))]