2 Scheme
While traditional Forth has its own metaprogramming facilities, combining it with a Scheme-based meta system gives the added advantage of tying into a powerful lexical scoping mechanism. The ability to assign local names to objects comes in handy when dealing with complex data structures, which is sometimes difficult to do in a combinator language like Forth or Joy. At the same time, the absence of lexical binding forms in Forth code make it very suitable to be handled as data, avoiding complications associated to name capture.
This sections introduces the forms macro:, compositions, patterns and tv: which comprise the metaprogramming interface between the Forth and Scheme languages in Staapl.
2.1 Code and Composition
A representation of a Forth program can be composed using the macro: form.
(define code1 (macro: 123)) |
(define code2 (macro: 1 +)) |
The objects created by macro: are called concatenative macros. These are functions that operate on compilation state objects.
> code1 |
#state->state |
The target code associated to the objects code1 and code2 can be printed using the function state-print-code which extracts a code stack from a state object and prints it, and the function state:stack which produces a state object with an empty code stack to serve as an initial state.
(define (print-code code-rep) |
(state-print-code (code-rep (state:stack)))) |
> (print-code code1) |
[qw 123] |
> (print-code code2) |
[addlw 1] |
It is seldom necessary to apply concatenative macros to state objects manually since composition using the macro: form will usually suffice. In practice such application is performed by the framework when generating target code.
The forms residing in the code body of a macro: form have a one-to-one correspondence to concatenative macros. Literals found in the body of a macro: are mapped to compilation state transformers that append qw instructions to the current code list. Identifiers are mapped to function values using the macro form, which fishes them out of the (macro) dictionary.
> (macro +) |
#state->state |
Note that the objects produced by (macro +) and (macro: +) are different, although they have the same behaviour. The former is a variable reference while the latter creates a new abstraction. This is similar to the distinction between the Scheme expressions + and (lambda (a b) (+ a b)).
Composing macros that are not in the (macro) dictionary is possible using the unquote operation. The macro: form behaves similar to quasiquote. Unquoted objects come from the Scheme lexical environment and are interpreted as macros.
(define code3 (macro: ,code1 ,code2))
> (print-code code3) |
[qw (123 1 +)] |
This illustrates the advantage of keeping target code in delayed form. The effect of both pieces of code has been combined into a single target operation.
It is possible to unquote Scheme values as literals by wrapping a quote form around the unquote form.
(define value 42) |
(define code4 (macro: ',value)) |
> (print-code code4) |
[qw 42] |
To define new identifiers in a particular dictionary, we can use the compositions form.
(compositions (macro) macro: |
(inc 1 +) |
(dec 1 -)) |
> (print-code (macro: 100 inc)) |
[qw (100 1 +)] |
> (print-code (macro: 100 dec)) |
[qw (100 1 -)] |
The first two sub-forms in a compositions form indicate the target dictionary and body compiler respectively. The rest of the body consists of a list of lists, where the first element of each list is an identifier to which a macro will be associated in the dictionary, and the rest of the list is a code body that’s passed to the body compiler form.
2.2 Primitives
The previous section describes how to compose existing code to create new code by concatenation, and how to evaluate code into a form that can be passed to the assembler. This section will describe how to define primitive macros operating on stacks of target machine instructions.
Creating an instance of a machine code instruction is done using the op: form. It is exactly these objects that are produced when concatenative macros are evaluated.
(define ins1 (op: addwf 42 0 0)) |
(define ins2 (op: addlw 123)) |
(define ins3 (op: qw 123)) |
Real opcodes can be passed to the assembler to produce binary output. Pseudo instructions cannot.
> (op-apply ins1 0) |
(9258) |
> (op-apply ins2 0) |
(3963) |
> (op-apply ins3 0) |
asm-pseudo-op: qw |
However, pseudo instructions can be used to hold intermediate data during the compilation phase. The following will illustrate the use of a form to define new primitive operations. We’ll create a macro add that will behave like the macro + encountered before.
Creating new primtive macros is done with the patterns form. Its first subform specifies the dictionary to which definitions are associated. The rest of the forms contains lists of pattern and template pairs. The following example defines the add macro as not taking any input from the code stack, but producing an addwf instruction as output.
(patterns (macro) |
((add) ([addwf POSTDEC0 0 0]))) |
The templates in a pattern form are lists of forms that are passed on to the op: form, resulting in lists of instruction objects. Verifying if it works gives:
> (print-code (macro: add)) |
[addwf POSTDEC0 0 0] |
This is an example of a non-optimizing macro which only performs code generation. To add different behaviour for different input patterns, extra clauses can be added.
(patterns (macro) |
(([qw a] add) ([addlw a])) |
((add) ([addwf POSTDEC0 0 0]))) |
> (print-code (macro: 123 add)) |
[addlw 123] |
> (print-code (macro: add)) |
[addwf POSTDEC0 0 0] |
When a qw instruction appears in the input, it is deconstructed and its operand is used as the operand of a addlw operation. The patterns form is built on the PLT Scheme match form, deconstructing a stack of instructions according to input patterns, and constructing lists of instructions to be added to the compilation state to replace the matching top of stack.
Upto now our add doesn’t really perform compile-time computation other than selecting a different instruction based on the presence of literal data. Using the tv: form we can add proper computation when there are two literals available. For now, think of tv: as an RPN calculator behaving as in:
> (target-value->number (tv: 1 2 +)) | |||
3 | |||
| |||
300 |
The tv: form will install wrappers to enable computations that can only be performed after the assembler has assigned numerical addresses to code labels. The use of this compile-time calculator then leads to an implementation of the + macro whith 3 evaluation modes:
(patterns (macro) |
(([qw a] [qw b] add) ([qw (tv: a b +)])) |
(([qw a] add) ([addlw a])) |
((add) ([addwf POSTDEC0 0 0]))) |
> (print-code (macro: 1 2 add)) |
[qw (1 2 +)] |
> (print-code (macro: 123 add)) |
[addlw 123] |
> (print-code (macro: add)) |
[addwf POSTDEC0 0 0] |