Version: 4.1.5.5

1 Forth

Staapl contains a Forth compiler for the Microchip PIC18 8-bit microcontroller architecture. We’re going to use it to illustrate code generation and processing.

  > (require (planet zwizwa/staapl/pic18/demo))

  Welcome to the PIC18 code generator demo.

The code> form provided by the demonstration module interprets the forms in its body as PIC18 Forth code, compiles them, and prints out the resulting intermediate code with one instruction per line.

  > (code> 123)

  [qw 123]

The instruction qw tells the target machine to load the number 123 on the run-time parameter stack. It is short for “quote word.” Providing a sequence of numbers in the code> body will generate concatenated machine code, which is executed by the target machine from top to bottom.

  > (code> 1 2 3)

  [qw 1]

  [qw 2]

  [qw 3]

Target code is represented in this intermediate form during the first code generation pass to facilitate code transformations. It consists of a mix of pseudo instructions and real PIC18 instructions. The code generator will eventually clean up all occurances of qw before attempting translation to binary machine code. The pic18> form performs this extra step and shows real machine code output.

  > (pic18> 123)

  [movwf PREINC0 0]

  [movlw 123]

The first instruction stores the contents of the working register in the 2nd position on the parameter stack, and the second instruction replaces the contents of the working register with 123. Again, concatenating compiler input produces concatenated output:

  > (pic18> 1 2 3)

  [movwf PREINC0 0]

  [movlw 1]

  [movwf PREINC0 0]

  [movlw 2]

  [movwf PREINC0 0]

  [movlw 3]

The intermediate instruction set which contains the qw instruction is useful for implementing partial evaluation rules. When compiling a particular Forth word, the compiler can inspect the code already compiled to determine if it can combine its effect with the effect to be compiled.

  > (pic18> +)

  [addwf POSTDEC0 0 0]

  > (pic18> 1 +)

  [addlw 1]

  > (pic18> 1 2 +)

  [movwf PREINC0 0]

  [movlw (1 2 +)]

This illustrates 3 different modes of computation. The first program computes the addition at run-time, taking both input values from the runtime stack and putting back the result. The second program adds the literal value 1 to the top of the stack using a different machine instruction. The third program doesn’t perform any run-time computation at all and simply loads the result of the addition that was computed at comple-time because both inputs to the addition where available.

Note that in this last program the result of the compile-time computation is not shown. Instead it shows a program (1 2 +) that gives the result upon evaluation. The compiler doesn’t need to know the exact value at this point. It only needs to know that the value can be determined later when necessary. This is essential for integration with the assembler, since these expressions might contain symbolic representations of code addresses that only the assembler knows.

Using the intermediate form with the qw pseudo-instructions to compile the Forth program 1 2 + shows the key idea: the target code list can be interpreted as a parameter stack, with the top of the stack at the bottom of the code list.

  > (code> 1 2)

  [qw 1]

  [qw 2]

  > (code> 1 2 +)

  [qw (1 2 +)]

The code stack can be used as the argument passing mechanism for a language of macros that is active at compile time. Machine instructions then become datatypes of this language. The word + names a function that operates at compile time. It inspects the code stack and if it finds one or two qw objects it can use them as input to the addition operation and compile a simpler run-time instruction.