F*dging up a Racket
Danny Yoo <dyoo@cs.wpi.edu>
Source code can be found at: https://github.com/dyoo/brainfudge. The latest version of this document lives in http://hashcollision.org/brainfudge.
1 Introduction
If people say that Racket is just a Scheme, they are short-selling Racket a little. It’s more accurate to say that Racket is a language laboratory, with support for many different languages.
#lang racket (define-syntax-rule (while test body ...) (let loop () (when test body ... (loop)))) ;; From this point forward, we've got while loops. (while (not (string=? (read-line) "quit")) (printf "never going to give you up\n") (printf "never going to let you down\n"))
#lang racket
We can understand the situation better by looking at another environment on our desktop, namely the web browser. A web browser supports different kinds of HTML variants, since HTML is a moving target, and browsers have come up with crazy rules for figuring out how to take an arbitrary document and decide what HTML parsing rules to apply to it.
HTML 5 tries to make this determination somewhat more straightforward: we can define an HTML 5 document by putting a DOCTYPE element at the very top of the file which self-describes the document as being html.
<!DOCTYPE html> |
<html lang="en"> |
<head><title>Hello world</title></head> |
<body><p>Hello world!</p></body> |
</html> |
Going back to the world of Racket, we see by analogy that the #lang line in a Racket program is a self-description of how to treat the rest of the program. (Actually, the #lang line is quite bit more active than this, but we’ll get to this in a moment.)
#lang datalog ancestor(A, B) :- parent(A, B). ancestor(A, B) :- parent(A, C), D = C, ancestor(D, B). parent(john, douglas). parent(bob, john). ancestor(A, B)?
#lang planet dyoo/bf ++++++[>++++++++++++<-]>. >++++++++++[>++++++++++<-]>+. +++++++..+++.>++++[>+++++++++++<-]>. <+++[>----<-]>.<<<<<+++[>+++++<-]>. >>.+++.------.--------.>>+.
Ignoring the question of why?!! someone would do this, let’s ask another: how do we build this? This tutorial will cover how to build this language into Racket from scratch.
Let’s get started!
2 The view from high orbit
#lang planet dyoo/bf ,[.,]
As mentioned earlier, a #lang line is quite active: it tells the Racket runtime how to convert from the surface syntax to a meaningful program. Programs in Racket get digested in a few stages; the process looks something like this:
reader macro expansion |
surface syntax ---------> AST -----------------> core forms |
When Racket sees #lang planet dyoo/bf, it will look for a particular module that we call a reader; a reader consumes surface syntax and excretes ASTs, and these ASTs are then annotated so that Racket knows how to make sense out of them later on. At this point, the rest of the Racket infrastructure kicks in and macro-expands the ASTs out, ultimately, to a core language.
Capture the meaning of brainf*ck by writing a semantics module.
Go from the line noise of the surface syntax into a more structured form by writing a parser module.
Connect the pieces, the semantics and the surface syntax parser, by making a reader module.
Profit!
3 Flight preparations
$ mkdir bf |
Ultimately, we want to put the fruit of our labor onto PLaneT, since that’ll make it easier for others to use our work. Let’s set up a PLaneT development link so the Racket environment knows about our work directory. I already have an account on PLaneT with my username dyoo. You can get an account fairly easily.
$ planet link dyoo bf.plt 1 0 bf |
$ cd bf |
~/bf$ cat >hello.rkt |
#lang racket |
"hello world" |
~/bf$ racket |
Welcome to Racket v5.1.1. |
> (require (planet dyoo/bf/hello)) |
"hello world" |
> |
4 The brainf*ck language
a byte array of data, and
a pointer into that data array
Increment the data pointer (>)
Decrement the data pointer (<)
Increment the byte at the data pointer (+)
Decrement the byte at the data pointer (-)
Write a byte to standard output (.)
Read a byte from standard input (,)
Perform a loop until the byte at the data pointer is zero ([, ])
"semantics.rkt"
#lang racket (require rackunit) ;; for unit testing (provide (all-defined-out)) ;; Our state contains two pieces. (define-struct state (data ptr) #:mutable) ;; Creates a new state, with a byte array of 30000 zeros, and ;; the pointer at index 0. (define (new-state) (make-state (make-vector 30000 0) 0)) ;; increment the data pointer (define (increment-ptr a-state) (set-state-ptr! a-state (add1 (state-ptr a-state)))) ;; decrement the data pointer (define (decrement-ptr a-state) (set-state-ptr! a-state (sub1 (state-ptr a-state)))) ;; increment the byte at the data pointer (define (increment-byte a-state) (let ([v (state-data a-state)] [i (state-ptr a-state)]) (vector-set! v i (add1 (vector-ref v i))))) ;; decrement the byte at the data pointer (define (decrement-byte a-state) (let ([v (state-data a-state)] [i (state-ptr a-state)]) (vector-set! v i (sub1 (vector-ref v i))))) ;; print the byte at the data pointer (define (write-byte-to-stdout a-state) (let ([v (state-data a-state)] [i (state-ptr a-state)]) (write-byte (vector-ref v i) (current-output-port)))) ;; read a byte from stdin into the data pointer (define (read-byte-from-stdin a-state) (let ([v (state-data a-state)] [i (state-ptr a-state)]) (vector-set! v i (read-byte (current-input-port))))) ;; we know how to do loops! (define-syntax-rule (loop a-state body ...) (let loop () (unless (= (vector-ref (state-data a-state) (state-ptr a-state)) 0) body ... (loop))))
"semantics.rkt"
;; Simple exercises. (let ([s (new-state)]) (increment-byte s) (check-equal? 1 (vector-ref (state-data s) 0)) (increment-byte s) (check-equal? 2 (vector-ref (state-data s) 0)) (decrement-byte s) (check-equal? 1 (vector-ref (state-data s) 0))) ;; pointer movement (let ([s (new-state)]) (increment-ptr s) (increment-byte s) (check-equal? 0 (vector-ref (state-data s) 0)) (check-equal? 1 (vector-ref (state-data s) 1)) (decrement-ptr s) (increment-byte s) (check-equal? 1 (vector-ref (state-data s) 0)) (check-equal? 1 (vector-ref (state-data s) 1))) ;; make sure standard input is doing something (let ([s (new-state)]) (parameterize ([current-input-port (open-input-bytes (bytes 3 1 4))]) (read-byte-from-stdin s) (increment-ptr s) (read-byte-from-stdin s) (increment-ptr s) (read-byte-from-stdin s)) (check-equal? 3 (vector-ref (state-data s) 0)) (check-equal? 1 (vector-ref (state-data s) 1)) (check-equal? 4 (vector-ref (state-data s) 2))) ;; make sure standard output is doing something (let ([s (new-state)]) (set-state-data! s (vector 80 76 84)) (let ([simulated-stdout (open-output-string)]) (parameterize ([current-output-port simulated-stdout]) (write-byte-to-stdout s) (increment-ptr s) (write-byte-to-stdout s) (increment-ptr s) (write-byte-to-stdout s)) (check-equal? "PLT" (get-output-string simulated-stdout)))) ;; Let's see that we can clear. (let ([s (new-state)]) (set-state-data! s (vector 0 104 101 108 112 109 101 105 109 109 101 108 116 105 110 103)) (set-state-ptr! s 15) ;; [ [-] < ] (loop s (loop s (decrement-byte s)) (decrement-ptr s)) (check-equal? 0 (state-ptr s)) (check-equal? (make-vector 16 0) (state-data s)))
Good! Our tests, at the very least, let us know that our definitions are doing something reasonable, and they should all pass.
However, there are a few things that we may want to fix in the future, like the lack of error trapping if the input stream contains eof. And there’s no bounds-checking on the ptr or on the values in the data. Wow, there are quite a few things that we might want to fix. But at the very least, we now have a module that captures the semantics of brainf*ck.
5 Lisping a language
"language.rkt"
#lang racket (require "semantics.rkt") (provide greater-than less-than plus minus period comma brackets (rename-out [my-module-begin #%module-begin])) ;; The current-state is a parameter used by the ;; rest of this language. (define current-state (make-parameter (new-state))) ;; Every module in this language will make sure that it ;; uses a fresh state. (define-syntax-rule (my-module-begin body ...) (#%plain-module-begin (parameterize ([current-state (new-state)]) body ...))) (define-syntax-rule (greater-than) (increment-ptr (current-state))) (define-syntax-rule (less-than) (decrement-ptr (current-state))) (define-syntax-rule (plus) (increment-byte (current-state))) (define-syntax-rule (minus) (decrement-byte (current-state))) (define-syntax-rule (period) (write-byte-to-stdout (current-state))) (define-syntax-rule (comma) (read-byte-from-stdin (current-state))) (define-syntax-rule (brackets body ...) (loop (current-state) body ...))
This "language.rkt" presents brainf*ck as a s-expression-based language. It uses the semantics we’ve coded up, and defines rules for handling greater-than, less-than, etc... We have a parameter called current-state that holds the state of the brainf*ck machine that’s used through the language.
> (syntax->datum (expand '(module an-example-module '#%kernel "hello" "world"))) '(module an-example-module '#%kernel (#%module-begin '"hello" '"world"))
#lang s-exp (planet dyoo/bf/language) (plus)(plus)(plus)(plus)(plus) (plus)(plus)(plus)(plus)(plus) (brackets (greater-than) (plus)(plus)(plus)(plus)(plus) (plus)(plus) (greater-than) (plus)(plus)(plus)(plus)(plus) (plus)(plus) (plus)(plus)(plus) (greater-than) (plus)(plus)(plus) (greater-than) (plus) (less-than)(less-than)(less-than) (less-than) (minus)) (greater-than) (plus)(plus) (period) (greater-than) (plus) (period) (plus)(plus)(plus)(plus)(plus) (plus)(plus) (period) (period) (plus)(plus)(plus) (period) (greater-than) (plus)(plus) (period) (less-than)(less-than) (plus)(plus)(plus)(plus)(plus) (plus)(plus)(plus)(plus)(plus) (plus)(plus)(plus)(plus)(plus) (period) (greater-than) (period) (plus)(plus)(plus) (period) (minus)(minus)(minus)(minus)(minus)(minus)(period) (minus)(minus)(minus)(minus)(minus)(minus)(minus)(minus) (period)(greater-than) (plus) (period) (greater-than) (period)
The #lang line here is saying, essentially, that the following program is written with s-expressions, and should be treated with the module language "language.rkt" that we just wrote up. And if we run this program, we should see a familiar greeting. Hurrah!
... But wait! We can’t just declare victory here. We really do want to allow the throngs of brainf*ck programmers to write brainf*ck in the surface syntax that they deserve. Keep "language.rkt" on hand, though. We will reuse it by having our parser transform the surface syntax into the forms we defined in "language.rkt".
Let’s get that parser working!
6 Parsing the surface syntax
The Racket toolchain includes a professional-strength lexer and parser in the parser-tools collection. For the sake of keeping this example terse, we’ll write a simple recursive-descent parser without using the parser-tools collection. (But if our surface syntax were any more complicated, we might reconsider this decision.)
The expected output of a successful parse should be some kind of abstract syntax tree. What representation should we use for the tree? Although we can use s-expressions, they’re pretty lossy: they don’t record where they came from in the original source text. For the case of brainf*ck, we might not care, but if we were to write a parser for a more professional, sophisticated language (like LOLCODE) we want source locations so we can give good error messages during parsing or run-time.
As an alternative to plain s-expressions, we’ll use a data structure built into Racket called a syntax object; syntax objects let us represent ASTs, just like s-expressions, and they also carry along auxiliary information, such as source locations. Plus, as we briefly saw in our play with expand, syntax objects are the native data structure that Racket itself uses during macro expansion, so we might as well use them ourselves.
> (define an-example-syntax-object (datum->syntax #f 'hello (list "hello.rkt" 1 20 32 5)))
> an-example-syntax-object #<syntax:1:20 hello>
> (syntax? an-example-syntax-object) #t
> (syntax->datum an-example-syntax-object) 'hello
> (symbol? (syntax->datum an-example-syntax-object)) #t
> (syntax-source an-example-syntax-object) "hello.rkt"
> (syntax-line an-example-syntax-object) 1
> (syntax-column an-example-syntax-object) 20
> (syntax-position an-example-syntax-object) 32
> (syntax-span an-example-syntax-object) 5
"parser.rkt"
#lang racket ;; The only visible export of this module will be parse-expr. (provide parse-expr) ;; While loops... (define-syntax-rule (while test body ...) (let loop () (when test body ... (loop)))) ;; ignorable-next-char?: input-port -> boolean ;; Produces true if the next character is something we should ignore. (define (ignorable-next-char? in) (let ([next-ch (peek-char in)]) (cond [(eof-object? next-ch) #f] [else (not (member next-ch '(#\< #\> #\+ #\- #\, #\. #\[ #\])))]))) ;; parse-expr: any input-port -> (U syntax eof) ;; Either produces a syntax object or the eof object. (define (parse-expr source-name in) (while (ignorable-next-char? in) (read-char in)) (let*-values ([(line column position) (port-next-location in)] [(next-char) (read-char in)]) ;; We'll use this function to generate the syntax objects by ;; default. ;; The only category this doesn't cover are brackets. (define (default-make-syntax type) (datum->syntax #f (list type) (list source-name line column position 1))) (cond [(eof-object? next-char) eof] [else (case next-char [(#\<) (default-make-syntax 'less-than)] [(#\>) (default-make-syntax 'greater-than)] [(#\+) (default-make-syntax 'plus)] [(#\-) (default-make-syntax 'minus)] [(#\,) (default-make-syntax 'comma)] [(#\.) (default-make-syntax 'period)] [(#\[) ;; The slightly messy case is bracket. We keep reading ;; a list of exprs, and then construct a wrapping bracket ;; around the whole thing. (let*-values ([(elements) (parse-exprs source-name in)] [(following-line following-column following-position) (port-next-location in)]) (datum->syntax #f `(brackets ,@elements) (list source-name line column position (- following-position position))))] [(#\]) eof])]))) ;; parse-exprs: input-port -> (listof syntax) ;; Parse a list of expressions. (define (parse-exprs source-name in) (let ([next-expr (parse-expr source-name in)]) (cond [(eof-object? next-expr) empty] [else (cons next-expr (parse-exprs source-name in))])))
> (define my-sample-input-port (open-input-string ",[.,]"))
> (define first-stx (parse-expr "my-sample-program.rkt" my-sample-input-port)) > first-stx #<syntax::1 (comma)>
> (define second-stx (parse-expr "my-sample-program.rkt" my-sample-input-port)) > second-stx #<syntax::2 (brackets (period) (comma))>
> (parse-expr "my-sample-program.rkt" my-sample-input-port) #<eof>
> (syntax->datum second-stx) '(brackets (period) (comma))
> (syntax-source second-stx) "my-sample-program.rkt"
> (syntax-position second-stx) 2
> (syntax-span second-stx) 4
We mentioned that the parser wasn’t too hard... but then again, we haven’t written good traps for error conditions. This parser is a baby parser. If we were more rigorous, we’d probably implement it with the parser-tools collection, write unit tests for the parser with rackunit, and make sure to produce good error messages when Bad Things happen (like having unbalanced brackets or parentheses.
Still, we’ve now got the language and a parser. How do we tie them together?
7 Crossing the wires
A parser in "parser.rkt" for the surface syntax that produces ASTs
A module language in "language.rkt" that provides the meaning for those ASTs.
#lang planet dyoo/bf
"lang/reader.rkt"
#lang s-exp syntax/module-reader (planet dyoo/bf/language) #:read my-read #:read-syntax my-read-syntax (require "../parser.rkt") (define (my-read in) (syntax->datum (my-read-syntax #f in))) (define (my-read-syntax src in) (parse-expr src in))
$ cat hello2.rkt |
#lang planet dyoo/bf |
++++++[>++++++++++++<-]>. |
>++++++++++[>++++++++++<-]>+. |
+++++++..+++.>++++[>+++++++++++<-]>. |
<+++[>----<-]>.<<<<<+++[>+++++<-]>. |
>>.+++.------.--------.>>+. |
|
$ racket hello2.rkt |
Hello, World! |
Sweet, sweet words.
8 Landing on PLaneT
Finally, we want to get this work onto PLaneT so that other people can share in the joy of writing brainf*ck in Racket. Let’s do it!
First, let’s go back to the parent of our work directory. Once we’re there, we’ll use the planet create command.
$ planet create bf |
planet create bf |
MzTarring ./... |
MzTarring ./lang... |
|
WARNING: |
Package has no info.rkt file. This means it will not have a description or documentation on the PLaneT web site. |
|
$ ls -l bf.plt |
-rw-rw-r-- 1 dyoo nogroup 3358 Jun 12 19:39 bf.plt |
"info.rkt"
#lang setup/infotab (define name "bf: a brainf*ck compiler for Racket") (define categories '(devtools)) (define can-be-loaded-with 'all) (define required-core-version "5.1.1") (define version "1.0") (define repositories '("4.x")) (define scribblings '()) (define primary-file "language.rkt") (define blurb '("Provides support for the brainf*ck language.")) (define release-notes '((p "First release")))
$ planet unlink dyoo bf.plt 1 0 |
$ racket hello2.rkt |
require: PLaneT could not find the requested package: Server had no matching package: No package matched the specified criteria |
$ planet fileinject dyoo bf.plt 1 0 |
planet fileinject dyoo bf.plt 1 0 |
|
============= Installing bf.plt on Sun, 12 Jun 2011 19:49:50 ============= |
raco setup: Unpacking archive from /home/dyoo/bf.plt |
raco setup: unpacking README in /home/dyoo/.racket/planet/300/5.1.1/cache/dyoo/bf.plt/1/0/./ |
raco setup: unpacking hello.rkt in /home/dyoo/.racket/planet/300/5.1.1/cache/dyoo/bf.plt/1/0/./ |
raco setup: unpacking hello2.rkt in /home/dyoo/.racket/planet/300/5.1.1/cache/dyoo/bf.plt/1/0/./ |
raco setup: making directory lang in /home/dyoo/.racket/planet/300/5.1.1/cache/dyoo/bf.plt/1/0/./ |
raco setup: unpacking reader.rkt in /home/dyoo/.racket/planet/300/5.1.1/cache/dyoo/bf.plt/1/0/./lang/ |
raco setup: unpacking language.rkt in /home/dyoo/.racket/planet/300/5.1.1/cache/dyoo/bf.plt/1/0/./ |
raco setup: unpacking parser.rkt in /home/dyoo/.racket/planet/300/5.1.1/cache/dyoo/bf.plt/1/0/./ |
raco setup: unpacking semantics.rkt in /home/dyoo/.racket/planet/300/5.1.1/cache/dyoo/bf.plt/1/0/./ |
raco setup: version: 5.1.1 [3m] |
... |
$ racket hello2.rkt |
Hello, World! |
Once we’re finally satisfied with the package’s contents, we can finally upload it onto PLaneT. If you log onto planet.racket-lang.org, the user interface will allow you to upload your "bf.plt" package.
9 Acknowledgements
Very special thanks to Shriram Krishnamurthi for being understanding when I told him I had coded a brainf*ck compiler. Guillaume Marceau, Rodolfo Carvalho, and Eric Hanchrow helped with grammar and spelling checks. Casey Klein suggested a section in the tutorial that shows how we can generate errors that point to original sources.
Furthermore, thanks to those who commented from the /r/programming Reddit thread: they helped isolate a performance issue regarding parameters and motivated the following section on optimization. David Van Horn pointed out how to use PyPy’s JIT properly, the results of which amazing. Sam Tobin-Hochstadt provided a few optimization suggestions, many of which have are in the main (planet dyoo/bf) implementation.
Finally, big shoutouts to the PLT group at
Brown University —
10 Epilo... Optimization and Polishing!
So we upload and release the package on PLaneT, and send our marketeers out to spread the Word. We kick back, lazily twiddle our thumbs, and await the adoration of the global brainf*ck community.
To our dismay, someone brings up the fact that our implementation is slower than an interpreter written in another language. What?!
But the Internet is absolutely correct. Let’s run the numbers. We can grab another brainf*ck implementation and try it on a benchmarking program, like the one that generates prime numbers. Let’s see what the competition looks like:
$ echo 100 | time ~/local/pypy/bin/pypy example1.py prime.b |
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 |
16.72user 0.24system 0:17.18elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k |
0inputs+0outputs (0major+3554minor)pagefaults 0swaps |
Ok, about sixteen seconds. Not bad. We’re not even using their JIT, and they’re still producing reasonable results.
Now let’s look at our own performance. We surely can’t do worse, right?
$ raco make prime.rkt && (echo 100 | time racket prime.rkt) |
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 |
37.36user 0.65system 0:38.15elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k |
0inputs+0outputs (0major+10259minor)pagefaults 0swaps |
Thirty-seven seconds. Wow. Ouch.
Outrageous! Aren’t interpreters supposed to be slower than compilers? Isn’t Racket a JIT-compiled language? What the heck happened?
We tried to follow the creed that says Get it right, then get it fast... except that we didn’t. We forgot the second part about getting it fast. Just because something is compiled and driven by a JIT doesn’t guarantee that the generated code’s going to perform particularly well, and the benchmark above shows that something strange is happening.
So let’s try our hand at optimization! We may not get the raw performance of an impressive project like PyPy, but we still should be able to perform reasonably well. Furthermore, we will including some error handling that uses the source locations we constructed in our parser, in order to precisely point out runtime errors in the original source.
As a warning, if you ran through the previous sections, you may want to take a small break before continuing forward. This optimization section is included near the end because the changes we’ll be making require some deeper digging into Racket’s language infrastructure, expecially with macros. Take a relaxing walk, and then come back to this when you’re ready.
10.1 Staring into the hot-spot
If we look a little closely into our implementation, we might notice something funny. Well, we might notice many things that look funny in our brainf*ck implementation, but there’s a particular one we’ll focus on: each of the forms in "language.rkt" refer to the current-state parameter. We use that parameter to make sure the other forms in the language use the same current-state value. And of course we want this kind of localized behavior, to prevent the kind of interference that might happen if two brainf*ck programs run.
... But every use of the parameter appears to be a function call. Just how bad is that? Let’s see. We can fire up our trusty DrRacket and try the following program in our Interactions window:
> (require rackunit) > (define my-parameter (make-parameter (box 0)))
> (time (parameterize ([my-parameter (box 0)]) (for ([x (in-range 10000000)]) (set-box! (my-parameter) (add1 (unbox (my-parameter))))) (check-equal? (unbox (my-parameter)) 10000000))) cpu time: 2964 real time: 3188 gc time: 0
Hmmmm... Ok, what if we didn’t have the parameter, and just accessed the variable more directly?
> (require rackunit)
> (time (let ([my-parameter (box 0)]) (for ([x (in-range 10000000)]) (set-box! my-parameter (add1 (unbox my-parameter)))) (check-equal? (unbox my-parameter) 10000000))) cpu time: 56 real time: 56 gc time: 0
In the immortal words of Neo: Whoa. Ok, we’ve got ourselves a target!
Let’s take a look again at the definition of our my-module-begin in "language.rkt".
(define current-state (make-parameter (new-state))) (define-syntax-rule (my-module-begin body ...) (#%plain-module-begin (parameterize ([current-state (new-state)]) body ...)))
Let’s replace the use of the parameterize here with a simpler let. Now we’ve got something like this:
(define-syntax-rule (my-module-begin body ...) (#%plain-module-begin (let ([my-fresh-state (new-state)]) body ...)))
But now we have a small problem: we want the rest of the inner body forms to syntactically recognize and re-route any use of current-state with this my-fresh-state binding. But we certainly can’t just rewrite the whole "language.rkt" and replace uses of current-state with my-fresh-state, because my-fresh-state isn’t a global variable! What do we do?
There’s a tool in the Racket library that allows us to solve this problem: it’s called a syntax parameter. A syntax parameter is similar to the reviled parameter that we talked about earlier, except that it works syntactically rather than dynamically. A common use of a syntax parameter is to let us wrap a certain area in our code, and say: “Any where this identifier shows up, rename it to use this variable instead.”
Let’s see a demonstration of these in action, because all this talk is a little abstract. What do these syntax parameters really do for us? Let’s play with them again a little.
> (require racket/stxparam)
> (define-syntax-parameter name (lambda (stx) #'"Madoka")) > name "Madoka"
> (define-syntax-rule (say-your-name) (printf "Your name is ~a\n" name))
> (define (outside-the-barrier) (printf "outside-the-barrier says: ") (say-your-name)) > (say-your-name) Your name is Madoka
> (let ([the-hero "Homerun"]) (syntax-parameterize ([name (make-rename-transformer #'the-hero)]) (say-your-name) (outside-the-barrier)))
Your name is Homerun
outside-the-barrier says: Your name is Madoka
It helps to keep in mind that, in Racket, macros are functions that work during compile-time. They take an input syntax, and produce an output syntax. Here, we define name to be a macro that expands to #'"Madoka" by default. When we use name directly, and when we use it in (say-your-name) for the first time, we’re seeing this default in place.
However, we make things more interesting (and a little more confusing!) in the second use of say-your-name: we use let to create a variable binding, and then use syntax-parameterize to reroute every use of name, syntactically, with a use of the-hero. Within the boundary defined at the syntax-parameterize’s body, name is magically transformed! That’s why we can see "Homerun" in the second use of (say-your-name).
Yet, where we use it from outside-the-barrier, name still takes on the default. Why?
(define (outside-the-barrier) (printf "outside-the-barrier says: ") (printf "Your name is ~a\n" name))
The use of name here is lexically outside the barrier set up by syntax-parameterize.
(let ([the-hero "Homerun"]) (syntax-parameterize ([name (make-rename-transformer #'the-hero)]) (say-your-name) (outside-the-barrier)))
(let ([the-hero "Homerun"]) (syntax-parameterize ([name (make-rename-transformer #'the-hero)]) (printf "Your name is ~a\n" name) (outside-the-barrier)))
Ah! So the use of name that’s introduced by say-your-name is within the lexical boundaries of the syntax-parameterize form. But outside-the-barrier is a plain, vanilla function, and because it’s not a macro, it doesn’t inline itself into the syntax-parameterize’s body. We can compare this with the more dynamic behavior of parameterize, and see that this difference is what makes syntax-parameterize different from parameterize. Well, we could tell that they’re different just from the names... but the behavior we’re seeing here makes it more clear just what that difference is.
Whew! Frankly, all of this is a little magical. But the hilarious thing, despite all this verbiage about syntax parameters, is that the implementation of the language looks almost exactly the same as before. Here’s a version of the language that uses these syntax parameters; let’s save it into "language.rkt" and replace the previous contents.
"language.rkt"
#lang racket (require "semantics.rkt" racket/stxparam) (provide greater-than less-than plus minus period comma brackets (rename-out [my-module-begin #%module-begin])) ;; The current-state is a syntax parameter used by the ;; rest of this language. (define-syntax-parameter current-state #f) ;; Every module in this language will make sure that it ;; uses a fresh state. (define-syntax-rule (my-module-begin body ...) (#%plain-module-begin (let ([fresh-state (new-state)]) (syntax-parameterize ([current-state (make-rename-transformer #'fresh-state)]) body ...)))) (define-syntax-rule (greater-than) (increment-ptr current-state)) (define-syntax-rule (less-than) (decrement-ptr current-state)) (define-syntax-rule (plus) (increment-byte current-state)) (define-syntax-rule (minus) (decrement-byte current-state)) (define-syntax-rule (period) (write-byte-to-stdout current-state)) (define-syntax-rule (comma) (read-byte-from-stdin current-state)) (define-syntax-rule (brackets body ...) (loop current-state body ...))
What effect does this change alone make to our performance on brainf*ck prime generation? Let’s cross our fingers!
$ raco make prime.rkt && (echo 100 | time racket prime.rkt) |
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 |
6.38user 0.09system 0:06.63elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k |
0inputs+0outputs (0major+10121minor)pagefaults 0swaps |
Now that’s more like it! Down from thirty-seven seconds to about six and a half. Nice. When we compare this versus the previous implementation of the language, we might laugh ruefully: we just got rid of a few more parentheses and typed in a few symbols. But of course, that’s not what we truly did. What in the world just happened?
Let’s summarize exactly what we did: earlier, we had used parameterize to maintain some shared local state within the dynamic extent of our module’s body. However, on reflection, we see that we don’t need the full power of dynamic scope: a simpler (and cheaper!) lexical scoping mechanism is sufficient here. We now use syntax-parameterize as our mechanism for sharing that state with the rest of the language. And if we ever see parameterize in a tight inner loop again, we shudder instinctively.
But now ambition rears its head and whispers to us: can we make the code go faster? At some point, we’ll hit diminishing returns, but let’s see what other obvious things we can do, and observe what happens to the benchmark results as we optimize.
10.2 Macros, macros everywhere
One trivial thing we can do is revisit our "semantics.rkt" file, and transform all of the exported function definitions into macros. This allows Racket’s compiler to inline the definitions for each use. That is, right now, Racket process and expands our brainf*ck programs up to the function definitions in the "semantics.rkt". Basically, we can go in and replace each define with a define-syntax-rule.
"semantics.rkt"
#lang racket (require rackunit) ;; for unit testing (provide (all-defined-out)) ;; Our state contains two pieces. (define-struct state (data ptr) #:mutable) ;; Creates a new state, with a byte array of 30000 zeros, and ;; the pointer at index 0. (define-syntax-rule (new-state) (make-state (make-vector 30000 0) 0)) ;; increment the data pointer (define-syntax-rule (increment-ptr a-state) (set-state-ptr! a-state (add1 (state-ptr a-state)))) ;; decrement the data pointer (define-syntax-rule (decrement-ptr a-state) (set-state-ptr! a-state (sub1 (state-ptr a-state)))) ;; increment the byte at the data pointer (define-syntax-rule (increment-byte a-state) (let ([v (state-data a-state)] [i (state-ptr a-state)]) (vector-set! v i (add1 (vector-ref v i))))) ;; decrement the byte at the data pointer (define-syntax-rule (decrement-byte a-state) (let ([v (state-data a-state)] [i (state-ptr a-state)]) (vector-set! v i (sub1 (vector-ref v i))))) ;; print the byte at the data pointer (define-syntax-rule (write-byte-to-stdout a-state) (let ([v (state-data a-state)] [i (state-ptr a-state)]) (write-byte (vector-ref v i) (current-output-port)))) ;; read a byte from stdin into the data pointer (define-syntax-rule (read-byte-from-stdin a-state) (let ([v (state-data a-state)] [i (state-ptr a-state)]) (vector-set! v i (read-byte (current-input-port))))) ;; we know how to do loops! (define-syntax-rule (loop a-state body ...) (let loop () (unless (= (vector-ref (state-data a-state) (state-ptr a-state)) 0) body ... (loop))))
$ raco make prime.rkt && (echo 100 | time racket prime.rkt) |
raco make prime.rkt && (echo 100 | time racket prime.rkt) |
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 |
3.78user 0.10system 0:03.96elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k |
0inputs+0outputs (0major+10101minor)pagefaults 0swaps |
Ok, inlining each of the definitions of the semantics gives us a little more performance, at the cost of some code expansion. But not a large one.
10.3 Structures? Where’s we’re going, we won’t need structures...
While we have our eye on "semantics.rkt", we might wonder: how much is it costing us to access the data and ptr fields of our state? The use of the structure introduces an indirect memory access. Maybe we can eliminate it, by saying that the state of our language consists of two pieces, rather than one aggregate piece. So one proposal we can consider is to remove the structure, and have each of the rules in our semantics deal with both pieces of the state.
The editing for this will be somewhat non-local: we’ll need to touch both "semantics.rkt" and "language.rkt" because each form in the semantics will take in two pieces, and each language construct in the language must provide those two pieces. Let’s see what this looks like for both files.
"semantics.rkt"
#lang racket (provide (all-defined-out)) ;; Provides two values: a byte array of 30000 zeros, and ;; the pointer at index 0. (define-syntax-rule (new-state) (values (make-vector 30000 0) 0)) ;; increment the data pointer (define-syntax-rule (increment-ptr data ptr) (set! ptr (add1 ptr))) ;; decrement the data pointer (define-syntax-rule (decrement-ptr data ptr) (set! ptr (sub1 ptr))) ;; increment the byte at the data pointer (define-syntax-rule (increment-byte data ptr) (vector-set! data ptr (add1 (vector-ref data ptr)))) ;; decrement the byte at the data pointer (define-syntax-rule (decrement-byte data ptr) (vector-set! data ptr (sub1 (vector-ref data ptr)))) ;; print the byte at the data pointer (define-syntax-rule (write-byte-to-stdout data ptr) (write-byte (vector-ref data ptr) (current-output-port))) ;; read a byte from stdin into the data pointer (define-syntax-rule (read-byte-from-stdin data ptr) (vector-set! data ptr (read-byte (current-input-port)))) ;; we know how to do loops! (define-syntax-rule (loop data ptr body ...) (let loop () (unless (= (vector-ref data ptr) 0) body ... (loop))))
"language.rkt"
#lang racket (require "semantics.rkt" racket/stxparam) (provide greater-than less-than plus minus period comma brackets (rename-out [my-module-begin #%module-begin])) ;; The current-data and current-ptr are syntax parameters used by the ;; rest of this language. (define-syntax-parameter current-data #f) (define-syntax-parameter current-ptr #f) ;; Every module in this language will make sure that it ;; uses a fresh state. (define-syntax-rule (my-module-begin body ...) (#%plain-module-begin (let-values ([(fresh-data fresh-ptr) (new-state)]) (syntax-parameterize ([current-data (make-rename-transformer #'fresh-data)] [current-ptr (make-rename-transformer #'fresh-ptr)]) body ...)))) (define-syntax-rule (greater-than) (increment-ptr current-data current-ptr)) (define-syntax-rule (less-than) (decrement-ptr current-data current-ptr)) (define-syntax-rule (plus) (increment-byte current-data current-ptr)) (define-syntax-rule (minus) (decrement-byte current-data current-ptr)) (define-syntax-rule (period) (write-byte-to-stdout current-data current-ptr)) (define-syntax-rule (comma) (read-byte-from-stdin current-data current-ptr)) (define-syntax-rule (brackets body ...) (loop current-data current-ptr body ...))
(let-values ([(data ptr) (new-state)]) ...)
In any case, what does our benchmark tell us about this optimization?
$ raco make prime.rkt && (echo 100 | time racket prime.rkt) |
raco make prime.rkt && (echo 100 | time racket prime.rkt) |
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 |
1.13user 0.09system 0:01.30elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k |
0inputs+0outputs (0major+10095minor)pagefaults 0swaps |
That seems like a more substantial optimization! Ok, we’re down to about a second plus a little more.
10.4 Strapping on the safety goggles
Let’s pause for a moment. We should ask ourselves: is our language actually doing the Right thing? We might consider the following situations:
The program may try to read a byte from the standard input port, and encounter eof instead.
A program may try to increment the value at the pointer beyond the boundaries of a byte.
The machine might be instructed to shift the pointer *clunk* off the data array.
What happens in our current implementation when these situations arise?
Oh dear. We should have looked at this earlier! How shameful! None of these are directly addressed by our current implementation. We’d better correct these flaws before continuing forward, before anyone else notices. And even if this costs us a few millseconds in performance, it’s certainly worth knowing exactly what should happen in these situations.
10.4.1 eof
According to the Portable Brainf*ck guide,
If a program attempts to input a value when there is no more data in the input stream, the value in the current cell after such an operation is implementation-defined. (The most common choices are to either store 0, or store -1, or to leave the cell’s value unchanged. This is frequently the most problematic issue for the programmer wishing to achieve portability.)
Let’s choose to treat the reading of eof as a zero. We can change the definition of read-byte-from-stdin in "semantics.rkt" to do this.
;; read a byte from stdin into the data pointer (define-syntax-rule (read-byte-from-stdin data ptr) (vector-set! data ptr (let ([a-value (read-byte (current-input-port))]) (if (eof-object? a-value) 0 a-value))))
10.4.2 out-of-range byte mutation
Next, let’s look at what the portability guide says about what happens when we increment or decrement a byte past certain limits.
The range of values a single cell can contain is implementation-defined. (The range need not be consistent, either: consider the case of a "bignum" implementation, whose cells’ ranges would be limited only by currently available resources.) However, the range of every cell shall always at least include the values 0 through 127, inclusive.)
If a program attempts to either decrement the value of a cell below its documented minimum value, if any, or increment the value of a cell beyond its documented maximum value, if any, then the value in the cell after such an operation is implementation-defined. (Most implementations choose to let the value wrap around in a fashion typical to C integers, but this is not required.)
So it looks like we have a little leeway here. We’ve implicitly been using an vector of bytes, since we’ve been using read-byte and write-byte on the values of the data vector. Since bytes range between 0 and 255, let’s keep our cells in that range too. One simple tool we can use is modulo, which allows us to keep the values in that range. Let’s use it.
;; increment the byte at the data pointer (define-syntax-rule (increment-byte data ptr) (vector-set! data ptr (modulo (add1 (vector-ref data ptr)) 256))) ;; decrement the byte at the data pointer (define-syntax-rule (decrement-byte data ptr) (vector-set! data ptr (modulo (sub1 (vector-ref data ptr)) 256)))
10.4.3 out-of-bounds pointer movement
What does the portability guide say about moving the tape out-of-bounds?
If a program attempts to move the pointer below the first array cell, or beyond the last array cell, then that program’s behavior is undefined. (A few implementations cause the pointer to wrap around, but many, perhaps most, implementations behave in a manner consistent with a C pointer wandering off into arbitrary memory.)
Wait. Stop right there. It is absolutely unacceptable for us to just have the pointer wander out-of-bounds like that. Even we brainf*ck programmers must have our standards. Instead, let’s make it a guaranteed runtime error that halts evaluation. Moreover, let’s make sure the error message points directly at the offending instruction in the source text.
How do we get our errors to highlight in DrRacket? Racket, like many languages, provides exceptions as structured values. In particular, DrRacket will cooperate when it sees an exception that provides source location.
#lang racket ;; We create a structure that supports the ;; prop:exn:srcloc protocol. It carries ;; with it the location of the syntax that ;; is guilty. (define-struct (exn:fail:he-who-shall-not-be-named exn:fail) (a-srcloc) #:property prop:exn:srclocs (lambda (a-struct) (match a-struct [(struct exn:fail:he-who-shall-not-be-named (msg marks a-srcloc)) (list a-srcloc)]))) ;; We can play with this by creating a form that ;; looks at identifiers, and only flags specific ones. (define-syntax (skeeterize stx) (syntax-case stx () [(_ expr) (cond [(and (identifier? #'expr) (eq? (syntax-e #'expr) 'voldemort)) (quasisyntax/loc stx (raise (make-exn:fail:he-who-shall-not-be-named "oh dear don't say his name" (current-continuation-marks) (srcloc '#,(syntax-source #'expr) '#,(syntax-line #'expr) '#,(syntax-column #'expr) '#,(syntax-position #'expr) '#,(syntax-span #'expr)))))] [else ;; Otherwise, leave the expression alone. #'expr])])) (define (f x) (* (skeeterize x) x)) (define (g voldemort) (* (skeeterize voldemort) voldemort)) ;; Examples: (f 7) (g 7) ;; The error should highlight the use ;; of the one-who-shall-not-be-named ;; in g.
When we create a make-exn:fail:he-who-shall-not-be-named, we provide it a srcloc from the originating syntax objects. Furthermore, we tell the Racket runtime that this structure is a good source for source locations, by annotating the structure’s definition with prop:exn:srclocs. This allows the runtime system to cooperate with the DrRacket editor, so that when a make-exn:fail:he-who-shall-not-be-named does get raised at runtime, the editor can nicely highlight the offending party.
When we were looking at parsing, we were careful enough to produce syntax objects with source locations. It would be a shame to waste that effort. Here’s what we’ll do: we’ll adjust the semantics of increment-ptr and decrement-ptr to take in one more argument: a representation of the source location. If we see that the pointer’s going to fall off, we can then raise an exception that’s annotated with srcloc information. That should give the DrRacket environment the information it needs to highlight tape-movement errors at runtime.
(define-syntax (greater-than stx) (syntax-case stx () [(_) (quasisyntax/loc stx (increment-ptr current-data current-ptr (srcloc '#,(syntax-source stx) '#,(syntax-line stx) '#,(syntax-column stx) '#,(syntax-position stx) '#,(syntax-span stx))))]))
One small complication is that we need the ability to talk about the source location of the syntax object being fed to the greater-than macro, so we switched from using define-syntax-rule to the more low-level syntax-case macro definer.
Let’s look at the corresponding changes we need to make to increment-ptr; assuming we have a definition for an exn:fail:out-of-bounds exception, the code for increment-ptr will look like this.
(define-syntax-rule (increment-ptr data ptr loc) (begin (set! ptr (add1 ptr)) (when (>= ptr (vector-length data)) (raise (make-exn:fail:out-of-bounds "out of bounds" (current-continuation-marks) loc)))))
Our "semantics.rkt" and "language.rkt" now look like the following:
"semantics.rkt"
#lang racket (provide (all-defined-out)) ;; We use a customized error structure that supports ;; source location reporting. (define-struct (exn:fail:out-of-bounds exn:fail) (srcloc) #:property prop:exn:srclocs (lambda (a-struct) (list (exn:fail:out-of-bounds-srcloc a-struct)))) ;; Provides two values: a byte array of 30000 zeros, and ;; the pointer at index 0. (define-syntax-rule (new-state) (values (make-vector 30000 0) 0)) ;; increment the data pointer (define-syntax-rule (increment-ptr data ptr loc) (begin (set! ptr (add1 ptr)) (when (>= ptr (vector-length data)) (raise (make-exn:fail:out-of-bounds "out of bounds" (current-continuation-marks) loc))))) ;; decrement the data pointer (define-syntax-rule (decrement-ptr data ptr loc) (begin (set! ptr (sub1 ptr)) (when (< ptr 0) (raise (make-exn:fail:out-of-bounds "out of bounds" (current-continuation-marks) loc))))) ;; increment the byte at the data pointer (define-syntax-rule (increment-byte data ptr) (vector-set! data ptr (modulo (add1 (vector-ref data ptr)) 256))) ;; decrement the byte at the data pointer (define-syntax-rule (decrement-byte data ptr) (vector-set! data ptr (modulo (sub1 (vector-ref data ptr)) 256))) ;; print the byte at the data pointer (define-syntax-rule (write-byte-to-stdout data ptr) (write-byte (vector-ref data ptr) (current-output-port))) ;; read a byte from stdin into the data pointer (define-syntax-rule (read-byte-from-stdin data ptr) (vector-set! data ptr (let ([a-value (read-byte (current-input-port))]) (if (eof-object? a-value) 0 a-value)))) ;; we know how to do loops! (define-syntax-rule (loop data ptr body ...) (let loop () (unless (= (vector-ref data ptr) 0) body ... (loop))))
"language.rkt"
#lang racket (require "semantics.rkt" racket/stxparam) (provide greater-than less-than plus minus period comma brackets (rename-out [my-module-begin #%module-begin])) ;; The current-data and current-ptr are syntax parameters used by the ;; rest of this language. (define-syntax-parameter current-data #f) (define-syntax-parameter current-ptr #f) ;; Every module in this language will make sure that it ;; uses a fresh state. (define-syntax-rule (my-module-begin body ...) (#%plain-module-begin (let-values ([(fresh-data fresh-ptr) (new-state)]) (syntax-parameterize ([current-data (make-rename-transformer #'fresh-data)] [current-ptr (make-rename-transformer #'fresh-ptr)]) body ...)))) (define-syntax (greater-than stx) (syntax-case stx () [(_) (quasisyntax/loc stx (increment-ptr current-data current-ptr (srcloc '#,(syntax-source stx) '#,(syntax-line stx) '#,(syntax-column stx) '#,(syntax-position stx) '#,(syntax-span stx))))])) (define-syntax (less-than stx) (syntax-case stx () [(_) (quasisyntax/loc stx (decrement-ptr current-data current-ptr (srcloc '#,(syntax-source stx) '#,(syntax-line stx) '#,(syntax-column stx) '#,(syntax-position stx) '#,(syntax-span stx))))])) (define-syntax-rule (plus) (increment-byte current-data current-ptr)) (define-syntax-rule (minus) (decrement-byte current-data current-ptr)) (define-syntax-rule (period) (write-byte-to-stdout current-data current-ptr)) (define-syntax-rule (comma) (read-byte-from-stdin current-data current-ptr)) (define-syntax-rule (brackets body ...) (loop current-data current-ptr body ...))
#lang planet dyoo/bf |
|
*********** |
* * |
* o> <o * |
* * |
* <<<<<<<< * |
* * |
******** |
DrRacket will properly highlight the second < at the left edge of the face’s mouth. Hurrah!
$ raco make prime.rkt && (echo 100 | time racket prime.rkt) |
raco make prime.rkt && (echo 100 | time racket prime.rkt) |
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 |
1.56user 0.08system 0:01.71elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k |
0inputs+0outputs (0major+8678minor)pagefaults 0swaps |
10.5 Running with scissors
There’s one obvious thing we haven’t done yet: we haven’t taken a look at racket/unsafe/ops. racket/unsafe/ops provides functions that act like +, vector-ref, and many of the the other functions we’ve used in "semantics.rkt". However, unlike their safe equivalents, the ones in racket/unsafe/ops don’t do type tests on their inputs.
This can reduce some runtime costs. We’re already making sure that the pointer lies within the boundaries of the data, with our last set of changes, so having Racket do the same kind of internal check is redundant.
(define-syntax-rule (increment-byte data ptr) (vector-set! data ptr (modulo (add1 (vector-ref data ptr)) 256)))
(define-syntax-rule (increment-byte data ptr) (unsafe-vector-set! data ptr (unsafe-fxmodulo (unsafe-fx+ (unsafe-vector-ref data ptr) 1) 256)))
;; WARNING WARNING DO NOT ACTUALLY EXECUTE THIS!!! (unsafe-vector-ref ptr data)
So we need to tread very, very carefully.
"semantics.rkt"
#lang racket ;; unsafe operations for speed. ;; But be very careful! (require racket/unsafe/ops) (provide (all-defined-out)) ;; We use a customized error structure that supports ;; source location reporting. (define-struct (exn:fail:out-of-bounds exn:fail) (srcloc) #:property prop:exn:srclocs (lambda (a-struct) (list (exn:fail:out-of-bounds-srcloc a-struct)))) ;; Provides two values: a byte array of 30000 zeros, and ;; the pointer at index 0. (define-syntax-rule (new-state) (values (make-vector 30000 0) 0)) ;; increment the data pointer (define-syntax-rule (increment-ptr data ptr loc) (begin (set! ptr (unsafe-fx+ ptr 1)) (when (unsafe-fx>= ptr (unsafe-vector-length data)) (raise (make-exn:fail:out-of-bounds "out of bounds" (current-continuation-marks) loc))))) ;; decrement the data pointer (define-syntax-rule (decrement-ptr data ptr loc) (begin (set! ptr (unsafe-fx- ptr 1)) (when (unsafe-fx< ptr 0) (raise (make-exn:fail:out-of-bounds "out of bounds" (current-continuation-marks) loc))))) ;; increment the byte at the data pointer (define-syntax-rule (increment-byte data ptr) (unsafe-vector-set! data ptr (unsafe-fxmodulo (unsafe-fx+ (unsafe-vector-ref data ptr) 1) 256))) ;; decrement the byte at the data pointer (define-syntax-rule (decrement-byte data ptr) (unsafe-vector-set! data ptr (unsafe-fxmodulo (unsafe-fx- (unsafe-vector-ref data ptr) 1) 256))) ;; print the byte at the data pointer (define-syntax-rule (write-byte-to-stdout data ptr) (write-byte (unsafe-vector-ref data ptr) (current-output-port))) ;; read a byte from stdin into the data pointer (define-syntax-rule (read-byte-from-stdin data ptr) (unsafe-vector-set! data ptr (let ([a-value (read-byte (current-input-port))]) (if (eof-object? a-value) 0 a-value)))) ;; we know how to do loops! (define-syntax-rule (loop data ptr body ...) (let loop () (unless (unsafe-fx= (unsafe-vector-ref data ptr) 0) body ... (loop))))
11 A final accounting
We can see the net effect of applying the combination of all these optimizations.
$ raco make prime.rkt && (echo 100 | time racket prime.rkt) |
raco make prime.rkt && (echo 100 | time racket prime.rkt) |
Primes up to: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 |
1.32user 0.05system 0:01.73elapsed 79%CPU (0avgtext+0avgdata 0maxresident)k |
0inputs+0outputs (0major+8696minor)pagefaults 0swaps |
And that’s not too bad of a result. We’ve gone from thirty-seven seconds to just over one.
What we have in hand isn’t the world’s fastest brainf*ck implementation. Ours isn’t horrible, mind you, but it doesn’t win speed records. What we do have, though, is an implementation that’s fairly straightforward, and integrates well with the umbrella of other languages and tools in Racket.
It’s one with which we can easily run experiments. What if we wanted to run brainf*ck programs in parallel? What if we want to run these programs under a restrictive sandbox? Would using a type system allow us to remove all those messy unsafe annotations in our semantics, while still removing the redundant type checks?
And what if we want to look at other languages besides brainf*ck? Now that we have a better understanding about how the Racket language toolchain works, how easy is it to implement a more realistic language?
There’s much more content about building languages in the Racket Guide; hopefully, this tutorial helps other hackers who’d love to try their hand at language design and implementation.
Also, please feel free to ask questions on the Racket Users mailing list; we’ll be happy to talk!